-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[1/4][PerfImprove][ResourceMngmt] Deallocate payload of BlockBasedTableBuilder::Rep::FilterBlockBuilder earlier for Full/PartitionedFilter #9070
Conversation
@hx235 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
HISTORY.md
Outdated
@@ -10,6 +10,7 @@ | |||
* Fixed a bug where stalled writes would remain stalled forever after the user calls `WriteBufferManager::SetBufferSize()` with `new_size == 0` to dynamically disable memory limiting. | |||
* Make `DB::close()` thread-safe. | |||
* Fix a bug in atomic flush where one bg flush thread will wait forever for a preceding bg flush thread to commit its result to MANIFEST but encounters an error which is mapped to a soft error (DB not stopped). | |||
* Fix a bug in `BlockBasedTableBuilder` where `BlockBasedTableBuilder::Rep::FilterBlockBuilder` object is not deallocated until `BlockBasedTableBuilder` is deallocated, despite the fact that it is no longer useful after `BlockBasedTableBuilder::Finish()`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be considered a "performance improvement" because the only thing "wrong" before was using more memory than necessary. And the level of detail should be appropriate for someone only interested in public API, not internal details.
@@ -2009,6 +2009,11 @@ Status BlockBasedTableBuilder::Finish() { | |||
if (ok()) { | |||
WriteFooter(metaindex_block_handle, index_block_handle); | |||
} | |||
if (ok()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good catch, though I don't think we need to be so concerned about the FilterBlockBuilder object itself as we are about ensuring its payload is deallocated as soon as possible. And that should be done earlier, on call to FilterBlockBuilder::Finish(). Ideally, the FilterBlockBuilder would transfer ownership of the payload (e.g. FullFilterBlock::filter_data_) to the caller of FilterBlockBuilder::Finish() (like how FilterBitsBuilder::Finish() does in an output parameter). That's a small internal refactoring that seems worthwhile.
Btw, .reset()
has same effect as .reset(nullptr)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
though I don't think we need to be so concerned about the FilterBlockBuilder object itself as we are about ensuring its payload is deallocated as soon as possible. And that should be done earlier, on call to FilterBlockBuilder::Finish()
Good point! I will proceed with the small internal refactoring and then deallocate the payload on call to FilterBlockBuilder::Finish().
@hx235 has updated the pull request. You must reimport the pull request before landing. |
@hx235 has updated the pull request. You must reimport the pull request before landing. |
2 similar comments
@hx235 has updated the pull request. You must reimport the pull request before landing. |
@hx235 has updated the pull request. You must reimport the pull request before landing. |
Update:
|
@hx235 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@hx235 has updated the pull request. You must reimport the pull request before landing. |
1 similar comment
@hx235 has updated the pull request. You must reimport the pull request before landing. |
@hx235 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
1 similar comment
@hx235 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise seems reasonable :)
// Deallocate the filter data payload of FilterBlockBuilder when done | ||
// using it to release memory. Otherwise, it will remain until | ||
// BlockBasedTableBuilder is deallocated. | ||
if (filter_data) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a free before use (use after free) because filter_content
should be referring to filter_data
. So you can just let RAII free filter_data
automatically when it goes out of scope.
I wonder why this was not caught by ASAN run. That could indicate a bug that neither of us sees. One way to verify your new logic is actually being exercised it to throw in a temporary assert(filter_data == nullptr);
and make sure it crashes some tests.
You can put a short comment on filter_data
, or not a big deal if adding comment to Finish().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why this was not caught by ASAN run.
It turns out to be a bug of mine that prevents this new logic from being run. Rather silly but previous impl in 084ea78 accidentally puts ownership transferring after return statement.
Valuable lessons learned:
- Make sure new logic being run by existing tests before relying on them by deliberately crashing some tests
- Always remember lesson 1 🙃
table/block_based/filter_block.h
Outdated
virtual Slice Finish(const BlockHandle& tmp, Status* status) = 0; | ||
virtual Slice Finish( | ||
const BlockHandle& tmp, Status* status, | ||
std::unique_ptr<const char[]>* filter_data = nullptr) = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think filter_data
deserves a comment here (as well as const BlockHandle&
which is very confusing in this context), like
If filter_data is not nullptr, Finish() may transfer ownership of block data to the caller, so that it can be freed as soon as possible.
@hx235 has updated the pull request. You must reimport the pull request before landing. |
Update:
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
41e149c
to
6ed0f9a
Compare
@hx235 has updated the pull request. You must reimport the pull request before landing. |
@hx235 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Update:
|
f862683
to
d80cf48
Compare
@hx235 has updated the pull request. You must reimport the pull request before landing. |
@hx235 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Update:
|
#9073) Summary: Note: This PR is the 4th part of a bigger PR stack (#9073) and will rebase/merge only after the first three PRs (#9070, #9071, #9130) merge. **Context:** Similar to #8428, this PR is to track memory usage during (new) Bloom Filter (i.e,FastLocalBloom) and Ribbon Filter (i.e, Ribbon128) construction, moving toward the goal of [single global memory limit using block cache capacity](https://github.com/facebook/rocksdb/wiki/Projects-Being-Developed#improving-memory-efficiency). It also constrains the size of the banding portion of Ribbon Filter during construction by falling back to Bloom Filter if that banding is, at some point, larger than the available space in the cache under `LRUCacheOptions::strict_capacity_limit=true`. The option to turn on this feature is `BlockBasedTableOptions::reserve_table_builder_memory = true` which by default is set to `false`. We [decided](#9073 (comment)) not to have separate option for separate memory user in table building therefore their memory accounting are all bundled under one general option. **Summary:** - Reserved/released cache for creation/destruction of three main memory users with the passed-in `FilterBuildingContext::cache_res_mgr` during filter construction: - hash entries (i.e`hash_entries`.size(), we bucket-charge hash entries during insertion for performance), - banding (Ribbon Filter only, `bytes_coeff_rows` +`bytes_result_rows` + `bytes_backtrack`), - final filter (i.e, `mutable_buf`'s size). - Implementation details: in order to use `CacheReservationManager::CacheReservationHandle` to account final filter's memory, we have to store the `CacheReservationManager` object and `CacheReservationHandle` for final filter in `XXPH3BitsFilterBuilder` as well as explicitly delete the filter bits builder when done with the final filter in block based table. - Added option fo run `filter_bench` with this memory reservation feature Pull Request resolved: #9073 Test Plan: - Added new tests in `db_bloom_filter_test` to verify filter construction peak cache reservation under combination of `BlockBasedTable::Rep::FilterType` (e.g, `kFullFilter`, `kPartitionedFilter`), `BloomFilterPolicy::Mode`(e.g, `kFastLocalBloom`, `kStandard128Ribbon`, `kDeprecatedBlock`) and `BlockBasedTableOptions::reserve_table_builder_memory` - To address the concern for slow test: tests with memory reservation under `kFullFilter` + `kStandard128Ribbon` and `kPartitionedFilter` take around **3000 - 6000 ms** and others take around **1500 - 2000 ms**, in total adding **20000 - 25000 ms** to the test suit running locally - Added new test in `bloom_test` to verify Ribbon Filter fallback on large banding in FullFilter - Added test in `filter_bench` to verify that this feature does not significantly slow down Bloom/Ribbon Filter construction speed. Local result averaged over **20** run as below: - FastLocalBloom - baseline `./filter_bench -impl=2 -quick -runs 20 | grep 'Build avg'`: - **Build avg ns/key: 29.56295** (DEBUG_LEVEL=1), **29.98153** (DEBUG_LEVEL=0) - new feature (expected to be similar as above)`./filter_bench -impl=2 -quick -runs 20 -reserve_table_builder_memory=true | grep 'Build avg'`: - **Build avg ns/key: 30.99046** (DEBUG_LEVEL=1), **30.48867** (DEBUG_LEVEL=0) - new feature of RibbonFilter with fallback (expected to be similar as above) `./filter_bench -impl=2 -quick -runs 20 -reserve_table_builder_memory=true -strict_capacity_limit=true | grep 'Build avg'` : - **Build avg ns/key: 31.146975** (DEBUG_LEVEL=1), **30.08165** (DEBUG_LEVEL=0) - Ribbon128 - baseline `./filter_bench -impl=3 -quick -runs 20 | grep 'Build avg'`: - **Build avg ns/key: 129.17585** (DEBUG_LEVEL=1), **130.5225** (DEBUG_LEVEL=0) - new feature (expected to be similar as above) `./filter_bench -impl=3 -quick -runs 20 -reserve_table_builder_memory=true | grep 'Build avg' `: - **Build avg ns/key: 131.61645** (DEBUG_LEVEL=1), **132.98075** (DEBUG_LEVEL=0) - new feature of RibbonFilter with fallback (expected to be a lot faster than above due to fallback) `./filter_bench -impl=3 -quick -runs 20 -reserve_table_builder_memory=true -strict_capacity_limit=true | grep 'Build avg'` : - **Build avg ns/key: 52.032965** (DEBUG_LEVEL=1), **52.597825** (DEBUG_LEVEL=0) - And the warning message of `"Cache reservation for Ribbon filter banding failed due to cache full"` is indeed logged to console. Reviewed By: pdillinger Differential Revision: D31991348 Pulled By: hx235 fbshipit-source-id: 9336b2c60f44d530063da518ceaf56dac5f9df8e
Note: This PR is the 1st part of a bigger PR stack (#9073).
Context:
Previously, the payload (i.e, filter data) within
BlockBasedTableBuilder::Rep::FilterBlockBuilder
object is not deallocated untilBlockBasedTableBuilder
is deallocated, despite it is no longer useful after its relatedfilter_content
being written.Summary:
BlockBasedTableBuilder::Rep::FilterBlockBuilder
objectfilters
andfilter_gc
lists into onestd::deque<FilterEntry> filters
by adding a new fieldlast_filter_entry_key
and storing thestd::unique_ptr filter_data
with theSlice filter
in the same entrylast_filter_data
in the case wherefilters
is empty, which should be as by then we would've finish using all theSlice filter
filter_content
associated with the payloadFilterBlockBuilder::Finish()
, which leads to touching the inherited interface inBlockBasedFilterBlockBuilder
. But for that, the payload transferring is ignored.Test:
FilterBlockBuilder::Finish()
andBlockBasedTableBuilder::Finish()
and interface mismatch. Relying on existing CI tests is enough asassert(false)
was temporarily added to verify the new logic of transferring ownership indeed run