Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ResourceManagement] Account for dictionary-building buffer in global memory limit #8428

Closed
wants to merge 13 commits into from

Conversation

hx235
Copy link
Contributor

@hx235 hx235 commented Jun 19, 2021

Context:
Some data blocks are temporarily buffered in memory in BlockBasedTableBuilder for building compression dictionary used in data block compression. Currently this memory usage is not counted toward our global memory usage utilizing block cache capacity. To improve that, this PR charges that memory usage into the block cache to achieve better memory tracking and limiting.

Summary:

  • Reserve memory in block cache for buffered data blocks that are used to build a compression dictionary
  • Release all the memory associated with buffering the data blocks mentioned above in EnterUnbuffered(), which is called when (a) buffer limit is exceeded after buffering OR (b) the block cache becomes full after reservation OR (c) BlockBasedTableBuilder calls Finish()

Test Plan:

  • Passing existing unit tests
  • Passing new unit tests

EnterUnbuffered();
if (r->state == Rep::State::kBuffered) {
if ((r->buffer_limit != 0 && r->data_begin_offset > r->buffer_limit) ||
r->buffer_exceeds_global_memory_limit) {
Copy link
Contributor Author

@hx235 hx235 Jun 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question 2:
I decided to do the checking and switching to unbuffer here instead of in WriteBlock() right after we add the data block into the buffer. My motivation is to (1)keep WriteBlock() simple and Flush() easy to resonate (2)align with what "checking against user-set buffer_limit" does since these two checking both follow unbuffering-policy-by-size.

Downside though, is that we need to keep a shared bool variable "buffer_exceeds_global_memory_limit" around (like "data_begin_offset") and we need to remember to manipulate that.

Let me know if you see we should do the checking and switching somewhere else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do ReserveLastBufferedDataBlockMemInCache() here (after flush in kBuffered mode), and also make it return the cache insertion Status so we can use that to check if block cache was full? The goal is to avoid new state.

With that change we'd no longer charge block cache for a Flush() in kBuffered mode during Finish(). I think it is fine because Finish() is going to EnterUnbuffered() right away anyway regardless of the cache insertion result. Maybe for consistency we should also skip ReserveLastBufferedDataBlockMemInCache() when we've already surpassed r->buffer_limit as in that case we are also going to EnterUnbuffered() right away regardless.

The high-level idea is we only charge block cache for the last buffered block when we are going to build another buffered block.

@hx235 hx235 force-pushed the account_dict_building_buffer_memory branch from cd73cd8 to 962d2b5 Compare June 19, 2021 02:15
@hx235 hx235 force-pushed the account_dict_building_buffer_memory branch from 962d2b5 to c7cf217 Compare June 19, 2021 03:18
@hx235 hx235 force-pushed the account_dict_building_buffer_memory branch 4 times, most recently from 1e13c97 to 39e492f Compare June 21, 2021 19:42
table/block_based/block_based_table_builder.cc Outdated Show resolved Hide resolved
table/block_based/block_based_table_builder.cc Outdated Show resolved Hide resolved
table/block_based/block_based_table_builder.cc Outdated Show resolved Hide resolved
table/block_based/block_based_table_builder.cc Outdated Show resolved Hide resolved
// TODO: figure out what is the right status message for successfully
// flushing all the buffered data blocks
if (ok()) {
FreeAllBufferedDataBlocksMemFromCache();
Copy link
Contributor

@ajkr ajkr Jun 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be inside the loop over r->data_block_buffers so it'll execute a lot of times.

Also we need to make sure to call this in the error case so a failed table building releases its dummy entry handles.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we need to make sure to call this in the error case so a failed table building releases its dummy entry handles.

I changed the code to release memory only after the line r->data_block_buffers.clear(); and removed my previous buggy ok() check since the memory reserved should stay as close as the total bytes in r->data_block_buffers (which is r->data_begin_offset I believe) regardless of any failure that can cause a non-okay status up to this point. Not sure if this is what your comment was suggesting :)

facebook-github-bot pushed a commit that referenced this pull request Aug 24, 2021
…8506)

Summary:
Context:
To help cap various memory usage by a single limit of the block cache capacity, we charge the memory usage through inserting/releasing dummy entries in the block cache. CacheReservationManager is such a class (non thread-safe) responsible for  inserting/removing dummy entries to reserve cache space for memory used by the class user.

- Refactored the inner private class CacheRep of WriteBufferManager into public CacheReservationManager class for reusability such as for #8428

- Encapsulated implementation details of cache key generation and dummy entries insertion/release in cache reservation as discussed in #8506 (comment)

- Consolidated increase/decrease cache reservation into one API - UpdateCacheReservation.

- Adjusted the previous dummy entry release algorithm in decreasing cache reservation to be loop-releasing dummy entries to stay symmetric to dummy entry insertion algorithm

- Made the previous dummy entry release algorithm in delayed decrease mode more aggressive for better decreasing cache reservation when memory used is less likely to increase back.

  Previously, the algorithms only release 1 dummy entries when new_mem_used < 3/4 * cache_allocated_size_ and cache_allocated_size_ - kSizeDummyEntry > new_mem_used.
Now, the algorithms loop-releases as many dummy entries as possible when new_mem_used < 3/4 * cache_allocated_size_.

- Updated WriteBufferManager's test cases to adapt to changes on the release algorithm mentioned above and left comment for some test cases for clarity

- Replaced the previous cache key prefix generation (utilizing object address related to the cache client) with one that utilizes Cache->NewID() to prevent cache-key collision among dummy entry clients sharing the same cache.

  The specific collision we are preventing happens when the object address is reused for a new cache-key prefix while the old cache-key using that same object address in its prefix still exists in the cache. This could happen due to that, under LRU cache policy, there is a possible delay in releasing a cache entry after the cache client object owning that cache entry get deallocated. In this case, the object address related to the cache client object can get reused for other client object to generate a new cache-key prefix.

  This prefix generation can be made obsolete after Peter's unification of all the code generating cache key, mentioned in #8506 (comment)

Pull Request resolved: #8506

Test Plan:
- Passing the added unit tests cache_reservation_manager_test.cc
- Passing existing and adjusted write_buffer_manager_test.cc

Reviewed By: ajkr

Differential Revision: D29644135

Pulled By: hx235

fbshipit-source-id: 0fc93fbfe4a40bb41be85c314f8f2bafa8b741f7
@hx235 hx235 force-pushed the account_dict_building_buffer_memory branch from 39e492f to 401c60b Compare August 27, 2021 20:14
@hx235 hx235 changed the title [draft][early review][global memory limit] Account for dictionary-building buffer in global memory limit [draft][global memory limit] Account for dictionary-building buffer in global memory limit Aug 27, 2021
@hx235
Copy link
Contributor Author

hx235 commented Aug 27, 2021

Will address review comments in next update

Update:

  • Rebased and integrated with CacheReservationManager for accounting memory
  • Removed some todos

@hx235 hx235 force-pushed the account_dict_building_buffer_memory branch from 2d6c828 to c9636c1 Compare September 4, 2021 04:02
@hx235 hx235 changed the title [draft][global memory limit] Account for dictionary-building buffer in global memory limit [ResourceManagement] Account for dictionary-building buffer in global memory limit Sep 4, 2021
@hx235
Copy link
Contributor Author

hx235 commented Sep 4, 2021

Update:

  • Addressed PR comments about places to impl memory reservation logics (comment) and two previous impl bugs (comment)
  • Added new unit tests
  • Clarified a subtle detail in class comment about adding a new CacheEntryRole enum
  • Ready for review

// only if the block is not going to be unbuffered immediately
// and there exists a cache reservation manager
if (!exceeds_buffer_limit && r->cache_rev_mng != nullptr) {
Status s = r->cache_rev_mng->UpdateCacheReservation<
Copy link
Contributor Author

@hx235 hx235 Sep 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Question]
It seems that there isn't any synchronization inside of BlockBasedTableBuilder for data structure like data_blocks_buffer nor did I see multi-threaded usage of one BlockBasedTableBuilder object in the test other than parallel compression, which seems be irrelevant to our data buffering (although I am having a bit difficulties in fully understanding the code of how parallel compression play its part in BlockBasedBuilder). I’d also imagine it will be pretty hard to guarantee the order of key doing multi-threaded Add() from one builder.

I wonder if we can skip the external synchronization of cache_rev_mng inside of BlockBasedTableBuilder (which I feel we can)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BlockBasedTableBuilder is generally single-threaded (except the experimental parallel compression parts, probably not relevant here), so having a separate cache_rev_mng per builder should suffice. 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for jumping in and answering my question!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the background threads will not be involved in BlockBasedTableBuilder::Add(). They only do compressing/writing, and also they only start doing work once we EnterUnbuffered().

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@hx235 hx235 force-pushed the account_dict_building_buffer_memory branch from a6ef874 to 15a3747 Compare September 8, 2021 17:15
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235
Copy link
Contributor Author

hx235 commented Sep 8, 2021

Update:

  • Adopted suggestion on adjusting test value to hide impl details of metadata
  • Refined some comments' wording
  • Will merge after signals are cleared

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

2 similar comments
@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@hx235 merged this pull request in 91b95ca.

facebook-github-bot pushed a commit that referenced this pull request Oct 9, 2021
Summary:
Context:
HISTORY.md was not properly updated along with the change in #8428, where we introduced a change of accounting compression dictionary buffering memory and an extra condition of triggering data unbuffering.
Updated HISTORY.md for #8428 in 6.25.0 HISTORY.md section.
Updated blog post https://rocksdb.org/blog/2021/05/31/dictionary-compression.html.

Pull Request resolved: #9001

Reviewed By: ajkr

Differential Revision: D31517836

Pulled By: hx235

fbshipit-source-id: 01f6b30de4e1ff6b315aa8221139d9b700c7c629
ajkr pushed a commit to ajkr/rocksdb that referenced this pull request Oct 11, 2021
Summary:
Context:
HISTORY.md was not properly updated along with the change in facebook#8428, where we introduced a change of accounting compression dictionary buffering memory and an extra condition of triggering data unbuffering.
Updated HISTORY.md for facebook#8428 in 6.25.0 HISTORY.md section.
Updated blog post https://rocksdb.org/blog/2021/05/31/dictionary-compression.html.

Pull Request resolved: facebook#9001

Reviewed By: ajkr

Differential Revision: D31517836

Pulled By: hx235

fbshipit-source-id: 01f6b30de4e1ff6b315aa8221139d9b700c7c629
facebook-github-bot pushed a commit that referenced this pull request Nov 1, 2021
… global memory limit) (#9032)

Summary:
Summary/Context:
- Renamed `cache_rev_mng` to `compression_dict_buffer_cache_res_mgr`
   - It is to distinguish with other potential `cache_res_mgr` in `BlockBasedTableBuilder` and to use correct short-hand for the words "reservation", "manager"
- Added `table_options.block_cache == nullptr` in additional to `table_options.no_block_cache == true` to be conditions where we don't create a `CacheReservationManager`
   - Theoretically `table_options.no_block_cache == true` is equivalent to `table_options.block_cache == nullptr` by API. But since segment fault will be generated by passing `nullptr` into `CacheReservationManager`'s constructor, it does not hurt to directly verify  `table_options.block_cache != nullptr` before passing in
- Renamed `is_cache_full` to `exceeds_global_block_cache_limit`
   - It is to hide implementation detail of cache reservation and to emphasize on the concept/design intent of caping memory within global block cache limit

Pull Request resolved: #9032

Test Plan: - Passing existing tests

Reviewed By: ajkr

Differential Revision: D32005807

Pulled By: hx235

fbshipit-source-id: 619fd17bb924199de3db5924d8ab7dae53b1efa2
facebook-github-bot pushed a commit that referenced this pull request Nov 18, 2021
#9073)

Summary:
Note: This PR is the 4th part of a bigger PR stack (#9073) and will rebase/merge only after the first three PRs (#9070, #9071, #9130) merge.

**Context:**
Similar to #8428, this PR is to track memory usage during (new) Bloom Filter (i.e,FastLocalBloom) and Ribbon Filter (i.e, Ribbon128) construction, moving toward the goal of [single global memory limit using block cache capacity](https://github.com/facebook/rocksdb/wiki/Projects-Being-Developed#improving-memory-efficiency). It also constrains the size of the banding portion of Ribbon Filter during construction by falling back to Bloom Filter if that banding is, at some point, larger than the available space in the cache under `LRUCacheOptions::strict_capacity_limit=true`.

The option to turn on this feature is `BlockBasedTableOptions::reserve_table_builder_memory = true` which by default is set to `false`. We [decided](#9073 (comment)) not to have separate option for separate memory user in table building therefore their memory accounting are all bundled under one general option.

**Summary:**
- Reserved/released cache for creation/destruction of three main memory users with the passed-in `FilterBuildingContext::cache_res_mgr` during filter construction:
   - hash entries (i.e`hash_entries`.size(), we bucket-charge hash entries during insertion for performance),
   - banding (Ribbon Filter only, `bytes_coeff_rows` +`bytes_result_rows` + `bytes_backtrack`),
   - final filter (i.e, `mutable_buf`'s size).
      - Implementation details: in order to use `CacheReservationManager::CacheReservationHandle` to account final filter's memory, we have to store the `CacheReservationManager` object and `CacheReservationHandle` for final filter in `XXPH3BitsFilterBuilder` as well as  explicitly delete the filter bits builder when done with the final filter in block based table.
- Added option fo run `filter_bench` with this memory reservation feature

Pull Request resolved: #9073

Test Plan:
- Added new tests in `db_bloom_filter_test` to verify filter construction peak cache reservation under combination of  `BlockBasedTable::Rep::FilterType` (e.g, `kFullFilter`, `kPartitionedFilter`), `BloomFilterPolicy::Mode`(e.g, `kFastLocalBloom`, `kStandard128Ribbon`, `kDeprecatedBlock`) and `BlockBasedTableOptions::reserve_table_builder_memory`
  - To address the concern for slow test: tests with memory reservation under `kFullFilter` + `kStandard128Ribbon` and `kPartitionedFilter` take around **3000 - 6000 ms** and others take around **1500 - 2000 ms**, in total adding **20000 - 25000 ms** to the test suit running locally
- Added new test in `bloom_test` to verify Ribbon Filter fallback on large banding in FullFilter
- Added test in `filter_bench` to verify that this feature does not significantly slow down Bloom/Ribbon Filter construction speed. Local result averaged over **20** run as below:
   - FastLocalBloom
      - baseline `./filter_bench -impl=2 -quick -runs 20 | grep 'Build avg'`:
         - **Build avg ns/key: 29.56295** (DEBUG_LEVEL=1), **29.98153** (DEBUG_LEVEL=0)
      - new feature (expected to be similar as above)`./filter_bench -impl=2 -quick -runs 20 -reserve_table_builder_memory=true | grep 'Build avg'`:
         - **Build avg ns/key: 30.99046** (DEBUG_LEVEL=1), **30.48867** (DEBUG_LEVEL=0)
      - new feature of RibbonFilter with fallback  (expected to be similar as above) `./filter_bench -impl=2 -quick -runs 20 -reserve_table_builder_memory=true -strict_capacity_limit=true | grep 'Build avg'` :
         - **Build avg ns/key: 31.146975** (DEBUG_LEVEL=1), **30.08165** (DEBUG_LEVEL=0)

    - Ribbon128
       - baseline `./filter_bench -impl=3 -quick -runs 20 | grep 'Build avg'`:
           - **Build avg ns/key: 129.17585** (DEBUG_LEVEL=1), **130.5225** (DEBUG_LEVEL=0)
       - new feature  (expected to be similar as above) `./filter_bench -impl=3 -quick -runs 20 -reserve_table_builder_memory=true | grep 'Build avg' `:
           - **Build avg ns/key: 131.61645** (DEBUG_LEVEL=1), **132.98075** (DEBUG_LEVEL=0)
       - new feature of RibbonFilter with fallback (expected to be a lot faster than above due to fallback) `./filter_bench -impl=3 -quick -runs 20 -reserve_table_builder_memory=true -strict_capacity_limit=true | grep 'Build avg'` :
          - **Build avg ns/key: 52.032965** (DEBUG_LEVEL=1), **52.597825** (DEBUG_LEVEL=0)
          - And the warning message of `"Cache reservation for Ribbon filter banding failed due to cache full"` is indeed logged to console.

Reviewed By: pdillinger

Differential Revision: D31991348

Pulled By: hx235

fbshipit-source-id: 9336b2c60f44d530063da518ceaf56dac5f9df8e
facebook-github-bot pushed a commit that referenced this pull request Apr 6, 2022
…y limit (#9748)

Summary:
**Context:**
Through heap profiling, we discovered that `BlockBasedTableReader` objects can accumulate and lead to high memory usage (e.g, `max_open_file = -1`). These memories are currently not saved, not tracked, not constrained and not cache evict-able. As a first step to improve this, similar to #8428,  this PR is to track an estimate of `BlockBasedTableReader` object's memory in block cache and fail future creation if the memory usage exceeds the available space of cache at the time of creation.

**Summary:**
- Approximate big memory users  (`BlockBasedTable::Rep` and `TableProperties` )' memory usage in addition to the existing estimated ones (filter block/index block/un-compression dictionary)
- Charge all of these memory usages to block cache on `BlockBasedTable::Open()` and release them on `~BlockBasedTable()` as there is no memory usage fluctuation of concern in between
- Refactor on CacheReservationManager (and its call-sites) to add concurrent support for BlockBasedTable  used in this PR.

Pull Request resolved: #9748

Test Plan:
- New unit tests
- db bench: `OpenDb` : **-0.52% in ms**
  - Setup `./db_bench -benchmarks=fillseq -db=/dev/shm/testdb -disable_auto_compactions=1 -write_buffer_size=1048576`
  - Repeated run with pre-change w/o feature and post-change with feature, benchmark `OpenDb`:  `./db_bench -benchmarks=readrandom -use_existing_db=1 -db=/dev/shm/testdb -reserve_table_reader_memory=true (remove this when running w/o feature) -file_opening_threads=3 -open_files=-1 -report_open_timing=true| egrep 'OpenDb:'`

#-run | (feature-off) avg milliseconds | std milliseconds | (feature-on) avg milliseconds | std milliseconds | change (%)
-- | -- | -- | -- | -- | --
10 | 11.4018 | 5.95173 | 9.47788 | 1.57538 | -16.87382694
20 | 9.23746 | 0.841053 | 9.32377 | 1.14074 | 0.9343477536
40 | 9.0876 | 0.671129 | 9.35053 | 1.11713 | 2.893283155
80 | 9.72514 | 2.28459 | 9.52013 | 1.0894 | -2.108041632
160 | 9.74677 | 0.991234 | 9.84743 | 1.73396 | 1.032752389
320 | 10.7297 | 5.11555 | 10.547 | 1.97692 | **-1.70275031**
640 | 11.7092 | 2.36565 | 11.7869 | 2.69377 | **0.6635807741**

-  db bench on write with cost to cache in WriteBufferManager (just in case this PR's CRM refactoring accidentally slows down anything in WBM) : `fillseq` : **+0.54% in micros/op**
`./db_bench -benchmarks=fillseq -db=/dev/shm/testdb -disable_auto_compactions=1 -cost_write_buffer_to_cache=true -write_buffer_size=10000000000 | egrep 'fillseq'`

#-run | (pre-PR) avg micros/op | std micros/op | (post-PR)  avg micros/op | std micros/op | change (%)
-- | -- | -- | -- | -- | --
10 | 6.15 | 0.260187 | 6.289 | 0.371192 | 2.260162602
20 | 7.28025 | 0.465402 | 7.37255 | 0.451256 | 1.267813605
40 | 7.06312 | 0.490654 | 7.13803 | 0.478676 | **1.060579461**
80 | 7.14035 | 0.972831 | 7.14196 | 0.92971 | **0.02254791432**

-  filter bench: `bloom filter`: **-0.78% in ms/key**
    - ` ./filter_bench -impl=2 -quick -reserve_table_builder_memory=true | grep 'Build avg'`

#-run | (pre-PR) avg ns/key | std ns/key | (post-PR)  ns/key | std ns/key | change (%)
-- | -- | -- | -- | -- | --
10 | 26.4369 | 0.442182 | 26.3273 | 0.422919 | **-0.4145720565**
20 | 26.4451 | 0.592787 | 26.1419 | 0.62451 | **-1.1465262**

- Crash test `python3 tools/db_crashtest.py blackbox --reserve_table_reader_memory=1 --cache_size=1` killed as normal

Reviewed By: ajkr

Differential Revision: D35136549

Pulled By: hx235

fbshipit-source-id: 146978858d0f900f43f4eb09bfd3e83195e3be28
facebook-github-bot pushed a commit that referenced this pull request May 17, 2022
Summary:
**Context:**
Previous PR #9748, #9073, #8428 added separate flag for each charged memory area. Such API design is not scalable as we charge more and more memory areas. Also, we foresee an opportunity to consolidate this feature with other cache usage related features such as `cache_index_and_filter_blocks` using `CacheEntryRole`.

Therefore we decided to consolidate all these flags with `CacheUsageOptions cache_usage_options` and this PR serves as the first step by consolidating memory-charging related flags.

**Summary:**
- Replaced old API reference with new ones, including making `kCompressionDictionaryBuildingBuffer` opt-out and added a unit test for that
- Added missing db bench/stress test for some memory charging features
- Renamed related test suite to indicate they are under the same theme of memory charging
- Refactored a commonly used mocked cache component in memory charging related tests to reduce code duplication
- Replaced the phrases "memory tracking" / "cache reservation" (other than CacheReservationManager-related ones) with "memory charging" for standard description of this feature.

Pull Request resolved: #9926

Test Plan:
- New unit test for opt-out `kCompressionDictionaryBuildingBuffer` `TEST_F(ChargeCompressionDictionaryBuildingBufferTest, Basic)`
- New unit test for option validation/sanitization `TEST_F(CacheUsageOptionsOverridesTest, SanitizeAndValidateOptions)`
- CI
- db bench (in case querying new options introduces regression) **+0.5% micros/op**: `TEST_TMPDIR=/dev/shm/testdb ./db_bench -benchmarks=fillseq -db=$TEST_TMPDIR  -charge_compression_dictionary_building_buffer=1(remove this for comparison)  -compression_max_dict_bytes=10000 -disable_auto_compactions=1 -write_buffer_size=100000 -num=4000000 | egrep 'fillseq'`

#-run | (pre-PR) avg micros/op | std micros/op | (post-PR)  micros/op | std micros/op | change (%)
-- | -- | -- | -- | -- | --
10 | 3.9711 | 0.264408 | 3.9914 | 0.254563 | 0.5111933721
20 | 3.83905 | 0.0664488 | 3.8251 | 0.0695456 | **-0.3633711465**
40 | 3.86625 | 0.136669 | 3.8867 | 0.143765 | **0.5289363078**

- db_stress: `python3 tools/db_crashtest.py blackbox  -charge_compression_dictionary_building_buffer=1 -charge_filter_construction=1 -charge_table_reader=1 -cache_size=1` killed as normal

Reviewed By: ajkr

Differential Revision: D36054712

Pulled By: hx235

fbshipit-source-id: d406e90f5e0c5ea4dbcb585a484ad9302d4302af
yoori pushed a commit to yoori/rocksdb that referenced this pull request Nov 26, 2023
…8506)

Summary:
Context:
To help cap various memory usage by a single limit of the block cache capacity, we charge the memory usage through inserting/releasing dummy entries in the block cache. CacheReservationManager is such a class (non thread-safe) responsible for  inserting/removing dummy entries to reserve cache space for memory used by the class user.

- Refactored the inner private class CacheRep of WriteBufferManager into public CacheReservationManager class for reusability such as for facebook/rocksdb#8428

- Encapsulated implementation details of cache key generation and dummy entries insertion/release in cache reservation as discussed in facebook/rocksdb#8506 (comment)

- Consolidated increase/decrease cache reservation into one API - UpdateCacheReservation.

- Adjusted the previous dummy entry release algorithm in decreasing cache reservation to be loop-releasing dummy entries to stay symmetric to dummy entry insertion algorithm

- Made the previous dummy entry release algorithm in delayed decrease mode more aggressive for better decreasing cache reservation when memory used is less likely to increase back.

  Previously, the algorithms only release 1 dummy entries when new_mem_used < 3/4 * cache_allocated_size_ and cache_allocated_size_ - kSizeDummyEntry > new_mem_used.
Now, the algorithms loop-releases as many dummy entries as possible when new_mem_used < 3/4 * cache_allocated_size_.

- Updated WriteBufferManager's test cases to adapt to changes on the release algorithm mentioned above and left comment for some test cases for clarity

- Replaced the previous cache key prefix generation (utilizing object address related to the cache client) with one that utilizes Cache->NewID() to prevent cache-key collision among dummy entry clients sharing the same cache.

  The specific collision we are preventing happens when the object address is reused for a new cache-key prefix while the old cache-key using that same object address in its prefix still exists in the cache. This could happen due to that, under LRU cache policy, there is a possible delay in releasing a cache entry after the cache client object owning that cache entry get deallocated. In this case, the object address related to the cache client object can get reused for other client object to generate a new cache-key prefix.

  This prefix generation can be made obsolete after Peter's unification of all the code generating cache key, mentioned in facebook/rocksdb#8506 (comment)

Pull Request resolved: facebook/rocksdb#8506

Test Plan:
- Passing the added unit tests cache_reservation_manager_test.cc
- Passing existing and adjusted write_buffer_manager_test.cc

Reviewed By: ajkr

Differential Revision: D29644135

Pulled By: hx235

fbshipit-source-id: 0fc93fbfe4a40bb41be85c314f8f2bafa8b741f7
yoori pushed a commit to yoori/rocksdb that referenced this pull request Nov 26, 2023
Summary:
Context:
Some data blocks are temporarily buffered in memory in BlockBasedTableBuilder for building compression dictionary used in data block compression. Currently this memory usage is not counted toward our global memory usage utilizing block cache capacity. To improve that, this PR charges that memory usage into the block cache to achieve better memory tracking and limiting.

- Reserve memory in block cache for buffered data blocks that are used to build a compression dictionary
- Release all the memory associated with buffering the data blocks mentioned above in EnterUnbuffered(), which is called when (a) buffer limit is exceeded after buffering OR (b) the block cache becomes full after reservation OR (c) BlockBasedTableBuilder calls Finish()

Pull Request resolved: facebook/rocksdb#8428

Test Plan:
- Passing existing unit tests
- Passing new unit tests

Reviewed By: ajkr

Differential Revision: D30755305

Pulled By: hx235

fbshipit-source-id: 6e66665020b775154a94c4c5e0f2adaeaff13981
yoori pushed a commit to yoori/rocksdb that referenced this pull request Nov 26, 2023
Summary:
Context:
HISTORY.md was not properly updated along with the change in facebook/rocksdb#8428, where we introduced a change of accounting compression dictionary buffering memory and an extra condition of triggering data unbuffering.
Updated HISTORY.md for facebook/rocksdb#8428 in 6.25.0 HISTORY.md section.
Updated blog post https://rocksdb.org/blog/2021/05/31/dictionary-compression.html.

Pull Request resolved: facebook/rocksdb#9001

Reviewed By: ajkr

Differential Revision: D31517836

Pulled By: hx235

fbshipit-source-id: 01f6b30de4e1ff6b315aa8221139d9b700c7c629
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants