Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid double-compacting data in bottom level in manual compactions #5138

Conversation

miasantreble
Copy link
Contributor

Depending on the config, manual compaction (leveled compaction style) does following compactions:
L0->L1
L1->L2
...
Ln-1 -> Ln
Ln -> Ln
The final Ln -> Ln compaction is partly unnecessary as it recompacts all the files that were just generated by the Ln-1 -> Ln. We should avoid recompacting such files. This rule should be applied to Lmax only.
Resolves issue #4995

@miasantreble
Copy link
Contributor Author

Hey @ajkr , do you mind taking a quick look to see if this approach makes sense? Thanks!

@miasantreble miasantreble added the WIP Work in progress label Apr 2, 2019
@ajkr
Copy link
Contributor

ajkr commented Apr 2, 2019

Thanks for doing this, will take a look. BTW, do you mind reminding me what @yoshinorim's reason was for only operating on the bottom level? I remember he wrote some insightful reason on the task.

@ajkr
Copy link
Contributor

ajkr commented Apr 2, 2019

Never mind, I think it makes sense. We wouldn't want to skip files in middle levels because the latest data for the range (as of the time CompactRange is called) might not be included by other compactions happening in parallel. So just because file number is higher in those levels doesn't necessarily mean the newest data has been pushed down.

if (inputs_shrinked.empty()) {
return nullptr;
}
inputs.files.swap(inputs_shrinked);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a problem that the files may not cover a contiguous key-range after this code filters some out?

I wonder if we should pick each compaction consisting of one contiguous range of files that all need to be compacted. We can then set compaction_end to the end of that range and another call will be made for the next contiguous range, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, it looks like the following ExpandInputsToCleanCut refills any holes if we pick non-contiguous files here. It seems safe, but OTOH some cases may be suboptimal. For example, the first and last file might be the only ones that need compaction (i.e., inputs_shrinked contains only those two). But then ExpandInputsToCleanCut will repopulate the input files with everything in the range, which leaves us with no files skipped.

Copy link
Contributor Author

@miasantreble miasantreble Apr 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ajkr !
let me see if I understood the idea correctly:
so in CompactionPicker::CompactRange, after the call to ExpandInputsToCleanCut and SetupOtherInputs, we will still filter out new files larger than max_sst_file_number, then maybe call a new function to break up inputs_shrinked into continuous ranges of files, and for each range, call SetupOtherInputs and then create/register one new Compaction?
This will create a number of Compaction objects but the caller DBImpl::RunManualCompaction seems to expect only one Compaction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if it's only suboptimal in some corner cases but optimizes in general cases, we can live with that. Or if really needed, we can also do some trick in ExpandInputsToCleanCut but that might be a bit error-prone

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than leaving holes in the middle, should we just cut the compaction?
If we have files [a b] [c d] [e f], and [c d] is newly generated, should we pick [a b] to compact itself, and then [e f] in a following one?

@miasantreble
Copy link
Contributor Author

miasantreble commented Apr 2, 2019

@ajkr was in a meeting earlier, but here is from @yoshinorim :

For example:
Suppose calling CompactRange(), and max sst number was 100 at that time. Currently, compactions will happen like this:
001.sst, 002.sst, 003.sst (L0) -> 101.sst (L1)
004.sst, 005.sst, 101.sst, 006.sst (L1) -> 102.sst (L2)
011.sst, 012.sst, 102.sst, 013.sst (L2) -> 103.sst (L3)
If you skip sst files > max number for all levels, several files (101.sst in L1 and 102.sst in L2 in this case) will remain in higher levels, which may waste some space. That's why I think only reasonable level to skip compaction is Lmax.

I believe it's the exact same reason that you summarized in the next comment

@ajkr
Copy link
Contributor

ajkr commented Apr 3, 2019 via email

@miasantreble
Copy link
Contributor Author

I see what you mean now, RunManualCompaction will repeatedly call manual.cfd->CompactRange until manual.done is set to true, and every time BackgroundCompaction runs, it will update begin with manual_end so I guess all we need to do now is in CompactionPicker when filtering inputs by file number, stop as soon as we find a newly created file, let ExpandInputsToCleanCut do its job and run the compaction using the truncated inputs. Will try out this idea. Thanks!

@siying
Copy link
Contributor

siying commented Apr 10, 2019

@miasantreble oops, I didn't see your latest comments. I was just trying to mention the same thing as you guys concluded:)

db/compaction_picker.cc Outdated Show resolved Hide resolved
db/compaction_picker.cc Outdated Show resolved Hide resolved
db/compaction_picker.cc Outdated Show resolved Hide resolved
db/compaction_picker.cc Show resolved Hide resolved
num = comp_stats2[2].num_input_files_in_output_level;
ASSERT_EQ(num, 0);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have compaction_picker_test.cc. Can we have more thorough test coverage there for some corner cases?

db/db_impl_compaction_flush.cc Outdated Show resolved Hide resolved
db/db_impl_compaction_flush.cc Outdated Show resolved Hide resolved
db/column_family.cc Outdated Show resolved Hide resolved
@siying
Copy link
Contributor

siying commented Apr 10, 2019

I gave it a pass, and I believe we are on the right track!

db/column_family.cc Outdated Show resolved Hide resolved
db/compaction_picker.cc Outdated Show resolved Hide resolved
@miasantreble miasantreble force-pushed the bottom-level-double-compaction branch from 39ea4b9 to 1127a66 Compare April 11, 2019 23:06
@miasantreble miasantreble removed the WIP Work in progress label Apr 11, 2019
@miasantreble miasantreble force-pushed the bottom-level-double-compaction branch from 8652a22 to 9305fe1 Compare April 14, 2019 19:34
@miasantreble miasantreble force-pushed the bottom-level-double-compaction branch from 9305fe1 to 855556d Compare April 14, 2019 20:32
Copy link
Contributor

@siying siying left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Defer to @ajkr for approval.

db/db_compaction_test.cc Outdated Show resolved Hide resolved
db/db_impl_compaction_flush.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@miasantreble
Copy link
Contributor Author

@ajkr do you mind taking another look before we land the PR? Thanks!

Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. Nice!

HISTORY.md Outdated
@@ -18,6 +18,7 @@
## 6.1.0 (3/27/2019)
### New Features
* Introduce two more stats levels, kExceptHistogramOrTimers and kExceptTimers.
* Added BottommostLevelCompaction::kForceOptimized to avoid double compacting newly compacted files in bottom level compaction of manual compaction.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it should be under "Unreleased" in master branch and under "6.1.x" in 6.1.fb branch, assuming you're going to backport it. Although perhaps the protocol changed after my tenure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, thanks for pointing out. The protocol hasn't changed yet :)

HISTORY.md Outdated
@@ -18,6 +18,7 @@
## 6.1.0 (3/27/2019)
### New Features
* Introduce two more stats levels, kExceptHistogramOrTimers and kExceptTimers.
* Added BottommostLevelCompaction::kForceOptimized to avoid double compacting newly compacted files in bottom level compaction of manual compaction.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding, is it an option because kForce should be able to force recompaction even on files that were trivially moved to the bottom level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we decided to make it an option because tools like option_change_migration and tests like DBCompactionTestWithParam .ConvertCompactionStyle expects manual compactions to generate one single file in the bottom level

@facebook-github-bot
Copy link
Contributor

@miasantreble has updated the pull request. Re-import the pull request

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@miasantreble merged this pull request in baa5302.

@miasantreble miasantreble deleted the bottom-level-double-compaction branch April 18, 2019 21:36
vagogte pushed a commit to vagogte/rocksdb that referenced this pull request Jun 18, 2019
…acebook#5138)

Summary:
Depending on the config, manual compaction (leveled compaction style) does following compactions:
L0->L1
L1->L2
...
Ln-1 -> Ln
Ln -> Ln
The final Ln -> Ln compaction is partly unnecessary as it recompacts all the files that were just generated by the Ln-1 -> Ln. We should avoid recompacting such files. This rule should be applied to Lmax only.
Resolves issue facebook#4995
Pull Request resolved: facebook#5138

Differential Revision: D14940106

Pulled By: miasantreble

fbshipit-source-id: 8d3cf5507a17e76f3333cfd4bac5256d005636e5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants