-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce iterator key comparison for upper/lower bound check #5111
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yiwu-arbug thank you for your continuous contibution!
I think it is a great approach in general. My only concern is that HintWithinLowerBound() is a virtual function, and calling it for every key is kind of expensive. And this is really virtual, in the sense that compiler or CPU really can't predict the class is BlockBasedTableIterator, or LevelIterator at all. I wonder whether there is a better way to solve a problem. Is it possible to make a protected field of InernalIterator and return it as a non-virtual function, for example?
@siying Valid concern. But to do that, caller (e.g. LevelIterator) will need to pass down a callback to callee (e.g. BlockBasedTableIterator) and the callee will need to use this callback to update the protected field of caller. I feel it is making iterators of different layers couple together more. Not sure if there's a better way to avoid the virtual call. |
Instead of relying on callback, can levelIterator fetch SST file iterator's value and put it in local variable after each Seek(), Next(), Prev(), etc? Is it possible? |
Sure. That's doable. |
Still WIP |
@yiwu-arbug do you still plan to move away from the virtual function? |
yes, still working on it. |
Signed-off-by: Yi Wu <yiwu@pingcap.com>
Signed-off-by: Yi Wu <yiwu@pingcap.com>
@siying I tried to remove virtual call, and the patch is here: https://github.com/yiwu-arbug/rocksdb/commit/01dd9aac1a92b83c3a4c8086400d27595667594a I also did a quick benchmark for short range scan and #5142 and #5111 combined is neutral vs. master.
Shall we keeping the virtual call? |
@yiwu-arbug Valid() seems to be pretty expensive call. Do you need to call it? |
Changed my second guess is: passing the hint layer by layer up the chain is a lot of random memory access, which is more expensive than virtual table lookup. |
Signed-off-by: Yi Wu <yiwu@pingcap.com>
Signed-off-by: Yi Wu <yiwu@pingcap.com>
Signed-off-by: Yi Wu <yiwu@pingcap.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not able to understand why moving away from virtual function is slower. Do you have any theory?
db/version_set.cc
Outdated
bool hint_within_upper_bound = | ||
read_options_.iterate_upper_bound != nullptr && | ||
user_comparator_.Compare(ExtractUserKey(file_largest_key(file_index_)), | ||
*read_options_.iterate_upper_bound) < 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do allow users to change upper bound for a new Seek(). So maybe we should do it in InitFileIterator and persistent the result in this iterator, even if we don't create a new file iterator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, will fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait, the upper bound check result won't change even in InitFileIterator
we don't create new file iterator, so it is safe to not updating it in that case. Do I miss anything?
@yiwu-arbug comparing the virtual function and non-virtual function cases, are they doing the same amount of comparisons? |
@siying regarding virtual call, my theory is we shall not just look at the part that makes the function call, but overall cpu cost. In the non-virtual attempt, we are passing the result bottom up, which also takes cpu cycle. Those cpu cycles may look trivial, but the virtual call that they comparing with is not that significant as well. |
@siying yes, I verified and the number of comparison is exactly the same. |
@yiwu-arbug it's roughly the same number of calls, isn't? If it is non-virtual, it's going to be inlined so it's not even a function call, but directly reading from memory location. |
@siying but the total cpu cost also include the effort to make it non-virtual. At least in my patch, the return value is copied several times layer by layer up, which is maybe not cheaper than just a virtual table lookup. |
@yiwu-arbug isn't the CPU mostly reading from one variable to another? |
@siying Forgive for my laziness but resolving issue with virtual calls is not the main goal of this patch. As I can see the method we are trying is at least making the code very messy. And we don't use the same pattern in the rest of the rocksdb code. As per-my bench the virtual call doesn't cause noticeable regression, but if it does I'll come back to fix. |
table/block_based_table_reader.h
Outdated
@@ -705,6 +710,8 @@ class BlockBasedTableIterator : public InternalIteratorBase<TValue> { | |||
TBlockIter block_iter_; | |||
bool block_iter_points_to_real_block_; | |||
bool is_out_of_bound_ = false; | |||
// Whether current data block being fully within iterate upper bound. | |||
bool data_block_within_upper_bound_ = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be default false to be safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neither one is safer to me:
- when used to check if next block needs to be read, setting to true is safer, it allow the next block to be read.
- when used to check whether upper bound needs to be checked, setting to false is safer, it force upper bound to be checked.
The initial value shouldn't be used anyway, but I don't know a way to assert that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then it's better to find why it failed in the first place then. It introduces a potential bug. When you try to reproduce, which flavor did you run? Did you run ASAN?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, I don't understand 1. If the value is false, will we skip the next block? Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the current block is not completely within upper bound, the next block will be completely out of upper bound, so it's safe to skip.
@siying was the test fixed? |
Yes it is fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read the code and suspect that the failure happens in read_options_.iterate_upper_bound == nullptr
case.
I think so. Do you prefer null-checking |
@yiwu-arbug if possible, I hope we avoid extra branching for |
@yiwu-arbug has updated the pull request. Re-import the pull request |
@siying make sense. done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Landed. Thank you for your contribution! |
Thanks much for the help to get the tricky patch merged! |
…bleIterator Summary: PR facebook#5111 reduced the number of key comparisons when iterating with upper/lower bounds; however, this caused a regression for MyRocks. Reverting to the previous behavior as a hotfix. Test Plan: make check + affected MyRocks test
…bleIterator (#5428) Summary: PR #5111 reduced the number of key comparisons when iterating with upper/lower bounds; however, this caused a regression for MyRocks. Reverting to the previous behavior in BlockBasedTableIterator as a hotfix. Pull Request resolved: #5428 Differential Revision: D15721038 Pulled By: ltamasi fbshipit-source-id: 5450106442f1763bccd17f6cfd648697f2ae8b6c
…acebook#5111)" This reverts commit f3a7847.
…5111) Summary: Previously if iterator upper/lower bound presents, `DBIter` will check the bound for every key. This patch turns the check into per-file or per-data block check when applicable, by checking against either file largest/smallest key or block index key. Pull Request resolved: facebook#5111 Differential Revision: D15330061 Pulled By: siying fbshipit-source-id: 8a653fe3cd50d94d81eb2d13b087326c58ee2024
…bleIterator (facebook#5428) Summary: PR facebook#5111 reduced the number of key comparisons when iterating with upper/lower bounds; however, this caused a regression for MyRocks. Reverting to the previous behavior in BlockBasedTableIterator as a hotfix. Pull Request resolved: facebook#5428 Differential Revision: D15721038 Pulled By: ltamasi fbshipit-source-id: 5450106442f1763bccd17f6cfd648697f2ae8b6c
…acebook#5111)" (facebook#5440) Summary: This reverts commit f3a7847. Pull Request resolved: facebook#5440 Differential Revision: D15765967 Pulled By: ltamasi fbshipit-source-id: d027fe24132e3729289cd7c01857a7eb449d9dd0
…pt) (#5468) Summary: This is a second attempt for #5111, with the fix to redo iterate bounds check after `SeekXXX()`. This is because MyRocks may change iterate bounds between seek. See #5111 for original benchmark result and discussion. Closes #5463. Pull Request resolved: #5468 Test Plan: Existing rocksdb tests, plus myrocks test `rocksdb.optimizer_loose_index_scans` and `rocksdb.group_min_max`. Differential Revision: D15863332 fbshipit-source-id: ab4aba5899838591806b8673899bd465f3f53e18
…pt) (facebook#5468) Summary: This is a second attempt for facebook#5111, with the fix to redo iterate bounds check after `SeekXXX()`. This is because MyRocks may change iterate bounds between seek. See facebook#5111 for original benchmark result and discussion. Closes facebook#5463. Pull Request resolved: facebook#5468 Test Plan: Existing rocksdb tests, plus myrocks test `rocksdb.optimizer_loose_index_scans` and `rocksdb.group_min_max`. Differential Revision: D15863332 fbshipit-source-id: ab4aba5899838591806b8673899bd465f3f53e18
Summary:
Previously if iterator upper/lower bound presents,
DBIter
will check the bound for every key. This patch turns the check into per-file or per-data block check when applicable, by checking against either file largest/smallest key or block index key.Test Plan:
Test 1: sequential scan
Use the following db_bench test, and compare
The benchmark do multiple (relatively long) range scan on a DB that's uncompressed, with smaller SST files, and can full cached in memory, to amplify the benefit.
Throughput & latency:
~10% throughput improvement.
Key comparison count (reported by perf context):
~40% less key comparison.
Also verified block_cache_hit and block_read_count are all the same, as a proxy to verify the behavior of the iterator is the same.
CPU usage ("task-clock (msec)" as reported by
perf stat
):~9% cpu usage reduction.
Test 2: reverse scan
Test with the same benchmark but added
--reverse_iterator=true
.Throughput & latency:
~2% throughput improvement
Key comparison count:
~14% less key comparison
CPU usage:
~2% cpu usage reduction
Also verified the optimization works when partitioned-index is being used.