Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Bypass window store cache when doing windowed pull queries #6548

Merged

Conversation

AlanConfluent
Copy link
Member

@AlanConfluent AlanConfluent commented Oct 29, 2020

Description

Windowed pull queries have very mediocre performance. After lots of investigation, it was clear that this was due to the use of the streams cache. We experimented with disabling the cache, and performance is good for pull queries. The issue is that if we did this, then other areas such as persistent queries suffer in their performance.

This PR aims to disable the use of the streams cache for only windowed pull queries. There was a lot of discussion over whether a public API should be exposed in Streams to bypass the cache during a state store lookup. At the moment they don't want to do this with the existing API. Next public API will expose bypassing the cache.

In order to give ksqlDB good performance for all pull query types, it was decided to use some reflection to bypass the lack of proper public APIs and skip the caching layers.

Testing done

Ran local unit tests. Ran RestQueryTranslationTests. Ran pull query benchmarks for windowed queries.

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

Copy link
Contributor

@guozhangwang guozhangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Maybe we can only answer whether the performance is similar to the case with caching disabled via config directly by doing some benchmarks. I'm just wondering if calling reflection per-fetch call would be too costly or is it negligible. If benchmark shows latter case, we can consider caching the provider / type along with the CompositeReadOnlyXXStore (actually, does that work without changing streams code? anyways, we'll see if it is really necessary)

Copy link
Member

@vvcephei vvcephei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, @AlanConfluent ! It looks good to me. I just had a couple of minor comments.

Comment on lines +129 to +131
if (!(store instanceof SessionStore)) {
break;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting; why does it happen?

...Ah, I see. It's because the RocksDBSessionStore wraps a SegmentedBytesStore.

The rest seems fairly obvious, but this one might be subtle enough to warrant a comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's exactly as you mentioned. Stopping at the last SessionStore is sufficient since it is below the cache and still has the expected interface for fetching. Added a few comments about that.

}

@Test
public void shouldAvoidNonWindowStore() throws IllegalAccessException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this test verifies that we bottom out at a wrapped store when it doesn't wrap a WindowStore, right?
It seems like a similar test could positively verify that we actually do skip the caching layer, or any other wrapped layer, but I didn't see that test. Did I miss it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic was technically cache agnostic. So long as the last WindowStore layer is not a cache, which we know it wouldn't be since the caching layer is built upon that, this check would hopefully be sufficient.

I changed the bypassing logic so that it verifies that it actually passes the caching layer when it's run and the tests verify that as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I removed the bypass check because it's possible that the user can configure it without a cache. I don't really want to try to reproduce the logic for determining if cache is enabled, so I removed this bypass check.

@AlanConfluent
Copy link
Member Author

AlanConfluent commented Oct 31, 2020

Maybe we can only answer whether the performance is similar to the case with caching disabled via config directly by doing some benchmarks. I'm just wondering if calling reflection per-fetch call would be too costly or is it negligible. If benchmark shows latter case, we can consider caching the provider / type along with the CompositeReadOnlyXXStore (actually, does that work without changing streams code? anyways, we'll see if it is really necessary)

From my benchmarking, the reflection doesn't appear to have a significant effect on performance. I think the performance is dominated by other factors, namely query parsing, and actually doing the reads from rocksdb. I'm hitting more or less the same numbers as with caching disabled.

On our standard single node windowed benchmark, we can do about 1800qps, and in benchmarks with this change, I'm seeing 1850 qps on my latest run, so the overhead is well within the noise.

@guozhangwang
Copy link
Contributor

On our standard single node windowed benchmark, we can do about 1800qps, and in benchmarks with this change, I'm seeing 1850 qps on my latest run, so the overhead is well within the noise.

That's great!!

@@ -129,6 +129,7 @@
.withProperty(KsqlRestConfig.ADVERTISED_LISTENER_CONFIG, "http://localhost:8188")
.withProperty(KsqlConfig.KSQL_QUERY_PULL_ENABLE_STANDBY_READS, true)
.withProperty(KsqlConfig.KSQL_STREAMS_PREFIX + "num.standby.replicas", 1)
.withProperty(KsqlConfig.KSQL_STREAMS_PREFIX + "cache.max.bytes.buffering", 10000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you add these?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added them to one of the pull query tests because I wanted to exercise the cache bypassing code in a functional test. It's hard to set up a realistic scenario in unit tests since it's not easy to construct all of the state store layers as they actually exist.

If something basic goes wrong in the reflection or types with underlying streams when faced with the cache, these tests should fail. The real test for this will be the automated benchmarks which test that this bypassing effect works as intended.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, makes sense

Copy link
Member

@vpapavas vpapavas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Alan! LGTM!

@AlanConfluent AlanConfluent changed the title Bypass window store cache when doing windowed pull queries fix: Bypass window store cache when doing windowed pull queries Nov 2, 2020
@AlanConfluent AlanConfluent merged commit 8f84e41 into confluentinc:master Nov 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants