feat: Make pull queries streamed asynchronously #6813

AlanConfluent · 2020-12-22T03:46:44Z

Description

Before this PR, pull queries were gathered in memory as a TableRows object containing a List<List<?>> rows across all of the endpoints where pull queries are exposed.

This change adds the class PullQueryQueue, which is a queue of result rows meant to decouple the producers and consumers of row data, allowing for producer calls that are partially complete to be enqueued across both local and remote sources of data. This largely follows the methodology used by push queries. This allows:

Fetches for large batches of data to begin immediately returning results rather than waiting until each batch from each partition is complete before returning the first row.
Fetches for large amounts of data being that may be prohibitively expensive or not possible to hold it all in memory using the old methodology, whereas this change keeps relatively few rows in memory at a time and can apply back-pressure to the producers if the queue is filled.
Queries like table scans and range queries, which may return many more rows than past pull queries allowed for.

Specifically on each end of the queue:

New rows are produced and enqueued by PullPhysicalPlan if the request is being handled locally or HARouting if the request must be forwarded to another node.
Rows are consumed by the request thread of the endpoint.

For each of the endpoints:

StreamedQueryResource: This uses chunked encoding responses using the class PullQueryStreamWriter which is a StreamingOutput response type. This class periodically reads from the queue and writes a chunk to the response.
QueryEndpoint: This uses a KsqlPullQueryHandle with the existing BlockingQueryPublisher to connect a publisher to the queued data.
PullQueryPublisher: This uses PullQuerySubscription, a new PollingSubscription rather than the former block of logic that fed raw rows to the subscriber.

Testing done

Ran unit and integration tests. Also manually experimented with batches of rows being streamed back to the user by introducing artificial delays.

Reviewer checklist

Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
Ensure relevant issues are linked (description should include text like "Fixes #")

agavra

Starting to lose steam, so I'll revisit in a bit. Note to myself, I've reviewed up until PullQueryStreamWriter and will continue later today!

ksqldb-engine/src/main/java/io/confluent/ksql/KsqlExecutionContext.java

ksqldb-engine/src/main/java/io/confluent/ksql/physical/pull/HARouting.java

ksqldb-engine/src/main/java/io/confluent/ksql/physical/pull/PullPhysicalPlan.java

agavra · 2021-01-13T19:47:19Z

ksqldb-engine/src/main/java/io/confluent/ksql/physical/pull/PullPhysicalPlan.java

+        // If the queue has been closed, we stop adding rows and cleanup.
+        break;
+      }
+      pullQueryQueue.acceptRow(rowFactory.apply(row, schema));


with this asynchronous model, how would we handle operators that require sorting? e.g. SELECT * FROM foo ORDER BY date

One option might be to block at the operator node itself, but I'm not sure what the canonical way of handling this is.

I think what you're describing is likely how this would be done. Obviously any operation that cannot be done on a per-row basis complicates streaming since it requires caching many rows from the lower layer before returning even a single one.

Probably the easiest thing to do would be to allow for sorting up to N entries (or M bytes of memory) and if that's hit, throw an error. If there was sufficient interest, we might consider doing something more complex. You can do disk-based sorts if memory is limited. Conventional DBs will try to consider exactly that and I believe try to sort in memory up to a point before spilling to disk and doing a kind of merge-sort from disk.

Just throwing out some more ideas here: if a query touches on multiple partitions, we can let the sorting to first happen on each partition itself, and then at the operator we can just do merge-sort based on the returned streaming of sorted rows, which would take const space only.

Even more general, if we allow aggregations in the future, such aggregation push-downs can be applied as well.

ksqldb-engine/src/main/java/io/confluent/ksql/physical/pull/PullQueryResult.java

ksqldb-engine/src/main/java/io/confluent/ksql/query/PullQueryQueue.java

agavra · 2021-01-13T20:14:00Z

...test/resources/rest-query-validation-tests/pull-queries-against-materialized-aggregates.json

@@ -1797,22 +1797,23 @@
      "statements": [
        "CREATE STREAM INPUT (ID DOUBLE KEY, IGNORED INT) WITH (kafka_topic='test_topic', value_format='JSON');",
        "CREATE TABLE AGGREGATE AS SELECT ID, COUNT(1) AS COUNT FROM INPUT WINDOW TUMBLING(SIZE 1 SECOND) GROUP BY ID;",
-        "SELECT * FROM AGGREGATE WHERE ID IN (10.1, 8.1);",
+        "SELECT * FROM AGGREGATE WHERE ID IN (10.5, 8.5);",


out of interest, why did you make this change?

Because in floating point representation in the CPU (i.e. binary), .1 cannot be represented exactly: https://www.exploringbinary.com/why-0-point-1-does-not-exist-in-floating-point

Doing a key lookup using equality with such a value creates weird flakey results sometimes: https://floating-point-gui.de/errors/comparison/

I was banging my head against the wall with these tests not returning the row I thought it should until I tried this. Look at the result and you'll see that the existing answer is wrong! 0.5 can be represented exactly in binary. With all this, it probably doesn't make sense to do double key lookups, but I guess we allow it.

User beware!

agavra · 2021-01-13T20:17:17Z

ksqldb-rest-app/src/main/java/io/confluent/ksql/api/impl/BlockingQueryPublisher.java

+      // This allows us to hit the limit without having to queue one last row
+      if (queue.isEmpty()) {
+        ctx.runOnContext(v -> sendComplete());
+      } else {
+        ctx.runOnContext(v -> doSend());
+      }


I'm confused. Why is this necessary? Why would we otherwise have to enqueue another row?

With push queries, the only thing that would stop a stream was a limit being hit. The way that the code was set up was a row was enqueued and then if that made it hit the limit, it would fire off the limit handler. Look below in doSend(). Effectively, the only way to trigger sendCompete was to push one more row into the queue. In the pull query case, we don't know when we push a row on the queue if that's the last until we try to fetch another.

To fix this, I could have either changed pull query behavior to match push queries in the way, or made this publisher do a sendComplete without having to enqueue another row, and the latter is simpler and makes a fair amount of sense.

agavra · 2021-01-13T20:29:48Z

...-app/src/main/java/io/confluent/ksql/rest/server/resources/streaming/PullQueryPublisher.java

+
+    PullQueryExecutionUtil.checkRateLimit(rateLimiter);
+
+    final PullQueryResult result = ksqlEngine.executePullQuery(


previously we were calling this inside the Subscription#request method, now we're doing it before calling the subscriber. I guess that this is intentional because of the new asynchronous nature of pull queries? Just wanted to confirm that I understand correctly (and to confirm that it is OK to do this here in the subscribe method instead of on the first poll call)

That's a good question. I think it was largely done that way so that we could have a big try/catch around it and ensure we called subscriber.onError(e); if we hit an exception. In this case, even though it's kicked off immediately, it still sets result.onException(this::setError); which should allow us to handle errors correctly.

I roughly patterned this off of push queries in PushQueryPublisher and they also start the query immediately, so I don't think there should be a big difference so far as I can tell.

...p/src/main/java/io/confluent/ksql/rest/server/resources/streaming/PullQueryStreamWriter.java

agavra

A few more comments, all that's left now are the tests :) I'll get to them tomorrow morning

agavra · 2021-01-14T00:44:30Z

...p/src/main/java/io/confluent/ksql/rest/server/resources/streaming/PullQueryStreamWriter.java

+          writeRow(toProcess, head, sb);
+          if (sb.length() >= FLUSH_SIZE_BYTES || (clock.millis() - lastFlush) >= MAX_FLUSH_MS) {
+            output.write(sb.toString().getBytes(StandardCharsets.UTF_8));
+            output.flush();


I'm wondering how this works from a client perspective. It looks like we'd be flushing incomplete JSON, right? Is that what was happening previously for pull queries? Would clients know how to handle that properly?

Correct, we flush incomplete JSON. This is how push queries work at the moment, so I just copied that behavior (Check out QueryStreamWriter). If you look in this PR at KsqlTarget, you'll see how we do this for partial results for forwarded requests. I assume that the CLI does something similar.

For my own clarification: were you referring to the JSON format of the HTTP response, or the JSON format of a single record?

agavra · 2021-01-14T00:50:12Z

...p/src/main/java/io/confluent/ksql/rest/server/resources/streaming/PullQueryStreamWriter.java

+  }
+
+  @Override
+  public void write(final OutputStream output) {


there's a lot of different levels going on and wrapping my head around passing around different heads/string buffers and flushing in the middle (etc...) is a struggle. At a minimum, I think we can do with inline comments and javadocing (including the private methods), though I urge you to take another stab and see if you can refactor this to make it cleaner and reduce the cyclomatic complexity (this is one of those times where I think checkstyle got it right, it needs improvement).

There's a lot of complexity in creating proper JSON (commas in the right places, the header row being first, etc...). I wonder if there's a library that we can use to make that simpler.

I added a ton of comments and simplified the methods. Hopefully you agree it's easier to understand.

I am using objectMapper.writeValueAsString to do the heavy lifting of json serialization. The comma handling is a bit of a pain that I think I have to handle myself. The reason is just due to the partial flushing. I can't rely on a json library to write partial json (or at least I don't really know of any that do). This is actually exactly what it does for push queries. I'll take a look and see if any internet searching turns up any ideas.

💯 this is way cleaner and easier for me to reason about. Thanks for the heavy refactoring!

ksqldb-rest-app/src/test/java/io/confluent/ksql/api/ApiTest.java

agavra

Well... morning quickly turned to afternoon and then evening but I finally got around to the tests for this PR.

ksqldb-engine/src/test/java/io/confluent/ksql/query/PullQueryQueueTest.java

ksqldb-engine/src/main/java/io/confluent/ksql/query/PullQueryQueue.java

agavra · 2021-01-15T00:58:03Z

.../src/test/java/io/confluent/ksql/rest/server/resources/streaming/PullQueryPublisherTest.java

+      }
+      times[0]++;
+      return null;
+    }).when(pullQueryQueue).drainTo(any());


I wonder if it makes sense to have a TestPullQueryQueue because I see this pattern come up in the tests quite often (of mocking what gets returned on which calls). It'll make things a little bit easier to write tests going forward

I played a bit with trying to create one, and it's a little hard to do given how I'm using it in tests. In some cases, a real queue is fine, which I create in some tests. In others, I want to verify that the queue was used in a certain way, and a mock is required. In some others, I want to test surrounding logic and call various callbacks, which I often do from queue "answer" methods to simulate things happening while waiting for new rows. This could be done with a test queue, and methods like completeAfterEmpty which would implicitly know to call the completion method. I'm not too sure how reusable given that the completion scenario is different in different tests. What do you think?

I trust that if you played around with it and it wasn't clear then it would probably require more work to retrofit it than it's worth. I'm happy leaving it as is

...p/src/main/java/io/confluent/ksql/rest/server/resources/streaming/PullQueryStreamWriter.java

...c/test/java/io/confluent/ksql/rest/server/resources/streaming/PullQueryStreamWriterTest.java

agavra · 2021-01-15T01:27:10Z

Overall, the PR seems like a huge improvement. I have some questions inline (amongst a less-important small army of nits) that I'd like addressed before giving the green light. I'll mark those 3/4 comments with the 🚀 emoji so that we can focus on the important ones

AlanConfluent

@agavra Followed up on some of your comments. Still working through them.

ksqldb-engine/src/main/java/io/confluent/ksql/KsqlExecutionContext.java

ksqldb-engine/src/main/java/io/confluent/ksql/physical/pull/HARouting.java

ksqldb-engine/src/main/java/io/confluent/ksql/physical/pull/PullPhysicalPlan.java

AlanConfluent · 2021-01-15T23:49:01Z

ksqldb-engine/src/main/java/io/confluent/ksql/physical/pull/PullPhysicalPlan.java

+        // If the queue has been closed, we stop adding rows and cleanup.
+        break;
+      }
+      pullQueryQueue.acceptRow(rowFactory.apply(row, schema));


I think what you're describing is likely how this would be done. Obviously any operation that cannot be done on a per-row basis complicates streaming since it requires caching many rows from the lower layer before returning even a single one.

Probably the easiest thing to do would be to allow for sorting up to N entries (or M bytes of memory) and if that's hit, throw an error. If there was sufficient interest, we might consider doing something more complex. You can do disk-based sorts if memory is limited. Conventional DBs will try to consider exactly that and I believe try to sort in memory up to a point before spilling to disk and doing a kind of merge-sort from disk.

ksqldb-engine/src/main/java/io/confluent/ksql/physical/pull/PullQueryResult.java

ksqldb-engine/src/main/java/io/confluent/ksql/query/PullQueryQueue.java

ksqldb-engine/src/test/java/io/confluent/ksql/query/PullQueryQueueTest.java

AlanConfluent · 2021-01-16T04:49:35Z

...test/resources/rest-query-validation-tests/pull-queries-against-materialized-aggregates.json

@@ -1797,22 +1797,23 @@
      "statements": [
        "CREATE STREAM INPUT (ID DOUBLE KEY, IGNORED INT) WITH (kafka_topic='test_topic', value_format='JSON');",
        "CREATE TABLE AGGREGATE AS SELECT ID, COUNT(1) AS COUNT FROM INPUT WINDOW TUMBLING(SIZE 1 SECOND) GROUP BY ID;",
-        "SELECT * FROM AGGREGATE WHERE ID IN (10.1, 8.1);",
+        "SELECT * FROM AGGREGATE WHERE ID IN (10.5, 8.5);",


Because in floating point representation in the CPU (i.e. binary), .1 cannot be represented exactly: https://www.exploringbinary.com/why-0-point-1-does-not-exist-in-floating-point

Doing a key lookup using equality with such a value creates weird flakey results sometimes: https://floating-point-gui.de/errors/comparison/

I was banging my head against the wall with these tests not returning the row I thought it should until I tried this. Look at the result and you'll see that the existing answer is wrong! 0.5 can be represented exactly in binary. With all this, it probably doesn't make sense to do double key lookups, but I guess we allow it.

User beware!

AlanConfluent

@agavra I believe I've addressed your comments. PTAL

AlanConfluent · 2021-01-19T18:40:58Z

ksqldb-rest-app/src/main/java/io/confluent/ksql/api/impl/BlockingQueryPublisher.java

+      // This allows us to hit the limit without having to queue one last row
+      if (queue.isEmpty()) {
+        ctx.runOnContext(v -> sendComplete());
+      } else {
+        ctx.runOnContext(v -> doSend());
+      }


With push queries, the only thing that would stop a stream was a limit being hit. The way that the code was set up was a row was enqueued and then if that made it hit the limit, it would fire off the limit handler. Look below in doSend(). Effectively, the only way to trigger sendCompete was to push one more row into the queue. In the pull query case, we don't know when we push a row on the queue if that's the last until we try to fetch another.

To fix this, I could have either changed pull query behavior to match push queries in the way, or made this publisher do a sendComplete without having to enqueue another row, and the latter is simpler and makes a fair amount of sense.

AlanConfluent · 2021-01-19T18:51:28Z

...-app/src/main/java/io/confluent/ksql/rest/server/resources/streaming/PullQueryPublisher.java

+
+    PullQueryExecutionUtil.checkRateLimit(rateLimiter);
+
+    final PullQueryResult result = ksqlEngine.executePullQuery(


That's a good question. I think it was largely done that way so that we could have a big try/catch around it and ensure we called subscriber.onError(e); if we hit an exception. In this case, even though it's kicked off immediately, it still sets result.onException(this::setError); which should allow us to handle errors correctly.

I roughly patterned this off of push queries in PushQueryPublisher and they also start the query immediately, so I don't think there should be a big difference so far as I can tell.

...p/src/main/java/io/confluent/ksql/rest/server/resources/streaming/PullQueryStreamWriter.java

AlanConfluent · 2021-01-20T00:27:52Z

...p/src/main/java/io/confluent/ksql/rest/server/resources/streaming/PullQueryStreamWriter.java

+  }
+
+  @Override
+  public void write(final OutputStream output) {


I added a ton of comments and simplified the methods. Hopefully you agree it's easier to understand.

I am using objectMapper.writeValueAsString to do the heavy lifting of json serialization. The comma handling is a bit of a pain that I think I have to handle myself. The reason is just due to the partial flushing. I can't rely on a json library to write partial json (or at least I don't really know of any that do). This is actually exactly what it does for push queries. I'll take a look and see if any internet searching turns up any ideas.

...p/src/main/java/io/confluent/ksql/rest/server/resources/streaming/PullQueryStreamWriter.java

ksqldb-rest-app/src/test/java/io/confluent/ksql/api/ApiTest.java

...c/test/java/io/confluent/ksql/rest/server/resources/streaming/PullQueryStreamWriterTest.java

AlanConfluent · 2021-01-20T05:15:39Z

...p/src/main/java/io/confluent/ksql/rest/server/resources/streaming/PullQueryStreamWriter.java

+          writeRow(toProcess, head, sb);
+          if (sb.length() >= FLUSH_SIZE_BYTES || (clock.millis() - lastFlush) >= MAX_FLUSH_MS) {
+            output.write(sb.toString().getBytes(StandardCharsets.UTF_8));
+            output.flush();


Correct, we flush incomplete JSON. This is how push queries work at the moment, so I just copied that behavior (Check out QueryStreamWriter). If you look in this PR at KsqlTarget, you'll see how we do this for partial results for forwarded requests. I assume that the CLI does something similar.

AlanConfluent · 2021-01-20T05:39:11Z

.../src/test/java/io/confluent/ksql/rest/server/resources/streaming/PullQueryPublisherTest.java

+      }
+      times[0]++;
+      return null;
+    }).when(pullQueryQueue).drainTo(any());


I played a bit with trying to create one, and it's a little hard to do given how I'm using it in tests. In some cases, a real queue is fine, which I create in some tests. In others, I want to verify that the queue was used in a certain way, and a mock is required. In some others, I want to test surrounding logic and call various callbacks, which I often do from queue "answer" methods to simulate things happening while waiting for new rows. This could be done with a test queue, and methods like completeAfterEmpty which would implicitly know to call the completion method. I'm not too sure how reusable given that the completion scenario is different in different tests. What do you think?

agavra

Thanks @AlanConfluent! Big step forward. I'm still having a little trouble wrapping my head around the limit handler/reached end code, but I don't want to block this PR on that as you've clearly thought that out and the longer it sits the more there's a chance for nasty merge conflicts. I'll dig into that offline 🙂

agavra · 2021-01-21T04:56:57Z

...p/src/main/java/io/confluent/ksql/rest/server/resources/streaming/PullQueryStreamWriter.java

+      final boolean hasAnotherRow
+  ) {
+    // Send for a comma after the header
+    if (!sentAtLeastOneRow) {


I think this can be simplified by always prepending ,\n when we write a row that's not the header (as opposed to only in the case of the header). then when we're done, just append \n] (so we don't need to check hasAnotherRow)

You're actually right and that would honestly simplify my code a bit. The only issue with this is that the code parsing this (and specifically partial responses, chunks) in both KsqlTarget and the CLI looks for "\n" and also assumes if there's a "," it's on the end. I can potentially fix this, but I don't want to make a big change at this point to the PR. Also, this is consistent with push queries at the moment. I'd rather do this as a followup, if we want to go that route.

agavra · 2021-01-21T05:03:03Z

...p/src/main/java/io/confluent/ksql/rest/server/resources/streaming/PullQueryStreamWriter.java

+        // The head becomes the next thing to process and the newly polled row becomes the head.
+        final PullQueryRow toProcess = head;
+        head = row;
+        return toProcess;


i guess your algorithm above handles this gracefully, but the first time you call pollNextRow won't it always be null? might just want to add a little comment there as well so that the next person doesn't scratch their head 😄

Yes, that's right. I'll add a comment.

agavra · 2021-01-21T05:03:36Z

...p/src/main/java/io/confluent/ksql/rest/server/resources/streaming/PullQueryStreamWriter.java

+  }
+
+  @Override
+  public void write(final OutputStream output) {


💯 this is way cleaner and easier for me to reason about. Thanks for the heavy refactoring!

AlanConfluent requested a review from a team as a code owner December 22, 2020 03:46

agavra reviewed Jan 13, 2021

View reviewed changes

agavra reviewed Jan 14, 2021

View reviewed changes

agavra reviewed Jan 15, 2021

View reviewed changes

AlanConfluent commented Jan 16, 2021

View reviewed changes

AlanConfluent force-pushed the make_pull_queries_async branch from eddb786 to 55d3da3 Compare January 20, 2021 06:15

AlanConfluent commented Jan 20, 2021

View reviewed changes

agavra approved these changes Jan 21, 2021

View reviewed changes

AlanConfluent added 2 commits January 27, 2021 10:33

Makes pull query interface async

f9168bd

Fix lint

30e2976

AlanConfluent force-pushed the make_pull_queries_async branch from fe21cfc to 30e2976 Compare January 27, 2021 19:15

AlanConfluent added 2 commits January 27, 2021 12:14

fix test compilation

b3d3d0f

Fix another test compilation from rebase

a25c7a0

AlanConfluent merged commit b69e3f8 into confluentinc:master Jan 28, 2021

agavra mentioned this pull request Feb 6, 2021

Support LIMIT Clauses in Pull Queries #6962

Closed

agavra mentioned this pull request May 27, 2021

docs: rewrite pull query docs page #7532

Merged

2 tasks


		PullQueryExecutionUtil.checkRateLimit(rateLimiter);

		final PullQueryResult result = ksqlEngine.executePullQuery(

feat: Make pull queries streamed asynchronously #6813

feat: Make pull queries streamed asynchronously #6813

Conversation

AlanConfluent commented Dec 22, 2020

Description

Testing done

Reviewer checklist

agavra left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agavra left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agavra Jan 14, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agavra left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agavra commented Jan 15, 2021 • edited Loading

AlanConfluent left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlanConfluent left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agavra left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlanConfluent Jan 22, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agavra Jan 14, 2021 •

edited

Loading

agavra commented Jan 15, 2021 •

edited

Loading

AlanConfluent Jan 22, 2021 •

edited

Loading