docs: add documentation for using lambda functions (confluentinc#7092)

* docs: add documentation for using lambda functions * Adding lambda example docs * Updates for reduce * Review update * Apply suggestions from code review Co-authored-by: Jim Galasyn <jim.galasyn@confluent.io> * Review updates - adding index cards * language clean-up Co-authored-by: Steven Zhang <stevenz@confluent.io> Co-authored-by: Jim Galasyn <jim.galasyn@confluent.io>
ableegoldman · Mar 8, 2021 · 8f684da · 8f684da
1 parent 170a3c0
commit 8f684da
Show file tree

Hide file tree

Showing 6 changed files with 314 additions and 1 deletion.
diff --git a/docs/concepts/index.md b/docs/concepts/index.md
@@ -74,7 +74,12 @@ Learn the core concepts that ksqlDB is built around.
     <p class="card-body"><small>Connectors source and sink data from external systems.</small></p>
     <span><a href="/concepts/connectors">Learn →</a></span>
   </div>
-
+  <div class="card concepts">
+    <strong>Lambda Functions</strong>
+    <p class="card-body"><small>Lambda functions allow you to apply in-line functions without creating a full UDF.</small></p>
+    <span><a href="/concepts/lambda-functions">Learn →</a></span>
+  </div>
+
   <div class="card concepts">
     <a href="/overview/apache-kafka-primer"><strong>Apache Kafka primer</strong></a>
     <p class="card-body"><small>None of this making sense? Take a step back and learn the basics of Kafka first.</small></p>

diff --git a/docs/concepts/lambda-functions.md b/docs/concepts/lambda-functions.md
@@ -0,0 +1,14 @@
+---
+layout: page
+title: Lambda Functions
+keywords: ksqldb, function, udf, lambda
+---
+
+# Lambda Functions
+
+Use lambda functions, or "lambdas" for short, to express simple inline functions that can be applied to input values in various ways.
+For example, you could apply a lambda function to each element of a collection, resulting in a transformed output collection. 
+Also, you can use lambdas to filter the elements of a collection, or reduce a collection to a single value.
+The advantage of a lambda is that you can express user-defined functionality in a way that doesn’t require implementing a full [UDF](/how-to-guides/create-a-user-defined-function).
+
+Learn how to use lambda functions in the [how-to guide](/how-to-guides/use-lambda-functions-in-udfs).
diff --git a/docs/developer-guide/ksqldb-reference/scalar-functions.md b/docs/developer-guide/ksqldb-reference/scalar-functions.md
@@ -520,6 +520,60 @@ SLICE(col1, from, to)
 Slices a list based on the supplied indices. The indices start at 1 and
 include both endpoints.
 
+## Invocation Functions
+
+Apply lambda functions to collections.
+
+### `TRANSFORM`
+
+Since: 0.17.0
+
+```sql
+TRANSFORM(array, x => ...)
+
+TRANSFORM(map, (k,v) => ..., (k,v) => ...)
+```
+
+Transform a collection by using a lambda function.
+
+If the collection is an array, the lambda function must have one input argument.
+
+If the collection is a map, two lambda functions must be provided, and both lambdas must have two arguments: a map entry key and a map entry value.
+
+### `Reduce`
+
+Since: 0.17.0
+
+```sql
+REDUCE(array, state, (s, x) => ...)
+
+REDUCE(map, state, (s, k, v) => ...)
+```
+
+Reduce a collection starting from an initial state.
+
+If the collection is an array, the lambda function must have two input arguments.
+
+If the collection is a map, the lambda function must have three input arguments.
+
+If the state is `null`, the result is `null`.
+
+### `Filter`
+
+Since: 0.17.0
+
+```sql
+FILTER(array, x => ...)
+
+FILTER(map, (k,v) => ...)
+```
+
+Filter a collection with a lambda function.
+
+If the collection is an array, the lambda function must have one input argument.
+
+If the collection is a map, the lambda function must have two input arguments.
+
 ## Strings
 
 ### `CHR`

diff --git a/docs/how-to-guides/index.md b/docs/how-to-guides/index.md
@@ -69,6 +69,12 @@ Follow compact lessons that help you work with common ksqlDB functionality.
 </div>
 
 <div class="cards">
+  <div class="card how-to-guide">
+    <strong>Transforming columns with structured data</strong>
+    <p class="card-body"><small>Transform columns of structured data without user-defined functions.</small></p>
+    <span><a href="/how-to-guides/use-lambda-functions">Learn →</a></span>
+  </div>
+
   <div class="card how-to-guide contribute">
     <a href="https://github.com/confluentinc/ksql"><strong>Help us write another?</strong></a>
     <p class="card-body"><small>We're always looking for more guides. Just send a pull request!</small></p>

diff --git a/docs/how-to-guides/use-lambda-functions.md b/docs/how-to-guides/use-lambda-functions.md
@@ -0,0 +1,232 @@
+---
+layout: page
+title: How to transform columns with structured data.
+tagline: Transform columns of structured data without user-defined functions.
+description: ksqlDB can compose existing functions to create new expressions over structured data
+keywords: function, lambda, aggregation, user-defined function, ksqlDB  
+---
+# Use lambda functions
+
+## Context
+
+You want to transform a column with structured data in a particular way, but there doesn't 
+exist a built-in function that suits your needs and you're unable to implement and deploy a 
+user-defined function. ksqlDB is capable of composing existing functions to create 
+new expressions over structured data. These are called lambda functions.
+
+## In action
+```sql
+CREATE STREAM stream1 (
+  id INT,
+  lambda_map MAP<STRING, INTEGER>
+) WITH (
+  kafka_topic = 'stream1',
+  partitions = 1,
+  value_format = 'avro'
+);
+
+CREATE STREAM output AS
+  SELECT id, 
+  TRANSFORM(lambda_map, (k, v) => UCASE(k), (k, v) => v + 5) 
+  FROM stream1
+  EMIT CHANGES;
+```
+
+## Syntax
+
+The arguments for the lambda function are separated from the body of the lambda with the lambda operator, `=>`.
+
+When there are two or more arguments, you must enclose the arguments with parentheses. Parentheses are optional for lambda functions with one argument.
+
+Currently, ksqlDB supports up to three arguments in a single lambda function.
+
+```sql
+x => x + 5
+
+(x,y) => x - y
+
+(x,y,z) => z AND x OR y
+```
+
+## Invocation UDFs
+
+Lambda functions must be used inside designated invocation functions. These are the available Invocations:
+
+- [TRANSFORM](/developer-guide/ksqldb-reference/scalar-functions#TRANSFORM)
+- [REDUCE](/developer-guide/ksqldb-reference/scalar-functions#REDUCE)
+- [FILTER](/developer-guide/ksqldb-reference/scalar-functions#FILTER)
+
+## Create a lambda-compatible stream
+Invocation functions require either a map or array input. The following example creates a stream
+with a column type of `MAP<STRING, INTEGER>`.
+```sql
+CREATE STREAM stream1 (
+  id INT,
+  lambda_map MAP<STRING, INTEGER>
+) WITH (
+  kafka_topic = 'stream1',
+  partitions = 1,
+  value_format = 'avro'
+);
+```
+
+## Apply a lambda invocation function
+A lambda invocation function is a [scalar UDF](/developer-guide/ksqldb-reference/scalar-functions), and you use it like other scalar functions.
+
+The following example lambda function transforms both the key and value of a map and produces a new map. A built-in UDF transforms the key 
+into an uppercase string using a built in UDF, and the value is transformed through addition. The order of the variables 
+is important: the first item in the arguments list, named `k` in this example, is treated as the key, and the second, 
+named `v` in this example, is treated as the value. Pay attention to this if your map has different types. 
+Note that `transform` on a map requires two lambda functions, while `transform` on an array requires one.
+```sql
+CREATE STREAM output AS
+  SELECT id, 
+  TRANSFORM(lambda_map, (k, v) => UCASE(k), (k, v) => v + 5) 
+  FROM stream1;
+```
+
+Insert some values into `stream1`.
+```sql
+INSERT INTO stream1 (
+  id, lambda_map
+) VALUES (
+  3, MAP("hello":= 15, "goodbye":= -5)
+);
+```
+
+Query the output.
+```sql
+SELECT * FROM output AS final_output;
+```
+
+Your output should resemble:
+```sql
++------------------------------+------------------------------+
+|id                            |final_output                  |
++------------------------------+------------------------------+
+|3                             |{HELLO: 20}                   |
+|4                             |{GOODBYE: 0}                  |                           
+```
+
+## Use a reduce lambda invocation function
+The following example creates a stream with a column type `ARRAY<INTEGER>` and applies the `reduce` lambda 
+invocation function.
+```sql
+CREATE STREAM stream1 (
+  id INT,
+  lambda_arr ARRAY<INTEGER>
+) WITH (
+  kafka_topic = 'stream1',
+  partitions = 1,
+  value_format = 'avro'
+);
+
+CREATE STREAM output AS
+  SELECT id, 
+  REDUCE(lambda_arr, 2, (s, x) => ceil(x/s)) 
+  FROM stream1
+  EMIT CHANGES;
+```
+Insert some values into `stream1`.
+```sql
+INSERT INTO stream1 (
+  id, lambda_arr
+) VALUES (
+  1, ARRAY(2, 3, 4, 5)
+);
+```
+
+Query the output.
+```sql
+SELECT * FROM output AS final_output;
+```
+
+You should see something similar to:
+```sql
++------------------------------+------------------------------+
+|id                            |final_output                  |
++------------------------------+------------------------------+
+|1                             |{output:3}                    |  
+```
+
+## Use a filter lambda invocation function
+Create a stream with a column type `MAP<STRING, INTEGER>`and apply the `filter` lambda 
+invocation function. 
+```sql
+CREATE STREAM stream1 (
+  id INT,
+  lambda_map MAP<STRING, INTEGER>
+) WITH (
+  kafka_topic = 'stream1',
+  partitions = 1,
+  value_format = 'avro'
+);
+
+CREATE STREAM output AS
+  SELECT id, 
+  FILTER(lambda_map, (k, v) => instr(k, 'name') > 0 AND v != 0) 
+  FROM stream1
+  EMIT CHANGES;
+```
+Insert some values into `stream1`.
+```sql
+INSERT INTO stream1 (
+  id, lambda_arr
+) VALUES (
+  1, MAP("first name":= 15, "middle":= 25, "last name":= 0, "alt name":= 33)
+);
+```
+
+Query the output.
+```sql
+SELECT * FROM output AS final_output;
+```
+
+Your output should resemble:
+```sql
++------------------------------+-----------------------------------------------+
+|id                            |final_output                                   |
++------------------------------+-----------------------------------------------+
+|1                             |{first name: 15, alt name: 33}                 |  
+```
+
+## Advanced lambda use cases
+the following example creates a stream with a column type `MAP<STRING, ARRAY<DECIMAL(2,3)>` and applies the `transform` 
+lambda invocation function with a nested `transform` lambda invocation function.
+```sql
+CREATE STREAM stream1 (
+  id INT,
+  lambda_map MAP<STRING, ARRAY<DECIMAL(2,3)>>
+) WITH (
+  kafka_topic = 'stream1',
+  partitions = 1,
+  value_format = 'avro'
+);
+
+CREATE STREAM output AS
+  SELECT id, 
+  TRANSFORM(lambda_map, (k, v) => concat(k, '_new')  (k, v) => transform(v, x => round(x))) 
+  FROM stream1
+  EMIT CHANGES;
+```
+Insert some values into `stream1`.
+```sql
+INSERT INTO stream1 (
+  id, lambda_arr
+) VALUES (
+  1, MAP("Mary":= ARRAY[1.23, 3.65, 8.45], "Jose":= ARRAY[5.23, 1.65]})
+);
+```
+
+Query the output.
+```sql
+SELECT * FROM output AS final_output;
+```
+
+Your output should resemble:
+```sql
++------------------------------+----------------------------------------------------------+
+|id                            |final_output                                              |
++------------------------------+----------------------------------------------------------+
+|1                             |{Mary_new: [1, 4, 8], Jose_new: [5, 2]}                   |  
+```
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -46,6 +46,7 @@ nav:
     - Time and Windows: concepts/time-and-windows-in-ksqldb-queries.md
     - User-defined functions: concepts/functions.md
     - Connectors: concepts/connectors.md
+    - Lambda Functions: concepts/lambda-functions.md
     - Apache Kafka primer: concepts/apache-kafka-primer.md
   - How-to guides:
       - Synopsis: how-to-guides/index.md
@@ -58,6 +59,7 @@ nav:
       - Use a custom timestamp column: how-to-guides/use-a-custom-timestamp-column.md
       - Test an application: how-to-guides/test-an-app.md
       - Substitute variables: how-to-guides/substitute-variables.md
+      - Transforming columns with structured data: how-to-guides/use-lambda-functions.md
   - Tutorials:
       - Synopsis: tutorials/index.md
       - Materialized cache: tutorials/materialized.md