Embedded code performance #424

alturkovic · 2024-09-22T09:35:06Z

Hi!
I am trying to use GraalVM from Kotlin to execute a Python script. I need to pass some input parameters, evaluate a script and read the result as a json.

I did a simple comparison with Jython:

package org.example.test

import com.fasterxml.jackson.databind.JsonNode
import com.fasterxml.jackson.module.kotlin.jacksonObjectMapper
import org.graalvm.polyglot.Context
import org.graalvm.polyglot.Engine
import org.graalvm.polyglot.Source
import org.python.util.PythonInterpreter
import kotlin.time.measureTime

fun main() {
    graalvm()
    jython()
}

val mapper = jacksonObjectMapper()

private fun graalvm() {
    val duration = measureTime {
        val engine = Engine.newBuilder()
            .option("engine.WarnInterpreterOnly", "false")
            .build()

        val script = Source.create("python", "{'name':'John', 'id': id}")

        repeat(100) {
            val ctx = Context.newBuilder().engine(engine).build()
            ctx.polyglotBindings.putMember("id", it)

            val result = ctx.eval(script)
            mapper.valueToTree<JsonNode>(result.`as`(Map::class.java))
        }
    }

    println("GraalVM: $duration")
}

private fun jython() {
    val duration = measureTime {
        val script = PythonInterpreter().compile("{'name':'John', 'id': id}")

        repeat(100) {
            PythonInterpreter().use { interpreter ->
                interpreter["id"] = it
                val pyResult = interpreter.eval(script)
                mapper.valueToTree<JsonNode>(pyResult)
            }
        }
    }

    println("Jython: $duration")
}

But the Jython implementation is ~4x faster for this example. I tried to compile and reuse the script with both implementations and just inject the necessary value.

I noticed I can reuse my val ctx = Context.newBuilder().engine(engine).build() by moving it outside the repeat block to make this example fast, but that is simply mutating the same context object in a loop and would retain old parameters in case they weren't reset after every execution and that seems fiddly.

Am I doing something wrong? Is there a better way to evaluate scripts from Java/Kotlin code?

After revisiting this issue, I added some simple logging to track the performance and it seems that most of the Jython time is spent on script parsing:

import com.fasterxml.jackson.databind.JsonNode
import com.fasterxml.jackson.module.kotlin.jacksonObjectMapper
import org.graalvm.polyglot.Context
import org.graalvm.polyglot.Engine
import org.graalvm.polyglot.Source
import org.python.util.PythonInterpreter
import kotlin.time.measureTime
import kotlin.time.measureTimedValue

fun main() {
    graalvm()
    jython()
}

val mapper = jacksonObjectMapper()

private fun graalvm() {
    val duration = measureTime {
        val engine = Engine.newBuilder()
            .option("engine.WarnInterpreterOnly", "false")
            .build()

        val builder = Context.newBuilder("python").engine(engine)

        val source = Source.create("python", "{'name':'John', 'id': id}")

        repeat(1000) {
            val executionTime = measureTime {
                builder.build().use { ctx ->
                    ctx.polyglotBindings.putMember("id", it)
                    val result = ctx.eval(source)
                    mapper.valueToTree<JsonNode>(result.`as`(Map::class.java))
                }
            }
            println(executionTime)
        }
    }
    println("GraalVM: $duration")
}

private fun jython() {
    val duration = measureTime {
        val (script, duration) = measureTimedValue { PythonInterpreter().compile("{'name':'John', 'id': id}") }

        println("Init: $duration")

        repeat(1000) {
            val executionTime = measureTime {
                PythonInterpreter().use { interpreter ->
                    interpreter["id"] = it
                    val pyResult = interpreter.eval(script)
                    mapper.valueToTree<JsonNode>(pyResult)
                }
            }
            println(executionTime)
        }
    }

    println("Jython: $duration")
}

This simple example yields almost the same amount of time on Jython with 100 or 1000 records, but GraalVM implementation scaled very poorly. Average Jython execution time is around 10 microseconds after warmup, whereas GraalVM seems to be around 10 ms after warmup, so it is actually a 1000x difference after warmup?

I must be doing something wrong with the GraalVM implementation, but I cannot figure out what.

The text was updated successfully, but these errors were encountered:

msimacek · 2024-09-26T09:14:27Z

Jython's interpreters are not the same concept as our contexts. Contexts provide full isolation, they don't share any language state, a module imported in one context is independent from module imported in another context. And recreating all the state and reimporting all the modules (python needs a to import a lot of stuff for the core to work even if you don't import anything yourself) has a cost. Jython's intepreters have different global namespace, but they are not isolated, they share modules and other stuff. Writing into a module in one interpreter is visible in other interpreters.

So having a single context is closer to what Jython is doing. Unfortunately, the context API doesn't have a way to say you just want a new namespace. The idiomatic way of running parametrized code with GraalPy is to create a function and execute it with parameters. In your example, that would be:

        val ctx = Context.newBuilder().option("engine.WarnInterpreterOnly", "false").build()
        val fn = ctx.eval("python", """
            def fn(id):
                return {'name':'alen', 'id': id}
            fn
        """.trimIndent())

        repeat(100) {
            val result = fn.execute(it)
            mapper.valueToTree<JsonNode>(result.`as`(Map::class.java))
        }

It's still a bit slower than Jython, because we just do more stuff in the first initialization, but it's much better than before.
If you need more flexibility than you can get with functions (i.e. the code needs to change), you can wrap python exec/eval functions, like this:

        val ctx = Context.newBuilder().option("engine.WarnInterpreterOnly", "false").build()
        val evalFn = ctx.eval("python", """
            def eval_fn(code, namespace):
                return eval(code, namespace, namespace)
            eval_fn
        """.trimIndent())
        val createDict = ctx.eval("python", "dict")

        repeat(100) {
            val namespace = createDict.execute()
            namespace.putHashEntry("id", it)
            val result = evalFn.execute("{'name':'alen', 'id': id}", namespace)
            mapper.valueToTree<JsonNode>(result.`as`(Map::class.java))
        }

alturkovic · 2024-09-26T09:40:25Z

Great, that makes sense, thank you for the explanation and the code samples, that helps a lot!

alturkovic closed this as completed Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embedded code performance #424

Embedded code performance #424

alturkovic commented Sep 22, 2024 •

edited

Loading

msimacek commented Sep 26, 2024

alturkovic commented Sep 26, 2024

Embedded code performance #424

Embedded code performance #424

Comments

alturkovic commented Sep 22, 2024 • edited Loading

msimacek commented Sep 26, 2024

alturkovic commented Sep 26, 2024

alturkovic commented Sep 22, 2024 •

edited

Loading