Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embedded code performance #424

Closed
alturkovic opened this issue Sep 22, 2024 · 2 comments
Closed

Embedded code performance #424

alturkovic opened this issue Sep 22, 2024 · 2 comments

Comments

@alturkovic
Copy link

alturkovic commented Sep 22, 2024

Hi!
I am trying to use GraalVM from Kotlin to execute a Python script. I need to pass some input parameters, evaluate a script and read the result as a json.

I did a simple comparison with Jython:

package org.example.test

import com.fasterxml.jackson.databind.JsonNode
import com.fasterxml.jackson.module.kotlin.jacksonObjectMapper
import org.graalvm.polyglot.Context
import org.graalvm.polyglot.Engine
import org.graalvm.polyglot.Source
import org.python.util.PythonInterpreter
import kotlin.time.measureTime

fun main() {
    graalvm()
    jython()
}

val mapper = jacksonObjectMapper()

private fun graalvm() {
    val duration = measureTime {
        val engine = Engine.newBuilder()
            .option("engine.WarnInterpreterOnly", "false")
            .build()

        val script = Source.create("python", "{'name':'John', 'id': id}")

        repeat(100) {
            val ctx = Context.newBuilder().engine(engine).build()
            ctx.polyglotBindings.putMember("id", it)

            val result = ctx.eval(script)
            mapper.valueToTree<JsonNode>(result.`as`(Map::class.java))
        }
    }

    println("GraalVM: $duration")
}

private fun jython() {
    val duration = measureTime {
        val script = PythonInterpreter().compile("{'name':'John', 'id': id}")

        repeat(100) {
            PythonInterpreter().use { interpreter ->
                interpreter["id"] = it
                val pyResult = interpreter.eval(script)
                mapper.valueToTree<JsonNode>(pyResult)
            }
        }
    }

    println("Jython: $duration")
}

But the Jython implementation is ~4x faster for this example. I tried to compile and reuse the script with both implementations and just inject the necessary value.

I noticed I can reuse my val ctx = Context.newBuilder().engine(engine).build() by moving it outside the repeat block to make this example fast, but that is simply mutating the same context object in a loop and would retain old parameters in case they weren't reset after every execution and that seems fiddly.

Am I doing something wrong? Is there a better way to evaluate scripts from Java/Kotlin code?


After revisiting this issue, I added some simple logging to track the performance and it seems that most of the Jython time is spent on script parsing:

import com.fasterxml.jackson.databind.JsonNode
import com.fasterxml.jackson.module.kotlin.jacksonObjectMapper
import org.graalvm.polyglot.Context
import org.graalvm.polyglot.Engine
import org.graalvm.polyglot.Source
import org.python.util.PythonInterpreter
import kotlin.time.measureTime
import kotlin.time.measureTimedValue

fun main() {
    graalvm()
    jython()
}

val mapper = jacksonObjectMapper()

private fun graalvm() {
    val duration = measureTime {
        val engine = Engine.newBuilder()
            .option("engine.WarnInterpreterOnly", "false")
            .build()

        val builder = Context.newBuilder("python").engine(engine)

        val source = Source.create("python", "{'name':'John', 'id': id}")

        repeat(1000) {
            val executionTime = measureTime {
                builder.build().use { ctx ->
                    ctx.polyglotBindings.putMember("id", it)
                    val result = ctx.eval(source)
                    mapper.valueToTree<JsonNode>(result.`as`(Map::class.java))
                }
            }
            println(executionTime)
        }
    }
    println("GraalVM: $duration")
}

private fun jython() {
    val duration = measureTime {
        val (script, duration) = measureTimedValue { PythonInterpreter().compile("{'name':'John', 'id': id}") }

        println("Init: $duration")

        repeat(1000) {
            val executionTime = measureTime {
                PythonInterpreter().use { interpreter ->
                    interpreter["id"] = it
                    val pyResult = interpreter.eval(script)
                    mapper.valueToTree<JsonNode>(pyResult)
                }
            }
            println(executionTime)
        }
    }

    println("Jython: $duration")
}

This simple example yields almost the same amount of time on Jython with 100 or 1000 records, but GraalVM implementation scaled very poorly. Average Jython execution time is around 10 microseconds after warmup, whereas GraalVM seems to be around 10 ms after warmup, so it is actually a 1000x difference after warmup?

I must be doing something wrong with the GraalVM implementation, but I cannot figure out what.

@msimacek
Copy link
Contributor

Jython's interpreters are not the same concept as our contexts. Contexts provide full isolation, they don't share any language state, a module imported in one context is independent from module imported in another context. And recreating all the state and reimporting all the modules (python needs a to import a lot of stuff for the core to work even if you don't import anything yourself) has a cost. Jython's intepreters have different global namespace, but they are not isolated, they share modules and other stuff. Writing into a module in one interpreter is visible in other interpreters.

So having a single context is closer to what Jython is doing. Unfortunately, the context API doesn't have a way to say you just want a new namespace. The idiomatic way of running parametrized code with GraalPy is to create a function and execute it with parameters. In your example, that would be:

        val ctx = Context.newBuilder().option("engine.WarnInterpreterOnly", "false").build()
        val fn = ctx.eval("python", """
            def fn(id):
                return {'name':'alen', 'id': id}
            fn
        """.trimIndent())

        repeat(100) {
            val result = fn.execute(it)
            mapper.valueToTree<JsonNode>(result.`as`(Map::class.java))
        }

It's still a bit slower than Jython, because we just do more stuff in the first initialization, but it's much better than before.
If you need more flexibility than you can get with functions (i.e. the code needs to change), you can wrap python exec/eval functions, like this:

        val ctx = Context.newBuilder().option("engine.WarnInterpreterOnly", "false").build()
        val evalFn = ctx.eval("python", """
            def eval_fn(code, namespace):
                return eval(code, namespace, namespace)
            eval_fn
        """.trimIndent())
        val createDict = ctx.eval("python", "dict")

        repeat(100) {
            val namespace = createDict.execute()
            namespace.putHashEntry("id", it)
            val result = evalFn.execute("{'name':'alen', 'id': id}", namespace)
            mapper.valueToTree<JsonNode>(result.`as`(Map::class.java))
        }

@alturkovic
Copy link
Author

Great, that makes sense, thank you for the explanation and the code samples, that helps a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants