-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(expressions): Use compound names in function identifiers #23
fix(expressions): Use compound names in function identifiers #23
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this change, the intent for allowing usage of the simple names was so that the builders can be used to auto resolve the compound function names based on the passed in argument types. You can see that resolution issue here when you had to change the types of the outputs. Notice that in the expressions_tests the output is no longer fp32. We want the type resolution to still work.
As for the NewCustom*Func functions, those are necessary for the cases where users utilize custom URNs rather than valid URIs. This is why we don't error out when we can't find the function. You can see more discussion of it here: #14 (comment)
func ExampleExpression_scalarFunction() { | ||
func TestExampleExpressionScalarFunction(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why make this change? The intent was for this to show up as an example in the docs on pkg.go.dev, by making this change the test will no longer show up as a testable example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! That makes sense. I wanted to use assert
as used elsewhere; I'm not used to example's like that, so I didn't realize the regression. Thanks for pointing it out!
// print string represention of the expression | ||
fmt.Println(fromProto) | ||
// print the string representation of our | ||
// manually constructed expression | ||
// Check the string represention of the expression | ||
assert.Equal(t, "add:i8_i8(.field(0), fp64(10)) => i8", fromProto.String()) | ||
// Check our manually constructed expression | ||
assert.Equal(t, "add:i8_i8(.field(0) => i8, fp64(10)) => i8", ex.String()) | ||
fmt.Println(ex) | ||
|
||
// verify that the Equals methods work recursively | ||
fmt.Println(ex.Equals(fromProto)) | ||
// verify that the two are equal | ||
assert.True(t, ex.Equals(fromProto)) | ||
// confirm our manually constructed expression is the same | ||
// as the one we got from protojson | ||
fmt.Println(pb.Equal(&exprProto, toProto)) | ||
|
||
// Output: | ||
// add(.field(0), fp64(10)) => i32 | ||
// add(.field(0) => i32, fp64(10)) => i32 | ||
// true | ||
// true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, we want the // Output:
and such so that this shows up as a testable example in the documentation to make it easy for consumers to see how to work with the library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, as above 👍
@zeroshade - great points! Yes, I had some uncertainty going into this, and you've confirmed it. Perhaps we should start by clarifying goals / requirements, so that we can get an overall improvement? I see a few that we've gathered:
So you pointed out 3-5, and that makes sense to me; I can work on that. I might do a bit of looking at the spec and ask about (3) on Slack, as the intention makes sense in a vague way to me, but I don't understand the specifics, and I'd like to. For 1-2, do those make sense to you as reasonable goals? Also - please let me know if any of the above is incorrect or could be restated better, I'm new to this repo and trying to figure it out. Thanks for your help, I appreciate it! |
Hey @wackywendell: 1 and 2 make perfect sense as reasonable goals for me. I believe that number 2 should already occur (to my knowledge compound function names should already get interpretted into the correct variant, if not then I must have missed a test somewhere and I'm glad you're fixing it!). Did you find a case where number 2 isn't currently true in the code? |
For (2), it looks like that works most of the time, but has some issues with functions that are (a) specified with compound names, and (b) not found in the YAML, so they end up getting created with the See example here, where an ID with a compound name This becomes relevant because if you fix (1), then you get problems related to (2), where parsing a protobuf file into the substrait-go representation and then back to protobuf fails. 🤔 I think this might be solved by changing the |
Without digging too much into it myself at the moment, that sounds like a reasonable approach. Wanna try it out and see if that fixes the issue? |
Yes - I'm giving it a try. Thanks! Let me see what code I can come up with and get back to you 🙇 |
Closing in favor of #24. |
This change addresses a bug where the type signatures in substrait proto files were lost in decoding -> encoding, and only simple names were used.
To address this, I updated the expression builders to always use compound names (e.g.
add:i64_i64
) and not just simple names (add
). The spec currently requires compound names for functions with multiple impls (and may soon require them always), and the signatures were lost in the conversion from protobuf to internal formats. That is the first commit, with tests updated in the second commit.Secondly, I changed the code so that if the function is not found with the given type signature, a "not found" error is returned, just as it is if the function is not found. That's the third commit. I'm less attached to this, but it helped in testing because I found a number of tests that "passed" incorrectly as (with the change above) they were referring to non-existent functions.
For that second part - I'm not sure it's necessary; we can drop that commit if you prefer. Alternatively - it might be worth removing the
NewCustom*Func
functions altogether.