-
Notifications
You must be signed in to change notification settings - Fork 555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented function translateAlgebra. This functions takes a SPARQL … #1322
Conversation
…query algebra as an input and returns the query text. Tests are in test/translate_algebra. I did not follow any specific test framework like Nose but wrote my own one.
Automated tests passed for python 3.6 and 3.8 but not for 3.7, which is strange. |
This is interesting because the same situation happened yesterday with #1321 |
This error ( |
…query algebra as an input and returns the query text. Tests are in test/translate_algebra. I did not follow any specific test framework like Nose but wrote my own one. Automated tests did not pass for python 3.7. Trying again
You can do an amend without actually changing anything, if I do |
ooohh ok I did not know that. But very useful, thanks! |
…atterns and solution modifiers. Added test (see test_integration__complex_query1.txt) A potential bug has been found which occurs when non-aggregation variables are translated into Aggregate-Sample by the translateQuery function. A back-translation of the algebra will get rid of the "sample aggregation function" but only in cases when variables are not bound (see test_integration__complex_query1.txt) and the code under "elif node.name == "AggregateJoin"" where I include a treatment for SAMPLE)
…ver these bugfixes
@GreenfishK apologies for not addressing this PR sooner - project work is busy towards the end of the Financial Year in June! Can you please provide a bit more background, specifically what the motivation for this work is? I see a special test rig here and a series of SPARQL queries testing many aspects of conversions to/from the algebra, but what is the ultimate purpose of the work? Is it that, until now, there hasn't been any comprehensive test suite for rdflib's SPARQL handling and you want one due to bugs you've noticed? If this is the case, what can we add to documentation to show what parts of SPARQL are tested and what parts are not? |
Hi @nicholascar, thank you for the question. This project is not about testing conversions, it is about creating a single function that let's you provide a SPARQL algebra and in return get a SPARQL query. Currently, in rdflib only the other way around is possible, so, converting a SPARQL algebra into a query tree/query algebra. The motivation behind this is Data Citation. Read more about it here "Data Citation WG | RDA" https://www.rd-alliance.org/groups/data-citation-wg.html There is one recommendation to normalize queries in order to tell whether two queries are equal, so basically what query containment solvers do. My idea was to do the normalization not on the query but on the query's algebra, which is easier to work with. Once you have done this, you need to turn back the SPARQL algebra into a SOARQL query but there is currently no offered function by rdflib and that is what I am providing now. |
OK, so this is essentially a stand-alone new functionality then. Well that's fine! I do know the RDA well, and even that WG. I was the co-chair of the Provenance Patterns WG when that WG was in operations and I reviewed their outputs, including the qurey re-writing stuff. I worked with Andreas Rauber on some elements of query normalisation. So happy to see work on this toolkit that supports those aims! |
Wow that's such a coincidence! Thank you for approving it. Such a great feeling. |
@GreenfishK I've merged but would you consider adding a documetation section to one of the existing pages, perhaps https://rdflib.readthedocs.io/en/stable/intro_to_sparql.html - just a few lines at the end of that page about the capability you've introduced? The reason is that without a note in there, people won't know this capability's there. |
Than you! Sure, i will document it. |
pass | ||
|
||
|
||
def translateAlgebra(query_algebra: Query = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure under what circumstance it makes sense for this to be null, because if it is null traverse(query_algebra.algebra, visitPre=sparql_query_text)
will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. I will correct it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not too serious, just saw the error in mypy and thought I would let you know in case I was missing something.
Other changes: - fix return value for `Graph.serialize` (refs RDFLib#1394) - remove default value for `rdflib.plugins.sparql.algebra.translateAlgebra` (refs RDFLib#1322) - add .dockerignore to reduce context size and make docker quicker to run. - add .flake8 config to ignore line length as black is managing formatting. - add mypy to docker-compose, makefile and tox.ini - fix the way csv2rdf is invoked to ensure that the right code gets executed.
Other changes: - fix return value for `Graph.serialize` (refs RDFLib#1394) - remove default value for `rdflib.plugins.sparql.algebra.translateAlgebra` (refs RDFLib#1322) - add .dockerignore to reduce context size and make docker quicker to run. - add .flake8 config to ignore line length as black is managing formatting. - add mypy to docker-compose, makefile and tox.ini - fix the way csv2rdf is invoked to ensure that the right code gets executed.
Other changes: - fix return value for `Graph.serialize` (refs RDFLib#1394) - remove default value for `rdflib.plugins.sparql.algebra.translateAlgebra` (refs RDFLib#1322) - add .dockerignore to reduce context size and make docker quicker to run. - add .flake8 config to ignore line length as black is managing formatting. - add mypy to docker-compose, makefile and tox.ini - fix the way csv2rdf is invoked to ensure that the right code gets executed.
Fixes RDFLib#1311 Add `mypy` to .drone.yml and fix type errors that come up. Not type checking examples or tests. Other changes: - fix return value for `Graph.serialize` (refs RDFLib#1394) - remove default value for `rdflib.plugins.sparql.algebra.translateAlgebra` (refs RDFLib#1322) - add .dockerignore to reduce context size and make docker quicker to run. - add .flake8 config to ignore line length as black is managing formatting. - add mypy to docker-compose, makefile and tox.ini - fix the way csv2rdf is invoked to ensure that the right code gets executed.
…query algebra as an input and returns the query text. It should cover the whole SPARQL 1.1 specification for Select-Queries. Tests are in test/translate_algebra. I did not follow any specific test framework like Nose but wrote my own one which can simply be executed by running test/translate_algebra/main.py file.
The specifications I followed from here: https://www.w3.org/TR/sparql11-query/
Especially following sections:
17.3 Operator Mapping
17.4 Function Definitions
18.2 Translation to the SPARQL Algebra
Tests:
I tried to cover all possible constructs.
Two tests did not pass due to two potential bugs:
Reason for not covering ASK queries:
The query form "ASK" is currently not supported as its algebra tree that comes from translateQuery() might be wrongly implemented. ASK-queries do not project any variables in theory but the query algebra shows a "project" node with "vars". Therefore, it would require a special treatment in the new function I am proposing.