Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sparqlstore: Optionally support blank nodes as <bnode:> URIs #512

Merged
merged 1 commit into from
Aug 27, 2015

Conversation

ssssam
Copy link
Contributor

@ssssam ssssam commented Aug 22, 2015

The 4Store triple store understands bnode: URIs as pointing to blank
nodes, which makes it possible to query and update them using its SPARQL
endpoint.

#511

The 4Store triple store understands <bnode:> URIs as pointing to blank
nodes, which makes it possible to query and update them using its SPARQL
endpoint.
@ssssam ssssam force-pushed the sam/sparqlstore-blank-nodes branch from 2a956e9 to 0aac7b0 Compare August 22, 2015 19:33
@joernhees joernhees added enhancement New feature or request SPARQL store Related to a store. labels Aug 23, 2015
@joernhees joernhees added this to the rdflib 4.2.2 milestone Aug 23, 2015
@uholzer
Copy link
Contributor

uholzer commented Aug 23, 2015

Seems to be a nice feature. However, as it is non-standard, other SPARQL stores might have other solutions to this problem. I personally would prefere if one could pass the SPARQLStore a function as bNodeAsURI (or in this case, better bNodeToURI) which transforms a BNode into a URIRef. Then we could predefine a SPARQLStore.BNODE_TO_BNODE_URI = lambda bn: UIRef('<bnode:%s>' % bn).

If we implement this features, shouldn't we also convert <bnode:...>-URIs received from the server to BNode instances if bNodeAsURI is set? And then, there is also skolemization, which may be supported by some SPARQL endpoints.

Any ideas on how to find a flexible solution that can be adapted to a specific enpoint?

And finally, we should not forget about the SPARQLUpdateStore, which might need to be adapted too.

@joernhees
Copy link
Member

yupp, i had pretty similar thoughts on this one: maybe the right thing to do is ease the customizability of sparqlstore. This could for example be done by passing a to_sparql method into the SPARQLStore constructor. All the checking for BNodes and raising exceptions by default should be done in our default method, which you can substitute by your own to_sparql method if you want to use non standard features such as the 4store one.

I'm not really sure how common the <bnode:...> extension is, but i somehow fear that other stores use other ways. So until it becomes clear that this is at least a common thing of several endpoints, i'd somehow be against including it as a default.

@ssssam
Copy link
Contributor Author

ssssam commented Aug 23, 2015

Thanks for the comments!

Virtuoso seems to allow <NodeID://b0005> for blank nodes, and Jena's ARQ query engine allows <_:0005> to refer to blank nodes by ID. So there are at least two alternative syntaxes to support !

So I think turning bNodeAsURI into a function is a great idea, so the SPARQLStore class can work with these various different SPARQL engines.

In terms of results, they are returned in XML form, and blank nodes appear as ns:bnode elements. The CastToTerm function already handles those and converts them to rdflib.BNode() instances.

The patch I submitted should make SPARQLUpdateStore work the same as SPARQLStore; I haven't actually tested that it does, though.

I'm happy to rework this to take a function instead of a boolean, I wonder if you can advise me on how the API should look, though? I thought about renaming 'bNodeAsURI' to 'bnode_as_uri' already, but I thought that it would cause an API break for anyone who is currently calling plugins.sparqlstore.SPARQLStore(..., bNodeAsURI=False, ...). Renaming it bNodeToURI would cause the same thing. So maybe it's best to keep the name the same, and allow it to be either False, or a function that returns the blank node as a string ?

@iherman
Copy link
Contributor

iherman commented Aug 23, 2015

Worth noting that the BNode.skolemize() and URIRef.deskolemize() methods have been added to RDFLib, implementing the skolemization approach described in the RDF 1.1 spec. This is of course independent from any SPARQL mapping; not sure whether the SPARQL engines recognize these forms these days.

Ivan

On 23 Aug 2015, at 16:20 , Sam Thursfield notifications@github.com wrote:

Thanks for the comments!

Virtuoso seems to allow NodeID://b0005 for blank nodes, and Jena's ARQ query engine allows <_:0005> to refer to blank nodes by ID. So there are at least two alternative syntaxes to support !

So I think turning bNodeAsURI into a function is a great idea, so the SPARQLStore class can work with these various different SPARQL engines.

In terms of results, they are returned in XML form, and blank nodes appear as ns:bnode elements. The CastToTerm function already handles those and converts them to rdflib.BNode() instances.

The patch I submitted should make SPARQLUpdateStore work the same as SPARQLStore; I haven't actually tested that it does, though.

I'm happy to rework this to take a function instead of a boolean, I wonder if you can advise me on how the API should look, though? I thought about renaming 'bNodeAsURI' to 'bnode_as_uri' already, but I thought that it would cause an API break for anyone who is currently calling plugins.sparqlstore.SPARQLStore(..., bNodeAsURI=False, ...). Renaming it bNodeToURI would cause the same thing. So maybe it's best to keep the name the same, and allow it to be either False, or a function that returns the blank node as a string ?


Reply to this email directly or view it on GitHub.


Ivan Herman
4, rue Beauvallon, clos St Joseph
13090 Aix-en-Provence, France
http://www.ivan-herman.net

@uholzer
Copy link
Contributor

uholzer commented Aug 23, 2015

Suggestion: We add two optional arguments node_to_sparql and node_from_result to SPARQLStore's constructor. Both take a function which takes a node and returns a node. All nodes are converted using these functions, not only BNodes. We then predefine two sets of functions: One that just handles skolemization using well-known IRIs, and one that uses <bnode:>-URIs. By default, we use a function that raises an exception when a blank node is encountered for node_to_sparql, and the identity for node_from_result.

The argument bNodeAsURI currently does not do anything. And it seems that it never did. It may be that SPARQLWrapper once interpreted this option, but the current version doesn't seem to. So, I would just document it as ignored and deprecated, while keeping it as optional agument to the constructor.

@ssssam
Copy link
Contributor Author

ssssam commented Aug 23, 2015

Marking bNodeAsURI as deprecated and adding new parameters for the new functionality sounds sensible indeed.

I think node_to_sparql should return a string, rather than an rdflib.term.Node. It has to become a string to be embedded in a SPARQL query in any case.

I'm not quite sure the purpose of node_from_result -- how would it relate to the current CastToTerm function? Would override it, or extend it?

@uholzer
Copy link
Contributor

uholzer commented Aug 24, 2015

String as return value of node_to_sparql: agreed.

node_from_result: Yes, this overrides CastToTerm, I would say. The danger is of course, that we probably expose too much internals, leading to backwards-incompatible changes later on.

@joernhees
Copy link
Member

please have a look at #513

joernhees added a commit to joernhees/rdflib that referenced this pull request Aug 25, 2015
joernhees added a commit to joernhees/rdflib that referenced this pull request Aug 25, 2015
@joernhees joernhees merged commit 0aac7b0 into RDFLib:master Aug 27, 2015
joernhees added a commit that referenced this pull request Aug 27, 2015
* master:
  PEP8: code cleanup
  SPARQLStore: fixed dangling { in regexp
  SPARQLStore deprecated unused bNodeAsURI arg
  SPARQLStore: added node_to_sparql and node_from_result args, closes #512
  PEP8 deprecated sparqlstore.CastToTerm, TraverseSPARQLResultDOM and localName methods
  sparqlstore: Optionally support blank nodes as <bnode:> URIs
joernhees added a commit that referenced this pull request Aug 27, 2015
* master:
  test changed to use _node_from_result instead of CastToTerm
  PEP8: code cleanup
  SPARQLStore: fixed dangling { in regexp
  SPARQLStore deprecated unused bNodeAsURI arg
  SPARQLStore: added node_to_sparql and node_from_result args, closes #512
  PEP8 deprecated sparqlstore.CastToTerm, TraverseSPARQLResultDOM and localName methods
  sparqlstore: Optionally support blank nodes as <bnode:> URIs
joernhees added a commit to joernhees/rdflib that referenced this pull request Mar 9, 2016
initBindings, contexts, addN, remove, add_graph and remove_graph now call the
node_to_sparql customizable function. Some support for BNode graph names added.

Add-on for RDFLib#513, see also RDFLib#511, RDFLib#512
joernhees added a commit to joernhees/rdflib that referenced this pull request Mar 9, 2016
query (initBindings), contexts, addN, remove, add_graph and remove_graph now call
node_to_sparql. Some support for BNode graph names added.

Add-on for RDFLib#513, see also RDFLib#511, RDFLib#512
joernhees added a commit to joernhees/rdflib that referenced this pull request Mar 9, 2016
query (initBindings), contexts, addN, remove, add_graph and remove_graph now call
node_to_sparql. Some support for BNode graph names added.

Add-on for RDFLib#513, see also RDFLib#511, RDFLib#512
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request SPARQL store Related to a store.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants