Signing named graphs #682

LibrEars · 2017-01-12T09:23:36Z

Hi all,

I would like to store my experiment data with RDFLib. Moreover I want to sign the data of every single experiment to make it sharable for the future. My approach is to use one named graph for each experiment, hash this graph and sign the hash.

It seems that I don't really understand the concept of isomorphic graphs. Why is the hash of the isomorphic named graph the same as the hash of its isomorphic conjunctive graph?

Should I sign the _TripleCanonicalizer(gmary).to_hash() instead or do I run in problems with blank nodes with this approach?

Here is some code to clarify what I want to do:

from rdflib import Namespace, Literal, URIRef, BNode
from rdflib.graph import Graph, ConjunctiveGraph
from rdflib.plugins.memory import IOMemory

from rdflib.compare import to_isomorphic, _TripleCanonicalizer


ns = Namespace("http://love.com#")

mary = BNode()
john = URIRef("http://love.com/lovers/john#")

cmary=URIRef("http://love.com/lovers/mary#")
cjohn=URIRef("http://love.com/lovers/john#")

store = IOMemory()

g = ConjunctiveGraph(store=store)
g.bind("love",ns)

gmary = Graph(store=store, identifier=cmary)

gmary.add((mary, ns['hasName'], Literal("Mary")))
gmary.add((mary, ns['loves'], john))

gjohn = Graph(store=store, identifier=cjohn)
gjohn.add((john, ns['hasName'], Literal("John")))

print("The internal hash of an named graph is the same as the internal hash of the Conjunctive graph: " +
      str(to_isomorphic(g).internal_hash() == to_isomorphic(gmary).internal_hash()) + "\n")

# Prints to 'True'

print("The internal hash of an named graph is the same as the internal hash of the Conjunctive graph: " +
      str(_TripleCanonicalizer(g).to_hash() == _TripleCanonicalizer(gmary).to_hash()))

# Prints to 'False'


# Example how I think to proove the signature of signed graphs:
for h in g.objects(mary_public_keys, wot.signed):
    # First verify the signature
    if gpg.verify(str(h)):
        # Second compare hash
        sigHash = gpg.decrypt(h).data.decode("utf-8").strip() # gpg.decrypt(h): bytes --> string
        
        identifier = str(g.value(h, RDFS.label))
        signedG = g.get_context(identifier)
        realHash = str(to_isomorphic(signedG).internal_hash())  # Gives the wrong hash?
                                                                # (Whath happens with two existing equal identifiers/contexts?)
        print(sigHash)
        print(realHash)
        
        if sigHash == realHash:
            print("Graph verified")
        
        else:
            print("Signature verified but graph has changed")
    
    else:
        print("Signature verification failed")

Cheers,
LibrEars

The text was updated successfully, but these errors were encountered:

nicholascar · 2017-01-30T00:50:16Z

I'm interested in this too. In the past (early rdflib days) I implemented my own graph hasher and wrote my own code to serialise the graph with deterministic blank node names. I'll be happy to see an answer here too!

joernhees · 2017-01-30T11:11:41Z

i briefly looked into this before, but maybe @jimmccusker could have a look...

seems to be a bug in to_isomorphic:

In [5]: list(g)
Out[5]:
[(rdflib.term.BNode('N387ffec6cfcf427499ff7c3a00db24dc'),
  rdflib.term.URIRef(u'http://love.com#loves'),
  rdflib.term.URIRef(u'http://love.com/lovers/john#')),
 (rdflib.term.BNode('N387ffec6cfcf427499ff7c3a00db24dc'),
  rdflib.term.URIRef(u'http://love.com#hasName'),
  rdflib.term.Literal(u'Mary')),
 (rdflib.term.URIRef(u'http://love.com/lovers/john#'),
  rdflib.term.URIRef(u'http://love.com#hasName'),
  rdflib.term.Literal(u'John'))]

In [6]: list(gmary)
Out[6]:
[(rdflib.term.BNode('N387ffec6cfcf427499ff7c3a00db24dc'),
  rdflib.term.URIRef(u'http://love.com#loves'),
  rdflib.term.URIRef(u'http://love.com/lovers/john#')),
 (rdflib.term.BNode('N387ffec6cfcf427499ff7c3a00db24dc'),
  rdflib.term.URIRef(u'http://love.com#hasName'),
  rdflib.term.Literal(u'Mary'))]

In [7]: list(gjohn)
Out[7]:
[(rdflib.term.URIRef(u'http://love.com/lovers/john#'),
  rdflib.term.URIRef(u'http://love.com#hasName'),
  rdflib.term.Literal(u'John'))]

In [8]: list(to_isomorphic(g))
Out[8]:
[(rdflib.term.BNode('N387ffec6cfcf427499ff7c3a00db24dc'),
  rdflib.term.URIRef(u'http://love.com#loves'),
  rdflib.term.URIRef(u'http://love.com/lovers/john#')),
 (rdflib.term.BNode('N387ffec6cfcf427499ff7c3a00db24dc'),
  rdflib.term.URIRef(u'http://love.com#hasName'),
  rdflib.term.Literal(u'Mary')),
 (rdflib.term.URIRef(u'http://love.com/lovers/john#'),
  rdflib.term.URIRef(u'http://love.com#hasName'),
  rdflib.term.Literal(u'John'))]

In [9]: list(to_isomorphic(gmary))
Out[9]:
[(rdflib.term.BNode('N387ffec6cfcf427499ff7c3a00db24dc'),
  rdflib.term.URIRef(u'http://love.com#loves'),
  rdflib.term.URIRef(u'http://love.com/lovers/john#')),
 (rdflib.term.BNode('N387ffec6cfcf427499ff7c3a00db24dc'),
  rdflib.term.URIRef(u'http://love.com#hasName'),
  rdflib.term.Literal(u'Mary')),
 (rdflib.term.URIRef(u'http://love.com/lovers/john#'),
  rdflib.term.URIRef(u'http://love.com#hasName'),
  rdflib.term.Literal(u'John'))]

is it maybe getting confused by the store being re-used?

joernhees · 2017-01-30T11:14:59Z

apart form that, as sha256 is used as a checksum, i think this would be a good approach to sign graphs, yes

jpmccu · 2017-01-30T11:27:16Z

To answe the original question, the isomorphic graph has a special method called graph_digest() that will output a graph-level hash using the Sayers and Karp algorithm. It looks like. The blank node has the same context (surrounding triples, grounding out at liberals or URIs) in all the graphs, so they are getting the same BNode IDs. So the canonicalized graphs should have the same ID fo the blank node version of Mary. I'll look into why they are the same as the non-canonicalized BNodes though. It could have something to do with the reuse of the Mary BNode across graphs.

joernhees · 2017-01-30T11:33:46Z

notice how list(gmary) contains less triples than list(to_isomorphic(gmary))

jpmccu · 2017-01-30T11:36:39Z

Ah, yes, sorry, still early here. Will investigate ASAP.

LibrEars · 2017-02-16T20:00:11Z

If you are interested in my approach for experiment-data management you can have a look at Linked-data-for-scientists-with-python.
There is also a quick way for graph visualization included.

jpmccu · 2017-02-16T21:15:22Z

The simple fix is for to_isomorphic() to not use the same store (which is what I do above). I tried using the same store but with an identifier, and the problem persists. The downside is that the triples are duplicated before actually being canonicalized.

Added test for Issue #682 and fixed.

joernhees added the bug Something isn't working label Jan 30, 2017

joernhees added this to the rdflib 5.0.0 milestone Jan 30, 2017

jpmccu mentioned this issue Feb 16, 2017

Added test for Issue #682 and fixed. #718

Merged

joernhees added the in-resolution label Feb 17, 2017

joernhees added a commit that referenced this issue Feb 20, 2017

Merge pull request #718 from jimmccusker/issue682

3758089

Added test for Issue #682 and fixed.

white-gecko modified the milestones: rdflib 5.0.0, rdflib 5.1.0 Apr 6, 2020

white-gecko modified the milestones: rdflib 5.1.0, rdflib 6.0.0 May 1, 2020

ghost added the id-as-cntxt tracking related issues label Dec 24, 2021

white-gecko modified the milestones: rdflib 6.x.x, 2022 June release Jun 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Signing named graphs #682

Signing named graphs #682

LibrEars commented Jan 12, 2017 •

edited

Loading

nicholascar commented Jan 30, 2017

joernhees commented Jan 30, 2017

joernhees commented Jan 30, 2017

jpmccu commented Jan 30, 2017

joernhees commented Jan 30, 2017

jpmccu commented Jan 30, 2017

LibrEars commented Feb 16, 2017

jpmccu commented Feb 16, 2017

Signing named graphs #682

Signing named graphs #682

Comments

LibrEars commented Jan 12, 2017 • edited Loading

nicholascar commented Jan 30, 2017

joernhees commented Jan 30, 2017

joernhees commented Jan 30, 2017

jpmccu commented Jan 30, 2017

joernhees commented Jan 30, 2017

jpmccu commented Jan 30, 2017

LibrEars commented Feb 16, 2017

jpmccu commented Feb 16, 2017

LibrEars commented Jan 12, 2017 •

edited

Loading