Start support for mypy --strict #1515

ajnelson-nist · 2021-12-17T01:20:05Z

This patch series attempts to begin a "80/20" strategy suggested on other Issue threads (cited in References) around finding "most commonly used APIs" to address issues introduced with the addition of the py.typed file.

The first patches are the minimal necessary to get mypy --strict to pass on the test program test/test_issue1447.py, which exercises (IIRC, all of) the rdflib functions I've used across several projects. Overall, the patch series adds annotations to methods under rdflib/, annotations to a few existing tests under test/, a new test under test/, and a practice of running the new and just-revised tests to mypy --strict in the various testing infrastructures. Feedback is welcome on some of the type-supporting decisions (especially what I call out as worth squinting eyes) and the sustainability of the testing practice.

I have a few outstanding questions that would be good to answer before merging this PR, and/or give their own Issues if you'd rather merge sooner:

I guessed by name pattern on part of integrating the new test. I assume Issue 1447 is the right reference, though we can start different Issues if it'd be better to focus more tests on specific things. On the other hand, the 1447 test script could also be a "Feature request" zone where others could contribute (even trivial) function calls that meet their own needs for strict type review.
How, and especially where, should mypy --strict be called? Would docker-compose.tests.yml get another command: line to run mypy --strict on just that file? Issue 1513 came up as I was testing things, and my initial two guesses of which mypy call to revise weren't what ended up being revised, whereas my draft of this PR noted the docker script that has its own mypy step, and I missed that Github Actions seems to be what runs the CI now. I'm proceeding in this PR with just modifying Github Actions.
Inlined in the 1447 test program - is there a better rdflib class name or type designation for Nodes that are not Literals, analogous to SHACL's sh:BlankNodeOrIRI? I frequently find the need to define a Union type like I did in the test program.
Inlined in the test program - it's not clear to me why prepareQuery is not exposed to mypy via rdflib/plugins/sparql/__init__py, while the test program itself would run just fine calling rdflib.plugins.sparql.prepareQuery(). Any ideas?
Squinting review request - The hardest part of this patch series to draft, and now probably the one worth the sternest review, was getting Literal.toPython() to work. The first patch makes that the only .toPython method that is typed returning Any, while the rest return str. It's possible to make it return a Union of types from the various mapping dictionaries, but that causes extra work (either assert or typing.cast) for any caller trying to pass mypy --strict. As written in the first patch, the typing and value-testing requires minimal amount of annotation work. Does this practice pass your review, though?

Test status:

pytest, mypy, and black continue to pass as well as they did, as parameterized in .github/workflows/validate.yml and Makefile, before the addition of this patch. For review notes:

black currently reports 4 formatting errors in files not touched by this patch series, which I see is || true'd out in the Github Actions CI.
test/test_xmlliterals.py::testHTML currently fails in the master branch on my laptop, though, I'm not sure why it's passing on the Github Actions CI.

I'll do catch-up merges if someone lets me know that either is fixed, even a rebase if you prefer seeing PRs with all patches bearing a green checkmark from CI.

References:

Disclaimer:

Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose.

This patch is believed to be part of what would be necessary to resolve Issue 1447. This patch addresses type errors raised by `mypy` pertaining to use of untyped methods within typed contexts. These signatures are sufficient to pass a test application (coming in a following patch) that exercises: * creation of a graph; * use of `Graph.add()` to supply triples containing `URIRef`, `BNode`, and `Literal` objects; * use of `Graph.triples()`; * use of `Graph.query()` with a string argument; * use of `Graph.query()` with a `Query` argument; * use of `Graph.query()` to return query results with different bound-variable lengths (i.e. different tuple lengths); and * use of several `.toPython()` methods, including `Literal.toPython()`. To address issues with the various `Node` subclasses appearing to be untyped on initialization, this patch introduces `__init__` methods to satisfy `mypy` needing a type signature. To support any future initialization logic, `super()` is used to support the Method Resolution Order chain. The alternative approach of trying to assign return types to `__new__` didn't seem to work for two reasons that would require more significant design revision: * `BNode.__new__()` returns an `Identifier`, and `mypy` complained that this is an incompatible return type. Attempting `typing.cast()` didn't seem to resolve this problem. * `Literal.__new__()` was raising this `mypy --strict` error: ``` "Literal" has no attribute "_value"; maybe "value"? ``` I can't think of a practice aside from exposing `_value` as a slot that would convince `mypy` that the field exists. Hence, the classes under `rdflib/term.py` got `__init__()` methods. References: * RDFLib#1447 * https://stackoverflow.com/a/33533514 describes PEPs 484 and 563 relative to this patch. Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

…ypes This test is anchored off of Issue 1447, trying to identify "common" functions that would be used in a script exercising multiple SPARQL queries that return Python values. References: * RDFLib#1447 * python/typing#911 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

…nsumer Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

aucampia · 2021-12-17T01:28:14Z

Thanks for this, I will have a look at it tomorrow night.

ajnelson-nist · 2021-12-17T14:32:06Z

Well, that's an unfortunate series of X's.

The call to mypy that I added was only supposed to review four of the scripts under test/. On my local machine, those are the only four reviewed. In the Github Action, everything's reviewed.

I determined the difference is in how the package install is run:

python setup.py install triggers the fail mode behavior.
pip install . triggers the fail mode behavior.
pip install --editable . does not trigger the fail mode behavior.
pip install --editable ${some_git_submodule_dir}/rdflib does not trigger the fail mode behavior.
pip install ${some_git_submodule_dir}/rdflib does not trigger the fail mode behavior.

Of course, all of my testing directly with this repository was with the last three modes. (Modes 4 and 5 were how I tested these modifications in other rdflib-using repositories I'd like to eventually test with mypy --strict.)

So, I guess I need to highlight my original question 2 - how and where would it be appropriate---and heck, even functional---for mypy --strict to be called to review test applications?

ajnelson-nist · 2021-12-17T14:41:07Z

Evidence of the pip install -e flag getting this patch series to work is in this CI-debugging PR.

ajnelson-nist · 2021-12-17T18:39:27Z

Also, I just caught that PR 1498 touches on some of the same content within Literal. Thanks to @veyndan for the _Any member definition, it didn't occur to me to try that.

@aucampia , I'd be fine revising this patch series if 1498 is merged first.

veyndan · 2021-12-17T22:39:10Z

Thanks to @veyndan for the _Any member definition, it didn't occur to me to try that.

It was actually @aucampia who added that bit (a65e710) 👏

aucampia · 2021-12-18T01:18:59Z

I guessed by name pattern on part of integrating the new test. I assume Issue 1447 is the right reference, though we can start different Issues if it'd be better to focus more tests on specific things. On the other hand, the 1447 test script could also be a "Feature request" zone where others could contribute (even trivial) function calls that meet their own needs for strict type review.

I'm not entirely sure if this pattern is considered ideal - I know it has been used but there has also been some resistance to it, and I would say I somewhat agree that it may be better to try and decouple tests from the issue they were in. I think a name like test_strict_typing.py may be more appropriate.

I made a pull request to update the developers guide on where tests should go (#1517) and will run it past others to see what the consensus is on this. I think for now it is fine, but the file name may need to change later.

How, and especially where, should mypy --strict be called? Would docker-compose.tests.yml get another command: line to run mypy --strict on just that file? Issue 1513 came up as I was testing things, and my initial two guesses of which mypy call to revise weren't what ended up being revised, whereas my draft of this PR noted the docker script that has its own mypy step, and I missed that Github Actions seems to be what runs the CI now. I'm proceeding in this PR with just modifying Github Actions.

I think ideally this should be setup in setup.cfg using module specific mypy config: https://mypy.readthedocs.io/en/stable/config_file.html#per-module-and-global-options

So that will entail something like this in setup.cfg:

[mypy]
python_version = 3.7
warn_unused_configs = True
ignore_missing_imports = True
disallow_subclassing_any = False
warn_unreachable = True

[mypy-test.test_issue1447]
strict = True

I have tried it thought and it is not quite doing what I'm expecting, there are some other fallbacks but I will try a couple of things to get it working.

Will respond to other points when I can, hopefully tomorrow.

aucampia · 2021-12-19T18:06:10Z

How, and especially where, should mypy --strict be called? Would docker-compose.tests.yml get another command: line to run mypy --strict on just that file? Issue 1513 came up as I was testing things, and my initial two guesses of which mypy call to revise weren't what ended up being revised, whereas my draft of this PR noted the docker script that has its own mypy step, and I missed that Github Actions seems to be what runs the CI now. I'm proceeding in this PR with just modifying Github Actions.

@ajnelson-nist I made a PR against your branch to make the validation pass with the selective files running in strict mode: ajnelson-nist#8

Would recommend a rebase onto master though as there has been a couple of changes related to mypy and typing since your branch.

- Changed all mypy invocation to not include paths and moved the paths to the mypy section in setup.cfg. This removes redudancy in directory specification and reduces the potential that directories are changed in one invocation but not others. - Ignore mypy errors in test directory. - Enable strict mypy config with in-file config comments in the following files: - test/test_issue1447.py - test/test_issue_git_336.py - test/test_literal.py - test/test_parse_file_guess_format.py

rdflib/namespace/__init__.py

aucampia · 2021-12-19T19:07:16Z

Please merge ajnelson-nist#9 so that you are adding typing to __new__ instead of adding __init__

rdflib/term.py

test/test_issue1447.py

aucampia · 2021-12-19T19:15:23Z

4. Inlined in the test program - it's not clear to me why prepareQuery is not exposed to mypy via rdflib/plugins/sparql/__init__py, while the test program itself would run just fine calling rdflib.plugins.sparql.prepareQuery(). Any ideas?

mypy has an option:

  --no-implicit-reexport    Treat imports as private unless aliased (inverse:
                            --implicit-reexport)

This is enabled in --strict mode:

  --strict                  Strict mode; enables the following flags: --warn-
                            unused-configs, --disallow-any-generics,
                            --disallow-subclassing-any, --disallow-untyped-
                            calls, --disallow-untyped-defs, --disallow-
                            incomplete-defs, --check-untyped-defs, --disallow-
                            untyped-decorators, --no-implicit-optional,
                            --warn-redundant-casts, --warn-unused-ignores,
                            --warn-return-any, --no-implicit-reexport,
                            --strict-equality

For more info see https://mypy.readthedocs.io/en/stable/config_file.html#confval-implicit_reexport

aucampia · 2021-12-19T19:18:16Z

Inlined in the 1447 test program - is there a better rdflib class name or type designation for Nodes that are not Literals, analogous to SHACL's sh:BlankNodeOrIRI? I frequently find the need to define a Union type like I did in the test program.

Maybe:

Subject = typing.Union[rdflib.BNode, rdflib.URIRef]

After all, from https://www.w3.org/TR/rdf11-concepts/#section-triples

An RDF triple consists of three components:

the subject, which is an IRI or a blank node

the predicate, which is an IRI

the object, which is an IRI, a literal or a blank node

aucampia · 2021-12-19T19:28:30Z

5. Squinting review request - The hardest part of this patch series to draft, and now probably the one worth the sternest review, was getting Literal.toPython() to work. The first patch makes that the only .toPython method that is typed returning Any, while the rest return str. It's possible to make it return a Union of types from the various mapping dictionaries, but that causes extra work (either assert or typing.cast) for any caller trying to pass mypy --strict. As written in the first patch, the typing and value-testing requires minimal amount of annotation work. Does this practice pass your review, though?

For now typing Literal.toPython() as returning Any is the only option, and maybe the right option also as typing it as more specific types seem to fail. It is anyway not wrong, just over broad at worst, but at best it is maybe correct.

We need to add a lot more typing to help figure out the tightest constraints that make sense, but your changes are a step in the right direction and best to keep them minimal and merge them. I have a bunch more typing changes that are not in PRs yet, and there are a lot more to add, but it is easier if we make them in small batches - it will take some time getting to the point where everything works in strict mode.

aucampia · 2021-12-19T19:31:47Z

If you give me access to your branch then I can do some rebases and merge the PRs I made, up to you of course - don't mind either way - but I want to keep this mergeable so we can get it in as soon as we are ready and so it does not hold up other typing changes that will be made.

aucampia · 2021-12-19T19:45:43Z

I think in the long run we should just get mypy passing on test directory without strict, and then generally start adding additional errrors/warnings - not something to add here, but just a general view on what to do to get things moving along.

Fix mypy settings to run select test files in strict mode

Type __new__ instead of adding __init__

This follows through with Iwan's suggestion to type `__new__` instead of adding `__init__`. `Node.__init__` becomes a vestigial addition with his patch. Reported-by: Iwan Aucamp <aucampia@gmail.com> Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Reported-by: Iwan Aucamp <aucampia@gmail.com> Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

test/test_issue1447.py

aucampia · 2021-12-21T17:16:42Z

I'm happy with this change, something which is maybe debatable is on which test files strict mode makes sense. I'm fine with it on the ones you enabled it for but would just want the other maintainers to give some feedback there. (CC: @ashleysommer @nicholascar @white-gecko ).

As I said in a comment before, I want to make mypy run on all test in normal non strict mode, basically all that will be needed for this is removal of ignore_errors = True (this) after fixing the issues raised - but it would be good to have it run on strict mode on at least some tests specifically designed to verifying typing, and for that I think test/test_typing.py is ideal. I will look at that after this PR is merged as I think the setup here provides a good basis for it.

Reported-by: Iwan Aucamp <aucampia@gmail.com> Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

This is just to fit with practice used in other places in this repository. This patch has no impact on function. Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

ajnelson-nist · 2021-12-21T17:37:52Z

I've adjusted a few more style matters, but otherwise think this patch series is ready for review and merge.

@aucampia , I agree with the general direction of eventually removing ignore_errors.

ajnelson-nist · 2021-12-21T17:38:56Z

Also, it seems I'm getting some CI chaos from networking-stack tests. Hopefully that doesn't confuse review.

nicholascar

Definitely steps in the right direction! Yes the literal typing is still not great. Last year I added a whole bunch more tests for literal types but not mypy or type hinting. It gets really hard to cater for all XSD datatypes AND Python datatypes AND custom datatypes since there's no guarentee that they all line up.

ajnelson-nist added 14 commits December 16, 2021 18:59

Have prepareQuery() pass with mypy-strict-reviewed consumers

d63bc91

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Test Literals and SPARQL .prepareQuery() with mypy-strict-reviewed co…

2d1d53c

…nsumer Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Avoid use of built-in keyword as variable name

3f4d1d0

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Add boolean sample to round-trip test

15b1710

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Avoid implicit re-export that raises mypy --strict error

59b1a37

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Test Literals further with mypy-strict-reviewed consumer

a210552

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Make Graph.bind() pass with mypy-strict-reviewed consumers

41fc12e

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Make Namespace() pass with mypy-strict-reviewed consumers

b6aa255

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Test Graph.bind(), Namespace() with mypy-strict-reviewed consumer

d7803dc

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Make util.guess_format() pass with mypy-strict-reviewed consumers

cbe4745

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Add direct test of util.guess_format()

fe028e1

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Test util.guess_format() with mypy-strict-reviewed consumer

53ca084

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

aucampia reviewed Dec 19, 2021

View reviewed changes

rdflib/namespace/__init__.py Outdated Show resolved Hide resolved

aucampia reviewed Dec 19, 2021

View reviewed changes

rdflib/term.py Outdated Show resolved Hide resolved

aucampia reviewed Dec 19, 2021

View reviewed changes

test/test_issue1447.py Outdated Show resolved Hide resolved

aucampia mentioned this pull request Dec 21, 2021

Add typing for parsers #1522

Merged

ajnelson-nist and others added 8 commits December 21, 2021 11:01

Merge branch 'master' into start_updates_for_mypy_strict

3576d5f

Merge pull request #8 from aucampia/iwana-20211219T1900-fix_mypy

f080da9

Fix mypy settings to run select test files in strict mode

Type __new__ instead of adding __init__

03d0971

Merge pull request #9 from aucampia/iwana-20211219T2004-type_new

afbb7ae

Type __new__ instead of adding __init__

Revert addition of Node.__init__

2c6821c

This follows through with Iwan's suggestion to type `__new__` instead of adding `__init__`. `Node.__init__` becomes a vestigial addition with his patch. Reported-by: Iwan Aucamp <aucampia@gmail.com> Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Reformat per configured 'black'

1440514

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Use imported symbol

480bdf6

Reported-by: Iwan Aucamp <aucampia@gmail.com> Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Rely on pytest instead of "standalone" test-call mode

7bf70e4

Reported-by: Iwan Aucamp <aucampia@gmail.com> Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

aucampia reviewed Dec 21, 2021

View reviewed changes

test/test_issue1447.py Outdated Show resolved Hide resolved

ajnelson-nist mentioned this pull request Dec 21, 2021

Name for union of URIRef and BNode? #1526

Closed

ajnelson-nist added 4 commits December 21, 2021 12:24

Rename test program and inline documentation on purpose

bc35352

Reported-by: Iwan Aucamp <aucampia@gmail.com> Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Fix typo

fb100cf

Align symbol import style

7c0665f

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Use symbol-import style

b82a504

This is just to fit with practice used in other places in this repository. This patch has no impact on function. Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

aucampia approved these changes Dec 21, 2021

View reviewed changes

aucampia requested review from nicholascar, ashleysommer and white-gecko December 21, 2021 19:49

more guess_format tests

0fd66ce

nicholascar approved these changes Dec 22, 2021

View reviewed changes

nicholascar merged commit 535eeae into RDFLib:master Dec 22, 2021

ajnelson-nist deleted the start_updates_for_mypy_strict branch August 4, 2022 19:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Start support for mypy --strict #1515

Start support for mypy --strict #1515

ajnelson-nist commented Dec 17, 2021

aucampia commented Dec 17, 2021

ajnelson-nist commented Dec 17, 2021 •

edited

Loading

ajnelson-nist commented Dec 17, 2021

ajnelson-nist commented Dec 17, 2021

veyndan commented Dec 17, 2021

aucampia commented Dec 18, 2021

aucampia commented Dec 19, 2021

aucampia commented Dec 19, 2021

aucampia commented Dec 19, 2021

aucampia commented Dec 19, 2021

aucampia commented Dec 19, 2021

aucampia commented Dec 19, 2021

aucampia commented Dec 19, 2021

aucampia commented Dec 21, 2021 •

edited

Loading

ajnelson-nist commented Dec 21, 2021

ajnelson-nist commented Dec 21, 2021

nicholascar left a comment

Start support for mypy --strict #1515

Start support for mypy --strict #1515

Conversation

ajnelson-nist commented Dec 17, 2021

aucampia commented Dec 17, 2021

ajnelson-nist commented Dec 17, 2021 • edited Loading

ajnelson-nist commented Dec 17, 2021

ajnelson-nist commented Dec 17, 2021

veyndan commented Dec 17, 2021

aucampia commented Dec 18, 2021

aucampia commented Dec 19, 2021

aucampia commented Dec 19, 2021

aucampia commented Dec 19, 2021

aucampia commented Dec 19, 2021

aucampia commented Dec 19, 2021

aucampia commented Dec 19, 2021

aucampia commented Dec 19, 2021

aucampia commented Dec 21, 2021 • edited Loading

ajnelson-nist commented Dec 21, 2021

ajnelson-nist commented Dec 21, 2021

nicholascar left a comment

Choose a reason for hiding this comment

ajnelson-nist commented Dec 17, 2021 •

edited

Loading

aucampia commented Dec 21, 2021 •

edited

Loading