Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-44594: fix (Async)ExitStack handling of __context__ #27089

Merged
merged 4 commits into from
Oct 4, 2021

Conversation

belm0
Copy link
Contributor

@belm0 belm0 commented Jul 12, 2021

Make enter_[async_]context(foo()) equivalent to [async] with foo() regarding __context__ when an exception is raised.

Previously, exceptions would be caught and re-raised with the wrong context when explicitly overriding __context__ with None. In other words, in usage like the following where we explicitly try to clear the context to None on an exception in flight, (Async)ExitStack will overwrite that with the context of the old exception, while the real (async) with will not:

exc = foo()
try:
    raise exc
finally:
    exc.__context__ = None  # ExitStack will stomp on this value

This bug seems to have existed from the first version of (Async)ExitStack. Namely, the _fix_exception_context() code was trying to substitute a suitable value for __context__ when the exception was raised with a context of None. But this handling was not needed, since raising with __context__ of None will be replaced with a suitable context by Python's machinery before (Async)ExitStack accesses the exception via sys.exc_info(). Since this handling in (Async)ExitStack had no purpose and interferes with the valid use case of clearing __context__ on an in-flight exception, this PR removes it.

Included are unit tests verifying that (Async)ExitStack now match the behavior of (async) with when clearing the context of an in-flight exception to None.

TODO:

  • unit tests
  • make sure this PR is backported to Python 3.10 and older

https://bugs.python.org/issue44594

Make enter_context(foo()) / enter_async_context(foo()) equivalent to
`[async] with foo()` regarding __context__ when an exception is raised.

Previously exceptions would be caught and re-raised with the wrong
context when explicitly overriding __context__ with None.

TODO: unit tests
Comment on lines -534 to +537
if exc_context is old_exc:
if exc_context is None or exc_context is old_exc:
# Context is already set correctly (see issue 20317)
return
if exc_context is None or exc_context is frame_exc:
if exc_context is frame_exc:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that an exception raised with __context__ of None will already have __context__ populated before ExitStack receives it. Therefore this existing handling of None was not necessary and not handling the intended case.

When __context__ is None here, it means that it was explicitly set this way at the raise site (via a finally block). So we should not override it to old_exc. This is in line with the behavior of actual nested with statements.

@belm0 belm0 marked this pull request as ready for review July 12, 2021 10:33
@belm0
Copy link
Contributor Author

belm0 commented Jul 14, 2021

@ncoghlan I noticed you've stewarded the exception chaining in ExitStack, would you have time to review this PR?

noting:

  • I think it's agreed that ExitStack should behave like with blocks as much as possible
  • the change proposed in this PR did not break existing tests. I don't think there is a use case for the implementation as it is (see comment), but if there is one it should be represented in tests and confirmed against with.
  • new unit test confirms the proposed change matches with

@belm0
Copy link
Contributor Author

belm0 commented Jul 27, 2021

comments from graingert:

How does this interact with the StopIteration RuntimeError promotion cause check in @contextmanager?
This thing https://github.com/python/cpython/blob/main/Lib/contextlib.py#L171
I think try raising a StopIteration error in such a way it will become a RuntimeError that then has its cause removed?

@graingert graingert mentioned this pull request Jul 27, 2021
@njsmith
Copy link
Contributor

njsmith commented Oct 3, 2021

This has apparently been causing a lot of pain for some trio users: python-trio/trio#2001

This is some very tricky code, but having finally found the spoons to read it over carefully, I think @belm0's analysis is right.

To summarize: ExitStack.__exit__ runs a series of cleanup functions in a loop. And ExitStack wants to simulate a set of nested with blocks, where if one cleanup function raises an exception, then it sets the exception context for the next cleanup function, so __context__ chaining works as expected.

However, the cleanup functions are actually all called directly from ExitStack.__exit__'s context. So they all run with excinfo set to whatever exception was raised into ExitStack.__exit__, and that's what the interpreter's automatic __context__ propagation attaches to exceptions inside the cleanup functions.

So the job of _fix_exception_context is to find the __context__ that the interpreter set, and replace it with the one that it would have been set if we had really used nested with blocks. It does this by walking the context chain until it finds the __context__ that was set by the interpreter (or at least, looks like it was set by the interpreter), and then replaces it. The intuition is that since we're trying to simulate the interpreter's built-in __context__ chaining, the easiest way to do that is to let the interpreter do its thing -- by definition, it knows how to do it properly! -- and then just fix up the value after the fact.

However, if _fix_exception_context can't find any __context__ that was set by the interpreter, then it eventually hits an exception with __context__ = None, denoting the end of the context chain. And currently, when it sees this, it overwrites that None with the context exception. Which... doesn't really make any sense. If the interpreter didn't already set the frame_exc as the __context__, then we shouldn't either. I guess this case was added because you need to do something when you hit the end of the chain, and maybe there was some sense that it's better to be err on the side of attaching more context rather than less? But, this defensive coding now turns out to be causing problems, and I think we can trust the interpreter to do things properly. So simply removing overwrite for the None case is correct.

@njsmith
Copy link
Contributor

njsmith commented Oct 4, 2021

BTW, this whole strategy of fixing up the exception metadata after the fact is inherently heuristic, and has some very subtle corner cases. For example, consider this cleanup function:

def cleanup():
    try:
        raise OSError   # the original exception
    except OSError as exc:
        raise MyLibError from exc   # a wrapper exception, whose __cause__ points to the original exception

Here, we want to get a display like: MyLibError was directly caused by OSError, which was raised in the context of [whatever the previous cleanup function raised]. But... the OSError will initially have its __context__ set to the wrong thing. And _fix_exception_context doesn't know how to walk the __cause__ chain, so it won't find it to replace with the correct thing.

...Except this case actually works! The reason is that the interpreter will also set MyLibError.__context__ to point to the OSError. Normally this is a total no-op, because __context__ is ignored when __cause__ is set. But _fix_exception_context walks up MyLibError.__context__ even though it should be ignored, and accidentally fixes up MyLibError.__cause__ in the process, because MyLibError.__context__ and MyLibError.__cause__ both point to the same object.

OTOH, if we slightly tweak the cleanup function, in a way that doesn't look like it should affect anything...

def cleanup2():
    try:
        raise OSError
    except OSError as exc:
        saved_exc = exc
    raise MyLibError from saved_exc

...then now _fix_exception_context will get confused, and we'll end up with incorrect exception metadata.

It might be better to trick the interpreter into using the correct excinfo from the start, so no cleanup is necessary. It's obnoxious to set up though, because there isn't any built-in way to set excinfo without also mutating the exception object. I think you'd need something like:

            saved_ctx = pending_raise.__context__
            saved_tb = pending_raise.__traceback__
            try:
                raise pending_raise
           except type(pending_raise):
                pending_raise.__context__ = saved_ctx
                pending_raise.__traceback__ = saved_tb
                # invoke the next __exit__ callback
                if cb(*exc_details):
                    suppressed_exc = True
                    pending_raise = False
                    exc_details = (None, None, None)

...And also a separate branch to handle the case where pending_raise is None.

.......ANYWAY though, none of that actually affects this PR, it's just musing on further possible improvements while I happen to have this swapped into my brain.

@njsmith
Copy link
Contributor

njsmith commented Oct 4, 2021

Filed https://bugs.python.org/issue45361 about a possible improvement to fix this once and for all

@njsmith njsmith added needs backport to 3.9 only security fixes needs backport to 3.10 only security fixes labels Oct 4, 2021
@njsmith njsmith merged commit e6d1aa1 into python:main Oct 4, 2021
@miss-islington
Copy link
Contributor

Thanks @belm0 for the PR, and @njsmith for merging it 🌮🎉.. I'm working now to backport this PR to: 3.9, 3.10.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Sorry, @belm0 and @njsmith, I could not cleanly backport this to 3.10 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker e6d1aa1ac65b6908fdea2c70ec3aa8c4f1dffcb5 3.10

@miss-islington
Copy link
Contributor

Sorry @belm0 and @njsmith, I had trouble checking out the 3.9 backport branch.
Please backport using cherry_picker on command line.
cherry_picker e6d1aa1ac65b6908fdea2c70ec3aa8c4f1dffcb5 3.9

belm0 added a commit to belm0/cpython that referenced this pull request Oct 5, 2021
…ngh-27089)

* bpo-44594: fix (Async)ExitStack handling of __context__

Make enter_context(foo()) / enter_async_context(foo()) equivalent to
`[async] with foo()` regarding __context__ when an exception is raised.

Previously exceptions would be caught and re-raised with the wrong
context when explicitly overriding __context__ with None..
(cherry picked from commit e6d1aa1)

Co-authored-by: John Belmonte <john@neggie.net>
belm0 added a commit to belm0/cpython that referenced this pull request Oct 5, 2021
…gh-27089)

* bpo-44594: fix (Async)ExitStack handling of __context__

Make enter_context(foo()) / enter_async_context(foo()) equivalent to
`[async] with foo()` regarding __context__ when an exception is raised.

Previously exceptions would be caught and re-raised with the wrong
context when explicitly overriding __context__ with None..
(cherry picked from commit e6d1aa1)

Co-authored-by: John Belmonte <john@neggie.net>
@belm0 belm0 deleted the exit_stack_context branch October 5, 2021 06:11
miss-islington pushed a commit that referenced this pull request Oct 5, 2021
) (GH-28730)

Make enter_context(foo()) / enter_async_context(foo()) equivalent to
`[async] with foo()` regarding __context__ when an exception is raised.

Previously exceptions would be caught and re-raised with the wrong
context when explicitly overriding __context__ with None..
(cherry picked from commit e6d1aa1)

Co-authored-by: John Belmonte <john@neggie.net>

Automerge-Triggered-By: GH:njsmith
miss-islington pushed a commit that referenced this pull request Oct 5, 2021
…) (GH-28731)

Make enter_context(foo()) / enter_async_context(foo()) equivalent to
`[async] with foo()` regarding __context__ when an exception is raised.

Previously exceptions would be caught and re-raised with the wrong
context when explicitly overriding __context__ with None..
(cherry picked from commit e6d1aa1)

Co-authored-by: John Belmonte <john@neggie.net>

Automerge-Triggered-By: GH:njsmith
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs backport to 3.9 only security fixes needs backport to 3.10 only security fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants