Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-107557: Remove unnecessary SAVE_IP instructions #108583

Merged
merged 12 commits into from
Aug 29, 2023

Conversation

gvanrossum
Copy link
Member

@gvanrossum gvanrossum commented Aug 28, 2023

As a final optimization pass, squeeze out any SAVE_IP instructions that are not needed.

For example, if we have

SAVE_IP
LOAD_FAST
SAVE_IP
LOAD_GLOBAL

we can remove the first SAVE_IP, since LOAD_FAST can neither deoptimize nor error out; but we cannot remove the second SAVE_IP, since LOAD_GLOBAL can error out.

This pass requires new per-opcode flags, HAS_ERROR_FLAG and HAS_DEOPT_FLAG, which must be defined for regular ops as well as for uops. As a bonus we also remove redundant NOP opcodes.

(It may seem suboptimal to call squeeze_stubs() twice -- once after constructing an initial trace if it has stubs, once after removing unneeded uops; but to avoid this we'd need to change a bunch of APIs to return where the gap is. This seems simpler.)

@gvanrossum
Copy link
Member Author

(@Fidget-Spinner I'm not sure if reusing your gh-107557 as the issue in the PR subject is right. I've mostly been skipping news for optimizer/executor based changes because they don't matter to end users and only a few core devs even care. Those interested can look at the commit messages.)

@gvanrossum gvanrossum marked this pull request as ready for review August 28, 2023 22:14
Python/optimizer.c Show resolved Hide resolved
Python/optimizer.c Outdated Show resolved Hide resolved
@@ -372,6 +372,32 @@ static PyTypeObject UOpExecutor_Type = {
.tp_as_sequence = &uop_as_sequence,
};

static int
squeeze_stubs(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
squeeze_stubs(
shift_stubs(

(the name implied to me that something happens to the size of the stubs).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about move_stubs() (echoing that it uses memmove()).

Python/optimizer.c Show resolved Hide resolved
need_ip = true;
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert here that last_instr >= 0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I can just initialize it to 0 though. If the trace is empty we shouldn't get here but the code here won't do anything.

or variable_used(instr, "error")
or variable_used(instr, "pop_1_error")
or variable_used(instr, "exception_unwind")
or variable_used(instr, "resume_with_error")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make variable_used accept a sequence of strings to check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but why bother? Plus, I'd kind of like variable_used(instr, X, Y, Z) to look for the sequence of identifiers X Y Z. (And I'd also like to allow keywords, e.g. goto.)

Copy link
Member Author

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reviews! I've changed the function name, shortened the comment, and changed the initialization of last_instr. I'll merge once tests pass.

@@ -372,6 +372,32 @@ static PyTypeObject UOpExecutor_Type = {
.tp_as_sequence = &uop_as_sequence,
};

static int
squeeze_stubs(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about move_stubs() (echoing that it uses memmove()).

need_ip = true;
}
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I can just initialize it to 0 though. If the trace is empty we shouldn't get here but the code here won't do anything.

or variable_used(instr, "error")
or variable_used(instr, "pop_1_error")
or variable_used(instr, "exception_unwind")
or variable_used(instr, "resume_with_error")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but why bother? Plus, I'd kind of like variable_used(instr, X, Y, Z) to look for the sequence of identifiers X Y Z. (And I'd also like to allow keywords, e.g. goto.)

@gvanrossum gvanrossum enabled auto-merge (squash) August 29, 2023 16:15
@gvanrossum gvanrossum merged commit 4f22152 into python:main Aug 29, 2023
23 checks passed
@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot AMD64 Windows10 3.x has failed when building commit 4f22152.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/all/#builders/146/builds/6164) and take a look at the build logs.
  4. Check if the failure is related to this commit (4f22152) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/all/#builders/146/builds/6164

Failed tests:

  • test.test_concurrent_futures.test_shutdown

Summary of the results of the build (if available):

== Tests result: FAILURE then SUCCESS ==

426 tests OK.

10 slowest tests:

  • test_math: 6 min 21 sec
  • test_peg_generator: 2 min 37 sec
  • test_wmi: 2 min 20 sec
  • test_tokenize: 2 min 5 sec
  • test.test_multiprocessing_spawn.test_processes: 2 min 2 sec
  • test_unparse: 2 min 2 sec
  • test_compileall: 1 min 31 sec
  • test_mmap: 1 min 29 sec
  • test_pickle: 1 min 26 sec
  • test_unicodedata: 1 min 25 sec

37 tests skipped:
test.test_asyncio.test_unix_events
test.test_multiprocessing_fork.test_manager
test.test_multiprocessing_fork.test_misc
test.test_multiprocessing_fork.test_processes
test.test_multiprocessing_fork.test_threads
test.test_multiprocessing_forkserver.test_manager
test.test_multiprocessing_forkserver.test_misc
test.test_multiprocessing_forkserver.test_processes
test.test_multiprocessing_forkserver.test_threads test_curses
test_dbm_gnu test_dbm_ndbm test_devpoll test_epoll test_fcntl
test_fork1 test_gdb test_grp test_ioctl test_kqueue test_openpty
test_perf_profiler test_perfmaps test_poll test_posix test_pty
test_pwd test_readline test_resource test_syslog
test_threadsignals test_wait3 test_wait4 test_xxlimited
test_xxtestfuzz test_zipfile64 test_zoneinfo

1 re-run test:
test.test_concurrent_futures.test_shutdown

Total duration: 22 min 32 sec

Click to see traceback logs
Traceback (most recent call last):
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\test\support\__init__.py", line 203, in _force_run
    return func(*args)
           ^^^^^^^^^^^
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'D:\\buildarea\\3.x.bolen-windows10\\build\\build\\test_python_7492�\\test_python_worker_10044�'


Traceback (most recent call last):
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\runpy.py", line 198, in _run_module_as_main
    return _run_code(code, main_globals, None,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\runpy.py", line 88, in _run_code
    exec(code, run_globals)
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\test\__main__.py", line 2, in <module>
    main()
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\test\libregrtest\main.py", line 826, in main
    Regrtest().main(tests=tests, **kwargs)
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\test\libregrtest\main.py", line 756, in main
    with os_helper.temp_cwd(test_cwd, quiet=True):
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\contextlib.py", line 159, in __exit__
    self.gen.throw(value)
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\test\support\os_helper.py", line 531, in temp_cwd
    with temp_dir(path=name, quiet=quiet) as temp_path:
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\contextlib.py", line 159, in __exit__
    self.gen.throw(value)
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\test\support\os_helper.py", line 485, in temp_dir
    rmtree(path)
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\test\support\os_helper.py", line 442, in rmtree
    _rmtree(path)
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\test\support\os_helper.py", line 385, in _rmtree
    _waitfor(_rmtree_inner, path, waitall=True)
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\test\support\os_helper.py", line 330, in _waitfor
    func(pathname)
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\test\support\os_helper.py", line 382, in _rmtree_inner
    _force_run(fullname, os.rmdir, fullname)
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\test\support\__init__.py", line 214, in _force_run
    return func(*args)
           ^^^^^^^^^^^
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'D:\\buildarea\\3.x.bolen-windows10\\build\\build\\test_python_7492�\\test_python_worker_10044�'


Traceback (most recent call last):
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\test\support\os_helper.py", line 480, in temp_dir
    yield path
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\test\support\os_helper.py", line 533, in temp_cwd
    yield cwd_dir
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\test\libregrtest\main.py", line 762, in main
    self._main(tests, kwargs)
  File "D:\buildarea\3.x.bolen-windows10\build\Lib\test\libregrtest\main.py", line 821, in _main
    sys.exit(0)
SystemExit: 0

@gvanrossum gvanrossum deleted the omit-save-ip branch August 29, 2023 17:30
@gvanrossum
Copy link
Member Author

Benchmarking results show this would be 4% slower than main if the optimizer was always on (it isn't).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants