-
-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-29735: Optimize partial_call(): avoid tuple #516
Conversation
@Haypo, thanks for your PR! By analyzing the history of the files in this pull request, we identified @serhiy-storchaka, @rhettinger, @benjaminp, @abalkin and @1st1 to be potential reviewers. |
New benchmark results: http://bugs.python.org/issue29735#msg289579 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be the optimization needs more test cases for covering all code.
Objects/call.c
Outdated
} | ||
else if (PyCFunction_Check(callable)) { | ||
PyMethodDef *method = ((PyCFunctionObject*)callable)->m_ml; | ||
int flags = method->ml_flags & ~(METH_CLASS | METH_STATIC | METH_COEXIST); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think clearing these flags is not needed.
It would be better to use PyCFunction_GET_FLAGS()
.
In sum this function can be written as:
return PyFunction_Check(callable) ||
(PyCFunction_Check(callable) &&
!(PyCFunction_GET_FLAGS(callable) & METH_VARARGS));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"I think clearing these flags is not needed. It would be better to use PyCFunction_GET_FLAGS()."
Right, done. But I prefer explicit if().
@@ -110,6 +111,8 @@ partial_new(PyTypeObject *type, PyObject *args, PyObject *kw) | |||
return NULL; | |||
} | |||
|
|||
pto->use_fastcall = _PyObject_HasFastCall(func); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pto->fn
can be changed by __setstate__()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, nice catch. Fixed.
Modules/_functoolsmodule.c
Outdated
stack = &PyTuple_GET_ITEM(args, 0); | ||
nargs = PyTuple_GET_SIZE(args); | ||
argappl = NULL; | ||
Py_INCREF(args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since concatenating with empty tuple is optimized, perhaps these two special cases no longer needed. -10 lines of code. Could you tests this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested: removing the two special case (size==0) has no impact on performance, so I removed it.
Modules/_functoolsmodule.c
Outdated
nargs2 = pto_nargs; | ||
} | ||
else { | ||
nargs2 = pto_nargs + nargs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be used for all branches. -2 lines of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Modules/_functoolsmodule.c
Outdated
} | ||
} | ||
|
||
ret = _PyObject_FastCallDict(pto->fn, stack, nargs2, kwargs2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are keyword arguments _PyObject_FastCallDict
allocates new array for arguments. I don't know whether it is worth to optimize this case and unpack keyword arguments here. This could complicate the code too much and I don't know how large is a benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a corner case: dict.update() really expects a dict.
Your proposed optimizations seem complex for a little gain.
Modules/_functoolsmodule.c
Outdated
nargs * sizeof(PyObject*)); | ||
} | ||
|
||
if (PyDict_GET_SIZE(pto->kw) == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems the code for merging two dicts is duplicated. Could it be shared?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Modules/_functoolsmodule.c
Outdated
@@ -322,6 +366,7 @@ partial_setstate(partialobject *pto, PyObject *state) | |||
Py_INCREF(dict); | |||
|
|||
Py_SETREF(pto->fn, fn); | |||
pto->use_fastcall = _PyObject_HasFastCall(fn); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it before Py_SETREF()
. The partial object should be consistent at the time of destroying old callable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Please check that new code is covered by tests. Otherwise add new tests. |
* Add _PyObject_HasFastCall() * partial_call() now avoids temporary tuple to pass positional arguments if the callable supports the FASTCALL calling convention for positional arguments. * Fix also a performance regression in partial_call() if the callable doesn't support FASTCALL.
All code paths of partial_fastcall() and partial_call_impl() are tested by test_functools. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope it is worth the complication of the code.
Thank you for your review @serhiy-storchaka ! |
Bumps [gidgethub](https://github.com/brettcannon/gidgethub) from 5.0.1 to 5.1.0. - [Release notes](https://github.com/brettcannon/gidgethub/releases) - [Changelog](https://github.com/brettcannon/gidgethub/blob/main/docs/changelog.rst) - [Commits](gidgethub/gidgethub@5.0.1...v5.1.0) --- updated-dependencies: - dependency-name: gidgethub dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Avoid temporary tuple to pass positional arguments.