-
-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-36030: Add _PyTuple_FromArray() function #11954
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer this PR other your 2 other PRs. But please make the API private, or even internal.
Objects/tupleobject.c
Outdated
@@ -419,14 +431,32 @@ tupleitem(PyTupleObject *a, Py_ssize_t i) | |||
return a->ob_item[i]; | |||
} | |||
|
|||
static PyObject * | |||
tuple_from_array(PyObject *const *src, Py_ssize_t n, Py_ssize_t step) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dislike step. I prefer to leave tuplesubscript() unchanged and avoid step complication here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why this complexity better fits for tuplesubscript()
rather than for tuple_from_array()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to keep tuple_from_array() simple. The compiler should be able to emit more efficient code without step.
The whole purpose of the change is performance, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm not mistaken compiler does its job and compile PyTuple_FromArray()
efficiently:
444 PyObject *item = src[cur];
0x00000000000c1128 <+200>: mov (%r12,%rax,8),%rdx
445 Py_INCREF(item);
446 dst[i] = item;
0x00000000000c1130 <+208>: mov %rdx,0x18(%rbx,%rax,8)
0x00000000000c1135 <+213>: add $0x1,%rax
0x00000000000c1139 <+217>: cmp %rax,%rbp
0x00000000000c113c <+220>: jne 0xc1128 <PyTuple_FromArray+200>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, others may have a different opinion, but I still don't think that adding a step argument is worth it for a single call site: tuplesubscript(). I would prefer a very straightforward tuple_from_array() without step argument: well, it becomes _PyTuple_FromArray() in this case. No need for a subfunction.
@@ -1276,45 +1276,19 @@ PyObject_CallFunctionObjArgs(PyObject *callable, ...) | |||
_Py_NO_INLINE PyObject * | |||
_PyStack_AsTuple(PyObject *const *stack, Py_ssize_t nargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vstinner
Does it make sense to have _PyStack_AsTuple()
and _PyStack_AsTupleSlice()
after adding PyTuple_FromArray()
? Might make sense to remove them or replace with macros.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove _PyStack_AsTuple, but please do that in a second PR (once this one is merged, if it's merged ;-)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove both _PyStack_AsTuple
and _PyStack_AsTupleSlice
. The former is just an alias of _PyTuple_FromArray
, and the latter is an alias with trivial arguments transformation, and is used only once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current impementation is tuple_new_internal() in unsafe: that's a blocker issue for me, see my comment.
I'm just a little bit afraid by usage of tuple_new_internal(). I would free more comfortable if you can split this PR in two parts:
- Modify this PR to only add _PyTuple_FromArray() and use _PyTuple_FromArray(): "simplify the code"
- Once the first PR is merged, modify tupleobject.c to use tuple_new_internal(): your micro-optimization
I also proposed a 3rd PR to remove _PyStack_AsTuple() and replace it with _PyTuple_FromArray().
A simpler PR would be easy to review and it would be easier to take a decision.
Objects/tupleobject.c
Outdated
@@ -419,14 +431,32 @@ tupleitem(PyTupleObject *a, Py_ssize_t i) | |||
return a->ob_item[i]; | |||
} | |||
|
|||
static PyObject * | |||
tuple_from_array(PyObject *const *src, Py_ssize_t n, Py_ssize_t step) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, others may have a different opinion, but I still don't think that adding a step argument is worth it for a single call site: tuplesubscript(). I would prefer a very straightforward tuple_from_array() without step argument: well, it becomes _PyTuple_FromArray() in this case. No need for a subfunction.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
ffbca9e
to
bc98990
Compare
I have made the requested changes; please review again @vstinner |
Thanks for making the requested changes! @vstinner: please review the changes made to this pull request. |
@serhiy-storchaka, @methane, @rhettinger: Would you mind to review this change? IMHO the change is good, but without "tuple_from_array(PyObject *const *src, Py_ssize_t n, Py_ssize_t step)": without the step argument. I asked @sir-sigurd to revert his optimization to not initialize items to NULL: he reverted it to make the PR simpler. He will propose a separated PR for this micro-optimization, once this one is merged. IMHO this change is good because the API _PyTuple_FromArray() is easy to use, it removes a lot of code, and may open the way to further optimization later (that should be discussed later). Ah, and the new function is private ("internal"), so it can be reverted anytime if needed. So the last thing is to remove the step argument. IMHO it's not worth it, it was used once in tuplesubscript. Keeping the existing tuplesubscript() code is better to avoid to make _PyTuple_FromArray() more complex and maybe slower (it's hard to check if the machine code is the most efficient on all architectures). But I expected than hardcoded step=1 helps the compiler and the CPU to run faster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 if remove _PyStack_AsTuple
and _PyStack_AsTupleSlice
.
@@ -1276,45 +1276,19 @@ PyObject_CallFunctionObjArgs(PyObject *callable, ...) | |||
_Py_NO_INLINE PyObject * | |||
_PyStack_AsTuple(PyObject *const *stack, Py_ssize_t nargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove both _PyStack_AsTuple
and _PyStack_AsTupleSlice
. The former is just an alias of _PyTuple_FromArray
, and the latter is an alias with trivial arguments transformation, and is used only once.
Yeah, I agree with that, but I asked @sir-sigurd to do that in a separated PR to keep this PR short and easier to review (see the length of the discussion of this PR ;-)). |
It removes 3x lines code. I agree this has net win. +1. |
@sir-sigurd: so please remove the "step" parameter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks @sir-sigurd, I merged your PR! You can now work on a PR to remove _PyStack_AsTuple() (replace it with _PyTuple_FromArray) and maybe also remove _PyStack_AsTupleSlice. @serhiy-storchaka proposed to remove both. I also would like a PR with your micro-optimization, it should now be way smaller, so easier to review. |
@@ -11,6 +11,7 @@ extern "C" { | |||
#include "tupleobject.h" | |||
|
|||
#define _PyTuple_ITEMS(op) (_PyTuple_CAST(op)->ob_item) | |||
PyAPI_FUNC(PyObject *) _PyTuple_FromArray(PyObject *const *, Py_ssize_t); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vstinner Was there a reason to export this as an API function, iso. using the extern
keyword?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only started to no longer export symbols by using "extern" recently. Previously, I kept the habit of exporting symbols using PyAPI_FUNC/PyAPI_DATA. You can propose a PR to use extern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Would that affect the Stable API, @encukou?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, nothing in Include/internal/
is in the limited API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks Petr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vstinner this is somewhat alternative to #11927.
https://bugs.python.org/issue36030