bpo-30747: Attempt to fix atomic load/store #2383

Paxxi · 2017-06-24T20:16:28Z

Py_atomic* are currently not implemented as atomic operations
when building with MSVC. This patch attempts to implement parts
of the functionality required.

https://bugs.python.org/issue30747

mention-bot · 2017-06-24T20:16:29Z

@Paxxi, thanks for your PR! By analyzing the history of the files in this pull request, we identified @jyasskin, @benjaminp and @akheron to be potential reviewers.

the-knights-who-say-ni · 2017-06-24T20:16:30Z

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately our records indicate you have not signed the CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

Thanks again to your contribution and we look forward to looking at it!

pitrou · 2017-06-29T18:47:21Z

It seems you broke the build on Unix:
https://travis-ci.org/python/cpython/jobs/246629729#L1145

Parser/myreadline.c:311:33: error: implicit declaration of function '_Py_atomic_load_relaxed' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
    if (_PyOS_ReadlineTState == PyThreadState_GET()) {
                                ^
./Include/pystate.h:252:31: note: expanded from macro 'PyThreadState_GET'
             ((PyThreadState*)_Py_atomic_load_relaxed(&_PyThreadState_Current))
                              ^

pitrou · 2017-06-29T18:47:55Z

Include/pyatomic.h

@@ -87,8 +93,9 @@ typedef struct _Py_atomic_int {
            || (ORDER) == __ATOMIC_CONSUME),                  \
     __atomic_load_n(&(ATOMIC_VAL)->_value, ORDER))

-#else
-
+/* Only support GCC (for expression statements) and x86 (for simple


Perhaps remove this comment?

I was thinking of updating it to reflect the current state.

Looks like you forgot to update it :-)

pitrou · 2017-06-29T18:48:33Z

Include/pyatomic.h

@@ -230,8 +321,6 @@ _Py_ANNOTATE_MEMORY_ORDER(const volatile void *address, _Py_memory_order order)
 #define _Py_atomic_load_explicit(ATOMIC_VAL, ORDER) \
    ((ATOMIC_VAL)->_value)

-#endif  /* !gcc x86 */
-#endif


Looks like you shouldn't have moved the second #endif below.

pitrou · 2017-06-29T18:50:47Z

Include/pyatomic.h

+#define _Py_atomic_store_64bit(ATOMIC_VAL, NEW_VAL, ORDER) \
+    switch (ORDER) { \
+    case _Py_memory_order_acquire: \
+      _InterlockedExchange64_HLEAcquire((__int64 volatile*)ATOMIC_VAL, (__int64)NEW_VAL); \


Is it a runtime check whether HLE is available or not, or is it a compile-time check? If compile-time, it means we may mistakingly produce builds that won't run on non-HLE processors. @zooba

According to the documentation these will use HLEAcquire if the cpu supports it or use a regular _InterlockedExchange64 docs here https://msdn.microsoft.com/en-us/library/1s26w950.aspx

specifically this bit

On Intel platforms that support Hardware Lock Elision (HLE) instructions, the intrinsics with _HLEAcquire and _HLERelease suffixes include a hint to the processor that can accelerate performance by eliminating a lock write step in hardware. If these intrinsics are called on platforms that do not support HLE, the hint is ignored.

I understand that. My question is whether the CPU detection happens at compile time or at run time. I can't find any information online...

There's no detection happening. Intel seem to have found a way to make it backwards compatible so older cpus will ignore the extra hint.

I tested this quickly and here's the generated assembly for the HLE and non HLE versions

; 12 : auto a = _InterlockedExchange_HLEAcquire(&test, 1); mov eax, 1 lea ecx, DWORD PTR _test$[ebp] xacquire xchg DWORD PTR [ecx], eax mov DWORD PTR _a$[ebp], eax ; 13 : auto b = _InterlockedExchange(&test, 2); mov eax, 2 lea ecx, DWORD PTR _test$[ebp] xchg DWORD PTR [ecx], eax mov DWORD PTR _b$[ebp], eax

That's great, thank you!

pitrou · 2017-06-29T18:53:54Z

Include/pyatomic.h

+  }
+
+#define _Py_atomic_signal_fence(/*memory_order*/ ORDER) ((void)0)
+#define _Py_atomic_thread_fence(/*memory_order*/ ORDER) ((void)0)


Don't you need to copy what the GCC version does for these two functions?

I'm not entirely sure yet, I'm thinking that it's not required but I'm looking into it further to be sure.

pitrou · 2017-06-29T18:54:08Z

Include/pyatomic.h

+    _Py_atomic_store_32bit(ATOMIC_VAL._value, NEW_VAL, ORDER) } 
+
+#define _Py_atomic_load_explicit(ATOMIC_VAL, ORDER) \
+    ((ATOMIC_VAL)->_value)


Same comment as for the fence functions.

I'm working on it. Will be updating this PR with the load implementation and the bits for MSVC/ARM as well. Had to spend this week debugging another issue so I'm planning on having everything done around Monday/Tuesday next week.

Paxxi · 2017-07-05T09:31:31Z

This should be ready now. Test suite passes for x86/x64, completely untested for ARM as I don't have any suitable environment.

I don't really know how to validate that they're correctly atomic. The operations aren't really suitable to implement a ref count or a mutex as a test case so I'm open to ideas of any way to test this.

pitrou · 2017-07-08T15:23:54Z

I don't really know how to validate that they're correctly atomic.

I don't think we need to. The potential issues here are extreme edge case race conditions that may happen once a month on heavily-loaded production systems. There's little we can check for in the test suite, IMHO.

Longer term, one possibility would be to build Python with thread sanitizer. That's not compatible with MSVC AFAIK, though.

pitrou · 2017-07-08T15:24:28Z

@Paxxi, partially related, but have you seen PR #2417 and do you have any opinion about it?

pitrou · 2017-07-09T17:08:34Z

Include/pyatomic.h

+}
+
+#define _Py_atomic_signal_fence(/*memory_order*/ ORDER) ((void)0)
+#define _Py_atomic_thread_fence(/*memory_order*/ ORDER) ((void)0)


On gcc/x86, _Py_atomic_thread_fence issues a mfence assembler instruction. Why doesn't it do the same here? Since it's the same CPU architecture, it should probably issue the same instruction...

And apparently MSVC has a MemoryBarrier macro.

These macros aren't used anywhere in the code base except in pyatomic.h for the gcc implementation so I skipped them.

Can implement them using the MemoryBarrier macro if required.

I see. Then it's simpler to remove them, I think.

pitrou · 2017-07-09T17:14:21Z

Include/pyatomic.h

+#define _Py_atomic_thread_fence(/*memory_order*/ ORDER) ((void)0)
+
+#define _Py_atomic_store_explicit(ATOMIC_VAL, NEW_VAL, ORDER) \
+  if (sizeof(*ATOMIC_VAL._value) == 8) { \


Instead of pseudo-runtime checks, you may want to use a true compile-time check using e.g. SIZEOF_VOID_P.

Perhaps that can help you get rid of some dummy definitions, by the way (for example the 64-bit load/store macros on 32-bit builds).

Can't do it at compile time since the macro is used for both _Py_atomic_address and _Py_atomic_int which will be different sizes on 64-bit Windows. int will still be 4 bytes so the runtime check is required to handle it correctly.

pitrou · 2017-07-09T17:16:34Z

Include/pyatomic.h

+
+
+#if defined(_M_X64) 
+#define _Py_atomic_store_64bit(ATOMIC_VAL, NEW_VAL, ORDER) \


Ok, I think the load and store macros are ok as defined, but that's because MSDN says "Most of the interlocked functions provide full memory barriers on all Windows platforms". Perhaps you can add a comment to that effect?

pitrou · 2017-07-09T17:18:25Z

@zooba, do you think you can get some expert at Microsoft to take a look at this?
Otherwise, while I can't vouch for the correctness, this can't be worse than the statu quo, as long as it compiles fine :-)

Paxxi · 2017-07-13T17:13:17Z

Added comment about interlocked functions and removed the fence macros as they're not used.

pitrou · 2017-07-17T09:53:52Z

Include/pyatomic.h

+    _Py_atomic_load_64bit(ATOMIC_VAL._value, ORDER) : \
+    _Py_atomic_load_32bit(ATOMIC_VAL._value, ORDER) \
+  )
+  _WriteBarrier


Are you sure about this? According to MSDN, those barrier intrinsics are only defined on x86 and x86-64.

Also, the macro doesn't look like it would expand correctly...

No that's a mistake, no idea how that snuck in there

pitrou · 2017-07-17T13:36:55Z

@zooba, do you want to review this pre-merge or should it go ahead?

pitrou · 2017-08-06T12:05:47Z

I think this is ready to merge. Would you like to add a NEWS entry using the "blurb" utility?

Paxxi · 2017-08-07T06:52:15Z

I'm not much of a writer so I'd rather skip writing anything about it. Feel free to do it on my behalf if it's considered newsworthy enough.

pitrou · 2017-08-08T10:04:00Z

Well, I'm not sure how to edit your branch, perhaps you can try applying the following patch to it:

$ git diff --cached 
diff --git a/Misc/NEWS.d/next/Core and Builtins/2017-08-08-12-00-29.bpo-30747.g2kZRT.rst b/Misc/NEWS.d/next/Core and Builtins/2017-08-08-12-00-29.bpo-30747.g2kZRT.rst
new file mode 100644
index 0000000000..04a726a7e6
--- /dev/null
+++ b/Misc/NEWS.d/next/Core and Builtins/2017-08-08-12-00-29.bpo-30747.g2kZRT.rst       
@@ -0,0 +1,2 @@
+Add a non-dummy implementation of _Py_atomic_store and _Py_atomic_load on
+MSVC.

_Py_atomic_* are currently not implemented as atomic operations when building with MSVC. This patch attempts to implement parts of the functionality required.

Paxxi · 2017-08-08T11:30:48Z

Thanks! I added it to the commit now.

Default settings on Github afaik is that repo owners can push directly to a contributors branch for a PR but I didn't know it needed to be part of the commit.

Should be all set now.

pitrou · 2017-08-12T09:19:27Z

Ok, thank you!

Apparently MSVC is too stupid to understand that the alternate branch is not taken and emits a warning for it. Warnings added in python#2383

* bpo-9566: Silence warnings from pyatomic.h macros Apparently MSVC is too stupid to understand that the alternate branch is not taken and emits a warning for it. Warnings added in #2383 * bpo-9566: A better fix for the pyatomic.h warning * bpo-9566: Remove a slash

…#3140) * bpo-9566: Silence warnings from pyatomic.h macros Apparently MSVC is too stupid to understand that the alternate branch is not taken and emits a warning for it. Warnings added in python#2383 * bpo-9566: A better fix for the pyatomic.h warning * bpo-9566: Remove a slash

the-knights-who-say-ni added the CLA not signed label Jun 24, 2017

Mariatta removed the CLA not signed label Jun 27, 2017

the-knights-who-say-ni added the CLA signed label Jun 27, 2017

pitrou reviewed Jun 29, 2017

View reviewed changes

Paxxi force-pushed the bop-30747 branch from 55788e5 to 1a55cc9 Compare June 30, 2017 16:11

pitrou reviewed Jul 9, 2017

View reviewed changes

Paxxi force-pushed the bop-30747 branch from 1a55cc9 to 5b8e93b Compare July 13, 2017 17:10

pitrou reviewed Jul 17, 2017

View reviewed changes

Paxxi force-pushed the bop-30747 branch from 5b8e93b to 64067b7 Compare July 17, 2017 12:50

bpo-30747: Attempt to fix atomic load/store

b02cb52

_Py_atomic_* are currently not implemented as atomic operations when building with MSVC. This patch attempts to implement parts of the functionality required.

Paxxi force-pushed the bop-30747 branch from 64067b7 to b02cb52 Compare August 8, 2017 11:28

pitrou merged commit e664d7f into python:master Aug 12, 2017

segevfiner added a commit to segevfiner/cpython that referenced this pull request Aug 18, 2017

bpo-9566: Silence warnings from pyatomic.h macros

a1479b9

Apparently MSVC is too stupid to understand that the alternate branch is not taken and emits a warning for it. Warnings added in python#2383

segevfiner mentioned this pull request Aug 18, 2017

bpo-9566 & bpo-30747: Silence warnings from pyatomic.h macros #3140

Merged

ericsnowcurrently mentioned this pull request Mar 1, 2019

bpo-33608: Simplify DISPATCH by hoisting eval_breaker ahead of time. #12062

Merged



		#if defined(_M_X64)
		#define _Py_atomic_store_64bit(ATOMIC_VAL, NEW_VAL, ORDER) \

bpo-30747: Attempt to fix atomic load/store #2383

bpo-30747: Attempt to fix atomic load/store #2383

Conversation

Paxxi commented Jun 24, 2017 • edited by bedevere-bot Loading

mention-bot commented Jun 24, 2017

the-knights-who-say-ni commented Jun 24, 2017

pitrou commented Jun 29, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Paxxi Jun 30, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Paxxi commented Jul 5, 2017

pitrou commented Jul 8, 2017

pitrou commented Jul 8, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pitrou commented Jul 9, 2017

Paxxi commented Jul 13, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pitrou commented Jul 17, 2017

pitrou commented Aug 6, 2017

Paxxi commented Aug 7, 2017

pitrou commented Aug 8, 2017

Paxxi commented Aug 8, 2017

pitrou commented Aug 12, 2017

Paxxi commented Jun 24, 2017 •

edited by bedevere-bot

Loading

Paxxi Jun 30, 2017 •

edited

Loading