Skip to content

Commit

Permalink
Python: Fix hang at application shutdown
Browse files Browse the repository at this point in the history
  • Loading branch information
garbear committed Jun 27, 2024
1 parent 7508b87 commit 0f5db1a
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions xbmc/interfaces/python/XBPython.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,13 @@ XBPython::~XBPython()
#if PY_VERSION_HEX >= 0x03070000
if (Py_IsInitialized())
{
// Switch to the main interpreter thread before finalizing
PyThreadState_Swap(PyInterpreterState_ThreadHead(PyInterpreterState_Main()));

// Clear all loaded modules to prevent circular references
PyObject* modules = PyImport_GetModuleDict();
PyDict_Clear(modules);

Py_Finalize();
}
#endif
Expand Down

15 comments on commit 0f5db1a

@graysky2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@garbear - Unfortunately, this commit does not fix the issue at at least on Arch Linux (python 3.12.4-1).

[Jul 4 15:00] LanguageInvoker[104543]: segfault at 7f6982a440d0 ip 00007f69a736ddd1 sp 00007f696bdff4a8 error 4 in libpython3.12.so.1.0[7f69a727c000+27b000] likely on CPU 1 (core 1, socket 0)
[  +0.000013] Code: 48 85 ff 75 d9 41 83 ac 24 98 00 00 00 01 e9 c8 fc ff ff 66 0f 1f 44 00 00 f3 0f 1e fa 48 8b 57 f0 48 85 d2 74 26 48 8b 4f f8 <48> 8b 42 08 48 83 e1 fc 83 e0 03 48 09 c8 48 89 11 48 89 42 08 48

Do you need me to open a fresh issue?

@garbear
Copy link
Member Author

@garbear garbear commented on 0f5db1a Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the segfault caused by PyDict_Clear()? Go ahead and open an issue and I'll look into it.

@graysky2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @garbear - would you mind looking at the issue I created and that others have added to which pre-dates this commit or do you need me to create a fresh issue here? I am happy to if needed.

https://gitlab.archlinux.org/archlinux/packaging/packages/python/-/issues/11

@garbear
Copy link
Member Author

@garbear garbear commented on 0f5db1a Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the segfault is a separate issue. This change fixes a hang (not a crash) that happens in Py_Finalize(). I don't see that function entered in the stack traces on the issue you linked.

Though it's possible that explicitly clearing the modules works around the Python bug somehow.

@graysky2
Copy link
Contributor

@graysky2 graysky2 commented on 0f5db1a Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@garbear - all of the data including the stack traces from @jelly in that report is on kodi before this commit was incorporated.

@jelly - now that kodi-20-7 includes this commit and has hit the repos, would you mind testing on your system to confirm my findings and potentially providing another stack trace as you did in https://gitlab.archlinux.org/archlinux/packaging/packages/python/-/issues/11

@graysky2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@garbear - is this helpful?

Stack trace of thread 1617:
#0  0x00007122a196ddd1 PyObject_GC_UnTrack (libpython3.12.so.1.0 + 0x16ddd1)
#1  0x00007122a1a85970 n/a (libpython3.12.so.1.0 + 0x285970)
#2  0x00007122a1983cbf n/a (libpython3.12.so.1.0 + 0x183cbf)
#3  0x00007122a1972e0a n/a (libpython3.12.so.1.0 + 0x172e0a)
#4  0x00007122a1a09209 _PyModule_ClearDict (libpython3.12.so.1.0 + 0x209209)
#5  0x00007122a1a85586 n/a (libpython3.12.so.1.0 + 0x285586)
#6  0x00007122a1a911c5 Py_EndInterpreter (libpython3.12.so.1.0 + 0x2911c5)
#7  0x00006544c49b05b8 _ZN14CPythonInvoker15onExecutionDoneEv (kodi.bin + 0xa3d5b8)
#8  0x00006544c5538556 _ZThn40_N22CLanguageInvokerThread6OnExitEv (kodi.bin + 0x15c5556)
#9  0x00006544c4dec6e8 _ZN7CThread6ActionEv (kodi.bin + 0xe796e8)
#10 0x00006544c4df7035 n/a (kodi.bin + 0xe84035)
#11 0x000071229fce0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#12 0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#13 0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 1601:
#0  0x000071229faa34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000071229faa6552 pthread_cond_clockwait (libc.so.6 + 0x92552)
#2  0x00006544c489c74e n/a (kodi.bin + 0x92974e)
#3  0x00006544c56d4a9b _ZN8ActiveAE9CActiveAE7ProcessEv (kodi.bin + 0x1761a9b)
#4  0x00006544c4dec6df _ZN7CThread6ActionEv (kodi.bin + 0xe796df)
#5  0x00006544c4df7035 n/a (kodi.bin + 0xe84035)
#6  0x000071229fce0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#7  0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#8  0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 1618:
#0  0x000071229fb1c39d __poll (libc.so.6 + 0x10839d)
#1  0x00007122a1083340 n/a (libavahi-common.so.3 + 0x3340)
#2  0x00007122a1085ed1 avahi_simple_poll_run (libavahi-common.so.3 + 0x5ed1)
#3  0x00007122a10860b1 avahi_simple_poll_iterate (libavahi-common.so.3 + 0x60b1)
#4  0x00007122a10862c6 avahi_simple_poll_loop (libavahi-common.so.3 + 0x62c6)
#5  0x00007122a108634f n/a (libavahi-common.so.3 + 0x634f)
#6  0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#7  0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 1598:
#0  0x000071229faa34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000071229faa5ed9 pthread_cond_wait (libc.so.6 + 0x91ed9)
#2  0x0000712291c9d47e n/a (iris_dri.so + 0x9d47e)
#3  0x0000712291c7b22c n/a (iris_dri.so + 0x7b22c)
#4  0x0000712291c9d3ad n/a (iris_dri.so + 0x9d3ad)
#5  0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#6  0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 1600:
#0  0x000071229faa34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000071229faa5ed9 pthread_cond_wait (libc.so.6 + 0x91ed9)
#2  0x0000712291c9d47e n/a (iris_dri.so + 0x9d47e)
#3  0x0000712291c7b22c n/a (iris_dri.so + 0x7b22c)
#4  0x0000712291c9d3ad n/a (iris_dri.so + 0x9d3ad)
#5  0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#6  0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 1610:
#0  0x000071229faf3b87 getdents64 (libc.so.6 + 0xdfb87)
#1  0x000071229faf3c17 readdir64 (libc.so.6 + 0xdfc17)
#2  0x00007122a0bff741 n/a (libudev.so.1 + 0x19741)
#3  0x00007122a0beee99 n/a (libudev.so.1 + 0x8e99)
#4  0x00007122a0bf4c95 udev_enumerate_scan_devices (libudev.so.1 + 0xec95)
#5  0x00007122a100d2c6 _ZN3CEC23CUSBCECAdapterDetection16FindAdaptersUdevEPNS_22cec_adapter_descriptorEhPKc (li>
#6  0x00007122a100e808 _ZN3CEC23CUSBCECAdapterDetection12FindAdaptersEPNS_22cec_adapter_descriptorEhPKc (libcec>
#7  0x00007122a0ffe9be _ZN3CEC15CAdapterFactory14DetectAdaptersEPNS_22cec_adapter_descriptorEhPKc (libcec.so.6 >
#8  0x00007122a0ff6fee _ZN3CEC7CLibCEC14DetectAdaptersEPNS_22cec_adapter_descriptorEhPKcb (libcec.so.6 + 0x40fe>
#9  0x00006544c538e3a8 _ZN11PERIPHERALS17CPeripheralBusCEC17PerformDeviceScanERNS_21PeripheralScanResultsE (kod>
#10 0x00006544c5387414 _ZN11PERIPHERALS14CPeripheralBus14ScanForDevicesEv (kodi.bin + 0x1414414)
#11 0x00006544c538738c _ZN11PERIPHERALS14CPeripheralBus7ProcessEv (kodi.bin + 0x141438c)
#12 0x00006544c4dec6df _ZN7CThread6ActionEv (kodi.bin + 0xe796df)
#13 0x00006544c4df7035 n/a (kodi.bin + 0xe84035)
#14 0x000071229fce0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#15 0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#16 0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 1603:
#0  0x000071229fb1c39d __poll (libc.so.6 + 0x10839d)
#1  0x00006544c4a72c85 _ZN15CFDEventMonitor7ProcessEv (kodi.bin + 0xaffc85)
#2  0x00006544c4dec6df _ZN7CThread6ActionEv (kodi.bin + 0xe796df)
#3  0x00006544c4df7035 n/a (kodi.bin + 0xe84035)
#4  0x000071229fce0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#5  0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#6  0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 1597:
#0  0x000071229faa34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000071229faa5ed9 pthread_cond_wait (libc.so.6 + 0x91ed9)
#2  0x0000712291c9d47e n/a (iris_dri.so + 0x9d47e)
#3  0x0000712291c7b22c n/a (iris_dri.so + 0x7b22c)
#4  0x0000712291c9d3ad n/a (iris_dri.so + 0x9d3ad)
#5  0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#6  0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 1587:
#0  0x000071229faa3595 n/a (libc.so.6 + 0x8f595)
#1  0x000071229faaa18e n/a (libc.so.6 + 0x9618e)
#2  0x00006544c49b1220 _ZN14CPythonInvoker4stopEb (kodi.bin + 0xa3e220)
#3  0x00006544c5544033 _ZN22CLanguageInvokerThread4stopEb (kodi.bin + 0x15d1033)
#4  0x00006544c553b0ed _ZN24CScriptInvocationManager4StopEib (kodi.bin + 0x15c80ed)
#5  0x00006544c50c1ddc _ZN5ADDON20CServiceAddonManager4StopEv (kodi.bin + 0x114eddc)
#6  0x00006544c4fb7896 _ZN12CApplication4StopEi (kodi.bin + 0x1044896)
#7  0x00006544c4fb80cb _ZN12CApplication20OnApplicationMessageEPN4KODI9MESSAGING13ThreadMessageE (kodi.bin + 0x>
#8  0x00006544c4e4b116 _ZN4KODI9MESSAGING21CApplicationMessenger14ProcessMessageEPNS0_13ThreadMessageE (kodi.bi>
#9  0x00006544c4e4b39b _ZN4KODI9MESSAGING21CApplicationMessenger15ProcessMessagesEv (kodi.bin + 0xed839b)
#10 0x00006544c4fbf8ee _ZN12CApplication7ProcessEv (kodi.bin + 0x104c8ee)
#11 0x00006544c4fb0d29 _ZN12CApplication3RunEv (kodi.bin + 0x103dd29)
#12 0x00006544c48162b7 main (kodi.bin + 0x8a32b7)
#13 0x000071229fa39c88 n/a (libc.so.6 + 0x25c88)
#14 0x000071229fa39d4c __libc_start_main (libc.so.6 + 0x25d4c)
#15 0x00006544c4876355 _start (kodi.bin + 0x903355)

Stack trace of thread 1613:
#0  0x000071229faa34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000071229faa6552 pthread_cond_clockwait (libc.so.6 + 0x92552)
#2  0x00006544c4de90fc _ZN6CTimer7ProcessEv (kodi.bin + 0xe760fc)
#3  0x00006544c4dec6df _ZN7CThread6ActionEv (kodi.bin + 0xe796df)
#4  0x00006544c4df7035 n/a (kodi.bin + 0xe84035)
#5  0x000071229fce0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#6  0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#7  0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 1599:
#0  0x000071229faa34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000071229faa5ed9 pthread_cond_wait (libc.so.6 + 0x91ed9)
#2  0x0000712291c9d47e n/a (iris_dri.so + 0x9d47e)
#3  0x0000712291c7b22c n/a (iris_dri.so + 0x7b22c)
#4  0x0000712291c9d3ad n/a (iris_dri.so + 0x9d3ad)
#5  0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#6  0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 1602:
#0  0x000071229faa34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000071229faa6552 pthread_cond_clockwait (libc.so.6 + 0x92552)
#2  0x00006544c489c74e n/a (kodi.bin + 0x92974e)
#3  0x00006544c56cd846 _ZN8ActiveAE13CActiveAESink7ProcessEv (kodi.bin + 0x175a846)
#4  0x00006544c4dec6df _ZN7CThread6ActionEv (kodi.bin + 0xe796df)
#5  0x00006544c4df7035 n/a (kodi.bin + 0xe84035)
#6  0x000071229fce0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#7  0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#8  0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 1590:
#0  0x000071229faa34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000071229faa5ed9 pthread_cond_wait (libc.so.6 + 0x91ed9)
#2  0x000071229fcd5e61 __gthread_cond_wait (libstdc++.so.6 + 0xd5e61)
#3  0x00006544c4deba59 n/a (kodi.bin + 0xe78a59)
#4  0x00006544c557843f _ZN12ANNOUNCEMENT20CAnnouncementManager7ProcessEv (kodi.bin + 0x160543f)
#5  0x00006544c4dec6df _ZN7CThread6ActionEv (kodi.bin + 0xe796df)
#6  0x00006544c4df7035 n/a (kodi.bin + 0xe84035)
#7  0x000071229fce0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#8  0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#9  0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 1608:
#0  0x000071229faa34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000071229faa6242 pthread_cond_timedwait (libc.so.6 + 0x92242)
#2  0x00007122a0fe66f7 _ZN3CEC10CCECClient7ProcessEv (libcec.so.6 + 0x306f7)
#3  0x00007122a0fdc65f _ZN10P8PLATFORM7CThread13ThreadHandlerEPv (libcec.so.6 + 0x2665f)
#4  0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#5  0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 1612:
#0  0x000071229faa34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000071229faa6552 pthread_cond_clockwait (libc.so.6 + 0x92552)
#2  0x00006544c489c74e n/a (kodi.bin + 0x92974e)
#3  0x00006544c5371021 _ZN11PERIPHERALS13CEventScanner7ProcessEv (kodi.bin + 0x13fe021)
#4  0x00006544c4dec6df _ZN7CThread6ActionEv (kodi.bin + 0xe796df)
#5  0x00006544c4df7035 n/a (kodi.bin + 0xe84035)
#6  0x000071229fce0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#7  0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#8  0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 1591:
#0  0x000071229faa34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000071229faa6552 pthread_cond_clockwait (libc.so.6 + 0x92552)
#2  0x00006544c489c74e n/a (kodi.bin + 0x92974e)
#3  0x00006544c4a76c28 _ZN5CLirc7ProcessEv (kodi.bin + 0xb03c28)
#4  0x00006544c4dec6df _ZN7CThread6ActionEv (kodi.bin + 0xe796df)
#5  0x00006544c4df7035 n/a (kodi.bin + 0xe84035)
#6  0x000071229fce0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#7  0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#8  0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 1596:
#0  0x000071229fb2a4e2 epoll_wait (libc.so.6 + 0x1164e2)
#1  0x00006544c4a7d038 _ZN16CLibInputHandler7ProcessEv (kodi.bin + 0xb0a038)
#2  0x00006544c4dec6df _ZN7CThread6ActionEv (kodi.bin + 0xe796df)
#3  0x00006544c4df7035 n/a (kodi.bin + 0xe84035)
#4  0x000071229fce0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#5  0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#6  0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 1609:
#0  0x000071229fb1c39d __poll (libc.so.6 + 0x10839d)
#1  0x00006544c4a6170c _ZN11PERIPHERALS17CPeripheralBusUSB13WaitForUpdateEv (kodi.bin + 0xaee70c)
#2  0x00006544c4a6db51 _ZN11PERIPHERALS17CPeripheralBusUSB7ProcessEv (kodi.bin + 0xafab51)
#3  0x00006544c4dec6df _ZN7CThread6ActionEv (kodi.bin + 0xe796df)
#4  0x00006544c4df7035 n/a (kodi.bin + 0xe84035)
#5  0x000071229fce0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#6  0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#7  0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 1611:
#0  0x000071229faa34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000071229faa6552 pthread_cond_clockwait (libc.so.6 + 0x92552)
#2  0x00006544c489c74e n/a (kodi.bin + 0x92974e)
#3  0x00006544c53873c7 _ZN11PERIPHERALS14CPeripheralBus7ProcessEv (kodi.bin + 0x14143c7)
#4  0x00006544c4dec6df _ZN7CThread6ActionEv (kodi.bin + 0xe796df)
#5  0x00006544c4df7035 n/a (kodi.bin + 0xe84035)
#6  0x000071229fce0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#7  0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#8  0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)
ELF object binary architecture: AMD x86-64

@garbear
Copy link
Member Author

@garbear garbear commented on 0f5db1a Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I see:

Stack trace of thread 1617:
#0  0x00007122a196ddd1 PyObject_GC_UnTrack (libpython3.12.so.1.0 + 0x16ddd1)
#1  0x00007122a1a85970 n/a (libpython3.12.so.1.0 + 0x285970)
#2  0x00007122a1983cbf n/a (libpython3.12.so.1.0 + 0x183cbf)
#3  0x00007122a1972e0a n/a (libpython3.12.so.1.0 + 0x172e0a)
#4  0x00007122a1a09209 _PyModule_ClearDict (libpython3.12.so.1.0 + 0x209209)
#5  0x00007122a1a85586 n/a (libpython3.12.so.1.0 + 0x285586)
#6  0x00007122a1a911c5 Py_EndInterpreter (libpython3.12.so.1.0 + 0x2911c5)
#7  0x00006544c49b05b8 _ZN14CPythonInvoker15onExecutionDoneEv (kodi.bin + 0xa3d5b8)
#8  0x00006544c5538556 _ZThn40_N22CLanguageInvokerThread6OnExitEv (kodi.bin + 0x15c5556)
#9  0x00006544c4dec6e8 _ZN7CThread6ActionEv (kodi.bin + 0xe796e8)
#10 0x00006544c4df7035 n/a (kodi.bin + 0xe84035)
#11 0x000071229fce0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#12 0x000071229faa6ded n/a (libc.so.6 + 0x92ded)
#13 0x000071229fb2a0dc n/a (libc.so.6 + 0x1160dc)

Looks like it's crashing when trying to garbage collect something when cleaning the module dict. So this commit doesn't fix the crash, it just makes it happen earlier in the code that I added, and shows exactly what the problem is.

According to ChatGPT:

The crash you're experiencing is likely due to the added PyDict_Clear(modules); line, which clears all the loaded Python modules' dictionaries, including potentially critical ones used by the interpreter itself. This could lead to undefined behavior, such as trying to untrack or finalize objects that have already been partially cleaned up.

Here's a refined approach to avoid the crash:

Ensure Proper Module Cleanup: Instead of clearing all the module dictionaries, carefully remove the references that might be causing the circular dependencies. You can iterate over the modules and selectively clear them.

Finalize Only When Safe: Ensure that no other Python-related operations are occurring while finalizing.

It recommends the following change:

if (Py_IsInitialized())
{
  // Switch to the main interpreter thread before finalizing
  PyThreadState_Swap(PyInterpreterState_ThreadHead(PyInterpreterState_Main()));

  // Clear modules safely
  PyObject *modules = PyImport_GetModuleDict();
  PyObject *key, *value;
  Py_ssize_t pos = 0;

  while (PyDict_Next(modules, &pos, &key, &value))
  {
    if (value != NULL && PyModule_Check(value))
    {
      PyObject *dict = PyModule_GetDict(value);
      if (dict != NULL)
      {
        PyDict_Clear(dict);
      }
    }
  }

  Py_Finalize();
}

The added safety checks may fix the crash on Kodi's end. If not, it could be a problem with the interpreter, as I see lots of talk about backporting crash fixes.

@graysky2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Are you planning to commit the code you're proposing? I am happy to build/test.

@garbear
Copy link
Member Author

@garbear garbear commented on 0f5db1a Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you test the change? The problem looks to be in the garbage collector, and while that change adds some safety it doesn't address any garbage-collector-specific problems. So I doubt the change will fix the problem, but it's worth a test. If things improve then I'll PR it.

@graysky2
Copy link
Contributor

@graysky2 graysky2 commented on 0f5db1a Jul 6, 2024 via email

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@garbear
Copy link
Member Author

@garbear garbear commented on 0f5db1a Jul 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you build Kodi with this patch?

diff --git a/xbmc/interfaces/python/XBPython.cpp b/xbmc/interfaces/python/XBPython.cpp
index 5fdd17610f..e47e8d8f34 100644
--- a/xbmc/interfaces/python/XBPython.cpp
+++ b/xbmc/interfaces/python/XBPython.cpp
@@ -60,9 +60,21 @@ XBPython::~XBPython()
     // Switch to the main interpreter thread before finalizing
     PyThreadState_Swap(PyInterpreterState_ThreadHead(PyInterpreterState_Main()));

-    // Clear all loaded modules to prevent circular references
+    // Clear modules safely
     PyObject* modules = PyImport_GetModuleDict();
-    PyDict_Clear(modules);
+    PyObject* key;
+    PyObject* value;
+    Py_ssize_t pos = 0;
+
+    while (PyDict_Next(modules, &pos, &key, &value))
+    {
+      if (value != nullptr && PyModule_Check(value))
+      {
+        PyObject* dict = PyModule_GetDict(value);
+        if (dict != nullptr)
+          PyDict_Clear(dict);
+      }
+    }

     Py_Finalize();
   }

@graysky2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still has the problem:

% dmesg
...
[Jul 6 04:27] LanguageInvoker[205484]: segfault at 75aa117a20d0 ip 000075aa35b6ddd1 sp 000075a9fd5ff4a8 error 4 in libpython3.12.so.1.0[75aa35a7c000+27b000] likely on CPU 3 (core 3, socket 0)
[  +0.000013] Code: 48 85 ff 75 d9 41 83 ac 24 98 00 00 00 01 e9 c8 fc ff ff 66 0f 1f 44 00 00 f3 0f 1e fa 48 8b 57 f0 48 85 d2 74 26 48 8b 4f f8 <48> 8b 42 08 48 83 e1 fc 83 e0 03 48 09 c8 48 89 11 48 89 42 08 48

and

Stack trace of thread 205484:
#0  0x000075aa35b6ddd1 PyObject_GC_UnTrack (libpython3.12.so.1.0 + 0x16ddd1)
#1  0x000075aa35c85970 n/a (libpython3.12.so.1.0 + 0x285970)
#2  0x000075aa35b83cbf n/a (libpython3.12.so.1.0 + 0x183cbf)
#3  0x000075aa35b72e0a n/a (libpython3.12.so.1.0 + 0x172e0a)
#4  0x000075aa35c09209 _PyModule_ClearDict (libpython3.12.so.1.0 + 0x209209)
#5  0x000075aa35c85586 n/a (libpython3.12.so.1.0 + 0x285586)
#6  0x000075aa35c911c5 Py_EndInterpreter (libpython3.12.so.1.0 + 0x2911c5)
#7  0x00005c82c233fd6b _ZN14CPythonInvoker15onExecutionDoneEv (kodi.bin + 0x1019d6b)
#8  0x00005c82c31ba876 _ZThn40_N22CLanguageInvokerThread6OnExitEv (kodi.bin + 0x1e94876)
#9  0x00005c82c286fa68 _ZN7CThread6ActionEv (kodi.bin + 0x1549a68)
#10 0x00005c82c286fd24 n/a (kodi.bin + 0x1549d24)
#11 0x000075aa340e0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#12 0x000075aa33ea6ded n/a (libc.so.6 + 0x92ded)
#13 0x000075aa33f2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 205427:
#0  0x000075aa33ea34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000075aa33ea5ed9 pthread_cond_wait (libc.so.6 + 0x91ed9)
#2  0x000075aa2929d3ee n/a (iris_dri.so + 0x9d3ee)
#3  0x000075aa2927b19c n/a (iris_dri.so + 0x7b19c)
#4  0x000075aa2929d31d n/a (iris_dri.so + 0x9d31d)
#5  0x000075aa33ea6ded n/a (libc.so.6 + 0x92ded)
#6  0x000075aa33f2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 205429:
#0  0x000075aa33ea34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000075aa33ea6552 pthread_cond_clockwait (libc.so.6 + 0x92552)
#2  0x00005c82c2331fd6 n/a (kodi.bin + 0x100bfd6)
#3  0x00005c82c33fb4a3 _ZN8ActiveAE9CActiveAE7ProcessEv (kodi.bin + 0x20d54a3)
#4  0x00005c82c286fa5f _ZN7CThread6ActionEv (kodi.bin + 0x1549a5f)
#5  0x00005c82c286fd24 n/a (kodi.bin + 0x1549d24)
#6  0x000075aa340e0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#7  0x000075aa33ea6ded n/a (libc.so.6 + 0x92ded)
#8  0x000075aa33f2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 205424:
#0  0x000075aa33ea34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000075aa33ea5ed9 pthread_cond_wait (libc.so.6 + 0x91ed9)
#2  0x000075aa2929d3ee n/a (iris_dri.so + 0x9d3ee)
#3  0x000075aa2927b19c n/a (iris_dri.so + 0x7b19c)
#4  0x000075aa2929d31d n/a (iris_dri.so + 0x9d31d)
#5  0x000075aa33ea6ded n/a (libc.so.6 + 0x92ded)
#6  0x000075aa33f2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 205414:
#0  0x000075aa33ea34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000075aa33ea6552 pthread_cond_clockwait (libc.so.6 + 0x92552)
#2  0x00005c82c2331fd6 n/a (kodi.bin + 0x100bfd6)
#3  0x00005c82c23408a1 _ZN14CPythonInvoker4stopEb (kodi.bin + 0x101a8a1)
#4  0x00005c82c31caa93 _ZN22CLanguageInvokerThread4stopEb (kodi.bin + 0x1ea4a93)
#5  0x00005c82c31be64e _ZN24CScriptInvocationManager4StopEib (kodi.bin + 0x1e9864e)
#6  0x00005c82c2c1c0fc _ZN5ADDON20CServiceAddonManager4StopEv (kodi.bin + 0x18f60fc)
#7  0x00005c82c2aa4360 _ZN12CApplication4StopEi (kodi.bin + 0x177e360)
#8  0x00005c82c2aa4bcb _ZN12CApplication20OnApplicationMessageEPN4KODI9MESSAGING13ThreadMessageE (kodi.bin + 0x>
#9  0x00005c82c28da8a5 _ZN4KODI9MESSAGING21CApplicationMessenger14ProcessMessageEPNS0_13ThreadMessageE (kodi.bi>
#10 0x00005c82c28e3c31 _ZN4KODI9MESSAGING21CApplicationMessenger15ProcessMessagesEv (kodi.bin + 0x15bdc31)
#11 0x00005c82c2aaafae _ZN12CApplication7ProcessEv (kodi.bin + 0x1784fae)
#12 0x00005c82c2a96f41 _ZN12CApplication3RunEv (kodi.bin + 0x1770f41)
#13 0x00005c82c22b7d17 main (kodi.bin + 0xf91d17)
#14 0x000075aa33e39c88 n/a (libc.so.6 + 0x25c88)
#15 0x000075aa33e39d4c __libc_start_main (libc.so.6 + 0x25d4c)
#16 0x00005c82c2117325 _start (kodi.bin + 0xdf1325)

Stack trace of thread 205480:
#0  0x000075aa33ea34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000075aa33ea6552 pthread_cond_clockwait (libc.so.6 + 0x92552)
#2  0x00005c82c2331fd6 n/a (kodi.bin + 0x100bfd6)
#3  0x00005c82c2865731 _ZN6CTimer7ProcessEv (kodi.bin + 0x153f731)
#4  0x00005c82c286fa5f _ZN7CThread6ActionEv (kodi.bin + 0x1549a5f)
#5  0x00005c82c286fd24 n/a (kodi.bin + 0x1549d24)
#6  0x000075aa340e0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#7  0x000075aa33ea6ded n/a (libc.so.6 + 0x92ded)
#8  0x000075aa33f2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 205428:
#0  0x000075aa33ea34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000075aa33ea5ed9 pthread_cond_wait (libc.so.6 + 0x91ed9)
#2  0x000075aa2929d3ee n/a (iris_dri.so + 0x9d3ee)
#3  0x000075aa2927b19c n/a (iris_dri.so + 0x7b19c)
#4  0x000075aa2929d31d n/a (iris_dri.so + 0x9d31d)
#5  0x000075aa33ea6ded n/a (libc.so.6 + 0x92ded)
#6  0x000075aa33f2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 205425:
#0  0x000075aa33ea34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000075aa33ea5ed9 pthread_cond_wait (libc.so.6 + 0x91ed9)
#2  0x000075aa2929d3ee n/a (iris_dri.so + 0x9d3ee)
#3  0x000075aa2927b19c n/a (iris_dri.so + 0x7b19c)
#4  0x000075aa2929d31d n/a (iris_dri.so + 0x9d31d)
#5  0x000075aa33ea6ded n/a (libc.so.6 + 0x92ded)
#6  0x000075aa33f2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 205417:
#0  0x000075aa33ea34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000075aa33ea5ed9 pthread_cond_wait (libc.so.6 + 0x91ed9)
#2  0x000075aa340d5e61 __gthread_cond_wait (libstdc++.so.6 + 0xd5e61)
#3  0x00005c82c286f08c n/a (kodi.bin + 0x154908c)
#4  0x00005c82c322a9ef _ZN12ANNOUNCEMENT20CAnnouncementManager7ProcessEv (kodi.bin + 0x1f049ef)
#5  0x00005c82c286fa5f _ZN7CThread6ActionEv (kodi.bin + 0x1549a5f)
#6  0x00005c82c286fd24 n/a (kodi.bin + 0x1549d24)
#7  0x000075aa340e0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#8  0x000075aa33ea6ded n/a (libc.so.6 + 0x92ded)
#9  0x000075aa33f2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 205426:
#0  0x000075aa33ea34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000075aa33ea5ed9 pthread_cond_wait (libc.so.6 + 0x91ed9)
#2  0x000075aa2929d3ee n/a (iris_dri.so + 0x9d3ee)
#3  0x000075aa2927b19c n/a (iris_dri.so + 0x7b19c)
#4  0x000075aa2929d31d n/a (iris_dri.so + 0x9d31d)
#5  0x000075aa33ea6ded n/a (libc.so.6 + 0x92ded)
#6  0x000075aa33f2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 205430:
#0  0x000075aa33f1c39d __poll (libc.so.6 + 0x10839d)
#1  0x000075aa35336223 n/a (libasound.so.2 + 0x60223)
#2  0x000075aa35337275 n/a (libasound.so.2 + 0x61275)
#3  0x000075aa3533791a n/a (libasound.so.2 + 0x6191a)
#4  0x00005c82c3416b1b _ZN11CAESinkALSA10AddPacketsEPPhjj (kodi.bin + 0x20f0b1b)
#5  0x00005c82c3428255 _ZN8ActiveAE13CActiveAESink13OutputSamplesEPNS_13CSampleBufferE (kodi.bin + 0x2102255)
#6  0x00005c82c342a5b9 _ZN8ActiveAE13CActiveAESink12StateMachineEiPN5Actor8ProtocolEPNS1_7MessageE (kodi.bin + >
#7  0x00005c82c342a89a _ZN8ActiveAE13CActiveAESink7ProcessEv (kodi.bin + 0x210489a)
#8  0x00005c82c286fa5f _ZN7CThread6ActionEv (kodi.bin + 0x1549a5f)
#9  0x00005c82c286fd24 n/a (kodi.bin + 0x1549d24)
#10 0x000075aa340e0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#11 0x000075aa33ea6ded n/a (libc.so.6 + 0x92ded)
#12 0x000075aa33f2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 205479:
#0  0x000075aa33ea34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000075aa33ea6552 pthread_cond_clockwait (libc.so.6 + 0x92552)
#2  0x00005c82c2331fd6 n/a (kodi.bin + 0x100bfd6)
#3  0x00005c82c2f99fd1 _ZN11PERIPHERALS13CEventScanner7ProcessEv (kodi.bin + 0x1c73fd1)
#4  0x00005c82c286fa5f _ZN7CThread6ActionEv (kodi.bin + 0x1549a5f)
#5  0x00005c82c286fd24 n/a (kodi.bin + 0x1549d24)
#6  0x000075aa340e0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#7  0x000075aa33ea6ded n/a (libc.so.6 + 0x92ded)
#8  0x000075aa33f2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 205431:
#0  0x000075aa33f1c39d __poll (libc.so.6 + 0x10839d)
#1  0x00005c82c23ff88b _ZN15CFDEventMonitor7ProcessEv (kodi.bin + 0x10d988b)
#2  0x00005c82c286fa5f _ZN7CThread6ActionEv (kodi.bin + 0x1549a5f)
#3  0x00005c82c286fd24 n/a (kodi.bin + 0x1549d24)
#4  0x000075aa340e0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#5  0x000075aa33ea6ded n/a (libc.so.6 + 0x92ded)
#6  0x000075aa33f2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 205478:
#0  0x000075aa33ea34e9 n/a (libc.so.6 + 0x8f4e9)
#1  0x000075aa33ea6552 pthread_cond_clockwait (libc.so.6 + 0x92552)
#2  0x00005c82c2331fd6 n/a (kodi.bin + 0x100bfd6)
#3  0x00005c82c2fb9787 _ZN11PERIPHERALS14CPeripheralBus7ProcessEv (kodi.bin + 0x1c93787)
#4  0x00005c82c286fa5f _ZN7CThread6ActionEv (kodi.bin + 0x1549a5f)
#5  0x00005c82c286fd24 n/a (kodi.bin + 0x1549d24)
#6  0x000075aa340e0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#7  0x000075aa33ea6ded n/a (libc.so.6 + 0x92ded)
#8  0x000075aa33f2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 205477:
#0  0x000075aa33f1c39d __poll (libc.so.6 + 0x10839d)
#1  0x00005c82c23f141c _ZN11PERIPHERALS17CPeripheralBusUSB13WaitForUpdateEv (kodi.bin + 0x10cb41c)
#2  0x00005c82c23f1671 _ZN11PERIPHERALS17CPeripheralBusUSB7ProcessEv (kodi.bin + 0x10cb671)
#3  0x00005c82c286fa5f _ZN7CThread6ActionEv (kodi.bin + 0x1549a5f)
#4  0x00005c82c286fd24 n/a (kodi.bin + 0x1549d24)
#5  0x000075aa340e0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#6  0x000075aa33ea6ded n/a (libc.so.6 + 0x92ded)
#7  0x000075aa33f2a0dc n/a (libc.so.6 + 0x1160dc)

Stack trace of thread 205422:
#0  0x000075aa33f2a4e2 epoll_wait (libc.so.6 + 0x1164e2)
#1  0x00005c82c2418f50 _ZN16CLibInputHandler7ProcessEv (kodi.bin + 0x10f2f50)
#2  0x00005c82c286fa5f _ZN7CThread6ActionEv (kodi.bin + 0x1549a5f)
#3  0x00005c82c286fd24 n/a (kodi.bin + 0x1549d24)
#4  0x000075aa340e0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
#5  0x000075aa33ea6ded n/a (libc.so.6 + 0x92ded)
#6  0x000075aa33f2a0dc n/a (libc.so.6 + 0x1160dc)
ELF object binary architecture: AMD x86-64

@garbear
Copy link
Member Author

@garbear garbear commented on 0f5db1a Jul 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a Python issue in the garbage collector, possibly an issue that arises when we transitioned to multiple interpreters. The commit here doesn't expose the issue, it just makes it occur sooner, so I'm not sure what I can do. Have you tried backporting all the crash fixes I've seen go upstream?

@graysky2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, if you can point them out, I can try building python 3.12.4 + these fixes.

@squat80
Copy link

@squat80 squat80 commented on 0f5db1a Jul 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi guys
FYI, i patched python 3.12.3 with PR python/cpython#118618
and it seems to have fixed the issue for me
the segfault on exit was systematic before

running gentoo + python 3.12.3 (3.12.3-r1) + kodi 21.0 (21.0-r2)

tried garbear's patch : fail (could patch and build but crashed on exit)
tried python PR's patch on 3.12.4 : fail (could not patch)
tried python PR's patch on 3.12.3 : fail on ABI changes (could not patch)
removed changes concerning Doc/data/python3.12.abi from the patchfile and it applied successfully
rebuilt python with that patch & kodi (without garbear's patch) => i have not been able to produce the issue since
here's the patchfile for python 3.12.3 if anybody needs it
gh-118618-use-pointer-for-interp-obmalloc-state.patch.gz

EDIT: i reworked the patch file (it was a CRLF issue in the patch process) i can now apply the patch for the full PR!
here are both patch files i used
gh-118618-use-pointer-for-interp-obmalloc-state.patch.tar.gz

Please sign in to comment.