-
-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-35813: Tests and docs for shared_memory #11816
Changes from 42 commits
720a8ea
29a7f80
c56e29c
c36de70
3c89c7c
5f4ba8f
f9aaa11
2377cfd
6bfa560
eaf7888
e166ed9
0f18511
a097dbb
7c65017
da7731d
242a5e9
7bdfbbb
eec4bb1
1076567
0be0531
1e5341e
a5800a9
34f1e9a
9846290
1f9bbf2
69dd8a9
8cf9ba3
594140a
395709b
aa4a887
885592b
9001b76
5848ec4
6ff8eed
06620e2
868b83d
9d83b06
715ded9
6878533
0d3d06f
05e26dd
7a3c7e5
caf0a5d
12c097d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,343 @@ | ||
:mod:`multiprocessing.shared_memory` --- Provides shared memory for direct access across processes | ||
=================================================================================================== | ||
|
||
.. module:: multiprocessing.shared_memory | ||
:synopsis: Provides shared memory for direct access across processes. | ||
|
||
**Source code:** :source:`Lib/multiprocessing/shared_memory.py` | ||
|
||
.. versionadded:: 3.8 | ||
|
||
.. index:: | ||
single: Shared Memory | ||
single: POSIX Shared Memory | ||
single: Named Shared Memory | ||
|
||
-------------- | ||
|
||
This module provides a class, :class:`SharedMemory`, for the allocation | ||
and management of shared memory to be accessed by one or more processes | ||
on a multicore or symmetric multiprocessor (SMP) machine. To assist with | ||
the life-cycle management of shared memory especially across distinct | ||
processes, a :class:`~multiprocessing.managers.BaseManager` subclass, | ||
:class:`SharedMemoryManager`, is also provided in the | ||
``multiprocessing.managers`` module. | ||
|
||
In this module, shared memory refers to "System V style" shared memory blocks | ||
(though is not necessarily implemented explicitly as such) and does not refer | ||
to "distributed shared memory". This style of shared memory permits distinct | ||
processes to potentially read and write to a common (or shared) region of | ||
volatile memory. Processes are conventionally limited to only have access to | ||
their own process memory space but shared memory permits the sharing | ||
of data between processes, avoiding the need to instead send messages between | ||
processes containing that data. Sharing data directly via memory can provide | ||
significant performance benefits compared to sharing data via disk or socket | ||
or other communications requiring the serialization/deserialization and | ||
copying of data. | ||
|
||
|
||
.. class:: SharedMemory(name=None, create=False, size=0) | ||
|
||
Creates a new shared memory block or attaches to an existing shared | ||
memory block. Each shared memory block is assigned a unique name. | ||
In this way, one process can create a shared memory block with a | ||
particular name and a different process can attach to that same shared | ||
memory block using that same name. | ||
|
||
As a resource for sharing data across processes, shared memory blocks | ||
may outlive the original process that created them. When one process | ||
no longer needs access to a shared memory block that might still be | ||
needed by other processes, the :meth:`close()` method should be called. | ||
When a shared memory block is no longer needed by any process, the | ||
:meth:`unlink()` method should be called to ensure proper cleanup. | ||
|
||
*name* is the unique name for the requested shared memory, specified as | ||
a string. When creating a new shared memory block, if ``None`` (the | ||
default) is supplied for the name, a novel name will be generated. | ||
|
||
*create* controls whether a new shared memory block is created (``True``) | ||
or an existing shared memory block is attached (``False``). | ||
|
||
*size* specifies the requested number of bytes when creating a new shared | ||
memory block. Because some platforms choose to allocate chunks of memory | ||
based upon that platform's memory page size, the exact size of the shared | ||
memory block may be larger or equal to the size requested. When attaching | ||
to an existing shared memory block, the ``size`` parameter is ignored. | ||
|
||
.. method:: close() | ||
|
||
Closes access to the shared memory from this instance. In order to | ||
ensure proper cleanup of resources, all instances should call | ||
``close()`` once the instance is no longer needed. Note that calling | ||
``close()`` does not cause the shared memory block itself to be | ||
destroyed. | ||
|
||
.. method:: unlink() | ||
|
||
Requests that the underlying shared memory block be destroyed. In | ||
order to ensure proper cleanup of resources, ``unlink()`` should be | ||
called once (and only once) across all processes which have need | ||
for the shared memory block. After requesting its destruction, a | ||
shared memory block may or may not be immediately destroyed and | ||
this behavior may differ across platforms. Attempts to access data | ||
inside the shared memory block after ``unlink()`` has been called may | ||
result in memory access errors. Note: the last process relinquishing | ||
its hold on a shared memory block may call ``unlink()`` and | ||
:meth:`close()` in either order. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mmm... what's the point of having both There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. After all instances have called |
||
|
||
.. attribute:: buf | ||
|
||
A memoryview of contents of the shared memory block. | ||
|
||
.. attribute:: name | ||
|
||
Read-only access to the unique name of the shared memory block. | ||
|
||
.. attribute:: size | ||
|
||
Read-only access to size in bytes of the shared memory block. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this the size previously passed as argument or the actual size of the occupied by all memory blocks (aka, the whole object's size)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is the requested size in bytes of the current (only one) shared memory block. It is possible to attach to a larger shared memory block but only request access to the first n bytes. One use case for this would be to avoid any potential for processes to access rapidly changing parts of the shared memory block (since two threads or processes reading and writing to the same location in memory is dangerous...) The actual size of the whole shared memory block is obtainable. When requesting to attach to an existing shared memory segment, supply size=0 and the actual size of the existing shared memory block is discovered and made available. This is probably the dominant use case when attaching to existing shared memory blocks. |
||
|
||
|
||
The following example demonstrates low-level use of :class:`SharedMemory` | ||
instances:: | ||
|
||
>>> from multiprocessing import shared_memory | ||
>>> shm_a = shared_memory.SharedMemory(create=True, size=10) | ||
>>> type(shm_a.buf) | ||
<class 'memoryview'> | ||
>>> buffer = shm_a.buf | ||
>>> len(buffer) | ||
10 | ||
>>> buffer[:4] = bytearray([22, 33, 44, 55]) # Modify multiple at once | ||
>>> buffer[4] = 100 # Modify single byte at a time | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know if pep8 will fail here. Because it say you that you need have two whitespaces. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe this physical alignment of two related comments helps readability significantly. To not do so would make the example much less readable. Pep8 encourages considering this in particular. |
||
>>> # Attach to an existing shared memory block | ||
>>> shm_b = shared_memory.SharedMemory(shm_a.name) | ||
>>> import array | ||
>>> array.array('b', shm_b.buf[:5]) # Copy the data into a new array.array | ||
array('b', [22, 33, 44, 55, 100]) | ||
>>> shm_b.buf[:5] = b'howdy' # Modify via shm_b using bytes | ||
>>> bytes(shm_a.buf[:5]) # Access via shm_a | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same comment as on the prior; I believe this significantly helps readability in a situation where such help is needed. |
||
b'howdy' | ||
>>> shm_b.close() # Close each SharedMemory instance | ||
>>> shm_a.close() | ||
>>> shm_a.unlink() # Call unlink only once to release the shared memory | ||
|
||
|
||
|
||
The following example demonstrates a practical use of the :class:`SharedMemory` | ||
class with `NumPy arrays <https://www.numpy.org/>`_, accessing the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know if is necessary show a sample with NumPy, it's just an opinion There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The use of NumPy arrays with shared memory is anticipated to be one of the most popular use cases for working with shared memory. Demonstrating this combination is likely very important to a large number of people. |
||
same ``numpy.ndarray`` from two distinct Python shells: | ||
|
||
.. doctest:: | ||
:options: +SKIP | ||
|
||
>>> # In the first Python interactive shell | ||
>>> import numpy as np | ||
>>> a = np.array([1, 1, 2, 3, 5, 8]) # Start with an existing NumPy array | ||
>>> from multiprocessing import shared_memory | ||
>>> shm = shared_memory.SharedMemory(create=True, size=a.nbytes) | ||
>>> # Now create a NumPy array backed by shared memory | ||
>>> b = np.ndarray(a.shape, dtype=a.dtype, buffer=shm.buf) | ||
>>> b[:] = a[:] # Copy the original data into shared memory | ||
>>> b | ||
array([1, 1, 2, 3, 5, 8]) | ||
>>> type(b) | ||
<class 'numpy.ndarray'> | ||
>>> type(a) | ||
<class 'numpy.ndarray'> | ||
>>> shm.name # We did not specify a name so one was chosen for us | ||
'psm_21467_46075' | ||
|
||
>>> # In either the same shell or a new Python shell on the same machine | ||
>>> import numpy as np | ||
>>> from multiprocessing import shared_memory | ||
>>> # Attach to the existing shared memory block | ||
>>> existing_shm = shared_memory.SharedMemory(name='psm_21467_46075') | ||
>>> # Note that a.shape is (6,) and a.dtype is np.int64 in this example | ||
>>> c = np.ndarray((6,), dtype=np.int64, buffer=existing_shm.buf) | ||
>>> c | ||
array([1, 1, 2, 3, 5, 8]) | ||
>>> c[-1] = 888 | ||
>>> c | ||
array([ 1, 1, 2, 3, 5, 888]) | ||
|
||
>>> # Back in the first Python interactive shell, b reflects this change | ||
>>> b | ||
array([ 1, 1, 2, 3, 5, 888]) | ||
|
||
>>> # Clean up from within the second Python shell | ||
>>> del c # Unnecessary; merely emphasizing the array is no longer used | ||
>>> existing_shm.close() | ||
|
||
>>> # Clean up from within the first Python shell | ||
>>> del b # Unnecessary; merely emphasizing the array is no longer used | ||
>>> shm.close() | ||
>>> shm.unlink() # Free and release the shared memory block at the very end | ||
|
||
|
||
.. class:: SharedMemoryManager([address[, authkey]]) | ||
|
||
A subclass of :class:`~multiprocessing.managers.BaseManager` which can be | ||
used for the management of shared memory blocks across processes. | ||
|
||
A call to :meth:`~multiprocessing.managers.BaseManager.start` on a | ||
:class:`SharedMemoryManager` instance causes a new process to be started. | ||
This new process's sole purpose is to manage the life cycle | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. After double-checking with a technical editor just now, I have been assured that either is correct. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great! |
||
of all shared memory blocks created through it. To trigger the release | ||
of all shared memory blocks managed by that process, call | ||
:meth:`~multiprocessing.managers.BaseManager.shutdown()` on the instance. | ||
This triggers a :meth:`SharedMemory.unlink()` call on all of the | ||
:class:`SharedMemory` objects managed by that process and then | ||
stops the process itself. By creating ``SharedMemory`` instances | ||
through a ``SharedMemoryManager``, we avoid the need to manually track | ||
and trigger the freeing of shared memory resources. | ||
|
||
This class provides methods for creating and returning :class:`SharedMemory` | ||
instances and for creating a list-like object (:class:`ShareableList`) | ||
backed by shared memory. | ||
|
||
Refer to :class:`multiprocessing.managers.BaseManager` for a description | ||
of the inherited *address* and *authkey* optional input arguments and how | ||
they may be used to connect to an existing ``SharedMemoryManager`` service | ||
from other processes. | ||
|
||
.. method:: SharedMemory(size) | ||
|
||
Create and return a new :class:`SharedMemory` object with the | ||
specified ``size`` in bytes. | ||
|
||
.. method:: ShareableList(sequence) | ||
|
||
Create and return a new :class:`ShareableList` object, initialized | ||
by the values from the input ``sequence``. | ||
|
||
|
||
The following example demonstrates the basic mechanisms of a | ||
:class:`SharedMemoryManager`: | ||
|
||
.. doctest:: | ||
:options: +SKIP | ||
|
||
>>> from multiprocessing import shared_memory | ||
>>> smm = shared_memory.SharedMemoryManager() | ||
>>> smm.start() # Start the process that manages the shared memory blocks | ||
>>> sl = smm.ShareableList(range(4)) | ||
>>> sl | ||
ShareableList([0, 1, 2, 3], name='psm_6572_7512') | ||
>>> raw_shm = smm.SharedMemory(size=128) | ||
>>> another_sl = smm.ShareableList('alpha') | ||
>>> another_sl | ||
ShareableList(['a', 'l', 'p', 'h', 'a'], name='psm_6572_12221') | ||
>>> smm.shutdown() # Calls unlink() on sl, raw_shm, and another_sl | ||
|
||
The following example depicts a potentially more convenient pattern for using | ||
:class:`SharedMemoryManager` objects via the :keyword:`with` statement to | ||
ensure that all shared memory blocks are released after they are no longer | ||
needed: | ||
|
||
.. doctest:: | ||
:options: +SKIP | ||
|
||
>>> with shared_memory.SharedMemoryManager() as smm: | ||
... sl = smm.ShareableList(range(2000)) | ||
... # Divide the work among two processes, storing partial results in sl | ||
... p1 = Process(target=do_work, args=(sl, 0, 1000)) | ||
... p2 = Process(target=do_work, args=(sl, 1000, 2000)) | ||
... p1.start() | ||
... p2.start() # A multiprocessing.Pool might be more efficient | ||
... p1.join() | ||
... p2.join() # Wait for all work to complete in both processes | ||
... total_result = sum(sl) # Consolidate the partial results now in sl | ||
|
||
When using a :class:`SharedMemoryManager` in a :keyword:`with` statement, the | ||
shared memory blocks created using that manager are all released when the | ||
:keyword:`with` statement's code block finishes execution. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, having this in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In all my years of working with multiprocessing even before becoming a core dev, it has been extraordinarily rare to meet anyone who knew of multiprocessing.managers and even fewer people have used one. I believe this is because (1) multiprocessing.managers are not commonly needed for the vast majority of use cases, and (2) they are not being discovered because there is too much information and too many non-trivial concepts covered in the multiprocessing main documentation. When a user wishes to use the zero-copy shared memory functionality, they will very commonly also want to use SharedMemoryManager. We should help users make this mental connection right away. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree the preferred way should be using
Mmm... apparently it's not so rare: |
||
|
||
|
||
.. class:: ShareableList(sequence=None, *, name=None) | ||
|
||
Provides a mutable list-like object where all values stored within are | ||
stored in a shared memory block. This constrains storable values to | ||
only the ``int``, ``float``, ``bool``, ``str`` (less than 10M bytes each), | ||
``bytes`` (less than 10M bytes each), and ``None`` built-in data types. | ||
It also notably differs from the built-in ``list`` type in that these | ||
lists can not change their overall length (i.e. no append, insert, etc.) | ||
and do not support the dynamic creation of new :class:`ShareableList` | ||
instances via slicing. | ||
|
||
*sequence* is used in populating a new ``ShareableList`` full of values. | ||
Set to ``None`` to instead attach to an already existing | ||
``ShareableList`` by its unique shared memory name. | ||
|
||
*name* is the unique name for the requested shared memory, as described | ||
in the definition for :class:`SharedMemory`. When attaching to an | ||
existing ``ShareableList``, specify its shared memory block's unique | ||
name while leaving ``sequence`` set to ``None``. | ||
|
||
.. method:: count(value) | ||
|
||
Returns the number of occurrences of ``value``. | ||
|
||
.. method:: index(value) | ||
|
||
Returns first index position of ``value``. Raises :exc:`ValueError` if | ||
``value`` is not present. | ||
|
||
.. attribute:: format | ||
|
||
Read-only attribute containing the :mod:`struct` packing format used by | ||
all currently stored values. | ||
|
||
.. attribute:: shm | ||
|
||
The :class:`SharedMemory` instance where the values are stored. | ||
|
||
|
||
The following example demonstrates basic use of a :class:`ShareableList` | ||
instance: | ||
|
||
>>> from multiprocessing import shared_memory | ||
>>> a = shared_memory.ShareableList(['howdy', b'HoWdY', -273.154, 100, None, True, 42]) | ||
>>> [ type(entry) for entry in a ] | ||
[<class 'str'>, <class 'bytes'>, <class 'float'>, <class 'int'>, <class 'NoneType'>, <class 'bool'>, <class 'int'>] | ||
>>> a[2] | ||
-273.154 | ||
>>> a[2] = -78.5 | ||
>>> a[2] | ||
-78.5 | ||
>>> a[2] = 'dry ice' # Changing data types is supported as well | ||
>>> a[2] | ||
'dry ice' | ||
>>> a[2] = 'larger than previously allocated storage space' | ||
Traceback (most recent call last): | ||
... | ||
ValueError: exceeds available storage for existing str | ||
>>> a[2] | ||
'dry ice' | ||
>>> len(a) | ||
7 | ||
>>> a.index(42) | ||
6 | ||
>>> a.count(b'howdy') | ||
0 | ||
>>> a.count(b'HoWdY') | ||
1 | ||
>>> a.shm.close() | ||
>>> a.shm.unlink() | ||
>>> del a # Use of a ShareableList after call to unlink() is unsupported | ||
|
||
The following example depicts how one, two, or many processes may access the | ||
same :class:`ShareableList` by supplying the name of the shared memory block | ||
behind it: | ||
|
||
>>> b = shared_memory.ShareableList(range(5)) # In a first process | ||
>>> c = shared_memory.ShareableList(name=b.shm.name) # In a second process | ||
>>> c | ||
ShareableList([0, 1, 2, 3, 4], name='...') | ||
>>> c[-1] = -999 | ||
>>> b[-1] | ||
-999 | ||
>>> b.shm.close() | ||
>>> c.shm.close() | ||
>>> c.shm.unlink() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the double whitepace on
machine. To assist
if for more readable?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the resulting html, whether one or two spaces are in the source only one space is presented in the browser. The convention of using two spaces predates patterns of using only one space after a period. As far as I know, both are "correct". Thankfully the output is the same either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not go looking for this but I accidentally stumbled across it in the dev guide (see devguide.python.org for lots more such goodies):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh great thanks for the clarification. I had the idea that double space is used just on docstring.