This library lets you create arrays with mmap backing. The interface is designed to be very similar to the array module in the standard library.
This library gives you the ability to have a shared memory space between different processes. This can be helpful to work around some of the concurrency limitations present in some Python implementations.
This library was originally created to support the usage of very large precomputed lookup tables across multiple threads/processes. Because of the large size of these tables creating copies for each process would have been very expensive in terms of memory usage. By creating a mmap backing and accessing read-only multiple processes could read from the same memory hence leading to memory savings.
There's also another use case when doing interprocess communications, if you want a quick-and-dirty shared memory between processes in python and processes using other languages you can use this library to share a flat memory space between them.
If you just need some simple shared memory and don't want, or can't, bring in a more complicated dependency this might be what you need. For more complicated concurrency tasks there may be more suitable libraries.
If you don't provide a file for mmap backing an anonymous mmap is created to back the array.
import mmap_backed_array
arr = mmap_backed_array.mmaparray('I', [1, 2, 3, 4])
You can also provide a mmap file as backing.
import mmap_backed_array
import mmap
with open("mmap_file", 'rb') as fd:
mmap_backing = mmap.mmap(
fd.fileno(), 0, access=mmap.ACCESS_READ
)
arr = mmap_backed_array.mmaparray('I', mmap=mmap_backing)
Note that this file can be shared with other processes, including ones that are not python.
The API is designed to be as close to the standard library array module API as possible. Using mmaparray's is designed to be similar to array.array. Another goal is to make interoperating with array.array as easy as possible.
Major functionality including (but not limited) to append
, extend
, pop
and tobytes
is supported.
There may be some slight API incompatibilities currently, if there is anything substantial in the array API that is not implemented please open an issue.
For example:
arr = mmaparray('I')
>>> arr.append(1)
>>> arr
array('I', [1])
>>> arr.extend([2,3,4])
>>> arr
array('I', [1, 2, 3, 4])
You can also use the standard library arrays easily with the mmap backed arrays:
>>> from mmap_backed_array import mmaparray
>>> mmap_array = mmaparray('I', (1, 1, 1, 1))
>>> mmap_array
array('I', [1, 1, 1, 1])
>>> import array
>>> mmap_array[2:4] = array.array('I', (2, 2))
>>> mmap_array
array('I', [1, 1, 2, 2])
Due to the way in which we are storing direct to arrays, just like in the standard library array the typecodes must match up:
>>> mmap_array.typecode
'I'
>>> mmap_array[2:4] = array.array('b', (3, 3))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/janis/mmap_backed_array/mmap_backed_array/mmap_array.py", line 302, in __setitem__
'Can only assign array of same type to array slice'
TypeError: Can only assign array of same type to array slice