Description
Not urgent matter as I figured a way around the issue, reporting here to make sure I am not misunderstanding something and in case someone might need it. Also, this behavior appeared in the last month or so, so maybe it is still unreported?
I am working on a Windows machine (anaconda, python 3.10); had some troubles using multiprocessing in any possible scenario since something like this:
from pathlib import Path
from datetime import datetime
from spikeinterface.core import load_extractor
data_path = Path(r"...\test-data")
test_data = load_extractor(data_path)
temp_path = data_path.parent / f"test_dataset_resaved_{datetime.now().strftime('%Y%m%d-%H%M%S')}"
test_data.save(folder=temp_path, n_jobs=-1)
would result in (expand to show full error traceback):
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
write_binary_recording with n_jobs = 20 and chunk_size = 30000
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "c:\Users\SNeurobiology\code\ephys-preprocessing\benchmarking\debug_compression.py", line 10, in <module>
test_data.save(folder=temp_path, n_jobs=-1)
File "C:\Users\SNeurobiology\code\spikeinterface\src\spikeinterface\core\base.py", line 760, in save
loaded_extractor = self.save_to_folder(**kwargs)
File "C:\Users\SNeurobiology\code\spikeinterface\src\spikeinterface\core\base.py", line 838, in save_to_folder
cached = self._save(folder=folder, verbose=verbose, **save_kwargs)
File "C:\Users\SNeurobiology\code\spikeinterface\src\spikeinterface\core\baserecording.py", line 462, in _save
write_binary_recording(self, file_paths=file_paths, dtype=dtype, **job_kwargs)
File "C:\Users\SNeurobiology\code\spikeinterface\src\spikeinterface\core\core_tools.py", line 314, in write_binary_recording
executor.run()
File "C:\Users\SNeurobiology\code\spikeinterface\src\spikeinterface\core\job_tools.py", line 391, in run
results = executor.map(function_wrapper, all_chunks)
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\concurrent\futures\process.py", line 766, in map
results = super().map(partial(_process_chunk, fn),
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\concurrent\futures\_base.py", line 610, in map
fs = [self.submit(fn, *args) for args in zip(*iterables)]
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\concurrent\futures\_base.py", line 610, in <listcomp>
fs = [self.submit(fn, *args) for args in zip(*iterables)]
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\concurrent\futures\process.py", line 737, in submit
self._adjust_process_count()
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\concurrent\futures\process.py", line 697, in _adjust_process_count
self._spawn_process()
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\concurrent\futures\process.py", line 714, in _spawn_process
p.start()
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\multiprocessing\context.py", line 336, in _Popen
return Popen(process_obj)
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\multiprocessing\popen_spawn_win32.py", line 45, in __init__
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
The workaround was to refactor my script to be the following:
from pathlib import Path
from datetime import datetime
from spikeinterface.core import load_extractor
data_path = Path(r"...\test-data")
test_data = load_extractor(data_path)
temp_path = data_path.parent / f"test_dataset_resaved_{datetime.now().strftime('%Y%m%d-%H%M%S')}"
if __name__ == "__main__":
test_data.save(folder=temp_path, n_jobs=-1)
And now it works. Still, I can't understand why I would need to protect my script from import - although the secrets of multiprocessing are not super clear to me and maybe I am missing something obvious.
This started with pulling the latest version of the package, when I was playing with it approx. one month ago I do not remember encountering the same issue even though I was for sure using multiprocessing already.
Thank you so very much for the amazing package!