Description
So we have the same issue occurring regularly on our Windows testing related to ProcessPool.
issue here for our records:
['Traceback (most recent call last):', ' File "D:\\a\\spikeinterface\\spikeinterface\\src\\spikeinterface\\sorters\\basesorter.py", line 270, in run_from_folder', ' SorterClass._run_from_folder(sorter_output_folder, sorter_params, verbose)', ' ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^', ' File "D:\\a\\spikeinterface\\spikeinterface\\src\\spikeinterface\\sorters\\internal\\tridesclous2.py", line 186, in _run_from_folder', ' labels_set, clustering_label, extra_out = find_cluster_from_peaks(', ' ~~~~~~~~~~~~~~~~~~~~~~~^', ' recording, peaks, method="tdc_clustering", method_kwargs=clustering_kwargs, extra_outputs=True, **job_kwargs', ' ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^', ' )', ' ^', ' File "D:\\a\\spikeinterface\\spikeinterface\\src\\spikeinterface\\sortingcomponents\\clustering\\main.py", line 44, in find_cluster_from_peaks', ' outputs = method_class.main_function(recording, peaks, params, job_kwargs=job_kwargs)', ' File "D:\\a\\spikeinterface\\spikeinterface\\src\\spikeinterface\\sortingcomponents\\clustering\\tdc.py", line 157, in main_function', ' post_split_label, split_count = split_clusters(', ' ~~~~~~~~~~~~~~^', ' original_labels,', ' ^^^^^^^^^^^^^^^^', ' ...<26 lines>...', ' **job_kwargs,', ' ^^^^^^^^^^^^^', ' )', ' ^', ' File "D:\\a\\spikeinterface\\spikeinterface\\src\\spikeinterface\\sortingcomponents\\clustering\\split.py", line 108, in split_clusters', ' is_split, local_labels, peak_indices, sub_folder = res.result()', ' ~~~~~~~~~~^^', ' File "C:\\hostedtoolcache\\windows\\Python\\3.13.2\\x64\\Lib\\concurrent\\futures\\_base.py", line 456, in result', ' return self.__get_result()', ' ~~~~~~~~~~~~~~~~~^^', ' File "C:\\hostedtoolcache\\windows\\Python\\3.13.2\\x64\\Lib\\concurrent\\futures\\_base.py", line 401, in __get_result', ' raise self._exception', ' concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.']
I see this on my own windows station a bunch. If the computer sleeps for a moment or network connection is weak then this fails. I don't know if there is a solution for this so one thing we could do instead would be to add pytest-repeat to our test suite and if on windows we could try to auto-retest some of these functions so that we only spend the time rerunning these functions rather than having to rerun the whole test suite. Thoughts @h-mayorquin, @alejoe91 ? And for @samuelgarcia this would be more environmentally friendly to reduce the computer time we use by only focusing on a few tests we know fail inconsistently.
We could set repeat twice for some of these tests (I think but maybe it would be a global try any test that fails twice which would be more expensive). If there is interest I can read up on this more, but if there is not then we could just keep rerunning the whole suite each time we have this failure...