Skip to content

pytest-repeat for windows flakiness? #3868

Open
@zm711

Description

@zm711

So we have the same issue occurring regularly on our Windows testing related to ProcessPool.

issue here for our records:

['Traceback (most recent call last):', '  File "D:\\a\\spikeinterface\\spikeinterface\\src\\spikeinterface\\sorters\\basesorter.py", line 270, in run_from_folder', '    SorterClass._run_from_folder(sorter_output_folder, sorter_params, verbose)', '    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^', '  File "D:\\a\\spikeinterface\\spikeinterface\\src\\spikeinterface\\sorters\\internal\\tridesclous2.py", line 186, in _run_from_folder', '    labels_set, clustering_label, extra_out = find_cluster_from_peaks(', '                                              ~~~~~~~~~~~~~~~~~~~~~~~^', '        recording, peaks, method="tdc_clustering", method_kwargs=clustering_kwargs, extra_outputs=True, **job_kwargs', '        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^', '    )', '    ^', '  File "D:\\a\\spikeinterface\\spikeinterface\\src\\spikeinterface\\sortingcomponents\\clustering\\main.py", line 44, in find_cluster_from_peaks', '    outputs = method_class.main_function(recording, peaks, params, job_kwargs=job_kwargs)', '  File "D:\\a\\spikeinterface\\spikeinterface\\src\\spikeinterface\\sortingcomponents\\clustering\\tdc.py", line 157, in main_function', '    post_split_label, split_count = split_clusters(', '                                    ~~~~~~~~~~~~~~^', '        original_labels,', '        ^^^^^^^^^^^^^^^^', '    ...<26 lines>...', '        **job_kwargs,', '        ^^^^^^^^^^^^^', '    )', '    ^', '  File "D:\\a\\spikeinterface\\spikeinterface\\src\\spikeinterface\\sortingcomponents\\clustering\\split.py", line 108, in split_clusters', '    is_split, local_labels, peak_indices, sub_folder = res.result()', '                                                       ~~~~~~~~~~^^', '  File "C:\\hostedtoolcache\\windows\\Python\\3.13.2\\x64\\Lib\\concurrent\\futures\\_base.py", line 456, in result', '    return self.__get_result()', '           ~~~~~~~~~~~~~~~~~^^', '  File "C:\\hostedtoolcache\\windows\\Python\\3.13.2\\x64\\Lib\\concurrent\\futures\\_base.py", line 401, in __get_result', '    raise self._exception', '  concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.']

I see this on my own windows station a bunch. If the computer sleeps for a moment or network connection is weak then this fails. I don't know if there is a solution for this so one thing we could do instead would be to add pytest-repeat to our test suite and if on windows we could try to auto-retest some of these functions so that we only spend the time rerunning these functions rather than having to rerun the whole test suite. Thoughts @h-mayorquin, @alejoe91 ? And for @samuelgarcia this would be more environmentally friendly to reduce the computer time we use by only focusing on a few tests we know fail inconsistently.

We could set repeat twice for some of these tests (I think but maybe it would be a global try any test that fails twice which would be more expensive). If there is interest I can read up on this more, but if there is not then we could just keep rerunning the whole suite each time we have this failure...

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions