Skip to content

Issue with multithreading SpykingCircus2 #3678

Open
@Djoels

Description

@Djoels

Quite often I start a sorting job with spykingcircus2 and end up with a lot of "defunct" processes. The output of ps command looks like this:

    PID TTY      STAT   TIME COMMAND
   3213 ?        Ss     0:00 /anaconda/envs/azureml_py38/bin/python /usr/local/bin/EDAT_Engine/engine.p
   3216 ?        Ssl  139:56 /anaconda/envs/jupyter_env/bin/python3.10 /anaconda/envs/jupyter_env/bin/j
   4808 pts/0    Ss+    0:00 /bin/bash -l
   5337 pts/0    Sl     0:48 pythonautomate_sorting.py --sorter spykingcircus2
  28372 pts/0    Sl    13:11 python hpoptuna.py --sorter spykingcircus2 --dataset hgt19 --ma
  29653 pts/0    S      0:00 /anaconda/envs/boc_minimal/bin/python -c from multiprocessing.resource_tra
  30090 pts/0    S      0:00 python hpoptuna.py --sorter spykingcircus2 --dataset hgt19 --ma
  30092 pts/0    Z      1:24 [python] <defunct>
  30094 pts/0    Z      1:22 [python] <defunct>
  30096 pts/0    Z      1:21 [python] <defunct>
  30098 pts/0    Z      1:22 [python] <defunct>
  30100 pts/0    Z      1:22 [python] <defunct>
  30102 pts/0    Z      1:22 [python] <defunct>
  30104 pts/0    Z      1:23 [python] <defunct>
  30106 pts/0    Z      1:23 [python] <defunct>
  30108 pts/0    Z      1:22 [python] <defunct>
  30110 pts/0    Z      1:22 [python] <defunct>
  30112 pts/0    Z      1:23 [python] <defunct>
  30114 pts/0    Z      1:25 [python] <defunct>
  30116 pts/0    Z      1:23 [python] <defunct>
  30118 pts/0    Z      1:24 [python] <defunct>
  30120 pts/0    Z      1:23 [python] <defunct>
  30122 pts/0    Z      1:23 [python] <defunct>
  30124 pts/0    Z      1:22 [python] <defunct>
  30126 pts/0    Z      1:22 [python] <defunct>
  30128 pts/0    Z      1:23 [python] <defunct>
  30130 pts/0    Z      1:25 [python] <defunct>
  30132 pts/0    Z      1:22 [python] <defunct>
  30134 pts/0    Z      1:21 [python] <defunct>
  30136 pts/0    Z      1:24 [python] <defunct>
  30138 pts/0    Z      1:23 [python] <defunct>
  30140 pts/0    Z      1:24 [python] <defunct>
  30142 pts/0    Z      1:24 [python] <defunct>
  30144 pts/0    Z      1:23 [python] <defunct>
  30146 pts/0    Z      1:22 [python] <defunct>
  30148 pts/0    Z      1:20 [python] <defunct>
  30150 pts/0    Z      1:21 [python] <defunct>
  30152 pts/0    Z      1:22 [python] <defunct>
  30154 pts/0    Z      1:23 [python] <defunct>

This is a Azure linux environment. Not sure what's going on.
When I kill the process running the sorting (here 30090) it immediately proceeds with the sorting.

Found 1497965 spikes
scipy.optimize.least_squares error: Initial guess is outside of provided bounds
scipy.optimize.least_squares error: Initial guess is outside of provided bounds
scipy.optimize.least_squares error: Initial guess is outside of provided bounds
scipy.optimize.least_squares error: Initial guess is outside of provided bounds
Kept 61 units after final merging
...

The output makes me think it didn't go beyond this line:

"sparsity": {"method": "snr", "amplitude_mode": "peak_to_peak", "threshold": 0.25},

I will try to play around with the mp_context parameters.
If this issue looks familiar and anyone tips, feel free to provide suggestions.

This may have to do with the custom environment.

Metadata

Metadata

Assignees

No one assigned

    Labels

    concurrencyRelated to parallel processingsortersRelated to sorters module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions