-
Notifications
You must be signed in to change notification settings - Fork 43
Description
We've seen two kinds of errors running CITE-seq-Count, v1.4.2. The behavior seems related to the number of threads we choose for the input data. When running a very large FASTQ input (150m reads), with 1 or 2 threads, we get errors related to values that exceed the max value for integer. When running this input with more threads, that went away. I am assuming this has something to do w/ splitting the data into child jobs? When run the same input as one job, we hit this error and when we run multithreaded (presumably split jobs) we dont? Sorry, but I dont immediately have a stack trace for you. I could dig this out.
The second type of error (stack below) occurs when running a smaller input with high threads (12 here). There's an error in getKneeEstimateDistance(). When I repeated the same command with one thread, no error. I dont know exactly what this code is doing, but perhaps when split too much each child job doesnt have enough information to do whatever this codepath is doing?
17 Jul 2019 08:31:14,730 DEBUG: Processed 1,000,000 reads in 2.0 minutes, 20.88 seconds. Total reads: 1,000,000 in child 38287
17 Jul 2019 08:31:24,293 DEBUG: Mapping done for process 38287. Processed 1,603,942 reads
17 Jul 2019 08:32:31,010 DEBUG: /home/groups/prime-seq/pipeline_tools/bin/primeseq-python/lib/python3.5/site-packages/umi_tools/whitelist_methods.py:283: RuntimeWarning: invalid value encountered in sqrt
17 Jul 2019 08:32:31,019 DEBUG: lineVecNorm = lineVec / np.sqrt(np.sum(lineVec**2))
17 Jul 2019 08:32:31,024 DEBUG: Mapping done
17 Jul 2019 08:32:31,030 DEBUG: Merging results
17 Jul 2019 08:32:31,038 DEBUG: Correcting cell barcodes
17 Jul 2019 08:32:31,044 DEBUG: Finding a whitelist
17 Jul 2019 08:32:31,048 DEBUG: Traceback (most recent call last):
17 Jul 2019 08:32:31,054 DEBUG: File "/home/groups/prime-seq/pipeline_tools/bin/primeseq-python/bin/CITE-seq-Count", line 10, in
17 Jul 2019 08:32:31,058 DEBUG: sys.exit(main())
17 Jul 2019 08:32:31,065 DEBUG: File "/home/groups/prime-seq/pipeline_tools/bin/primeseq-python/lib/python3.5/site-packages/cite_seq_count/main.py", line 352, in main
17 Jul 2019 08:32:31,070 DEBUG: collapsing_threshold=args.bc_threshold)
17 Jul 2019 08:32:31,076 DEBUG: File "/home/groups/prime-seq/pipeline_tools/bin/primeseq-python/lib/python3.5/site-packages/cite_seq_count/processing.py", line 310, in correct_cells
17 Jul 2019 08:32:31,086 DEBUG: plotfile_prefix=False)
17 Jul 2019 08:32:31,093 DEBUG: File "/home/groups/prime-seq/pipeline_tools/bin/primeseq-python/lib/python3.5/site-packages/umi_tools/whitelist_methods.py", line 447, in getCellWhitelist
17 Jul 2019 08:32:31,097 DEBUG: cell_barcode_counts, cell_number, plotfile_prefix)
17 Jul 2019 08:32:31,102 DEBUG: File "/home/groups/prime-seq/pipeline_tools/bin/primeseq-python/lib/python3.5/site-packages/umi_tools/whitelist_methods.py", line 322, in getKneeEstimateDistance
17 Jul 2019 08:32:31,106 DEBUG: raise ValueError("Something's gone wrong here!!")
17 Jul 2019 08:32:31,112 DEBUG: ValueError: Something's gone wrong here!!
17 Jul 2019 08:32:33,030 WARN : process exited with non-zero value: 1