Skip to content

lumeohq/deepstream-encoder-segfault

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DeepStream SEGFAULT

This is a minimal reproducible example of the segfault issue with DeepStream.

To run once:

./run.sh

To run test in a loop with different combinations of threads/encoders:

sudo apt install python3-tabulate
./table.py
  • image nvcr.io/nvidia/deepstream:7.1-triton-multiarch
  • GPU:
    • 3070ti
    • 4060ti
  • Drivers affected:
    • 550.120
    • 560.35.05
    • 570.144
  • Drivers not affected
    • 535.183

The test looks like this:

  • run.sh starts the process
    • the process creates THREAD_COUNT threads
    • each thread runs a loop which stops after 5 seconds
      • each loop iteration:
        • creates a pipeline with ENCODERS_PER_PIPELINE encoders
        • starts the pipeline
        • if started or failed, stop/unref the pipeline
        • continues iteration if 5 seconds not elapsed yet

Pipeline looks like this:

videotestsrc ! nvvideoconvert ! nvv4l2h264enc ! fakesink  videotestsrc ! nvvideoconvert ! nvv4l2h264enc ! fakesink

Note that the pipeline contains videotestsrc ! nvvideoconvert ! nvv4l2h264enc ! fakesink part twice. In this case, ENCODERS_PER_PIPELINE=2.

If the pipeline starts well, we stop it and continue iteration. So each thread starts and stops the pipeline in a loop. In 5 seconds, each tread may start/stop a hundred of pipelines.

Of course, the pipeline with 14 encoders won't run well because 3070ti only has 8 encoder sessions, but it shouldn't segfault in any case. If the pipeline fails to start, a GStreamer error is caught, the pipeline is stopped/unreffed, and iteration continues.

If the program runs for 5 seconds, we consider it a successful run, stop the threads gracefully and terminate with exit code 0. When it doesn't segfault, we count it as a success, even in presence of GStreamer errors.

If program crashes, we just capture the exit code 139.

table.py script prints a table to find out the probability of crashes depending on thread and encoder count. To run it, install tabulate: sudo apt install python3-tabulate.

Let's look at cell Thr=7 / Enc=2 as an example. {0: 11, 139: 57} means that test returned

  • 0 exit code (success) 11 times
  • 139 exit code (segfault) 57 times
 +-------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------------------------+------------------+------------------+
 |   Thr \ Enc | 1                | 2                | 3                | 4                | 5                | 6                | 7                | 8                | 9                | 10              | 11               | 12               |
 +=============+==================+==================+==================+==================+==================+==================+==================+==================+====================================+==================+==================+
 |           1 | {0: 69}          | {0: 69}          | {0: 69}          | {0: 69}          | {0: 69}          | {0: 69}          | {0: 69}          | {0: 69}          | {0: 69}          | {0: 9}          | {0: 69}          | {0: 69}          |
 +-------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------------------------+------------------+------------------+
 |           2 | {0: 69}          | {0: 69}          | {0: 66, 139: 3}  | {0: 67, 139: 2}  | {0: 69}          | {0: 50, 139: 19} | {0: 61, 139: 8}  | {0: 59, 139: 10} | {0: 56, 139: 13} | {0: 0, 139: 9}  | {0: 52, 139: 17} | {0: 45, 139: 24} |
 +-------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------------------------+------------------+------------------+
 |           3 | {0: 55, 139: 14} | {0: 67, 139: 2}  | {0: 51, 139: 18} | {0: 37, 139: 32} | {0: 26, 139: 43} | {0: 37, 139: 32} | {0: 31, 139: 38} | {0: 33, 139: 36} | {0: 33, 139: 36} | {0: 1, 139: 38} | {0: 30, 139: 39} | {0: 27, 139: 42} |
 +-------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------------------------+------------------+------------------+
 |           4 | {0: 59, 139: 10} | {0: 63, 139: 6}  | {0: 28, 139: 41} | {0: 18, 139: 51} | {0: 21, 139: 48} | {0: 18, 139: 51} | {0: 13, 139: 56} | {0: 26, 139: 43} | {0: 23, 139: 46} | {0: 2, 139: 37} | {0: 26, 139: 43} | {0: 31, 139: 38} |
 +-------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------------------------+------------------+------------------+
 |           5 | {0: 65, 139: 4}  | {0: 21, 139: 48} | {0: 12, 139: 57} | {0: 24, 139: 45} | {0: 16, 139: 53} | {0: 22, 139: 47} | {0: 21, 139: 48} | {0: 47, 139: 22} | {0: 41, 139: 28} | {0: 7, 139: 42} | {0: 56, 139: 13} | {0: 68, 139: 1}  |
 +-------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------------------------+------------------+------------------+
 |           6 | {0: 66, 139: 3}  | {0: 16, 139: 53} | {0: 19, 139: 50} | {0: 18, 139: 51} | {0: 29, 139: 40} | {0: 32, 139: 37} | {0: 44, 139: 25} | {0: 42, 139: 27} | {0: 56, 139: 13} | {0: 8, 139: 1}  | {0: 65, 139: 3}  | {0: 68}          |
 +-------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------------------------+------------------+------------------+
 |           7 | {0: 62, 139: 6}  | {0: 11, 139: 57} | {0: 25, 139: 43} | {0: 24, 139: 44} | {0: 33, 139: 35} | {0: 38, 139: 30} | {0: 46, 139: 22} | {0: 65, 139: 3}  | {0: 66, 139: 2}  | {0: 8}          | {0: 68}          | {0: 68}          |
 +-------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------------------------+------------------+------------------+
 |           8 | {0: 68}          | {0: 7, 139: 61}  | {0: 22, 139: 46} | {0: 31, 139: 37} | {0: 40, 139: 28} | {0: 46, 139: 22} | {0: 68}          | {0: 66, 139: 2}  | {0: 68}          | {0: 8}          | {0: 68}          | {0: 68}          |
 +-------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------------------------+------------------+------------------+
 |           9 | {0: 46, 139: 22} | {0: 18, 139: 50} | {0: 34, 139: 34} | {0: 33, 139: 35} | {0: 41, 139: 27} | {0: 41, 139: 27} | {0: 67, 139: 1}  | {0: 68}          | {0: 68}          | {0: 8}          | {0: 68}          | {0: 68}          |
 +-------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------------------------+------------------+------------------+
 |          10 | {0: 53, 139: 15} | {0: 20, 139: 48} | {0: 29, 139: 39} | {0: 48, 139: 20} | {0: 46, 139: 22} | {0: 67, 139: 1}  | {0: 68}          | {0: 67, 139: 1}  | {0: 68}          | {0: 8}          | {0: 68}          | {0: 68}          |
 +-------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------------------------+------------------+------------------+
 |          11 | {0: 43, 139: 25} | {0: 21, 139: 47} | {0: 39, 139: 29} | {0: 43, 139: 25} | {0: 60, 139: 8}  | {0: 68}          | {0: 68}          | {0: 68}          | {0: 68}          | {0: 8}          | {0: 68}          | {0: 67, 139: 1}  |
 +-------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------------------------+------------------+------------------+
 |          12 | {0: 44, 139: 24} | {0: 41, 139: 27} | {0: 42, 139: 26} | {0: 48, 139: 20} | {0: 65, 139: 3}  | {0: 68}          | {0: 68}          | {0: 68}          | {0: 68}          | {0: 8}          | {0: 67, 139: 1}  | {0: 66, 139: 2}  |
 +-------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+

If running the program under gdb, the stacktrace leads to libnvcuvid.so

...
Running pipeline succeeded
Thread 7 runtime: 1 seconds, Result: succeeded
Thread 7, iteration 3
Configuration value parsed from env: ENCODERS_PER_PIPELINE=2
Pipeline switched to READY state
Pipeline switched to READY state
Running pipeline failed
Thread 5 runtime: 1 seconds, Result: failed
Thread 5, iteration 4
Configuration value parsed from env: ENCODERS_PER_PIPELINE=2
Pipeline switched to READY state
Failed to query video capabilities: Invalid argument
Failed to query video capabilities: Invalid argument
Failed to query video capabilities: Invalid argument
Pipeline switched to READY state
Failed to query video capabilities: Invalid argument
Pipeline switched to READY state
Failed to query video capabilities: Invalid argument
Pipeline switched to READY state
[New Thread 0x732e0f400640 (LWP 239)]
[New Thread 0x732e24e00640 (LWP 240)]

Thread 32 "videotestsrc2:s" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x732e31000640 (LWP 222)]
0x0000732e71417941 in ?? () from /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1
(gdb) bt
#0  0x0000732e71417941 in  () at /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1
#1  0x0000732e71417a7a in  () at /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1
#2  0x0000732e7148d326 in  () at /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1
#3  0x0000732e7148da8d in  () at /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1
#4  0x0000732e9d285ac3 in  () at /usr/lib/x86_64-linux-gnu/libc.so.6
#5  0x0000732e9d316a04 in clone () at /usr/lib/x86_64-linux-gnu/libc.so.6

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published