OCTproZ - Performance Information

Processing rate highly depends on the size of the raw data, the used computer hardware and resource usage by background or system processes. With modern computer hardware and typical data dimensions for OCT, OCTproZ achieves A-scan rates in the MHz range.

A test data set with 12 bit per sample, 1024 samples per raw A-scan, 512 A-scans per B-scan and 256 B-scans per volume was used to measure the performance on different systems:

	Office Computer	Lab Computer	Gaming Computer
CPU	Intel® Core i5-7500	AMD Ryzen™ Threadripper 1900X	AMD Ryzen™ 5 1600
RAM	16 GB	32 GB	16 GB
GPU	NVIDIA Quadro K620	NVIDIA GeForce GTX 1080 Ti	NVIDIA GeForce GTX 1080
Operating system	Windows 10	Ubuntu 16.04	Windows 10
A-scan rate with 3D view	~ 250 kHz (~ 1.9 volumes/s)	~ 4.0 MHz (~ 30 volumes/s)	~ 1.9 MHz (~ 15 volumes/s)
A-scan rate without 3D view	~ 300 kHz (~ 2.2 volumes/s)	~ 4.8 MHz (~ 36 volumes/s)	~ 2.4 MHz (~ 18 volumes/s)

Embedded System	NVIDIA Jetson Nano
CPU	ARMv8 Processor rev 1(v8l) x 4
RAM	4 GB
GPU	NVIDIA Tegra X1 (128-core Maxwell)
Operating system	Ubuntu 18.04 (JetPack 4.4.1)
A-scan rate with 3D view	~ 27 kHz (~ 0.2 volumes/s)
A-scan rate without 3D view	~ 116 kHz (~ 0.89 volumes/s)

Office Computer, Lab Computer:
The performance was measured with the full processing pipeline of OCTproZ v1.0.0. The same performance is expected with OCTproZ v1.2.0 if live sinusoidal scan distortion correction is disabled.

Gaming Computer:
The performance was measured with OCTproZ v1.2.0 with disabled live sinusoidal scan distortion correction.

Here are the relevant parameters that were used with Virtual OCT System and OCTproZ to determine the performance:

	Office Computer	Lab Computer	Gaming Computer	Jetson Nano
Virtual OCT System Settings
bit depth [bits]	12	12	12	12
Samples per raw A-scan	1024	1024	1024	1024
A-scan per B-scan	512	512	512	512
B-scans per buffer	32	256	256	32
Buffers per volume	8	1	1	8
Buffers to read from file	16	2	2	16
Wait after file read [us]	100	100	100	100
OCTproZ Settings
Bit shift sample values by 4	enabled	enabled	enabled	enabled
Flip every second B-scan	enabled	enabled	enabled	enabled
k-linearization	enabled	enabled	enabled	enabled
Dispersion Compensation	enabled	enabled	enabled	enabled
Windowing	enabled	enabled	enabled	enabled
Fixed-Pattern Noise Removal	enabled	enabled	enabled	enabled
B-scans for noise determination:	1	26	1	1
once at start of measurement	enabled	enabled	enabled	enabled
continuously	disabled	disabled	disabled	disabled
Sinusoidal scan correction	disabled	disabled	disabled	disabled
Log scaling	enabled	enabled	enabled	enabled
Stream Processed Data to Ram	enabled	disabled	disabled	enabled

How to Determine Performance

OCTproZ provides live performance information within the sidebar in the "Processing"-tab. Live performance estimation is performed and updated every 5 seconds:

It is also possible to use the NVIDIA Visual Profiler to analyze performance in more detail.

For example, the following screenshot from the NVIDIA Visual Profiler shows the performance analysis of the measurement (without 3D live view) from the table at the beginning of this document with the lab computer:

The individual kernels are marked alphanumerically:
a) data conversion
b) kernel that combines k-linearization, windowing and dispersion compensation
c) IFFT
d) subtraction step of fixed pattern noise removal
e) truncate and logarithm
f) backward scan correction
g) copy B-scan frame to display buffer
h) copy en face view to display buffer

Additional Information

Processing happens in batches. One batch is equal to one buffer and the size of the buffer has impact on processing performance. If it is too small the processing may be slower than possible. If it is too large the application may crash as a larger buffer size results in higher GPU memory usage, which can exceed the available memory on the used GPU
The optimal buffer size for a specific GPU needs to be determined experimentally
In Virtual OCT System the buffer size can be changed by changing bit depth, Samples per raw A-scan, A-scans per B-scan and B-scans per buffer.
Buffer size in bytes = ceil(bitDepth/8) * SamplesPerRawAscan * AscansPerBscan * BscansPerBuffer
When B-scans per buffer is changed in Virtual OCT System, you should also change Buffers per Volume and Buffers to read from file accordingly
If OCTproZ crashes after setting the parameters in Virtual OCT System and starting the processing, try reducing the buffer size (for example instead of B-scans per buffer: 256, Buffers per volume: 1, Buffers to read from file: 2, you could try: B-scans per buffer: 128, Buffers per volume: 2, Buffers to read from file: 4)
In Virtual OCT System a value greater than 2 for Buffers to read from file will result in a slower processing rate displayed by OCTproZ. The reason for that is that Virtual OCT System takes more time to provide the raw data if more than two buffers should be read from a file. The processing itself is not slowed down just the time between two batches is increased.

For performance measurement, you can use the provided test data set. To replicate the measurements from above you need to set the value for Samples per raw A-scan to 1024. This will cause the resulting OCT images to look distorted as the test data set was recorded with 1664 samples per raw A-scan. This is expected behavior that does not invalidate the performance measurement.

Performance and buffer size

The following bar graph shows the A-scan rate for different buffer sizes. The Gaming Computer setup without 3D live view described above was used. To change the buffer size Buffers to read from file was kept at a value of 2 and only B-scans per buffer and Buffers per volume were changed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

performance.md

performance.md

OCTproZ - Performance Information

How to Determine Performance

Additional Information

Performance and buffer size

Files

performance.md

Latest commit

History

performance.md

File metadata and controls

OCTproZ - Performance Information

How to Determine Performance

Additional Information

Performance and buffer size