Skip to content

USBD: CDC ACM performance regression #99779

@JordanYates

Description

@JordanYates

Describe the bug

The CDC ACM class in the new USB next stack has significantly degraded performance compared to the legacy stack.
Measured write throughput on a nRF52840DK is approximately half, 1.8 Mbps vs 3.4 Mbps previously.

Regression

  • This is a regression.

Steps to reproduce

  1. Cherry-pick the following commit: Embeint@ed075ab
  2. Compile zephyr/samples/usb/cdc_acm and zephyr/samples/usb/legacy/cdc_acm
  3. Run the serial_test.py script against both applications

Relevant log output

# New stack
Read 131072 in 0.58 seconds (1793288 kbps)
Read 354304 in 1.58 seconds (1792186 kbps)
Read 578560 in 2.58 seconds (1792059 kbps)
Read 802816 in 3.58 seconds (1792067 kbps)
Read 1027072 in 4.58 seconds (1792077 kbps)
Read 1250304 in 5.58 seconds (1792059 kbps)
Read 1474560 in 6.58 seconds (1792048 kbps)

# Legacy stack
Read 153600 in 0.50 seconds (2465198 kbps)
Read 590848 in 1.50 seconds (3153578 kbps)
Read 1027072 in 2.50 seconds (3288777 kbps)
Read 1464320 in 3.50 seconds (3348318 kbps)
Read 1901568 in 4.50 seconds (3383012 kbps)
Read 2339840 in 5.50 seconds (3404404 kbps)
Read 2776064 in 6.50 seconds (3418291 kbps)
Read 3213312 in 7.50 seconds (3428161 kbps)

Impact

Functional Limitation – Some features not working as expected, but system usable.

Environment

  • Zephyr v4.3.0

Additional Context

Another issue with performance, but for MSC not CDC ACM: #93704

Possibly related to the double memcpy in the new driver, which doesn't appear to exist in the legacy driver.

key = k_spin_lock(&data->lock);
done = ring_buf_put(data->tx_fifo.rb, tx_data, len);
k_spin_unlock(&data->lock, key);

buf = cdc_acm_buf_alloc(c_data, cdc_acm_get_bulk_in(c_data));
if (buf == NULL) {
atomic_clear_bit(&data->state, CDC_ACM_TX_FIFO_BUSY);
cdc_acm_work_schedule(&data->tx_fifo_work, K_MSEC(1));
return;
}
len = ring_buf_get(data->tx_fifo.rb, buf->data, buf->size);
net_buf_add(buf, len);

Further testing does appear to indicate the double memcpy is the problem. Replacing the second one with this snippet (that just drops the data from the FIFO while still transmitting the number of bytes) results in all the lost performance returning (plus a little bit more).

	// len = ring_buf_get(data->tx_fifo.rb, buf->data, buf->size);
	len = ring_buf_size_get(data->tx_fifo.rb);
	ring_buf_reset(data->tx_fifo.rb);
	net_buf_add(buf, len);
Read 204800 in 0.56 seconds (2926989 kbps)
Read 664576 in 1.56 seconds (3406465 kbps)
Read 1123328 in 2.56 seconds (3510212 kbps)
Read 1583104 in 3.56 seconds (3556312 kbps)
Read 2041856 in 4.56 seconds (3582105 kbps)
Read 2501632 in 5.56 seconds (3599102 kbps)
Read 2960384 in 6.56 seconds (3610476 kbps)

Metadata

Metadata

Assignees

Labels

RegressionSomething, which was working, does not anymorearea: USBUniversal Serial BusbugThe issue is a bug, or the PR is fixing a bug

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions