`iobuf::append` improvements part 2 #25540

ballard26 · 2025-03-25T00:38:38Z

Changes iobuf::append(iobuf&&) to reduce fragment allocations by re-using fragments from the moved iobuf. Also reduces the number of duplicate checks. See commit messages for benchmarking results.

Backports Required

Release Notes

none

ballard26 · 2025-03-25T00:51:29Z

src/v/bytes/iobuf.h

        }

-        append(std::move(f).unoptimized_release());
+        append(f);


Note that since we're not trimming f here test_next_chunk_allocation_append_iobuf will fail as is since the appended data will now map to 322 fragments instead of 323. If we trim f the test will pass and the behavior will be identical to what it is today. However, I really don't think that behavior is optimal since if we are taking ownership of the fragments we may as well use any free space rather than just trimming it away. Thoughts?

I think yes we should change the test instead of realloc here.

Brandon do you know why we were trimming before here? Seems weird.

I don't believe the trimming was ever intentional. However, before this PR appending an existing io_fragment to an iobuf would entail;

The underlying temporary_buffer in the io_fragment to be appended was shared up to the used bytes. This is basically what is acting as a trim.

A new io_fragment was allocated and given the shared temporary_buffer.

This new fragment was appended to the iobuf

src/v/bytes/iobuf.h

vbotbuildovich · 2025-03-25T04:37:02Z

CI test results

test results on build#63618

test_id	test_kind	job_url	test_status	passed
rptest.tests.data_migrations_api_test.DataMigrationsApiTest.test_migrated_topic_data_integrity.transfer_leadership=True.params=.cancellation.dir.in.stage.preparing.use_alias.False	ducktape	https://buildkite.com/redpanda/redpanda/builds/63618#0195cb19-3e37-4a61-b714-3ba701e233cb	FLAKY	1/5
rptest.tests.data_migrations_api_test.DataMigrationsApiTest.test_migrated_topic_data_integrity.transfer_leadership=True.params=.cancellation.dir.in.stage.preparing.use_alias.True	ducktape	https://buildkite.com/redpanda/redpanda/builds/63618#0195cb19-3e37-4833-9a76-ec8374af5b37	FLAKY	1/2
rptest.tests.datalake.cluster_restore_test.DatalakeClusterRestoreTest.test_restore_partition_spec.cloud_storage_type=CloudStorageType.S3.catalog_type=CatalogType.NESSIE	ducktape	https://buildkite.com/redpanda/redpanda/builds/63618#0195cb19-3e37-4a61-b714-3ba701e233cb	FLAKY	1/2
rptest.tests.datalake.schema_evolution_test.SchemaEvolutionE2ETests.test_dropped_column_no_collision.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.SPARK.produce_mode=ProducerType.AVRO.catalog_type=CatalogType.NESSIE	ducktape	https://buildkite.com/redpanda/redpanda/builds/63618#0195cb19-3e36-432d-8143-60ae359f93d2	FLAKY	1/3
rptest.tests.datalake.schema_evolution_test.SchemaEvolutionE2ETests.test_dropped_column_no_collision.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.SPARK.produce_mode=ProducerType.PROTO2.catalog_type=CatalogType.NESSIE	ducktape	https://buildkite.com/redpanda/redpanda/builds/63618#0195cb19-3e38-4a2c-8dd2-ba661d9cd05a	FLAKY	1/2
rptest.tests.datalake.schema_evolution_test.SchemaEvolutionE2ETests.test_dropped_column_no_collision.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.SPARK.produce_mode=ProducerType.PROTO2.catalog_type=CatalogType.REST_JDBC	ducktape	https://buildkite.com/redpanda/redpanda/builds/63618#0195cb19-3e37-4a61-b714-3ba701e233cb	FLAKY	1/3
rptest.tests.datalake.schema_evolution_test.SchemaEvolutionE2ETests.test_dropped_column_no_collision.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.SPARK.produce_mode=ProducerType.PROTO3.catalog_type=CatalogType.REST_JDBC	ducktape	https://buildkite.com/redpanda/redpanda/builds/63618#0195cb19-3e36-432d-8143-60ae359f93d2	FLAKY	1/11
rptest.tests.datalake.schema_evolution_test.SchemaEvolutionE2ETests.test_dropped_column_no_collision.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.TRINO.produce_mode=ProducerType.AVRO.catalog_type=CatalogType.REST_JDBC	ducktape	https://buildkite.com/redpanda/redpanda/builds/63618#0195cb19-3e38-4a2c-8dd2-ba661d9cd05a	FLAKY	1/3
rptest.tests.datalake.schema_evolution_test.SchemaEvolutionE2ETests.test_dropped_column_no_collision.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.TRINO.produce_mode=ProducerType.PROTO2.catalog_type=CatalogType.NESSIE	ducktape	https://buildkite.com/redpanda/redpanda/builds/63618#0195cb19-3e36-432d-8143-60ae359f93d2	FLAKY	1/19
rptest.tests.datalake.schema_evolution_test.SchemaEvolutionE2ETests.test_dropped_column_no_collision.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.TRINO.produce_mode=ProducerType.PROTO2.catalog_type=CatalogType.REST_JDBC	ducktape	https://buildkite.com/redpanda/redpanda/builds/63618#0195cb19-3e37-4833-9a76-ec8374af5b37	FLAKY	1/2
rptest.tests.datalake.schema_evolution_test.SchemaEvolutionE2ETests.test_dropped_column_no_collision.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.TRINO.produce_mode=ProducerType.PROTO3.catalog_type=CatalogType.NESSIE	ducktape	https://buildkite.com/redpanda/redpanda/builds/63618#0195cb19-3e38-4a2c-8dd2-ba661d9cd05a	FLAKY	1/13
rptest.tests.random_node_operations_test.RandomNodeOperationsTest.test_node_operations.enable_failures=True.mixed_versions=False.with_tiered_storage=True.with_iceberg=True.with_chunked_compaction=True.cloud_storage_type=CloudStorageType.S3	ducktape	https://buildkite.com/redpanda/redpanda/builds/63618#0195cb05-b82f-49c9-b63a-2a29317b32c4	FLAKY	1/2
rptest.tests.scaling_up_test.ScalingUpTest.test_scaling_up_with_recovered_topic	ducktape	https://buildkite.com/redpanda/redpanda/builds/63618#0195cb05-b82d-4274-9e6c-e6ab42988505	FLAKY	1/4
test_bytes_rpunit.test_bytes_rpunit	unit	https://buildkite.com/redpanda/redpanda/builds/63618#0195cabf-e8e9-4f54-8664-84aac46366d5	FAIL	0/2
test_bytes_rpunit.test_bytes_rpunit	unit	https://buildkite.com/redpanda/redpanda/builds/63618#0195cabf-e8ea-44c5-aaf6-c7e7d81f5a0a	FAIL	0/2

test results on build#64009

test_id	test_kind	job_url	test_status	passed
rptest.tests.cloud_storage_chunk_read_path_test.CloudStorageChunkReadTest.test_read_when_segment_size_smaller_than_chunk_size	ducktape	https://buildkite.com/redpanda/redpanda/builds/64009#0195efc8-4e6e-4071-a21f-5486fc9b1c90	FLAKY	1/2
rptest.tests.data_migrations_api_test.DataMigrationsApiTest.test_migrated_topic_data_integrity.transfer_leadership=True.params=.cancellation.dir.in.stage.preparing.use_alias.True	ducktape	https://buildkite.com/redpanda/redpanda/builds/64009#0195efc8-4e6c-413b-8c03-14906f5afb69	FLAKY	1/2
storage_e2e_rpfixture.storage_e2e_rpfixture	unit	https://buildkite.com/redpanda/redpanda/builds/64009#0195ef85-c1c1-4fea-9567-446fb6c6d6dc	FLAKY	1/2

test results on build#64059

test_id	test_kind	job_url	test_status	passed
rptest.tests.data_migrations_api_test.DataMigrationsApiTest.test_migrated_topic_data_integrity.transfer_leadership=True.params=.cancellation.dir.in.stage.preparing.use_alias.False	ducktape	https://buildkite.com/redpanda/redpanda/builds/64059#0195f333-2d21-43e2-b6f3-08a945db92e5	FLAKY	1/5
rptest.tests.datalake.simple_connect_test.RedpandaConnectIcebergTest.test_translating_avro_serialized_records.cloud_storage_type=CloudStorageType.S3	ducktape	https://buildkite.com/redpanda/redpanda/builds/64059#0195f333-2d20-4539-8710-365aefc1845d	FLAKY	1/2

StephanDollberg · 2025-03-28T09:44:46Z

It seems to me that the iobuf microbench space is too small for all the various cases that exist to append things to an iobuf. Feels like we are potentially missing out on cases.

travisdowns · 2025-04-01T18:16:46Z

This is selected to backport to 25.1.x? Is that right?

travisdowns · 2025-04-01T18:23:42Z

src/v/bytes/iobuf.h

@@ -251,6 +251,7 @@ class iobuf {
    size_t last_allocation_size() const;

    bool try_copy_append(const char* ptr, size_t size);
+    void append(fragment*);


missing comment, especially necessary here as there is a raw pointer involved

Good point, will add one.

ballard26 · 2025-04-01T18:26:38Z

This is selected to backport to 25.1.x? Is that right?

No, I believe we decide against backporting it in our last meeting.

rockwotj

LGTM outside of the comment about docs. Not sure if it was discussed out of band about adding more benchmark patterns per Stephan's comment

travisdowns · 2025-04-01T18:37:54Z

src/v/bytes/iobuf.h

    }
-    o.clear();
+    o._size = 0;


This drops the the update_generation() call which we shouldn't do (I guess there is no meta test for this since one didn't fail here).

Is this noticeably faster than clear()? If so maybe we just add the update_generation() call back in explicitly.

Let me check if it's any faster.

The effects appear to be negligible, will switch back to using iobuf::clear()

Yeah this seems more robust to future changes also.

Instead of allocating a new fragment this PR re-uses the allocated fragments from the iobuf being appended. This reduces the iobuf::append benchmark from; | test | iterations | median | mad | min | max | allocs | tasks | inst | | - | - | - | - | - | - | - | - | - | | iobuf.append_bench_small | 33010000 | 8.845ns | 0.009ns | 8.834ns | 8.879ns | 0.008 | 0.000 | 180.1 | | iobuf.append_bench_medium | 9000 | 2.321us | 15.924ns | 2.305us | 2.595us | 0.627 | 0.000 | 2257.0 | | iobuf.append_bench_large | 1000 | 7.376us | 32.411ns | 7.344us | 7.454us | 4.000 | 0.000 | 6345.5 | To; | test | iterations | median | mad | min | max | allocs | tasks | inst | | - | - | - | - | - | - | - | - | - | | iobuf.append_bench_small | 32347000 | 8.838ns | 0.005ns | 8.829ns | 8.854ns | 0.008 | 0.000 | 174.1 | | iobuf.append_bench_medium | 9000 | 2.151us | 3.486ns | 2.132us | 2.269us | 0.626 | 0.000 | 2251.5 | | iobuf.append_bench_large | 1000 | 7.313us | 13.044ns | 7.240us | 7.368us | 1.750 | 0.000 | 6039.5 |

iobuf::append(fragment) already handles trimming the last fragment before appending. Therefore there is no need to repeat this work in iobuf::append(iobuf&&) or iobuf::append(temporary_buffer). This reduces the append microbench from; | test | iterations | median | mad | min | max | allocs | tasks | inst | | - | - | - | - | - | - | - | - | - | | iobuf.append_bench_small | 32347000 | 8.838ns | 0.005ns | 8.829ns | 8.854ns | 0.008 | 0.000 | 174.1 | | iobuf.append_bench_medium | 9000 | 2.151us | 3.486ns | 2.132us | 2.269us | 0.626 | 0.000 | 2251.5 | | iobuf.append_bench_large | 1000 | 7.313us | 13.044ns | 7.240us | 7.368us | 1.750 | 0.000 | 6039.5 | To; | test | iterations | median | mad | min | max | allocs | tasks | inst | | - | - | - | - | - | - | - | - | - | | iobuf.append_bench_small | 31733000 | 8.226ns | 0.003ns | 8.217ns | 8.253ns | 0.008 | 0.000 | 160.1 | | iobuf.append_bench_medium | 9000 | 2.146us | 18.402ns | 2.124us | 2.247us | 0.626 | 0.000 | 2237.9 | | iobuf.append_bench_large | 1000 | 7.346us | 14.547ns | 7.320us | 7.362us | 1.750 | 0.000 | 6017.9 |

…iobuf

travisdowns · 2025-04-02T14:22:02Z

@StephanDollberg

It seems to me that the iobuf microbench space is too small for all the various cases that exist to append things to an iobuf. Feels like we are potentially missing out on cases.

This is true (that we lack test cases), but is it a problem for this change? It looks to me like this isn't really a "tradeoff" type change, but just strictly better. Are you thinking about the change from trim to no-trim in append(iobuf&&)? It looks to me like the old trim was spurious.

StephanDollberg · 2025-04-02T15:00:22Z

it looks to me like this isn't really a "tradeoff" type change, but just strictly better

I mean I am no expert in this code but that's immediately obvious from the change? Especially given that with all the iobuf stuff we have run into a lot of unexpected things already.

Also more fundamentally I think we should write the benchmarks/tests first before we start optimizing things.

ballard26 · 2025-04-02T17:49:19Z

It seems to me that the iobuf microbench space is too small for all the various cases that exist to append things to an iobuf. Feels like we are potentially missing out on cases.

Also more fundamentally I think we should write the benchmarks/tests first before we start optimizing things.

Was chatting with Stephan about these points in DMs. My understanding is as follows;

I implemented microbenchmarks for the main code paths(copy and zero-copy) foriobuf::append(iobuf&&) in a previous PR. These microbenchmarks should be sufficient to gauge the changes made in this PR since the same function is being optimized.
In general there is a lack of microbenchmark coverage for iobuf; only 6 for a type that is fundamental to RP. As an example there are at least 3 other iobuf::append functions that have no microbenchmarks. Hence as a follow-up PR it may be worthwhile to expand the coverage.

vbotbuildovich · 2025-04-02T19:21:00Z

/backport v25.1.x

ballard26 requested review from dotnwat and travisdowns March 25, 2025 00:38

github-actions bot added the area/redpanda label Mar 25, 2025

ballard26 changed the title ~~iobuf::append improvements p2~~ iobuf::append improvements part 2 Mar 25, 2025

ballard26 commented Mar 25, 2025

View reviewed changes

src/v/bytes/iobuf.h Show resolved Hide resolved

piyushredpanda requested review from StephanDollberg and rockwotj March 28, 2025 03:22

travisdowns reviewed Apr 1, 2025

View reviewed changes

rockwotj previously approved these changes Apr 1, 2025

View reviewed changes

travisdowns reviewed Apr 1, 2025

View reviewed changes

ballard26 added 3 commits April 1, 2025 15:32

bytes/tests: fix fragment count in test_next_chunk_allocation_append_…

e67f565

…iobuf

ballard26 dismissed rockwotj’s stale review via e67f565 April 1, 2025 19:32

ballard26 force-pushed the iobuf-append-imprv-2 branch from 66e0b98 to e67f565 Compare April 1, 2025 19:32

ballard26 requested a review from travisdowns April 1, 2025 19:33

rockwotj approved these changes Apr 2, 2025

View reviewed changes

travisdowns approved these changes Apr 2, 2025

View reviewed changes

ballard26 merged commit 580f901 into redpanda-data:dev Apr 2, 2025
19 checks passed

vbotbuildovich mentioned this pull request Apr 2, 2025

[v25.1.x] iobuf::append improvements part 2 #25703

Merged

iobuf::append improvements part 2 #25540

iobuf::append improvements part 2 #25540

Uh oh!

Conversation

ballard26 commented Mar 25, 2025

Backports Required

Release Notes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vbotbuildovich commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CI test results

Uh oh!

StephanDollberg commented Mar 28, 2025

Uh oh!

travisdowns commented Apr 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ballard26 commented Apr 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rockwotj left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

travisdowns commented Apr 2, 2025

Uh oh!

StephanDollberg commented Apr 2, 2025

Uh oh!

ballard26 commented Apr 2, 2025

Uh oh!

Uh oh!

vbotbuildovich commented Apr 2, 2025

Uh oh!

Uh oh!

`iobuf::append` improvements part 2 #25540

`iobuf::append` improvements part 2 #25540

vbotbuildovich commented Mar 25, 2025 •

edited

Loading

ballard26 commented Apr 1, 2025 •

edited

Loading