feat(fuzz): add bulk decompression fuzz targets (MPPC, NCRUSH, XCRUSH)#1285
feat(fuzz): add bulk decompression fuzz targets (MPPC, NCRUSH, XCRUSH)#1285Greg Lamberson (glamberson) wants to merge 3 commits into
Conversation
Adds three algorithm-pinned panic-only targets (`bulk_mppc`,
`bulk_ncrush`, `bulk_xcrush`) plus a round-trip target
(`bulk_round_trip`) exercising the symmetric `BulkCompressor`
compress + decompress pairing. Pattern matches existing
`rle_decompression` and `bitmap_stream` codec targets, and
gateway's `jet_message` round-trip shape.
The bulk crate's output buffer is structurally capped at 64 KB
(`OUTPUT_BUFFER_SIZE` in `bulk.rs`), so memory-bomb is not the
dominant risk. The high-risk bug classes here are bit-stream
alignment errors, dictionary-overrun copies, sliding-window
underruns, and livelock on malformed symbol streams. Livelock
is caught by adding `-timeout=10` to the cargo fuzz invocation
in `xtask/src/fuzz.rs`.
Closes the F13 Core Tier fuzz coverage gap for `ironrdp-bulk`
per ARCHITECTURE.md ("all these crates must be fuzzed").
Regression-replay tests added in `ironrdp-testsuite-core`. Seed
corpora are minimal (zero-byte seed per target); fuzz-driven
corpora grow organically and minimized crashers will land in
the same directories per existing convention.
Assisted-by: Claude (Anthropic) <noreply@anthropic.com>
Signed-off-by: Greg Lamberson <greg@lamco.io>
I think it would be better to use the minimized crashers as there is low interest into keeping bigger artifacts in the repository history. |
There was a problem hiding this comment.
Pull request overview
This PR expands the fuzzing surface for ironrdp-bulk by adding new bulk decompression fuzz targets (MPPC/NCRUSH/XCRUSH) and a compressor→decompressor round-trip oracle, plus wiring them into the existing xtask fuzz workflow and regression replay tests.
Changes:
- Added 4 new fuzz targets (
bulk_mppc,bulk_ncrush,bulk_xcrush,bulk_round_trip) and documented them. - Implemented bulk decompression and round-trip oracles in
ironrdp-fuzzing, and added regression replay plumbing + seed files inironrdp-testsuite-core. - Updated
cargo xtask fuzz runto pass-timeout=10to libFuzzer runs.
Reviewed changes
Copilot reviewed 11 out of 17 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
xtask/src/main.rs |
Registers the new fuzz targets in the workspace target list. |
xtask/src/fuzz.rs |
Adds libFuzzer -timeout=10 to xtask fuzz run invocations. |
fuzz/README.md |
Documents the new bulk fuzz targets and their input shapes. |
fuzz/fuzz_targets/bulk_mppc.rs |
Adds MPPC decompression fuzz entrypoint. |
fuzz/fuzz_targets/bulk_ncrush.rs |
Adds NCRUSH decompression fuzz entrypoint. |
fuzz/fuzz_targets/bulk_xcrush.rs |
Adds XCRUSH decompression fuzz entrypoint. |
fuzz/fuzz_targets/bulk_round_trip.rs |
Adds BulkCompressor compress→decompress round-trip fuzz entrypoint. |
fuzz/Cargo.toml |
Registers the new fuzz target binaries. |
fuzz/Cargo.lock |
Locks new dependency graph including ironrdp-bulk. |
crates/ironrdp-fuzzing/src/oracles/mod.rs |
Implements bulk decompression and round-trip oracles. |
crates/ironrdp-fuzzing/Cargo.toml |
Adds ironrdp-bulk dependency to the fuzzing/oracle crate. |
crates/ironrdp-testsuite-core/tests/fuzz_regression.rs |
Adds regression replay tests for the new bulk oracles. |
crates/ironrdp-testsuite-core/test_data/fuzz_regression/* |
Adds seed files/directories for new regression replay targets. |
Cargo.lock |
Locks workspace dependency graph including ironrdp-bulk under ironrdp-fuzzing. |
Comments suppressed due to low confidence (2)
crates/ironrdp-fuzzing/src/oracles/mod.rs:49
- This oracle constructs a fresh
BulkCompressoron every fuzz input. SinceBulkCompressor::newallocates large algorithm state (including XCRUSH), this adds substantial per-iteration overhead and can slow fuzzing. Consider reusing a single compressor instance and resetting it between cases to keep runs deterministic.
pub fn bulk_decompress_ncrush(data: &[u8]) {
use ironrdp_bulk::{BulkCompressor, CompressionType, flags};
let Ok(mut bulk) = BulkCompressor::new(CompressionType::Rdp6) else {
return;
};
let _ = bulk.decompress(data, flags::PACKET_COMPRESSED | 0x02);
}
crates/ironrdp-fuzzing/src/oracles/mod.rs:58
- This oracle constructs a fresh
BulkCompressoron every fuzz input.BulkCompressor::newallocates large internal state (notably XCRUSH’s history/chunk tables), so per-iteration construction can substantially reduce fuzzing throughput. Consider keeping a reused instance (thread-local) and resetting between inputs.
pub fn bulk_decompress_xcrush(data: &[u8]) {
use ironrdp_bulk::{BulkCompressor, CompressionType, flags};
let Ok(mut bulk) = BulkCompressor::new(CompressionType::Rdp61) else {
return;
};
let _ = bulk.decompress(data, flags::PACKET_COMPRESSED | 0x03);
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
bulk_round_trip previously returned early when compress() cleared PACKET_COMPRESSED, treating the uncompressed case as a vacuous round-trip. That skipped coverage of the decompressor's pass-through branch (decompress with the flag clear returns src_data unchanged), which is real wire-visible code. Select the payload from src or sender.compressed_data based on the flag and run decompress unconditionally so the pass-through branch is now fuzzed alongside the real compressed path. Also drop the to_vec() copy: sender.compressed_data returns a borrow that stays valid through the receiver.decompress call. Signed-off-by: Greg Lamberson <greg@lamco.io>
Signed-off-by: Greg Lamberson <greg@lamco.io>
Summary
Adds fuzz coverage for
ironrdp-bulkdecompression. Closes a Core Tier fuzz gap per ARCHITECTURE.md ("all these crates must be fuzzed"). Three algorithm-pinned panic-only targets (bulk_mppc,bulk_ncrush,bulk_xcrush) plus one round-trip target (bulk_round_trip) exercising the symmetricBulkCompressorcompress + decompress pairing.Filed against #1124 acceptance criteria ("Oracles assert invariants (round-trip equality and/or framing invariants)") per the coordination comment on #1120 (#1120 (comment)).
Approach
Each fuzz target binary is a tiny wrapper (3 lines) calling an oracle in
crates/ironrdp-fuzzing/src/oracles/mod.rs. Algorithm dispatch is driven byflags::PACKET_COMPRESSEDplus the algorithm low-bit perCompressionType::from_flags:bulk_mppc: first input byte selects RDP4 (low bit clear, 8 KB history) vs RDP5 (low bit set, 64 KB history) so a single corpus exercises both MPPC modes via libFuzzer mutation across the byte boundary.bulk_ncrush: pinned to RDP6 NCRUSH.bulk_xcrush: pinned to RDP6.1 XCRUSH (2 MB sliding window).bulk_round_trip: first input byte selects algorithm; the remaining bytes are uncompressed input. The oracle callsBulkCompressor::compress, then constructs a freshBulkCompressorfor the receive side, decompresses, and asserts byte-equality. A fresh decompressor per call avoids sliding-window state leaking between fuzz iterations.Pattern matches the existing
rle_decompressionandbitmap_streamcodec targets inironrdp-graphics, anddevolutions-gateway'sjet_messagebyte-for-byte round-trip shape.-timeout=10added to thecargo xtask fuzz runinvocation inxtask/src/fuzz.rs. Bulk decompressors are bit-stream parsers that iterate symbol-by-symbol over a variable-length input until a terminator is observed; adversarial inputs can construct streams that never terminate cleanly. The default libFuzzer timeout (1200 seconds) is too loose to catch livelock during the 5-second CI smoke run. 10 seconds is generous for any legitimate input across the workspace (PDU decoders complete in microseconds) and tight enough to surface livelock as a per-input timeout report. All four bulk targets and all existing targets ran clean locally with the new flag.Regression-replay tests added under
crates/ironrdp-testsuite-core/test_data/fuzz_regression/. Seed corpora are minimal (one zero-byte seed per target); fuzz-driven corpora grow organically and minimized crashers will land in the same directories per existing convention.Alternatives considered
A
cap::Cap-based memory-budget oracle was considered for the bulk targets. After surveying Devolutions org fuzz practice acrossdevolutions-gateway,sspi-rs, andpicky-rs, no precedent exists forcap::Capallocator wrapping or in-process memory tracking in any of those repos. After analyzing theironrdp-bulkbug profile specifically, the crate's output buffer is structurally capped at 64 KB (OUTPUT_BUFFER_SIZEinbulk.rs) and per-call allocation is near-zero by design (sliding windows are pre-allocated atBulkCompressor::new). Memory-bomb is not the dominant risk on this crate. The dominantironrdp-bulkbug classes are bit-stream alignment errors, dictionary-overrun copies, sliding-window underruns, and livelock; the chosen oracle stack (panic + round-trip +-timeout) is matched to those.The memory-budget oracle remains the right shape for
ironrdp-egfxZGFX andironrdp-graphicscodec decoders where output sizing is field-driven (a 100-byte ClearCodec PDU declaring a 65535x65535 tile demands ~16 GB of RAM; historical MSTSC ClearCodec CVE cluster CVE-2020-0610/-0611/-0612/-0655/-0660 sits on this surface). That work is reserved for a separately-filed sibling issue per the coordination comment on #1120.A per-target
bulk_mppc_rdp4/bulk_mppc_rdp5split was considered instead of the first-byte mode selector. The single-target shape was chosen for parity with the other bulk targets and to let libFuzzer mutate across the algorithm boundary in one corpus. Happy to split if reviewers prefer.Differential round-trip against an alternative implementation is deferred.
Trade-offs accepted
The panic-only oracle does not catch silent logic bugs where a decompressor returns wrong output without panicking. The round-trip oracle catches a substantial portion of that class (any compress-then-decompress asymmetry is caught) but does not exhaustively cover decode-of-adversarial-input behaviour, where the input was not produced by a matching compressor.
-timeout=10is uniform across all fuzz targets, not bulk-specific. PDU decoders should complete in microseconds, so 10 seconds is far above any legitimate input cost across the workspace. If a non-bulk target ever needs more than 10 seconds for legitimate input, the value is a one-line change inxtask/src/fuzz.rs.Regression-replay seeds are zero-byte placeholders, not minimized crashers. Crashes surfaced during fuzz runs will land in these directories per existing convention (the same pattern as
cliprdr_format/which has only crash-* and minimized-from-* files).Verification
Local:
cargo xtask check fmt— cleancargo xtask check lints— cleancargo xtask check tests— 6 fuzz regression tests pass (the 2 existing + 4 new)cargo xtask check typos— cleanCI: awaiting on push.