Skip to content

Conversation

@weiji14
Copy link
Member

@weiji14 weiji14 commented Nov 22, 2025

Benchmark reading a sample Sentinel-2 TCI file, converted to have LZW compression and predictor: 2 (horizontal differencing). Using cargo-codspeed for the benchmarking on GitHub Actions CI.

Adapted from weiji14/foss4g2025#3 and weiji14/foss4g2025#2, but with a smaller GeoTIFF from a different area.

TODO:

  • Initial port over
  • Split benchmarks to be more unit-like. One to capture network part, another to capture decoding.

References:

@weiji14 weiji14 self-assigned this Nov 22, 2025
@github-actions github-actions bot added the test label Nov 22, 2025
@weiji14 weiji14 changed the title test: Setup Rust benchmark tests test: Setup Rust benchmarks Nov 22, 2025
@weiji14
Copy link
Member Author

weiji14 commented Nov 22, 2025

Still need to install the GitHub App (https://codspeed.io/docs#connect-your-repository) for the benchmark reports to be uploaded to CodSpeed.

Comment on lines +65 to +76
// Do actual decoding of TIFF tile data (multi-threaded using rayon)
let pool = ThreadPoolBuilder::new()
.num_threads(4)
.build()
.map_err(|err| AsyncTiffError::External(Box::new(err)))?;

let tile_bytes: Vec<u8> = pool.install(|| {
tiles
.into_par_iter()
.flat_map_iter(|tile| tile.decode(&decoder_registry).unwrap())
.collect()
});
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rayon probably defaults to LIFO instead of FIFO here (xref #41). Will change it to use a FIFO scope in #133 (need to figure out how though).

@codspeed-hq
Copy link

codspeed-hq bot commented Nov 22, 2025

CodSpeed Performance Report

Congrats! CodSpeed is installed 🎉

🆕 1 new benchmark was detected.

You will start to see performance impacts in the reports once the benchmarks are run from your default branch.

Detected benchmark

@weiji14 weiji14 marked this pull request as ready for review November 22, 2025 23:45
let (x_count, y_count) = ifd.tile_count().ok_or(AsyncTiffError::General(
"unable to get IFD count".to_string(),
))?;
// dbg!(x_count, y_count); // 43 * 43 = 1849
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convert into assert_eq or remove?

fn read_tiff(fpath: &str) -> AsyncTiffResult<()> {
let abs_path: PathBuf = std::path::Path::new(fpath).canonicalize()?;
let tif_url: Url = Url::from_file_path(abs_path).expect("Failed to parse url: {abs_path}");
let (store, path): (Box<dyn ObjectStore>, Path) = parse_url(&tif_url)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know whether this or just using LocalFileSystem::new is easier. Up to you

use reqwest::Url;
use tokio::runtime;

fn read_tiff(fpath: &str) -> AsyncTiffResult<()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we split this up into a few helper functions? Maybe start with splitting

fn open_tiff(path: &str) -> ObjectReader

hopefully as we expand the benchmarks we can reuse that part

let tiles: Vec<Tile> = runtime
.block_on(async {
// Read metadata header
let prefetch_reader = PrefetchBuffer::new(reader.clone(), 32 * 1024).await?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we merge #140 first, we'll have to change this import.


// Get list of tiles in TIFF file stream (using tokio async runtime)
let tiles: Vec<Tile> = runtime
.block_on(async {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be great to move this entire async fn into a top-level function, and then just call

let tiles = runtime.block_on(read_tiles(reader, path))

or something like that

.flat_map_iter(|tile| tile.decode(&decoder_registry).unwrap())
.collect()
});
assert_eq!(tile_bytes.len(), 363528192); // should be 361681200, why not?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm what's happening here? Can you explain why you expect the number to be different?

Is it possible there's an issue with whether this is compressed/decompressed? Or whether there's a step in the decoding missing?

Copy link
Member Author

@weiji14 weiji14 Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be 10980x10980x3bands = 361681200 bytes (since this is u8 dtype, so 1 byte per pixel). The number is not lower, but higher, so the decompression should have worked, but not sure what's causing the deviation. I should probably try to find a way to piece the bytes together and display it to see what's going on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe read it in python to inspect it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants