Skip to content

Inconsistent results between first and second product.load() call (temporary file reuse issue) when resampling is needed #301

@cmayet

Description

@cmayet

Hi !

I’ve encountered an inconsistency when calling product.load() twice with the same parameters.
EOReader reuses the temporary file on the second call, but the returned data is not identical between calls (shape mismatch of two pixels, np.testing.assert_equal fails !). This happens only when loading bands needing resampling (not in the product's native resolution)

Summary

  • EOReader uses temporary files to store intermediate rasters (cropped, resampled, reflectance-applied, etc.)
  • When calling product.load() twice with identical parameters:
  • The first call computes the data
  • The second call reloads from temporary files
  • However, the outputs are not strictly identical when resampling occurs

Code To Reproduce

import logging
import os

import numpy as np
from eodag import EODataAccessGateway, setup_logging
from eoreader.bands import *
from eoreader.env_vars import USE_DASK
from eoreader.reader import Reader
from rasterio.windows import Window
from sertit import logs


os.environ[USE_DASK] = "0"


# eoreader_logger
logger = logging.getLogger("eoreader")
logs.init_logger(logger)

# eodag logger
setup_logging(1)
dag = EODataAccessGateway()
search_results = dag.search(
    provider="cop_dataspace",
    collection="S2_MSI_L1C",
    start="2025-08-26",
    end="2025-08-27",
    **{"grid:code": "MGRS-28PCC"},
)

logger.info(f"Working on product {search_results[0].properties['id']}")


path = dag.download(
    search_results[0],
)


# Custom user defined read window (is a rasterio Window instance)
window = Window(3208, 2047, 932, 4322)
bounds = (332080.0, 1736310.0, 341400.0, 1779530.0)

band = SWIR_1 # (needs resampling)
read_window = bounds

reader = Reader()
with reader.open(
    path, remove_tmp=True, output_path="/tmp/eoreader_reload_from_tmp/"
) as product:
    ds_init = product.load(band, window=read_window, to_reflectance=False)
    ds_reload = product.load(band, window=read_window, to_reflectance=False)
    np.testing.assert_equal(ds_init[band].data, ds_reload[band].data)

Output

AssertionError: 
Arrays are not equal

(shapes (1, 4324, 932), (1, 4322, 932) mismatch)
 ACTUAL: array([[[1729.    , 1721.25  , 1705.75  , ..., 1897.25  , 1949.75  ,
         1976.    ],
        [1735.5   , 1727.4375, 1711.3125, ..., 1892.25  , 1954.75  ,...
 DESIRED: array([[[1735.5   , 1727.4375, 1711.3125, ..., 1892.25  , 1954.75  ,
         1986.    ],
        [1748.5   , 1739.8125, 1722.4375, ..., 1882.25  , 1964.75  ,...

This works as expected when loading e.g GREEN band.
When using a pixel window (set reading_window to window in the code snippet above), things get worse (but this might be related to #300 :

AssertionError: 
Arrays are not equal

(shapes (1, 8644, 1864), (1, 4322, 0) mismatch)
 ACTUAL: array([[[6117.    , 6082.5   , 6013.5   , ..., 6477.    , 6487.    ,
         6492.    ],
        [6118.7544, 6082.549 , 6010.137 , ..., 6467.2046, 6477.107 ,...
 DESIRED: array([], shape=(1, 4322, 0), dtype=float32)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions