Hey @remi-braun, hope this find you well, just passing along two findings from a security scan I ran on eoreader using a popular LLM.
Below is LLM generated descriptions of the vulnerabilities.s
- XXE in STAC metadata XML parsing (stac_product.py:294)
def _read_mtd_xml_stac(self, mtd_url, **kwargs) -> (etree._Element, dict):
...
mtd_str = self.read_href(mtd_url, clients=self.clients)
root = etree.fromstring(mtd_str)
mtd_url is a metadata href taken from a STAC Item that the user fetched over the network. The bytes returned by read_href are therefore attacker-influenced.
Reach
Triggered through the documented entry point:
- User calls
Reader().open(url) where url points to a STAC catalog or item (reader.py:605-620).
- The JSON is parsed into a
pystac.Item. Asset hrefs become mtd_url.
read_href(mtd_url) fetches the metadata XML.
etree.fromstring(mtd_str) parses with entity resolution and network fetches enabled.
This affects every STAC product variant that inherits from StacProduct: s2_e84, s2_mpc, hls, s1_rtc_asf, s1_rtc_mpc, and any future STAC subclass.
Impact
A hostile STAC catalog (or a hostile URL pasted by the user) can ship XML that:
- Reads local files via
<!ENTITY xxe SYSTEM \"file:///etc/passwd\"> and exposes them through the parsed product attributes.
- SSRFs internal services via entity URLs (cloud metadata endpoints, RFC1918 hosts).
- Exhausts memory through entity expansion (billion laughs).
- Security: eval() reachable on unvalidated input via public compute_index API (bands/indices.py:151)
eoreader/bands/indices.py:151 calls eval(index)(bands) where index is a function parameter. The function compute_index is publicly exported (eoreader/bands/__init__.py:98, 114). A caller that passes an attacker-controlled string to compute_index gets remote code execution.
Vulnerable code
def compute_index(index: str, bands: dict, **kwargs) -> xr.DataArray:
...
if hasattr(spyndex.indices, index):
...
elif index in EOREADER_DERIVATIVES:
...
else:
index_arr = eval(index)(bands)
The else branch is reached when the input string is not in the spyndex catalog and not in EOREADER_DERIVATIVES. There is no allowlist gating the eval call inside compute_index itself.
Reach
The internal call path through Product.load(...) is safe: product.py:1178 filters band names through is_index(band) (which checks str(index) in get_all_index_names()) before they reach compute_index.
The risk is the public function. compute_index is exported in eoreader.bands.__all__ and documented as taking an index name string. A downstream application that passes user-controlled strings (config file, web request, CLI arg, notebook input) without re-implementing is_index first gets RCE. A payload such as __import__('os').system('...') fails the hasattr and dict-membership checks and falls into eval.
Impact
Arbitrary code execution in the host process whenever an unfiltered string reaches compute_index. The function's docstring does not warn that it evaluates strings, so a developer reading the API docs has no signal that pre-validation is required.
Hey @remi-braun, hope this find you well, just passing along two findings from a security scan I ran on eoreader using a popular LLM.
Below is LLM generated descriptions of the vulnerabilities.s