test: improve coverage on test retrieval#4
Conversation
WalkthroughThis update significantly expands the test coverage in the Changes
Possibly related PRs
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4 +/- ##
===========================================
+ Coverage 65.42% 81.23% +15.81%
===========================================
Files 8 8
Lines 778 794 +16
Branches 111 112 +1
===========================================
+ Hits 509 645 +136
+ Misses 221 116 -105
+ Partials 48 33 -15 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (5)
tests/test_ipfs_retrieval.py (5)
19-21: Consider limiting tests to public interfaces.Testing private functions (
_get_ipfs_store,fetch_json_from_cid, and_get_host) can be a code smell. If you need to test them extensively, consider making them part of the public API or reaffirm the design that justifies direct testing of these internals.
526-527:mock_exists_falseis declared but not used.Consider removing it if no test scenario requires returning
Falsefromos.path.exists:-def mock_exists_false(*args, **kwargs) -> bool: - return False
602-640: Consider using a singlewith patch(...)block to avoid.start()/.stop().You can simplify by using the context manager directly:
- mock_json_load = patch( - "json.load", - side_effect=json.JSONDecodeError("err", "doc", 0), - ).start() ... - mock_json_load.stop() + with patch( + "json.load", + side_effect=json.JSONDecodeError("err", "doc", 0) + ): + update_cache_if_changed(new_data)
646-668: Combine nestedwithstatements into a single context.According to Ruff (SIM117), consider merging nested blocks to reduce indentation. For example:
-def test_get_ipns_name_hash_local_cache_malformed_json_during_fallback(): - with patch("requests.get", side_effect=requests.RequestException("Simulated error")): - with patch("os.path.exists", return_value=True): - with patch("dclimate_zarr_client.ipfs_retrieval.update_cache_if_changed"): - with patch("dclimate_zarr_client.ipfs_retrieval.open", mock_open(read_data="INVALID JSON!!"), create=True) as mock_file_open: - with patch("json.load", side_effect=json.JSONDecodeError("err", "doc", 0)): - with pytest.raises(DatasetNotFoundError, match="Invalid dataset name"): - get_ipns_name_hash("cpc-precip-conus") +def test_get_ipns_name_hash_local_cache_malformed_json_during_fallback(): + with patch("requests.get", side_effect=requests.RequestException("Simulated error")), \ + patch("os.path.exists", return_value=True), \ + patch("dclimate_zarr_client.ipfs_retrieval.update_cache_if_changed"), \ + patch("dclimate_zarr_client.ipfs_retrieval.open", mock_open(read_data="INVALID JSON!!"), create=True) as mock_file_open, \ + patch("json.load", side_effect=json.JSONDecodeError("err", "doc", 0)): + with pytest.raises(DatasetNotFoundError, match="Invalid dataset name"): + get_ipns_name_hash("cpc-precip-conus")🧰 Tools
🪛 Ruff (0.8.2)
648-651: Use a single
withstatement with multiple contexts instead of nestedwithstatements(SIM117)
661-666: Use a single
withstatement with multiple contexts instead of nestedwithstatementsCombine
withstatements(SIM117)
672-693: Consider consolidating the nestedwithstatements here as well.Similarly, you can merge the multiple
withcalls for clarity:-def test_get_ipns_name_hash_local_cache_empty_during_fallback(): - with patch("requests.get", side_effect=requests.RequestException("Simulated error")): - with patch("os.path.exists", return_value=True): - with patch("dclimate_zarr_client.ipfs_retrieval.update_cache_if_changed"): - with patch("dclimate_zarr_client.ipfs_retrieval.open", mock_open(read_data=""), create=True) as mock_file_open: - with patch("json.load", side_effect=json.JSONDecodeError("Expecting value", "", 0)): - with pytest.raises(DatasetNotFoundError, match="Invalid dataset name"): - get_ipns_name_hash("cpc-precip-conus") +def test_get_ipns_name_hash_local_cache_empty_during_fallback(): + with patch("requests.get", side_effect=requests.RequestException("Simulated error")), \ + patch("os.path.exists", return_value=True), \ + patch("dclimate_zarr_client.ipfs_retrieval.update_cache_if_changed"), \ + patch("dclimate_zarr_client.ipfs_retrieval.open", mock_open(read_data=""), create=True) as mock_file_open, \ + patch("json.load", side_effect=json.JSONDecodeError("Expecting value", "", 0)): + with pytest.raises(DatasetNotFoundError, match="Invalid dataset name"): + get_ipns_name_hash("cpc-precip-conus")🧰 Tools
🪛 Ruff (0.8.2)
674-677: Use a single
withstatement with multiple contexts instead of nestedwithstatementsCombine
withstatements(SIM117)
685-691: Use a single
withstatement with multiple contexts instead of nestedwithstatementsCombine
withstatements(SIM117)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/test_ipfs_retrieval.py(8 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
tests/test_ipfs_retrieval.py (2)
dclimate_zarr_client/ipfs_retrieval.py (5)
_get_ipfs_store(34-52)fetch_json_from_cid(56-95)_get_host(436-449)update_cache_if_changed(500-511)get_ipns_name_hash(514-557)dclimate_zarr_client/dclimate_zarr_errors.py (3)
StacCatalogError(65-66)IpfsConnectionError(61-62)DatasetNotFoundError(25-26)
🪛 Ruff (0.8.2)
tests/test_ipfs_retrieval.py
648-651: Use a single with statement with multiple contexts instead of nested with statements
(SIM117)
661-666: Use a single with statement with multiple contexts instead of nested with statements
Combine with statements
(SIM117)
674-677: Use a single with statement with multiple contexts instead of nested with statements
Combine with statements
(SIM117)
685-691: Use a single with statement with multiple contexts instead of nested with statements
Combine with statements
(SIM117)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build
🔇 Additional comments (23)
tests/test_ipfs_retrieval.py (23)
3-4: Imports for type hinting and mocking look good.
8-8: CID import is correct.
23-23: Importing IPFSStore is appropriate.
36-38: Type alias definitions are clear and help readability.
44-53: Test for default store creation appears correct.
55-69: Argument override test is thorough.
71-83: Environment variable usage test is solid.
85-98: Mixed args/env test logic works as expected.
103-111: mock_ipfs_store fixture is well-structured.
113-130: Successful fetch test logic is correct.
132-146: Prefix stripping test for /ipfs/ is correct.
148-156: CID decoding failure test is correct.
158-168: Test for empty load return is valid.
170-180: Invalid JSON test is valid and ensures error handling.
182-192: Timeout scenario test is well implemented.
194-212: Connection error tests cover various failure modes thoroughly.
214-225: Generic load error test is correctly raising StacCatalogError.
230-239: _get_host default logic test is correct.
241-245: Test for IPFS_HOST environment variable usage is correct.
522-523:mock_exists_trueis used correctly where needed.
531-538: Test for no-update scenario is correct and confirms read-only usage.
548-554: Update scenario test is valid, ensuring correct write logic.
575-588: File not found scenario test is thorough.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (5)
dclimate_zarr_client/ipfs_retrieval.py (2)
212-223: Refine or simplify the JSONDecodeError handling logic.
This block robustly handles JSON decode errors on the second attempt, constructing an informative error message that appends details from the first attempt. If similar logic recurs in other branches, consider extracting a helper method for consistent error reporting.
317-317: Clarify the log message vs. appending logic.
Currently, the log line indicates "skipping" the child link, but the code then appends it tocollections_to_visit. This can be confusing to maintainers. Here is a proposed tweak to the log message:-logger.warning(f"Skipping child link with unexpected string href format (expected dict): {link}") +logger.warning(f"Processing legacy string-format child link: {link}")tests/test_ipfs_retrieval.py (3)
30-33: Consider removing stale commented-out lines.
These lines reference a potential fixture usage that is now commented out. If no longer needed, remove them to keep the file clean.
1067-1095: test_get_ipns_name_hash_local_cache_malformed_json_during_fallback.
Multiple nestedwithstatements degrade readability slightly, as flagged by static analysis. Consider combining contexts:-with patch("requests.get", ...): - with patch("os.path.exists", ...): - with patch("dclimate_zarr_client.ipfs_retrieval.update_cache_if_changed"): - ... +with patch("requests.get", ...), \ + patch("os.path.exists", ...), \ + patch("dclimate_zarr_client.ipfs_retrieval.update_cache_if_changed"): + ...This is a minor style improvement. Otherwise, the fallback logic is tested thoroughly.
🧰 Tools
🪛 Ruff (0.8.2)
1073-1076: Use a single
withstatement with multiple contexts instead of nestedwithstatements(SIM117)
1086-1091: Use a single
withstatement with multiple contexts instead of nestedwithstatementsCombine
withstatements(SIM117)
1097-1119: test_get_ipns_name_hash_local_cache_empty_during_fallback.
Same nestedwithstyle comment applies here. Tests are otherwise solid, verifying the correct exception for empty local cache.🧰 Tools
🪛 Ruff (0.8.2)
1099-1102: Use a single
withstatement with multiple contexts instead of nestedwithstatementsCombine
withstatements(SIM117)
1110-1116: Use a single
withstatement with multiple contexts instead of nestedwithstatementsCombine
withstatements(SIM117)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
dclimate_zarr_client/ipfs_retrieval.py(3 hunks)tests/test_ipfs_retrieval.py(6 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
dclimate_zarr_client/ipfs_retrieval.py (1)
dclimate_zarr_client/dclimate_zarr_errors.py (2)
StacCatalogError(65-66)IpfsConnectionError(61-62)
🪛 Ruff (0.8.2)
tests/test_ipfs_retrieval.py
1073-1076: Use a single with statement with multiple contexts instead of nested with statements
(SIM117)
1086-1091: Use a single with statement with multiple contexts instead of nested with statements
Combine with statements
(SIM117)
1099-1102: Use a single with statement with multiple contexts instead of nested with statements
Combine with statements
(SIM117)
1110-1116: Use a single with statement with multiple contexts instead of nested with statements
Combine with statements
(SIM117)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build
🔇 Additional comments (34)
dclimate_zarr_client/ipfs_retrieval.py (1)
206-206: Looks good for JSON parsing on retry.
This line correctly callsresponse.json()and is consistent with your approach of catchingJSONDecodeErrorin subsequent blocks.tests/test_ipfs_retrieval.py (33)
3-4: Imports and testing private members.
You're importing additional modules (typing.Dict,typing.Any,CID, etc.) and referencing private/internal functions from the package under test. This is acceptable in Python land, though it's worth noting that testing internal APIs can signal that these functionalities might be candidates for public exposure or refactoring.Also applies to: 8-8, 13-13, 20-20, 22-22, 24-24
34-38: No issues with custom type aliasing.
DefiningMonkeyPatch,MockIPFSStore, andMockResponsefosters clarity in your tests. No concerns here.
40-53: Fixtures look well-structured.
These fixtures (mock_ipfs_storeandmock_requests_get) are clear, enabling consistent test mocking. Good approach for controlling dependencies.
56-62: Helper function for cache path.
get_cache_path()is straightforward. No issues.
64-65: Section heading clarity.
The commentary sets up the_get_ipfs_storetest section. Nothing to change here.
67-77: Test _get_ipfs_store_defaults.
Verifies store creation with defaults when no args are passed. Looks good.
79-93: Test _get_ipfs_store_args_override_env.
Checks argument precedence over env vars, ensuring correct store config. Nicely done.
95-106: Test _get_ipfs_store_env_vars.
Covers usage of environment variables. Good thoroughness.
108-121: Test _get_ipfs_store_mixed_args_env.
Ensures a mix of argument and environment usage. No issues noted.
123-124: Section heading for fetch_json_from_cid tests.
These lines simply mark a new section.
126-139: Test successful fetching from CID.
Verifies normal JSON decode. This covers a critical happy path. Good.
141-153: Test CID with /ipfs/ prefix.
Ensures prefix stripping logic is correct and tested.
155-163: Test invalid CID string.
Properly raisesStacCatalogErrorbefore attempting any IPFS load.
165-174: Test empty bytes from load call.
VerifiesStacCatalogErrorif no data is returned, which is helpful to catch partially available IPFS content.
176-185: Test JSON decode error.
ChecksStacCatalogErrorupon invalid JSON. Straightforward coverage.
187-196: Test Timeout exception.
Ensures correctIpfsConnectionErroris thrown if requests time out.
198-216: Test various connection error messages.
This set of sub-tests is comprehensive, matching typical IPFS gateway issues.
218-229: Test generic load error.
Correctly re-raises issues asStacCatalogError.
231-241: Test _get_host_default.
Confirms fallback toDEFAULT_HOSTabsent env vars. Looks good.
243-248: Test _get_host_from_env.
Verifies environment override. LGTM.
250-250: Section heading for fetch_json_from_ipns tests.
No content to review beyond the comment marker.
253-439: Class TestFetchJsonFromIpnsErrors.
These tests meticulously cover all error paths and retry logic, including timeouts, connection errors, JSON decode errors, and HTTP failures. The detailed coverage is commendable.
444-455: Root catalog fetch error test.
Validates raisingStacCatalogErrorif the initial root catalog fetch fails. Nicely done.
457-468: Invalid root catalog format test.
Properly checks fortype: Catalogand raises error if missing.
470-485: No child links test.
Ensures aStacCatalogErrorif no valid child links exist.
487-532: Invalid child link formats test.
Comprehensive scenario that ensures skipping or warning on malformed links. Good coverage.
534-596: Collection fetch error scenario.
Valid test ensuring errors on certain collections don't prevent discovering valid ones. This improves robustness.
598-621: Invalid collection format scenario.
Checks skipping of incomplete or invalid collection data.
623-678: Item fetch error scenario.
Verifies that one failing item won't block the rest. Nicely handled.
944-953: Helper functions for mock_exists.
These are straightforward, returningTrueorFalse; no issues.
956-972: test_update_cache_no_update.
Correct logic verifying no write occurs when the file is already up-to-date.
973-999: test_update_cache_update.
Checks the file is opened for writing when new data differs. Implementation is correct.
1000-1027: test_update_cache_file_not_found & test_update_cache_decode_error.
Confirms correctly handling missing or malformed cache files, then writing updated content. Good coverage.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/test_ipfs_retrieval.py (2)
11-11: Consider using alternative import style.While the current import works correctly, you could make it more explicit.
-import dclimate_zarr_client.ipfs_retrieval as ipfs_retrieval +from dclimate_zarr_client import ipfs_retrieval🧰 Tools
🪛 Pylint (3.3.7)
[refactor] 11-11: Use 'from dclimate_zarr_client import ipfs_retrieval' instead
(R0402)
1465-1487: Consider combining nested context managers for cleaner code.The nested
withstatements could be combined for improved readability.- with patch( - "requests.get", side_effect=requests.RequestException("Simulated error") - ): - with patch("os.path.exists", return_value=True): - # Mock update_cache_if_changed to avoid file writes during test setup if called early - with patch("dclimate_zarr_client.ipfs_retrieval.update_cache_if_changed"): - # Mock open specifically within the fallback block - with patch( - "dclimate_zarr_client.ipfs_retrieval.open", - mock_open(read_data="INVALID JSON!!"), - create=True, # Allow create=True if needed by mock_open internals - ) as mock_file_open: - # Mock json.load raising the error when called by get_ipns_name_hash - with patch( - "json.load", side_effect=json.JSONDecodeError("err", "doc", 0) - ): + with ( + patch("requests.get", side_effect=requests.RequestException("Simulated error")), + patch("os.path.exists", return_value=True), + patch("dclimate_zarr_client.ipfs_retrieval.update_cache_if_changed"), + patch( + "dclimate_zarr_client.ipfs_retrieval.open", + mock_open(read_data="INVALID JSON!!"), + create=True, + ) as mock_file_open, + patch("json.load", side_effect=json.JSONDecodeError("err", "doc", 0)) + ):🧰 Tools
🪛 Ruff (0.11.9)
1465-1468: Use a single
withstatement with multiple contexts instead of nestedwithstatements(SIM117)
1478-1483: Use a single
withstatement with multiple contexts instead of nestedwithstatementsCombine
withstatements(SIM117)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
dclimate_zarr_client/ipfs_retrieval.py(7 hunks)tests/test_geotemporal_utils.py(1 hunks)tests/test_ipfs_retrieval.py(6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- dclimate_zarr_client/ipfs_retrieval.py
🧰 Additional context used
🧬 Code Graph Analysis (2)
tests/test_geotemporal_utils.py (2)
dclimate_zarr_client/geotemporal_data.py (2)
_haversine(651-685)_check_input_parameters(688-722)dclimate_zarr_client/dclimate_zarr_errors.py (2)
InvalidTimePeriodError(38-40)InvalidAggregationMethodError(33-35)
tests/test_ipfs_retrieval.py (2)
dclimate_zarr_client/ipfs_retrieval.py (10)
fetch_json_from_ipns(99-255)get_dataset_hamt_cid_from_stac(259-460)get_ipns_name_hash(541-584)_get_dataset_by_ipfs_cid(628-709)update_cache_if_changed(527-538)_get_single_metadata(479-491)list_datasets(763-928)_get_ipfs_store(34-52)fetch_json_from_cid(56-95)_get_host(463-476)dclimate_zarr_client/dclimate_zarr_errors.py (3)
DatasetNotFoundError(25-26)IpfsConnectionError(61-62)StacCatalogError(65-66)
🪛 Pylint (3.3.7)
tests/test_ipfs_retrieval.py
[refactor] 11-11: Use 'from dclimate_zarr_client import ipfs_retrieval' instead
(R0402)
[refactor] 568-589: Unnecessary "elif" after "raise", remove the leading "el" from "elif"
(R1720)
[refactor] 647-673: Unnecessary "elif" after "return", remove the leading "el" from "elif"
(R1705)
[refactor] 923-944: Unnecessary "elif" after "raise", remove the leading "el" from "elif"
(R1720)
[refactor] 973-1013: Unnecessary "elif" after "return", remove the leading "el" from "elif"
(R1705)
🪛 Ruff (0.11.9)
tests/test_ipfs_retrieval.py
1465-1468: Use a single with statement with multiple contexts instead of nested with statements
(SIM117)
1478-1483: Use a single with statement with multiple contexts instead of nested with statements
Combine with statements
(SIM117)
1491-1494: Use a single with statement with multiple contexts instead of nested with statements
Combine with statements
(SIM117)
1502-1508: Use a single with statement with multiple contexts instead of nested with statements
Combine with statements
(SIM117)
🔇 Additional comments (3)
tests/test_geotemporal_utils.py (1)
1-33: Well-structured test coverage for geotemporal utilities.The tests appropriately cover both single-point and array-based distance calculations for
_haversine, as well as error handling for invalid parameters in_check_input_parameters. The use ofpytest.approxandnp.allclosewith appropriate tolerances is correct for floating-point comparisons.tests/test_ipfs_retrieval.py (2)
689-691: Good practice: Clearing cache for test isolation.The explicit cache clearing ensures test isolation and prevents false positives from cached data. This is an important fix that prevents test interdependencies.
255-445: Excellent comprehensive error handling test coverage.The
TestFetchJsonFromIpnsErrorsclass provides thorough coverage of the retry logic and various failure scenarios, including timeouts, connection errors, JSON decode errors, and HTTP errors. The test structure with predictable mocks and clear assertions makes the retry behavior easy to understand and verify.
Summary by CodeRabbit
/ipfs/-prefixed hrefs.