-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broken Links when Exporting #194
base: dev
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## dev #194 +/- ##
==========================================
- Coverage 87.11% 86.88% -0.23%
==========================================
Files 5 5
Lines 1172 1182 +10
Branches 286 289 +3
==========================================
+ Hits 1021 1027 +6
- Misses 100 103 +3
- Partials 51 52 +1 ☔ View full report in Codecov by Sentry. |
More Notes: If FileA contains an external link to a dataset in FileX, then FileB should also contain an external link to the dataset in FileX. |
Goal for this PR:
|
Just in case this is relevant for this PR. The following test cases mirror tests from HDMF but were disabled in the hdfm_zarr test suite because links on export didn't fully work. If this PR fixes this, then we should also look at updating these tests. hdmf-zarr/tests/unit/base_tests_zarrio.py Lines 1329 to 1521 in 8ca5787
hdmf-zarr/tests/unit/test_io_convert.py Lines 993 to 1069 in 8ca5787
|
Good to know. I believe my tests are similar if not the same ones. Thanks for pointing this out so we don't have duplicates. |
Motivation
What was the reasoning behind this change? Please explain the changes briefly.
Export is not supposed to create links to the prior file, rather it is just mean to have the option to preserve them. This means if File A has links to some File C, then when we export File A to File B, File B will also have links the File C.
Problem 1: HDMF-Zarr is missing the logic in HDMF within write_dataset that has conditionals for link in terms of export.
Problem 2: When links are creating (let's say when they are supposed to be created), they are using absolute paths. They are supposed to use relative paths. Both can break when you move things, but absolute paths will always break.
Problem 3: When we create a reference, the source path is shorthand with ".", to represent the file it is currently in. We need to add logic in resolve_ref to handle links.
What to do while this is being fixed:
Always use 'write_args={'link_data': False}'.
I will divide the problem into stages:
Stage 1 (PR 1:) Add updated export logic into write_dataset
Stage 2 (PR 2:) Add logic into resolve_ref to resolve references in links
Stage 3 (PR 3:) Edge case Test Suite
How to test the behavior?
Checklist
ruff
from the source directory.