Skip to content

Do we need to test support for legacy formats? #857

@dagewa

Description

@dagewa

I was having a look at the size of DIALS data-files, to see if there are things we can do to avoid things like download time outs that affect CI tests. The largest sub-directory is image_examples/. Just the top 3 largest files take over 70 MB of space:

-rw-rw-r--  1 fcx32934 fcx32934  33M Oct 27 11:02 APS_22ID-mar300.0001
-rw-rw-r--  1 fcx32934 fcx32934  20M Apr  7  2025 APS_19ID-q315_unbinned_a.0001.img.bz2
-rw-rw-r--  1 fcx32934 fcx32934  18M Oct 27 11:02 MacScience-reallysurprise_001.ipf

We could save a bit of space by compressing two of these, but before doing that I'd like to explore what value there is in keeping these files. They are used in test_experiment_files.py (and test_filecache.py in the case of MacScience-reallysurprise_001.ipf).

We have good support for old file formats in dxtbx, and yet it is far from complete. If we were actually aiming for comprehensive support then I would be in favour of keeping these files and finding examples from all other missing instruments. However, I think the work involved in making dxtbx truly comprehensive is far beyond our resources. So, in that case is there really any value in testing this support for a smattering of legacy file formats?

I just picked these 3 files as the largest, but there are many other images from legacy detectors in this directory that I think have limited value.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions