Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frame decompression error #14

Open
ackagel opened this issue Mar 4, 2022 · 11 comments
Open

Frame decompression error #14

ackagel opened this issue Mar 4, 2022 · 11 comments

Comments

@ackagel
Copy link

ackagel commented Mar 4, 2022

I keep encountering this error:

Error uncompressing frame, error code: 18446744073709551606. File is either corrupted, or in a (yet) unsupported variant of the format.

when querying frames beyond some frame id, in some tdf files (e.g D.query(frames=[bad_frame], columns=('intensity', ...))). The error code reads like a u64 underflow (-10 or -8 I think). Here's some of the tdf metadata for context:

In [1]: tdf_tables.table2dict(os.path.join(path, 'analysis.tdf'), 'GlobalMetadata')

Out [1]: {'Key': array(['SchemaType', 'SchemaVersionMajor', 'SchemaVersionMinor',
        'AcquisitionSoftwareVendor', 'InstrumentVendor', 'ClosedProperly',
        'TimsCompressionType', 'MaxNumPeaksPerScan', 'AnalysisId',
        'DigitizerNumSamples', 'MzAcqRangeLower', 'MzAcqRangeUpper',
        'AcquisitionSoftware', 'AcquisitionSoftwareVersion',
        'AcquisitionFirmwareVersion', 'AcquisitionDateTime',
        'InstrumentName', 'InstrumentFamily', 'InstrumentRevision',
        'InstrumentSourceType', 'InstrumentSerialNumber', 'OperatorName',
        'Description', 'SampleName', 'MethodName', 'DenoisingEnabled',
        'PeakWidthEstimateValue', 'PeakWidthEstimateType',
        'PeakListIndexScaleFactor', 'OneOverK0AcqRangeLower',
        'OneOverK0AcqRangeUpper', 'MaldiApplicationType', 'RunId',
        'TargetId', 'Geometry', 'ImagingAreaMinXIndexPos',
        'ImagingAreaMaxXIndexPos', 'ImagingAreaMinYIndexPos',
        'ImagingAreaMaxYIndexPos'], dtype='<U26'),
 'Value': array(['TDF', '3', '6', 'Bruker', 'Bruker', '1', '2', '1014',
        '00000000-0000-0000-0000-000000000000', '447844', '800.000000',
        '4000.000000', 'timsTOF', '3.0.20',
        'I4IP-12.67.1.179; IPPT-12.67.1.179; IPET-12.67.1.179; FXM3-0.0.1.6; MXMC-0.0.4.2; MXIF-0.0.2.0; MXI2-NOT_PRESENT; MXRF-0.0.1.1; RFXS-0.1.3.1; RFXE-NOT_PRESENT',
        '2022-02-15T14:47:39.295-08:00', 'timsTOF fleX MALDI 2', '9', '2',
        '1', '1877407.00348', 'Admin', '', 'Glycans First',
        'timsTOFflex_startup TIMS ON MALDI.m', '0', '0.000025', '1', '1',
        '0.800000', '2.950000', 'Imaging', '', 'T_0235380_1002526_1',
        'Imaging_Run', '702', '1222', '103', '360'], dtype='<U158')}
@michalsta
Copy link
Owner

Hi,

The error code is not overflown, libZSTD actually reports them as high unsigned ints. Having said that: at a first glance there doesn't seem to be anything obviously wrong with the metadata, so, it's hard to tell what's happening without getting a look at the file itself. Can you upload it somewhere, or do you need to keep it confidential?

@MatteoLacki
Copy link
Collaborator

Hello,

And what is the value of bad frame and what are the minimal and maximal frames in the dataset?
Just want to eliminate the usual suspects first :)

Best wishes

@ackagel
Copy link
Author

ackagel commented Mar 4, 2022

@michalsta This particular run should probably stay confidential, but it sounds like our team can put together a run which we can share by sometime next week. For context, this same error code pops up on pretty much all our tdf's thus far, so it's likely the issue will be replicated.

@MatteoLacki minimum frame, in terms of id, is 1, while the maximum ranges between 70k-110k. A bad frame typically pops up around 3,000-8,000, and then all the subsequent frames throw this error. The intensities/ion-mobilities that come from the prior 'error-free' frames look correct and don't seem to be corrupted.

@ackagel
Copy link
Author

ackagel commented Mar 25, 2022

@michalsta sorry for the delay, I can share some offending tdf/tdf_bin files now. Is sharing via GoogleDrive alright?

@michalsta
Copy link
Owner

Yes, absolutely. Just post/send me the link

@ackagel
Copy link
Author

ackagel commented Mar 29, 2022

@michalsta
Copy link
Owner

ok, that's weird. I downloaded these and... works for me ;) So. Let's take a few shots in the dark:

Maybe there's some zstd version mismatch and somehow it uses not the built-in but system-wide zstd?
What do you get when you run: python -c "import zstd; print(zstd.version())"
1.5.1.0 for me.

What's your system (not Python) zstd version? like:

$ ls /usr/lib/libzstd.so -l
lrwxrwxrwx 1 root root 16 Feb 19 19:04 /usr/lib/libzstd.so -> libzstd.so.1.5.2

Maybe something got changed in transit? What's the md5sum of the two files you sent? For me it's

2c41f1053df0315db5c331d3979f37d5  analysis.tdf
3af802550b9c7f3f2643899257a68f04  analysis.tdf_bin

Could you check out newest opentims, install the Python version from devel branch - it'll install a script "opentims_verify.py", could you run it on your machine, both with and without -c option?

@michalsta
Copy link
Owner

Also, can you check which frame number is the crashy one?

@ackagel
Copy link
Author

ackagel commented Mar 31, 2022

good to hear it works on your end! I'll try extracting on a different system for now; probably a linux machine since this is sounding more and more like a "windows is cranky about something ambiguous" problem.

As for the zstd version, looks like no zstd module is installed for python; but pip installing zstd didn't improve anything. Pretty sure I don't have a system zstd installed (if I do, not sure where that DLL would be on Windows). md5sums match, and opentims_verify.py also crashes on the problem frame. Looks like the first crashy frame is 8622.

@michalsta
Copy link
Owner

Ah, it's on Windows. Okay, now I can reproduce it. Will have a look.

@simondoer
Copy link

simondoer commented Apr 12, 2023

Hello, is there any progress on this problem? I am getting the same error for my files from DIA experiments. From a certain frame on, so far always at around 42000 of 67700, all subsequent frames cause the error. I also run opentimspy in Windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants