Skip to content

Conversation

@jspaezp
Copy link
Contributor

@jspaezp jspaezp commented Oct 16, 2025

Well this is not really meant to be a "real" PR (but it can become one).

I guess I added a lot of stuff that worked for me and wanted to know if any of it would be of interest.

  1. Temperature aware calibration... enough said ... seems to work on my data ... I have no idea if it will work in other data types.
    • Also modified the test files to include data for it.
  2. Buffered read.
    • I wanted to reduce the number of possible calls to the global allocator, so adding the option to pass a buffer means that I can read multiple frames with the same memory (instead of creating + destroyig vecs left and right)
    • Speed difference is not huge in my tests...
  3. TimsTofFile/Path api -> exposed it and allowed getting the metadata + frame reader from it (its nice to validate once that it is a real tims tof path and then dispatch instead of one validation per data type needed)
  4. Frame struct: Split the data into a FrameMeta and a FramePeaks (frame composes from them), and the framereader reads the meta for each (instead of having a partially built Frame with empty vecs)

/// have index 2 but its empty!
///
/// Then this index can be converted using the Scan2ImConverter.convert
fn expand_mobility_iter(&self) -> impl Iterator<Item = u16> + '_ {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think than an option to expand the peaks should be inside timsrust or at least this function

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have this set up internally with a more generic with a SparseVec structure though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good to me.

// These two are here so we can implement optimized versions for iterators if needed.
fn convert_iter(
&self,
values: impl Iterator<Item = f64>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably go for &[f64] rather than an impl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason I made it this way was for things like the scan index, where we dont have a materialized slice.

Comment on lines +16 to +25
pub(crate) fn decompress_reset(
&mut self,
compressed_bytes: &[u8],
) -> Result<(), TdfBlobError> {
self.bytes.clear();
copy_decode(compressed_bytes, &mut self.bytes)
.map_err(|_| TdfBlobError::Decompression)?;
Self::check_len(self.bytes.as_slice())?;
Ok(())
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have some stats on how much better recycling allocated spae works compared to making new?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

real answer ... no ... if my memory serves me correctly it was less than 5% in my laptop. But I believe that would greatly depend on what allocator is being used (I have seen massive heap allocation overhead in windows but not mac, and using mimalloc in windows makes it go away, whilst it makes mac slower ... )

Comment on lines +15 to +16
pub peaks: FramePeaks,
pub meta: FrameMeta,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very good idea that I also have been wanting to implement for awhile already

/// have index 2 but its empty!
///
/// Then this index can be converted using the Scan2ImConverter.convert
fn expand_mobility_iter(&self) -> impl Iterator<Item = u16> + '_ {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed

/// have index 2 but its empty!
///
/// Then this index can be converted using the Scan2ImConverter.convert
fn expand_mobility_iter(&self) -> impl Iterator<Item = u16> + '_ {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have this set up internally with a more generic with a SparseVec structure though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably want to put the actual reading elsewhere. I understand the tight coupling between the reader and metadata, but might opt to move the metadata definition to the reader rather than the otherway around...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No preference... the main thing here was that the reader struct is never constructed, so it was kind of annoying to 'import' two structs for no reason.

Comment on lines +38 to +41
pub dia_windows: Option<Vec<Arc<QuadrupoleSettings>>>,
pub frame_metas: Vec<FrameMeta>,
pub acquisition: AcquisitionType,
pub compression_type: u8,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I feel structs shoud not have pub fields. I actually need to revisit all those that still have this... (in particular frame)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I think ... we can break this up into two things...

  1. Should I have access to the data? -> I believe the answer is yes, I do not believe I should need to read the data to know how many MS1 frames I have, or how many quad types I am using. If you do not want to have that be pub that is cool, but a getter would be great to have.
  2. Should I have "write-access" to the field. -> Well ... I think that if I am writing to it its one me to not mess it up.

Comment on lines +158 to +159
/// Attempts to find the frame using the instrument index and
/// returns it if found.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to make explicit that the indices indeed are different in terms of accession (0-offset rather than 1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not sure if that was the difference and if there could be gaps (which is why I used bin search instead of subtraction :P)

frame_buffer: &mut Frame,
blob_buffer: &mut TdfBlob,
) -> Result<(), FrameReaderError> {
frame_buffer.peaks.clear();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a very quick glance I find it difficult to see if/how this maintains multithreaded compatability without getting overly complex. Care to elaborate? If not, I'll try to look up the details later

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the real answer is you dont :P if the user (me) wants to use this interface its on them to figure out the multi-threaded mutable state. (I use a per-thread buffer, but I can see someone else using an Arc<Mutex<Queue>>). You gain in alloc overhead but loose in threaded logic complexity as a user.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants