-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option: output sparse binary dose file instead of 3ddose in DOSXYZnrc #405
Conversation
Looks pretty cool Marc! Did you investigate how well your 3ddose files compress? In egs_brachy we have an option to output gzipped 3ddose (and egsdat) files (via gzstream) instead of text. IIRC the gzipped files were roughly an order of magnitude smaller so not as impressive as your gains here. If you have lots of zero dose voxels though you may get better compression than our typical case. |
Hey Randy!
Not too bad! For my specific use case, I'd still want to use a binary file because everytime I do an optimisation run, I need to read in those 15000 dose files. As you can imagine, 3ddose files are extremely inefficient to parse compared to binary files. edit: This is another test with realistic 6 MeV/20MeV 1x1 cm^2 beamlets on a 120x53x100 phantom (so, small end of the phantoms I deal with)
|
Thanks for posting your numbers! Your technique is still a pretty nice gain over just gzipping. I wonder if you could make it more generally applicable by storing start and end indices (or block sizes) for blocks of non zero doses instead of storing the individual voxel indices directly Something like:
Worst case for both methods (all voxels non zero for yours, every other voxel non zero for mine) stores Nvox indices so I guess it would be fairly dependent on how the zero doses are distributed throughout your dose array. |
I think this option for sparse binary output is long overdue! I've coded a similar solution for my own purposes before, due to the recombination of 3ddose files to be the slowest part of certain highly parallel simulations. I think Randle's suggestion of storing pairs makes sense, since dose usually occurs in contiguous regions. |
It would be interesting to profile the different methods vs their savings in space. Doing blocks of indices would only save space on the indices, so you'd be optimising the 1/3 of the file size that can be optimised (ignoring the header). The cost would be an increase in complexity for writing/reading those files. Having 3 neat blocks means you can read the entire dose content of the file with 3 freads (or a single fread since they're all 4 bytes but then there's more work to do in the background). Since the index-pair format would implicitly optimise the format for storage over write/load performance, I'd be especially curious about the difference between a gzipped version of the flat indices array format vs the index-block format. Another thought is that while dose does indeed occur in contiguous regions physically, it doesn't necessarily occur in contiguous memory regions. Linearized indices are only "optimised" for a beam incident normal to the xy plane (though my bias is showing there, brachy dose distributions would have different sparseness properties :P). |
I'm sitting down and implementing the "RandyDos" file format now, will have some benchmarks soon! |
OK so I implemented the randydos file format in another branch. Here's some numbers for a monoenergetic 16 MeV electron beam incident on a the xy plane of a 128x128x128 phantom
randydos is 2.1% the file size of 3ddose, and bindos is 2.8% I then made a python class which deserializes each of these files into the same format. I print some statistics for (basic) validation
For the randydos, the number of index pairs was 33862, so 67724 ints to store instead of 249672, a nice reduction. then I went into ipython to see their load times
This is far from scientific though since I can't guarantee that my from_randydos method is the most efficient way to deserialize a randydos. I'm not used to reading binary files from Python either so who knows how else this could be done. See this gist for the code I wrote for each file type. |
Nice work! voxel_indices2 = []
for vstart, vend in zip(blocks[::2], blocks[1::2]):
voxel_indices2.extend(range(vstart, vend))
voxel_indices3 = [vi for vstart, vend in zip(blocks[::2], blocks[1::2]) for vi in range(vstart, vend)] Either of those solutions for getting your voxel indices would be a bit more idiomatic in Python (may or may not be quicker, vi3 is probably quickest) but you could also just fill your doses array rather than creating the intermediate voxel indices array: # not tested :p
doses = np.zeros(num_voxels[0] * num_voxels[1] * num_voxels[2])
uncerts = np.zeros(num_voxels[0] * num_voxels[1] * num_voxels[2])
processed = 0
for block_start, block_end in zip(blocks[::2], blocks[1::2]):
block_size = block_end - block_start
doses[block_start:block_end] = nonzero_doses[processed:processed + block_size]
uncerts[block_start:block_end] = nonzero_doses[processed:processed + block_size]
processed += block_size That said, you need to be cautious extrapolating the performance of Python to (Fortran|C|C++|etc)! |
I'll test the performance, my logic was to use as much of numpy as possible over python loops because of the speed difference (only chose python because of how simple it is to profile), but that could be premature optimisation! Thanks for the snippets :D edit: Filling the dose arrays directly:
vi3:
|
C++11 version that I rapidly cooked up, don't mind the code quality :P
edit: forgot to compile with -O3 I think it more or less settles the question. We've only tested with one dose in the optimal orientation, but performance is identical so they can't be too far apart in the worst case. It's probably worth it to go with randydos, I'll amend the PR later today. |
Thanks for running all the numbers...that'll be a nice addition to EGSnrc :) |
I've updated the pull request and added some documentation / GUI input. I have absolutely no idea what I'm doing for the GUI, but I just copied whatever was there for ihowfarless and it seems to work nicely. Oh and I don't have all the packages to compile the .tex document so I don't know if it compiles after my changes. Some things left to discuss would probably be:
edit: not sure why CI is failing, would be nice to see the log but it's just blank. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works as expected, backwards compatible. Just needs some minor changes as noted.
Deferring this to Release 2022. This ought to be merged in develop as soon as possible after Release 2021. @marenaud, Are there any updates at your end: have you or others been using this without issue since implementation? |
Sorry for the late reply. I've used it since implementation with no issues, but it's been a good amount of time since I've generated my last bindos file :( |
Ok thanks for the update. We will go ahead and merge this in develop after the 2021 release tomorrow! |
21c8b2c
to
dd1e5f4
Compare
Polished up the commits, empty |
dd1e5f4
to
baf88af
Compare
Rebased on |
baf88af
to
4fc47e6
Compare
Rebased on develop for merging in release 2022. |
@mainegra please review when you have a chance. |
200fd27
to
a4707bd
Compare
Added author Marc-Andre Renaud in files that were modified to implement the |
Change the binary format to the one suggested by Randle Taylor, and add information about the sparse matrix binary dose format in the DOSXYZnrc documentation.
Add an option to select the dose output format in the gui. Just emulating what was done for the howfarless option.
a4707bd
to
1e33700
Compare
Add the contributor name in the same commit where the file is mofidied, and squashed the small commit to tweak the dosxyznrc gui menu item strings. |
Sparse binary 3ddose output
Just throwing this pull request up here to see if there is any interest. It was a fairly simple change to make.
Use case
As part of my research, I use DOSXYZnrc to generate electron beamlets for mixed electron-photon optimisation. A full set of electron beamlets for 5 gantry angles for a single chest wall patient (15,000 beamlets total) with 3mm^3 voxels takes up 2.1 TB of disk space when they're stored as 3ddose files. Just the process of converting beamlets to a nicer format takes a long time. When stored as sparse binary files, the same set of beamlets takes ~14 GB. Being able to output directly to binary dose saves tons of space and time.
The format
The format is similar the 3ddose format, except that voxels with 0 dose are omitted. The binary file is constructed as:
The format was chosen to allow codes reading in the binary dose to avoid having to read in uncertainties if they're not needed. Dose/uncert values are stored as floats to save space since there's no accumulation of floating point errors to worry about as this is a storage format.
I added a flag to toggle between 3ddose/binary dose output in the input file. Default value is 3ddose and the input file is kept backwards compatible for people who don't specify the flag.
Downsides
access='stream'
in fortran to write C-style binary files. Apparently this was added with fortran2003 so there could be compatibility issues with older compilers.Let me know if there is interest to include this in the official distribution. I can fill in the gaps in the documentation and add a checkbox to the GUI. Otherwise I'll just keep this on my branch.
Cheers,
Marc