Skip to content

Comments

Adding scalar modalities for HSC#17

Merged
EiffL merged 4 commits intomainfrom
hsc
May 24, 2025
Merged

Adding scalar modalities for HSC#17
EiffL merged 4 commits intomainfrom
hsc

Conversation

@EiffL
Copy link
Contributor

@EiffL EiffL commented May 24, 2025

This pull request introduces several updates to the codec system, including enhancements to modality support, codec initialization, and the addition of a new script for exporting codecs. It also includes updates to test data for validation purposes. Below is a breakdown of the most significant changes:

Codec System Enhancements:

  • Added new scalar modalities (AG, AR, AI, AZ, AY, MagG, MagR, MagI, MagZ, MagY, Shape11, Shape22, Shape12) to expand support for HSC extinction values, magnitudes, and shapes in aion/modalities.py. These modalities are also added to the ScalarModalities list. [1] [2]
  • Updated the ScalarCodec initialization in aion/codecs/scalar.py to include an optional min_log_value parameter for better control over log-scaling behavior.

New Script for Codec Export:

  • Introduced scripts/export_hsc_codecs.py, a comprehensive script for converting legacy HSC codecs to the AION format, uploading them to the HuggingFace Hub, and generating test data for validation. The script supports multiple operational modes such as testing, uploading, and skipping uploads.

Test Data Updates:

  • Added new test data files for the updated codecs, including input, encoded, and decoded batches for modalities like a_g, a_r, a_i, a_z, a_y, and others. These files are stored in tests/test_data/ and are tracked using Git LFS. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16]

Configuration Updates:

  • Updated .gitattributes to ensure all .pt files are managed by Git LFS, streamlining handling of large model and test data files.

@EiffL EiffL requested a review from Copilot May 24, 2025 13:42
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR expands the HSC codec support by introducing new scalar modalities, enhances codec initialization with an optional log‐scaling parameter, and provides a one‐stop script for converting, uploading, and validating legacy HSC codecs. It also updates test fixtures and LFS configuration.

  • Added new HSC extinction, magnitude, and shape scalar modalities in aion/modalities.py.
  • Extended ScalarCodec to accept min_log_value and adjusted quantizer logic.
  • Introduced scripts/export_hsc_codecs.py to automate conversion, uploading to HuggingFace, and test‐data generation.
  • Updated test data under tests/test_data/ and enabled .pt files in Git LFS.

Reviewed Changes

Copilot reviewed 45 out of 45 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
aion/modalities.py New HSC scalar modality classes registered in ScalarModalities.
aion/codecs/scalar.py ScalarCodec now takes min_log_value and uses log quantizer.
scripts/export_hsc_codecs.py New export script for legacy HSC codecs (convert, upload, test).
tests/test_data/*.pt Added LFS‐tracked input/encoded/decoded batches for new modalities.
.gitattributes Added *.pt to Git LFS tracking.
pyproject.toml Added huggingface_hub[torch] to optional dependencies.
Comments suppressed due to low confidence (1)

aion/codecs/scalar.py:83

  • The ScalarCodec constructor now always uses ScalarLogReservoirQuantizer, which changes its original behavior of linear quantization. Consider restoring ScalarReservoirQuantizer for ScalarCodec and reserving ScalarLogReservoirQuantizer for LogScalarCodec.
self._quantizer = ScalarLogReservoirQuantizer(

Copy link
Contributor Author

@EiffL EiffL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@EiffL EiffL merged commit a01a9ac into main May 24, 2025
2 checks passed
@EiffL EiffL deleted the hsc branch May 24, 2025 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant