Skip to content

Feature/calibration data device#2421

Open
avtc wants to merge 7 commits intoModelCloud:mainfrom
avtc:feature/calibration-data-device
Open

Feature/calibration data device#2421
avtc wants to merge 7 commits intoModelCloud:mainfrom
avtc:feature/calibration-data-device

Conversation

@avtc
Copy link
Contributor

@avtc avtc commented Feb 19, 2026

@Qubitium Hi, this feature allows specify in config the device where calibration data inputs/outputs will be stored, allowing to use more calibration data samples for quantization, because calibration data can be placed on device different to cuda:0 which already stores all layer modules.

Before the feature initial calibration data was stored on CPU and after first pass it was stored on DEVICE_0 (cuda:0 usually).
After the feature if calibration_data_device is not set initial behavior preserved.
calibration_data_device can be set to "cpu", "cuda:1" (or any other torch device), and to "balanced" - in "balanced" mode calibration data distributed between compute devices available: DEVICE_0 .. DEVICE_N

P.S. I have used this feature previously several times but on another old branch. This PR is based on latest master.
Also I have fixed examples in config file for using moe parameter, and fixed sys.abiflags typo which failed build.

Note: the handling of layer with all modules excluded from quantization was also fixed, as current main code did not do forward replay it seems.

I have run several small tests (few first layers) ensuring nothing fail with auto_forward_data_parallel enabled and disabled, on qwen3-30b-a3b with calibration_data_device set to cpu, cuda:1, balanced and removed from config.

@Qubitium
Copy link
Collaborator

@avtc Thanks again for amother gem! Can you whip up some unit tests so there is good test coverage on the diffs so I can run it ok our gpus and check for regressions.

@avtc avtc marked this pull request as draft February 23, 2026 15:08
@avtc avtc marked this pull request as ready for review February 23, 2026 18:38
@avtc
Copy link
Contributor Author

avtc commented Feb 23, 2026

@Qubitium I have added tests with help of GLM-5, please review if it is OK

@Qubitium
Copy link
Collaborator

@avtc Will be checking and merging in the next 48 hours.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants