Skip to content

From NVIDIA Megatron-LM for visibility #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4,810 commits into
base: multi-query-attention
Choose a base branch
from

Conversation

RaymondLi0
Copy link
Collaborator

No description provided.

@RaymondLi0 RaymondLi0 changed the base branch from multi-query-attention to before-merge June 20, 2023 20:12
@RaymondLi0 RaymondLi0 changed the base branch from before-merge to multi-query-attention June 20, 2023 20:12
mathemakitten and others added 28 commits May 9, 2025 16:34
Co-authored-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>
Inference functional test: 580M Minitron

See merge request ADLR/megatron-lm!2812
Co-authored-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>
Invalidate cached SSM tensors if batch size changes during inference

See merge request ADLR/megatron-lm!3277
ci: Move unit test logic to file

See merge request ADLR/megatron-lm!3291
Adapt _write_item call to new signature with 'serialization_format'

See merge request ADLR/megatron-lm!3243
Co-authored-by: Russell Hewett <rhewett@nvidia.com>
Add in-process restart

See merge request ADLR/megatron-lm!2711
ci: Run on multiple clusters

See merge request ADLR/megatron-lm!3292
ci: Allow specific TE-ref

See merge request ADLR/megatron-lm!3302
ci(fix): Write logs to log_dir

See merge request ADLR/megatron-lm!3299
Address dist checkpointing PyT 24.08 failure

See merge request ADLR/megatron-lm!3253
ci(hotfix): Downstream pipeline

See merge request ADLR/megatron-lm!3307
…nal argparse flag to clear GPU...

Co-authored-by: Szymon Migacz <smigacz@nvidia.com>
MR feedback: added units for arguments, optional argparse flag to clear GPU...

See merge request ADLR/megatron-lm!3308
…mamba class constructor

Co-authored-by: Zhiyu Li <zhiyul@NVIDIA.com>
Allow process group as optional argument for mamba class constructor

See merge request ADLR/megatron-lm!2966
sbak5 and others added 30 commits June 16, 2025 15:37
Revert `fork` to `spawn` based on stability issues in checkpointing

See merge request ADLR/megatron-lm!3450
…able quantization configuration

Co-authored-by: Simon Layton <slayton@nvidia.com>
Add kitchen extension with per-layer configurable quantization configuration

See merge request ADLR/megatron-lm!3301
Add deprecation warning for legacy inference

See merge request ADLR/megatron-lm!3474
Change naming of original_max_position_embeddings to avoid conflicts

See merge request ADLR/megatron-lm!3181
…main'

Make cudagraph replay check more descriptive when it fails arg checks

See merge request ADLR/megatron-lm!3472
…der tests in CI for MCore Encoder Refactoring

Co-authored-by: yaoyu-33 <yaoyu.094@gmail.com>
Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>
…o 'main'

M4 Taskforce: Disable T5 and encoder_and_decoder tests in CI for MCore Encoder Refactoring

See merge request ADLR/megatron-lm!3414
Quick fix for NeMo: handle alternate key names like 'pre_wd_mult' instead of 'wd_mult'

See merge request ADLR/megatron-lm!3444
chore: Bump version 0.14.0

See merge request ADLR/megatron-lm!3477
Co-authored-by: Selvaraj Anandaraj <selvaraja@cw-dfw-cs-001-login-01.cm.cluster>
Co-authored-by: Selvaraj Anandaraj <selvaraja@login-ptyche02.ptyche.clusters.nvidia.com>
Added offloading support for MCore layers

See merge request ADLR/megatron-lm!3071
… avoid shuffling of new tokens

Co-authored-by: Shanmugam Ramasamy <shanmugamr@cw-dfw-cs-001-vscode-01.cm.cluster>
Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>
Co-authored-by: Shanmugam Ramasamy <shanmugamr@shanmugamr-mlt.client.nvidia.com>
Bug fix to reset kv chunks assigned to -1 and avoid shuffling of new tokens

See merge request ADLR/megatron-lm!3437
chore: Add init to tools

See merge request ADLR/megatron-lm!3483
Fix unit test test_fp8_param.py blockwise scaling

See merge request ADLR/megatron-lm!3480
chore: Add init to examples

See merge request ADLR/megatron-lm!3492
build: Force pin down setuptools

See merge request ADLR/megatron-lm!3493
Pad input tensors and enable fp8 weights for fp8 inference

See merge request ADLR/megatron-lm!3341
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.