Skip to content

Can't load GPT Model: ImportError: 'TP_REPLICATED_PARAMETER_PATTERNS' from 'deepspeed.checkpoint' #269

Open
@V3RGANz

Description

When I am trying to load GPT Model it is fails, because it requires non-existing deepspeed import :

  File "/workspace/megatron/megatron/model/gpt_model.py", line 27, in <module>
    from deepspeed.checkpoint import (
ImportError: cannot import name 'TP_REPLICATED_PARAMETER_PATTERNS' from 'deepspeed.checkpoint' (/usr/local/lib/python3.10/dist-packages/deepspeed/checkpoint/__init__.py)

This is the line with constant import:
https://github.com/microsoft/Megatron-DeepSpeed/blob/main/megatron/model/gpt_model.py#L30

However, deepspeed package does not contain this constant:
https://github.com/microsoft/DeepSpeed/blob/master/deepspeed/checkpoint/constants.py
I could not find where this constant come from, because it never appears in VCS history of deepspeed package

Megatron-Deepspeed version: main branch
deepspeed version: 0.11.1

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions