Can't load GPT Model: ImportError: 'TP_REPLICATED_PARAMETER_PATTERNS' from 'deepspeed.checkpoint' #269
Open
Description
opened on Oct 25, 2023
When I am trying to load GPT Model it is fails, because it requires non-existing deepspeed import :
File "/workspace/megatron/megatron/model/gpt_model.py", line 27, in <module>
from deepspeed.checkpoint import (
ImportError: cannot import name 'TP_REPLICATED_PARAMETER_PATTERNS' from 'deepspeed.checkpoint' (/usr/local/lib/python3.10/dist-packages/deepspeed/checkpoint/__init__.py)
This is the line with constant import:
https://github.com/microsoft/Megatron-DeepSpeed/blob/main/megatron/model/gpt_model.py#L30
However, deepspeed package does not contain this constant:
https://github.com/microsoft/DeepSpeed/blob/master/deepspeed/checkpoint/constants.py
I could not find where this constant come from, because it never appears in VCS history of deepspeed package
Megatron-Deepspeed version: main
branch
deepspeed version: 0.11.1
Metadata
Assignees
Labels
No labels
Activity