Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How is the FP16 model trained? #1985

Open
icestoneking opened this issue Aug 5, 2024 · 0 comments
Open

How is the FP16 model trained? #1985

icestoneking opened this issue Aug 5, 2024 · 0 comments
Labels
question Further information is requested

Comments

@icestoneking
Copy link

Notice: In order to resolve issues more efficiently, please raise issue following the template.
(注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)

❓How is the FP16 model trained? Can I save the FP16 model as a normal model after training?

Before asking:

What is your question?

  1. 用ddp训练:
    ++train_conf.use_fp16=true,最后保存的模型仍为fp32
  2. 用deepseed训练:
    FunASR/funasr/models/sanm/attention.py", line 518, in forward
    [rank0]: inputs = inputs * mask
    [rank0]: ~~~~~~~^~~~~~
    [rank0]: RuntimeError: The size of tensor a (0) must match the size of tensor b (74) at non-singleton dimension 1

Code

What have you tried?

What's your environment?

  • OS (e.g., Linux):
  • FunASR Version (e.g., 1.0.0):
  • ModelScope Version (e.g., 1.11.0):
  • PyTorch Version (e.g., 2.0.0):
  • How you installed funasr (pip, source):
  • Python version:
  • GPU (e.g., V100M32)
  • CUDA/cuDNN version (e.g., cuda12.1):
  • Any other relevant information:
@icestoneking icestoneking added the question Further information is requested label Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant