Support MUSA (Moore Threads GPU) backend in accelerate #2917

fmo-mt · 2024-07-05T08:18:08Z

What does this PR do?

To train 🤗 Transformers models using MUSA (Moore Threads GPU), the support should be added in Accelerate first and then will come in the Trainer for free.

This PR will support MUSA in accelerate, and we just followed how MLU supporting ( #2552 ) was merged.

Sample config after running the accelerate config command:

  1 compute_environment: LOCAL_MACHINE
  2 debug: false
  3 distributed_type: MULTI_MUSA
  4 downcast_bf16: 'no'
  5 gpu_ids: 0,1,2,3,4,5,6,7
  6 machine_rank: 0
  7 main_training_function: main
  8 mixed_precision: 'no'
  9 num_machines: 1
 10 num_processes: 8
 11 rdzv_backend: static
 12 same_network: true
 13 tpu_env: []
 14 tpu_use_cluster: false
 15 tpu_use_sudo: false
 16 use_cpu: false

to train a Bert-large-uncased model using:

accelerate launch run_trainer.py \
    --model_name_or_path ./squad_finetuned_checkpoint \
    --dataset_name ./squad \
    --per_device_train_batch_size 24 \
    --learning_rate 3e-5 \
    --num_train_epochs 50 \
    --max_seq_length 384 \
    --doc_stride 128 \
    --lr_scheduler_type cosine \
    --output_dir ./bert-large-uncased |& tee bert-large-uncased.log

Below are the output logs:

loading file vocab.txt
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading configuration file ./squad_finetuned_checkpoint/config.json
Model config BertConfig {
  "_name_or_path": "./squad_finetuned_checkpoint",
  "architectures": [
    "BertForQuestionAnswering"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "transformers_version": "4.40.0",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30522
}
07/05/2024 16:07:43 - INFO - torch.nn.parallel.distributed - Reducer buckets have been rebuilt in this iteration.
07/05/2024 16:07:43 - INFO - torch.nn.parallel.distributed - Reducer buckets have been rebuilt in this iteration.
07/05/2024 16:07:43 - INFO - torch.nn.parallel.distributed - Reducer buckets have been rebuilt in this iteration.
07/05/2024 16:07:43 - INFO - torch.nn.parallel.distributed - Reducer buckets have been rebuilt in this iteration.
07/05/2024 16:07:43 - INFO - torch.nn.parallel.distributed - Reducer buckets have been rebuilt in this iteration.
07/05/2024 16:07:43 - INFO - torch.nn.parallel.distributed - Reducer buckets have been rebuilt in this iteration.
07/05/2024 16:07:43 - INFO - torch.nn.parallel.distributed - Reducer buckets have been rebuilt in this iteration.
  loss: 4.95459, lr: [2.999990586959975e-05, 2.999990586959975e-05]:   0%|          | 26/23100 [01:11<13:08:05,  2.05s/it]

About MUSA and Moore Threads GPU:

We've release many types of GPU which are all full-funtional including computing and shading, and we've made our PyTorch adaption repository torch_musa open-source.
Official Website: https://www.mthreads.com/
Related PRs: https://www.tomshardware.com/pc-components/gpus/china-made-moore-threads-ai-gpus-used-for-three-billion-parameter-llm-training-mtt-s4000-appears-competitive-against-unspecified-nvidia-solutions

HuggingFaceDocBuilderDev · 2024-07-05T11:42:58Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

fmo-mt · 2024-07-08T03:00:41Z

@muellerzr @SunMarc Hi, buddies! Can you take a look at this PR, please?

SunMarc

Thanks for the PR @fmo-mt ! The integration looks very clean ! Nice to see a new backend 🔥 Can you have a second look @muellerzr ? Also, i'm not sure if you are on the on team working on torch_musa but if that's the case, it would be great to spin some runners on your side to make sure that we don't have failing accelerate tests with musa hardware

muellerzr

Same comment as Marc, very nice PR @fmo-mt !

muellerzr · 2024-07-08T16:03:33Z

For the quality check to pass, please do pip install -e .[quality]; make style; make quality

fmo-mt · 2024-07-09T01:58:45Z

Thanks for the PR @fmo-mt ! The integration looks very clean ! Nice to see a new backend 🔥 Can you have a second look @muellerzr ? Also, i'm not sure if you are on the on team working on torch_musa but if that's the case, it would be great to spin some runners on your side to make sure that we don't have failing accelerate tests with musa hardware

Yes I'm working on torch_musa currently, and we have trained/fine-tuned some models like BERT, Mistral etc.

fmo-mt · 2024-07-09T04:05:24Z

@muellerzr @SunMarc Oh, I fixed a typo and force-pushed with rebase which clean the change history, but it seems that the CI workflow needs to be activate by you guys 🥲

SunMarc · 2024-07-10T11:42:22Z

No issues ! I'm merging !

SunMarc approved these changes Jul 8, 2024

View reviewed changes

muellerzr approved these changes Jul 8, 2024

View reviewed changes

Support MUSA (Moore Threads GPU) backend in accelerate

45c299e

fmo-mt force-pushed the main branch from 306038d to 45c299e Compare July 9, 2024 03:22

SunMarc merged commit 12a007d into huggingface:main Jul 10, 2024
25 checks passed

fmo-mt mentioned this pull request Jul 11, 2024

Support MUSA (Moore Threads GPU) backend in transformers huggingface/transformers#31913

Merged

muellerzr mentioned this pull request Jul 29, 2024

Simplify logic for backends for ease of addition + contributor tutorials to add new ones (such as mps, xpu, etc) #2968

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support MUSA (Moore Threads GPU) backend in accelerate #2917

Support MUSA (Moore Threads GPU) backend in accelerate #2917

fmo-mt commented Jul 5, 2024

HuggingFaceDocBuilderDev commented Jul 5, 2024

fmo-mt commented Jul 8, 2024

SunMarc left a comment •

edited

Loading

muellerzr left a comment

muellerzr commented Jul 8, 2024

fmo-mt commented Jul 9, 2024

fmo-mt commented Jul 9, 2024

SunMarc commented Jul 10, 2024

Support MUSA (Moore Threads GPU) backend in accelerate #2917

Support MUSA (Moore Threads GPU) backend in accelerate #2917

Conversation

fmo-mt commented Jul 5, 2024

What does this PR do?

HuggingFaceDocBuilderDev commented Jul 5, 2024

fmo-mt commented Jul 8, 2024

SunMarc left a comment • edited Loading

Choose a reason for hiding this comment

muellerzr left a comment

Choose a reason for hiding this comment

muellerzr commented Jul 8, 2024

fmo-mt commented Jul 9, 2024

fmo-mt commented Jul 9, 2024

SunMarc commented Jul 10, 2024

SunMarc left a comment •

edited

Loading