-
Notifications
You must be signed in to change notification settings - Fork 60
Modify device_id setting way to avoid ambiguity while setting device_id by env variable or yaml #425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default value is 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to keep it 7 to leave card 0 for distributed training.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value of device_id
in yaml is suggested to be 7. If device_id
is not specified in yaml, mindspore will use device 0 by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, but in the table, the column represents the default number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value of device_id in yaml is 7 is bit strange. what if the user does not have 8 devices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, but in the table, the column represents the default number.
If we have 'device_id' in yaml, the default value is 7, refer to configs/cls/mobilenetv3/cls_mv3.yaml
.
I add more details in yaml_configuration.md to be more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value of device_id in yaml is 7 is bit strange. what if the user does not have 8 devices?
As Rustam said, due to the inconvenience of setting different cards for distributed training on Ascend, the default device_id (for standalone training) in yaml is supposed not to be 0. In the meantime, we don't know how many devices users have. So the compromise is to specify 'device_id=7' in yaml. If no 8 devices, let the error raise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. But still I think this setting (default to 7) is only convenient for us, and a bit strange to the user
…id by env variable or yaml
Thank you for your contribution to the MindOCR repo.
Before submitting this PR, please make sure:
Motivation
There are two ways to specify device id while standalone training:
We take (1) as the higher priority, namely, (2) is only valid when distribute=False (standalone training) and environment variable 'DEVICE_ID' is NOT set.