-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modify Bert Docstring #914
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
还有一些小问题,总体很棒!
- `start_logits` (Tensor): | ||
Labels for position (index) of the start of the labelled span. Positions are clamped to the sequence length. | ||
Position outside of the sequence are not taken into account for computing the token classification loss. | ||
Its data type should be float32 and its shape is [batch_size, sequence_length]. | ||
|
||
- `end_logits` (Tensor): | ||
Labels for position (index) of the end of the labelled span. Positions are clamped to the sequence length. | ||
Position outside of the sequence are not taken into account for computing the token classification loss. | ||
Its data type should be float32 and its shape is [batch_size, sequence_length]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
start_logits和end_logits的描述有问题,他们不是输入的label,他们是我们对输入序列每一个位置输出的logits
attention_mask (Tensor, optional): | ||
Mask used in multi-head attention to avoid performing attention on to some unwanted positions, | ||
usually the paddings or the subsequent positions. | ||
The values should be either 0 or 1. | ||
|
||
- **1** for tokens that **not masked**, | ||
- **0** for tokens that **masked**. | ||
|
||
It's data type should be float32 and its shape is [batch_size, num_attention_heads, sequence_length, sequence_length]. | ||
Defaults to `None`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的dtype可以是int,float,bool
shape只要能够broadcasted到对应的shape就可以
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dtype为float,bool的时候分别是什么行为呢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
PR changes
Description