You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/en/add_new_model.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -101,7 +101,7 @@ own regarding how code should be written :-)
101
101
1. The forward pass of your model should be fully written in the modeling file while being fully independent of other
102
102
models in the library. If you want to reuse a block from another model, copy the code and paste it with a
103
103
`# Copied from` comment on top (see [here](https://github.com/huggingface/transformers/blob/v4.17.0/src/transformers/models/roberta/modeling_roberta.py#L160)
104
-
for a good example).
104
+
for a good example and [there](pr_checks#check-copies) for more documentation on Copied from).
105
105
2. The code should be fully understandable, even by a non-native English speaker. This means you should pick
106
106
descriptive variable names and avoid abbreviations. As an example, `activation` is preferred to `act`.
107
107
One-letter variable names are strongly discouraged unless it's an index in a for loop.
Copy file name to clipboardExpand all lines: docs/source/en/pr_checks.md
+55Lines changed: 55 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -142,3 +142,58 @@ Additional checks concern PRs that add new models, mainly that:
142
142
- All checkpoints used actually exist on the Hub
143
143
144
144
-->
145
+
146
+
### Check copies
147
+
148
+
Since the Transformers library is very opinionated with respect to model code, and each model should fully be implemented in a single file without relying on other models, we have added a mechanism that checks whether a copy of the code of a layer of a given model stays consistent with the original. This way, when there is a bug fix, we can see all other impacted models and choose to trickle down the modification or break the copy.
149
+
150
+
<Tip>
151
+
152
+
If a file is a full copy of another file, you should register it in the constant `FULL_COPIES` of `utils/check_copies.py`.
153
+
154
+
</Tip>
155
+
156
+
This mechanism relies on comments of the form `# Copied from xxx`. The `xxx` should contain the whole path to the class of function which is being copied below. For instance, `RobertaSelfOutput` is a direct copy of the `BertSelfOutput` class, so you can see [here](https://github.com/huggingface/transformers/blob/2bd7a27a671fd1d98059124024f580f8f5c0f3b5/src/transformers/models/roberta/modeling_roberta.py#L289) it has a comment:
157
+
158
+
```py
159
+
# Copied from transformers.models.bert.modeling_bert.BertSelfOutput
160
+
```
161
+
162
+
Note that instead of applying this to a whole class, you can apply it to the relevant methods that are copied from. For instance [here](https://github.com/huggingface/transformers/blob/2bd7a27a671fd1d98059124024f580f8f5c0f3b5/src/transformers/models/roberta/modeling_roberta.py#L598) you can see how `RobertaPreTrainedModel._init_weights` is copied from the same method in `BertPreTrainedModel` with the comment:
163
+
164
+
```py
165
+
# Copied from transformers.models.bert.modeling_bert.BertPreTrainedModel._init_weights
166
+
```
167
+
168
+
Sometimes the copy is exactly the same except for names: for instance in `RobertaAttention`, we use `RobertaSelfAttention` insted of `BertSelfAttention` but other than that, the code is exactly the same. This is why `# Copied from` supports simple string replacements with the follwoing syntax: `Copied from xxx with foo->bar`. This means the code is copied with all instances of `foo` being replaced by `bar`. You can see how it used [here](https://github.com/huggingface/transformers/blob/2bd7a27a671fd1d98059124024f580f8f5c0f3b5/src/transformers/models/roberta/modeling_roberta.py#L304C1-L304C86) in `RobertaAttention` with the comment:
169
+
170
+
```py
171
+
# Copied from transformers.models.bert.modeling_bert.BertAttention with Bert->Roberta
172
+
```
173
+
174
+
Note that there shouldn't be any spaces around the arrow (unless that space is part of the pattern to replace of course).
175
+
176
+
You can add several patterns separated by a comma. For instance here `CamemberForMaskedLM` is a direct copy of `RobertaForMaskedLM` with two replacements: `Roberta` to `Camembert` and `ROBERTA` to `CAMEMBERT`. You can see [here](https://github.com/huggingface/transformers/blob/15082a9dc6950ecae63a0d3e5060b2fc7f15050a/src/transformers/models/camembert/modeling_camembert.py#L929) this is done with the comment:
177
+
178
+
```py
179
+
# Copied from transformers.models.roberta.modeling_roberta.RobertaForMaskedLM with Roberta->Camembert, ROBERTA->CAMEMBERT
180
+
```
181
+
182
+
If the order matters (because one of the replacements might conflict with a previous one), the replacements are executed from left to right.
183
+
184
+
<Tip>
185
+
186
+
If the replacements change the formatting (if you replace a short name by a very long name for instance), the copy is checked after applying the auto-formatter.
187
+
188
+
</Tip>
189
+
190
+
Another way when the patterns are just different casings of the same replacement (with an uppercased and a lowercased variants) is just to add the option `all-casing`. [Here](https://github.com/huggingface/transformers/blob/15082a9dc6950ecae63a0d3e5060b2fc7f15050a/src/transformers/models/mobilebert/modeling_mobilebert.py#L1237) is an example in `MobileBertForSequenceClassification` with the comment:
191
+
192
+
```py
193
+
# Copied from transformers.models.bert.modeling_bert.BertForSequenceClassification with Bert->MobileBert all-casing
194
+
```
195
+
196
+
In this case, the code is copied from `BertForSequenceClassification` by replacing:
197
+
-`Bert` by `MobileBert` (for instance when using `MobileBertModel` in the init)
198
+
-`bert` by `mobilebert` (for instance when defining `self.mobilebert`)
199
+
-`BERT` by `MOBILEBERT` (in the constant `MOBILEBERT_INPUTS_DOCSTRING`)
0 commit comments