-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how can i save fine_tuned cross-encoder to HF and then download it from HF #2499
Comments
Hello! Indeed, the from sentence_transformers import CrossEncoder
model = CrossEncoder("the/path/to/my/local/model")
# An example repo_id:
repo_id = "tomaarsen/my_cross_encoder"
model.model.push_to_hub(repo_id)
model.tokenizer.push_to_hub(repo_id) And then you can load your model like so: from sentence_transformers import CrossEncoder
repo_id = "tomaarsen/my_cross_encoder"
model = CrossEncoder(repo_id) I will be adding proper
|
I managed to push tokenizer to HF separately, full code is a bit long:
|
That should indeed be equivalent. Looks good! |
I'm happy to work on this if no one else has volunteered! |
That would be great! |
It looks like this issue was resolved by #2524. Is there any interest in adding a test case for the |
I'm looking for ways to share fine-tuned cross-encoder with my teacher.
Cross encoder model does not have native push_to_hub() method. So i decided to use general approach:
Now model is available on HF. Commit info was like:
CommitInfo(commit_url='https://huggingface.co/satyroffrost/crerankingeval-30e-4000-ms-marco-MiniLM-L-6-v2/commit/d81fe317cb037940e09db256d8a0926e80c358e5', commit_message='Upload BertForSequenceClassification', commit_description='', oid='d81fe317cb037940e09db256d8a0926e80c358e5', pr_url=None, pr_revision=None, pr_num=None)
then i decided to ensure the model is workable:
and get the error:
_Traceback (most recent call last):
Cell In[18], line 1
cross_ft_model = CrossEncoder("satyroffrost/crerankingeval-30e-4000-ms-marco-MiniLM-L-6-v2")
File ~\anaconda3\Lib\site-packages\sentence_transformers\cross_encoder\CrossEncoder.py:72 in init
self.tokenizer = AutoTokenizer.from_pretrained(model_name, **tokenizer_args)
File ~\anaconda3\Lib\site-packages\transformers\models\auto\tokenization_auto.py:745 in from_pretrained
return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File ~\anaconda3\Lib\site-packages\transformers\tokenization_utils_base.py:1838 in from_pretrained
raise EnvironmentError(
OSError: Can't load tokenizer for 'satyroffrost/crerankingeval-30e-4000-ms-marco-MiniLM-L-6-v2'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'satyroffrost/crerankingeval-30e-4000-ms-marco-MiniLM-L-6-v2' is the correct path to a directory containing all relevant files for a BertTokenizerFast tokenizer._
I compare local model folder and uploaded HF model files, last ones don't include tokenizer files. Uploaded model don't work on HF too. How can i correctly upload model with tokenizer to HF and the use it from HF like model = CrossEncoder(path_to_hf)?
The text was updated successfully, but these errors were encountered: