Releases: rubingshen/AugmentedSocialScientist
v2.2.1
v2.2.0
add self.load_model()
method to load a locally saved model
v2.1.0
add a parameter (add_special_tokens: bool) to encode
v2.0.2
add type hints
v2.0.1
Fix bug for "unexpected argument model_name" when using a custom model from hugging face
New release v2
The package is rewritten in an object-oriented way for more readability, extensibility and flexibility.
- All models are grouped together in the module
AugmentedSocialScientist.models
To use a model (see README for the list)
from AugmentedSocialScientist.models import Bert
bert = Bert() #instanciation
Everything else remains unchanged for users to train a model: bert.encode()
to preprocess data, bert.run_training()
to train, validate and save a model, bert.predict_with_model()
to make predictions.
- Flexibility
- To use a custom model from Hugging Face, set the
model_name
argument while instantiating the model
For example, to use the Danish BERT model from Hugging Face DJSammy/bert-base-danish-uncased_BotXO-ai.
from AugmentedSocialScientist.models import Bert
bert = Bert(model_name="DJSammy/bert-base-danish-uncased_BotXO-ai")
- Users can now set their own device to use, by providing a custom a
torch.Device
object to thedevice
argument when instantiating the model.
from AugmentedSocialScientist.models import Bert
bert = Bert(device=...) #your own device
-
The input classification labels can also be textual labels now. They will be automatically converted to the corresponding label ids (integers starting from 0) by the method
encode()
. The dictionary of labels{label_name:label_id}
will be printed during the preprocessing and saved to the attributeself.dict_labels
. -
All dependancies on tensorflow and keras are removed.
New languages
Add models for other languages:
arabic_bert
for Arabic;chinese_bert
for Chinese;german_bert
for German;hindi_bert
for Hindi;italian_bert
for Italian;portuguese_bert
for Portuguese;russian_bert
for Russian;spanish_bert
for Spanish;swedish_bert
for Swedish.
v1.0.1
Update setup.py