- Docker - Docker installation guide
- TensorFlow - TensorFlow installation guide
- TensorFlow Serving - TensorFlow Serving installation guide
- HuggingFace - HuggingFace installation guide
*requires Docker
*parameters refer to "DistilBERT (embedding)" sample example
MODEL_SOURCE=$(pwd)/models/embedding/saved_model/1 MODEL_TARGET=/models/embedding/1 MODEL_NAME=embedding sh scripts/start_tf_serving.sh
Parameters:
MODEL_SOURCE
: path to the model in your local system.MODEL_TARGET
: path to the model in the Docker env.MODEL_NAME
: Model name used by TFServing, this name will be part of the API URL.
After finished you can use docker ps
to check active containers and then docker stop
to stop it.
If you don't have a model to use, you can create one using one of the sample models:
- DistilBERT (embedding)
python sample_models/text_models.py get_distilbert_embedding
- DistilBERT (sequence classification)
python sample_models/text_models.py get_distilbert_sequence_classification
- DistilBERT (token classification - NER)
python sample_models/text_models.py get_distilbert_token_classification
- DistilBERT (multiple choice)
python sample_models/text_models.py get_distilbert_multiple_choice
- DistilBERT (question answering)
python sample_models/text_models.py get_distilbert_qa
- DistilGPT2 (text generation)
python sample_models/text_models.py get_distilgpt2_text_generation
- DistilBERT (custom)
python sample_models/text_models.py get_distilbert_custom
We have two options to access the model and make inferences.
- Just use the notebook at
notebooks/text_inference.ipynb
- Run the
app.py
command folder for your specific use case at thegradio_apps
- Available use cases:
- Text:
- Generic
TF_URL="http://localhost:8501/v1/models/embedding:predict" TOKENIZER_PATH="./tokenizers/distilbert-base-uncased" python gradio_apps/text_app.py
- token classification - NER
TF_URL="http://localhost:8501/v1/models/token_classification:predict" TOKENIZER_PATH="./tokenizers/distilbert-base-uncased" python gradio_apps/text_ner_app.py
- multiple choice
TF_URL="http://localhost:8501/v1/models/multiple_choice:predict" TOKENIZER_PATH="./tokenizers/distilbert-base-uncased" python gradio_apps/text_multiple_choice_app.py
- question answering
TF_URL="http://localhost:8501/v1/models/qa:predict" TOKENIZER_PATH="./tokenizers/distilbert-base-uncased" python gradio_apps/text_qa_app.py
- text generation
TF_URL="http://localhost:8501/v1/models/text_generation:predict" TOKENIZER_PATH="./tokenizers/distilgpt2" python gradio_apps/text_generation_app.py
- Generic
- Text:
*_ To be more generic, predictions from the Gradio apps will return raw outputs_
*Gradio apps requires you to define environment variables
TF_URL
: REST API URL provided by your TF Serving.- e.g.
"http://localhost:8501/v1/models/embedding:predict"
- Swap {
embedding
} with your model's name
- Swap {
- e.g.
TOKENIZER_PATH
: path to the tokenizer in your local system.- e.g.
"./tokenizerstokenizers/distilbert-base-uncased"
- e.g.