Conversion of Huggingface bigcode/santacoder to Nvidia Triton Inference server #17

michaelfeil · 2023-01-24T09:41:34Z

Thanks for publishing the model to Huggingface. For using the Triton Inference server in Products like https://github.com/fauxpilot/fauxpilot:

Do you have any preferred way to convert it to Nvidia Triton Inference server (e.g. https://github.com/triton-inference-server/fastertransformer_backend), starting e.g. from the checkpoint by Huggingface?

model = AutoModelForCausalLM.from_pretrained(
    "bigcode/santacoder",
    revision="no-fim", # name of branch or commit hash
    trust_remote_code=True
)

The text was updated successfully, but these errors were encountered:

* initial commit * script fix

mayank31398 pushed a commit to mayank31398/BigCode-Megatron-LM that referenced this issue Jun 20, 2023

Curriculum learning support (bigcode-project#17)

4d6a08d

* initial commit * script fix

mayank31398 pushed a commit to mayank31398/BigCode-Megatron-LM that referenced this issue Jun 21, 2023

Curriculum learning support (bigcode-project#17)

71aa558

* initial commit * script fix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conversion of Huggingface bigcode/santacoder to Nvidia Triton Inference server #17

Conversion of Huggingface bigcode/santacoder to Nvidia Triton Inference server #17

michaelfeil commented Jan 24, 2023 •

edited

Loading

Conversion of Huggingface bigcode/santacoder to Nvidia Triton Inference server #17

Conversion of Huggingface bigcode/santacoder to Nvidia Triton Inference server #17

Comments

michaelfeil commented Jan 24, 2023 • edited Loading

michaelfeil commented Jan 24, 2023 •

edited

Loading