-
Notifications
You must be signed in to change notification settings - Fork 1
Issues: tenstorrent/tt-inference-server
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Add evals support for 3.x meta Llama eval datasets in lm-evaluation-harness
enhancement
New feature or request
#81
opened Jan 28, 2025 by
tstescoTT
setup.sh to check host for running out of disk space when setting up new model
enhancement
New feature or request
#76
opened Jan 22, 2025 by
tstescoTT
Setup.sh check and error message for HF 401 error when gated repo (e.g. Llama 3.x repos) does not have authorization.
enhancement
New feature or request
#75
opened Jan 21, 2025 by
tstescoTT
Verify device topology automatically
enhancement
New feature or request
#74
opened Jan 21, 2025 by
tstescoTT
MESH_DEVICE management for Llama 3.x implementations
enhancement
New feature or request
#73
opened Jan 21, 2025 by
tstescoTT
vLLM run script prefill + decode trace pre-capture to avoid TTFT on first completions being unexpectedly high or stalling
enhancement
New feature or request
#56
opened Dec 12, 2024 by
tstescoTT
Provide example chat template usage
documentation
Improvements or additions to documentation
enhancement
New feature or request
#36
opened Nov 15, 2024 by
tstescoTT
Add status messaging and endpoint to allow for client-side users to reason about model initialization and life cycle.
enhancement
New feature or request
#17
opened Sep 26, 2024 by
tstescoTT
Capture tt-metal and tt-NN loguru logs in inference server python log files
enhancement
New feature or request
#13
opened Sep 25, 2024 by
tstescoTT
ProTip!
Add no:assignee to see everything that’s not assigned.