-
Notifications
You must be signed in to change notification settings - Fork 816
[CLI] Add Inference Endpoints Commands #3428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v1.0-release
Are you sure you want to change the base?
Conversation
|
||
|
||
@deploy_app.command(name="hub", help="Deploy an Inference Endpoint from a Hub repository.") | ||
def deploy_from_hub( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
splitted the deploy into two subcommands instead of using a flag to deploy from the Model Catalog because Typer doesn't easily allow conditional requirements (i.e., "these parameters are required unless that flag (e.g. --from-catalog
) is set"), which makes validation and type hints messy so using subcommands lets Typer enforce required options cleanly for each case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think of having
# deploy from Hub
hf endpoints deploy
# list catalog
hf endpoints catalog ls
# deploy from catalog
hf endpoints catalog deploy
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other if keeping the same I'd be more inclined for
hf endpoints deploy from-repo
hf endpoints deploy from-catalog
(since both are deployed from Hub + the from-
makes it more explicit)
(but still slight preference for the one above)
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! Left some high level feedback mostly geared towards the CLI syntax. Let me know what you think :)
NamespaceOpt = Annotated[ | ||
Optional[str], | ||
typer.Option( | ||
help="The namespace where the Inference Endpoint will be created. Defaults to the current user's namespace.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
help="The namespace where the Inference Endpoint will be created. Defaults to the current user's namespace.", | |
help="The namespace associated with the Inference Endpoint. Defaults to the current user's namespace.", |
less specific too endpoint creation
@app.command(help="Lists all Inference Endpoints for the given namespace.") | ||
def list( | ||
namespace: NamespaceOpt = None, | ||
token: TokenOpt = None, | ||
) -> None: | ||
api = get_hf_api(token=token) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@app.command(help="Lists all Inference Endpoints for the given namespace.") | |
def list( | |
namespace: NamespaceOpt = None, | |
token: TokenOpt = None, | |
) -> None: | |
api = get_hf_api(token=token) | |
@app.command() | |
def list( | |
namespace: NamespaceOpt = None, | |
token: TokenOpt = None, | |
) -> None: | |
"""Lists all Inference Endpoints for the given namespace.""" | |
api = get_hf_api(token=token) |
(nit) slight preference for docstring help. It makes it easier if we want to extend it to multiline in the future
(same for other commands)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
potentially rename to ls
?
EDIT: naah, it's good like this. We are not listing a filesystem
) | ||
|
||
|
||
deploy_app = typer_factory(help="Deploy Inference Endpoints from the Hub or the Catalog.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deploy_app = typer_factory(help="Deploy Inference Endpoints from the Hub or the Catalog.") | |
deploy_app = typer_factory(help="Deploy a new Inference Endpoint.") |
|
||
|
||
@deploy_app.command(name="hub", help="Deploy an Inference Endpoint from a Hub repository.") | ||
def deploy_from_hub( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think of having
# deploy from Hub
hf endpoints deploy
# list catalog
hf endpoints catalog ls
# deploy from catalog
hf endpoints catalog deploy
?
|
||
|
||
@deploy_app.command(name="hub", help="Deploy an Inference Endpoint from a Hub repository.") | ||
def deploy_from_hub( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other if keeping the same I'd be more inclined for
hf endpoints deploy from-repo
hf endpoints deploy from-catalog
(since both are deployed from Hub + the from-
makes it more explicit)
(but still slight preference for the one above)
typer.Option( | ||
help="Skip confirmation prompts.", | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typer.Option( | |
help="Skip confirmation prompts.", | |
), | |
typer.Option("--yes", help="Skip confirmation prompts."), |
explicit --yes
avoids to have a --no-yes
defined https://typer.tiangolo.com/tutorial/parameter-types/bool/
|
||
|
||
@app.command(help="List available Catalog models.") | ||
def list_catalog( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slight preference for
hf endpoints catalog list
app.add_typer(repo_cli, name="repo") | ||
app.add_typer(repo_files_cli, name="repo-files") | ||
app.add_typer(jobs_cli, name="jobs") | ||
app.add_typer(inference_endpoints_cli, name="inference-endpoints") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
app.add_typer(inference_endpoints_cli, name="inference-endpoints") | |
app.add_typer(inference_endpoints_cli, name="endpoints") |
what do you think of renaming everything to hf endpoints
? It is slightly less explicitly but much simpler to type and copy-paste IMO
running_ok: Annotated[ | ||
bool, | ||
typer.Option( | ||
help="If `True`, the method will not raise an error if the Inference Endpoint is already running." | ||
), | ||
] = True, | ||
token: TokenOpt = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one feels weird in the CLI --help (--running-ok / --no-running-ok
). I would either remove it or rename it to --fail-if-already-running
(with default to False)
(in any case I genuinely don't see a case someone would want to fail^^)
|
||
@app.command(help="Update an existing endpoint.") | ||
def update( | ||
endpoint_name: NameArg, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
endpoint_name: NameArg, | |
name: NameArg, | |
namespace: NamespaceOpt = None, |
For consistency as all other commands (pause, resume, etc.) have both name
and namespace
defined
This PR implements a CLI to manage Inference Endpoints, this provides "one liners" to deploy/delete/update/etc. endpoints, which could be handy in many cases. The DX intentionally mirrors a bit the UI instead of the API, to quote @ErikKaum :
I explored a few layouts (e.g. a single deploy command with
--catalog
), but the cleanest UX ended up being two explicit paths:hf inference-endpoints deploy hub ...
– minimal set of hardware/task configs for Hub models.hf inference-endpoints deploy catalog ...
– one liner using optimized configs from the model catalog.happy to more iterate if there are more suggestions to make the DX better (and simpler?)