[CLI] Add Inference Endpoints Commands #3428

hanouticelina · 2025-10-09T15:05:16Z

This PR implements a CLI to manage Inference Endpoints, this provides "one liners" to deploy/delete/update/etc. endpoints, which could be handy in many cases. The DX intentionally mirrors a bit the UI instead of the API, to quote @ErikKaum :

we're renaming things in the UI quite fast to adapt and make things make more sense. And in many cases in the UI things are configured with slightly different names/groupings that in the API. Just because it's faster than in the API.

I explored a few layouts (e.g. a single deploy command with --catalog), but the cleanest UX ended up being two explicit paths:

hf inference-endpoints deploy hub ... – minimal set of hardware/task configs for Hub models.
hf inference-endpoints deploy catalog ... – one liner using optimized configs from the model catalog.
delete and update endpoints currently live under the "Settings" group in the UI, but feels more natural to keep them top-level in the CLI 🤷‍♀️

> hf inference-endpoints --help
Usage: hf inference-endpoints [OPTIONS] COMMAND [ARGS]...

  Manage Hugging Face Inference Endpoints.

Options:
  --help  Show this message and exit.

Commands:
  delete         Delete an Inference Endpoint permanently.
  deploy         Deploy Inference Endpoints from the Hub or the Catalog.
  describe       Get information about an Inference Endpoint.
  list           Lists all inference endpoints for the given namespace.
  list-catalog   List available Catalog models.
  pause          Pause an Inference Endpoint.
  resume         Resume an Inference Endpoint.
  scale-to-zero  Scale an Inference Endpoint to zero.
  update         Update an existing endpoint.

happy to more iterate if there are more suggestions to make the DX better (and simpler?)

… into inference-endpoints-cli

src/huggingface_hub/cli/jobs.py

hanouticelina · 2025-10-09T15:12:30Z

src/huggingface_hub/cli/inference_endpoints.py

+
+
+@deploy_app.command(name="hub", help="Deploy an Inference Endpoint from a Hub repository.")
+def deploy_from_hub(


splitted the deploy into two subcommands instead of using a flag to deploy from the Model Catalog because Typer doesn't easily allow conditional requirements (i.e., "these parameters are required unless that flag (e.g. --from-catalog) is set"), which makes validation and type hints messy so using subcommands lets Typer enforce required options cleanly for each case

what do you think of having

# deploy from Hub hf endpoints deploy # list catalog hf endpoints catalog ls # deploy from catalog hf endpoints catalog deploy

?

other if keeping the same I'd be more inclined for

hf endpoints deploy from-repo hf endpoints deploy from-catalog

(since both are deployed from Hub + the from- makes it more explicit)

(but still slight preference for the one above)

HuggingFaceDocBuilderDev · 2025-10-09T15:43:42Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Wauplin

Thanks for working on this! Left some high level feedback mostly geared towards the CLI syntax. Let me know what you think :)

Wauplin · 2025-10-10T13:28:05Z

src/huggingface_hub/cli/inference_endpoints.py

+NamespaceOpt = Annotated[
+    Optional[str],
+    typer.Option(
+        help="The namespace where the Inference Endpoint will be created. Defaults to the current user's namespace.",


Suggested change

help="The namespace where the Inference Endpoint will be created. Defaults to the current user's namespace.",

help="The namespace associated with the Inference Endpoint. Defaults to the current user's namespace.",

less specific too endpoint creation

Wauplin · 2025-10-10T13:29:33Z

src/huggingface_hub/cli/inference_endpoints.py

+@app.command(help="Lists all Inference Endpoints for the given namespace.")
+def list(
+    namespace: NamespaceOpt = None,
+    token: TokenOpt = None,
+) -> None:
+    api = get_hf_api(token=token)


Suggested change

@app.command(help="Lists all Inference Endpoints for the given namespace.")

def list(

namespace: NamespaceOpt = None,

token: TokenOpt = None,

) -> None:

api = get_hf_api(token=token)

@app.command()

def list(

namespace: NamespaceOpt = None,

token: TokenOpt = None,

) -> None:

"""Lists all Inference Endpoints for the given namespace."""

api = get_hf_api(token=token)

(nit) slight preference for docstring help. It makes it easier if we want to extend it to multiline in the future

(same for other commands)

~~potentially rename to ls ?~~
EDIT: naah, it's good like this. We are not listing a filesystem

Wauplin · 2025-10-10T13:33:29Z

src/huggingface_hub/cli/inference_endpoints.py

+    )
+
+
+deploy_app = typer_factory(help="Deploy Inference Endpoints from the Hub or the Catalog.")


Suggested change

deploy_app = typer_factory(help="Deploy Inference Endpoints from the Hub or the Catalog.")

deploy_app = typer_factory(help="Deploy a new Inference Endpoint.")

Wauplin · 2025-10-10T13:35:40Z

src/huggingface_hub/cli/inference_endpoints.py

+
+
+@deploy_app.command(name="hub", help="Deploy an Inference Endpoint from a Hub repository.")
+def deploy_from_hub(


what do you think of having

# deploy from Hub hf endpoints deploy # list catalog hf endpoints catalog ls # deploy from catalog hf endpoints catalog deploy

?

Wauplin · 2025-10-10T13:37:40Z

src/huggingface_hub/cli/inference_endpoints.py

+
+
+@deploy_app.command(name="hub", help="Deploy an Inference Endpoint from a Hub repository.")
+def deploy_from_hub(


other if keeping the same I'd be more inclined for

hf endpoints deploy from-repo hf endpoints deploy from-catalog

(since both are deployed from Hub + the from- makes it more explicit)

(but still slight preference for the one above)

Wauplin · 2025-10-10T13:44:35Z

src/huggingface_hub/cli/inference_endpoints.py

+        typer.Option(
+            help="Skip confirmation prompts.",
+        ),


Suggested change

typer.Option(

help="Skip confirmation prompts.",

),

typer.Option("--yes", help="Skip confirmation prompts."),

explicit --yes avoids to have a --no-yes defined https://typer.tiangolo.com/tutorial/parameter-types/bool/

Wauplin · 2025-10-10T13:46:57Z

src/huggingface_hub/cli/inference_endpoints.py

+
+
+@app.command(help="List available Catalog models.")
+def list_catalog(


Slight preference for

hf endpoints catalog list

Wauplin · 2025-10-10T13:47:55Z

src/huggingface_hub/cli/hf.py

 app.add_typer(repo_cli, name="repo")
 app.add_typer(repo_files_cli, name="repo-files")
 app.add_typer(jobs_cli, name="jobs")
+app.add_typer(inference_endpoints_cli, name="inference-endpoints")


Suggested change

app.add_typer(inference_endpoints_cli, name="inference-endpoints")

app.add_typer(inference_endpoints_cli, name="endpoints")

what do you think of renaming everything to hf endpoints? It is slightly less explicitly but much simpler to type and copy-paste IMO

Wauplin · 2025-10-10T13:50:28Z

src/huggingface_hub/cli/inference_endpoints.py

+    running_ok: Annotated[
+        bool,
+        typer.Option(
+            help="If `True`, the method will not raise an error if the Inference Endpoint is already running."
+        ),
+    ] = True,
+    token: TokenOpt = None,


This one feels weird in the CLI --help (--running-ok / --no-running-ok). I would either remove it or rename it to --fail-if-already-running (with default to False)

(in any case I genuinely don't see a case someone would want to fail^^)

Wauplin · 2025-10-10T13:52:33Z

src/huggingface_hub/cli/inference_endpoints.py

+
+@app.command(help="Update an existing endpoint.")
+def update(
+    endpoint_name: NameArg,


Suggested change

endpoint_name: NameArg,

name: NameArg,

namespace: NamespaceOpt = None,

For consistency as all other commands (pause, resume, etc.) have both name and namespace defined

hanouticelina added 8 commits October 9, 2025 16:17

add inference endpoints cli

4faf7e5

fix naming

30c13d6

update docs

e670188

Merge branch 'v1.0-release' of github.com:huggingface/huggingface_hub…

f387a11

… into inference-endpoints-cli

wording

b49a70a

remove logging

7b7b122

don't instantiate logger when not needed

0862c4a

refactor

d81a59c

hanouticelina requested review from ErikKaum and Wauplin October 9, 2025 15:05

hanouticelina commented Oct 9, 2025

View reviewed changes

src/huggingface_hub/cli/jobs.py Show resolved Hide resolved

hanouticelina commented Oct 9, 2025

View reviewed changes

hanouticelina added 2 commits October 9, 2025 17:13

remove unused import

6a50b0b

nit

c5b0638

nit

5b4111d

Wauplin reviewed Oct 10, 2025

View reviewed changes



		@deploy_app.command(name="hub", help="Deploy an Inference Endpoint from a Hub repository.")
		def deploy_from_hub(

	help="The namespace where the Inference Endpoint will be created. Defaults to the current user's namespace.",
	help="The namespace associated with the Inference Endpoint. Defaults to the current user's namespace.",

		)


		deploy_app = typer_factory(help="Deploy Inference Endpoints from the Hub or the Catalog.")



		@app.command(help="List available Catalog models.")
		def list_catalog(

	app.add_typer(inference_endpoints_cli, name="inference-endpoints")
	app.add_typer(inference_endpoints_cli, name="endpoints")

	endpoint_name: NameArg,
	name: NameArg,
	namespace: NamespaceOpt = None,

[CLI] Add Inference Endpoints Commands #3428

Are you sure you want to change the base?

[CLI] Add Inference Endpoints Commands #3428

Uh oh!

Conversation

hanouticelina commented Oct 9, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Oct 9, 2025

Uh oh!

Wauplin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Wauplin Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Wauplin Oct 10, 2025 •

edited

Loading