Skip to content

Commit

Permalink
Simplify flags
Browse files Browse the repository at this point in the history
  • Loading branch information
iojw committed Jul 7, 2024
1 parent 6437c71 commit d530c43
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 7 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ python -m examples.router_chat --router mf --threshold 0.11593

In the above examples, GPT-4 and Mixtral 8x7B are used as the model pair, but you can modify this using the `strong-model` and `weak-model` arguments.

We leverage [LiteLLM](https://github.com/BerriAI/litellm) to support chat completions from a wide-range of open-source and closed models. In general, you need a setup an API key and point to the provider with the appropriate model name. Alternatively, you can also use **any OpenAI-compatible endpoint** by prefixing the model name with `openai/` using the `--alt-base-url` and `--alt-api-key` flags to point to the server.
We leverage [LiteLLM](https://github.com/BerriAI/litellm) to support chat completions from a wide-range of open-source and closed models. In general, you need a setup an API key and point to the provider with the appropriate model name. Alternatively, you can also use **any OpenAI-compatible endpoint** by prefixing the model name with `openai/` and setting the `--base-url` and `--api-key` flags.

Note that regardless of the model pair used, an `OPENAI_API_KEY` will be required to generate embeddings for both the `mf` and `sw_ranking` routers.

Expand Down
12 changes: 6 additions & 6 deletions routellm/openai_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,8 @@ async def lifespan(app):
routers=args.routers,
config=yaml.safe_load(open(args.config, "r")) if args.config else None,
routed_pair=routed_pair,
alt_base_url=args.alt_base_url,
alt_api_key=args.alt_api_key,
api_base=args.base_url,
api_key=args.api_key,
progress_bar=True,
)
yield
Expand Down Expand Up @@ -159,14 +159,14 @@ async def create_chat_completion(request: ChatCompletionRequest):
choices=list(ROUTER_CLS.keys()),
)
parser.add_argument(
"--alt-base-url",
help="The base URL used for LLM requests",
"--base-url",
help="The base URL used for all LLM requests",
type=str,
default=None,
)
parser.add_argument(
"--alt-api-key",
help="The API key used for LLM requests",
"--api-key",
help="The API key used for all LLM requests",
type=str,
default=None,
)
Expand Down

0 comments on commit d530c43

Please sign in to comment.