Skip to content

Rate limits outside of the CLI #952

Answered by palamangelus
palamangelus asked this question in Q&A
Discussion options

You must be logged in to vote

Sort of - for now just to test, I added it to the RATE_CONFIG in rate_limiter.py, and I reduced my docs to ~100 for testing (which wow, still consumed a large number of embedding tokens). I have a number of other questions, but will post them in different threads.

In case anyone else needs this, this is what I added to RATE_CONFIG:

AZURE_HOST = "[yourprojectendpointbase].openai.azure.com/"
AZURE_BASE_URL = f"https://{AZURE_HOST}"

Then added to

RATE_CONFIG: dict[tuple[str, str | MatchAllInputs], RateLimitItem] = {
    ("get", AZURE_HOST): RateLimitItemPerMinute(900, 1),

That gave me an answer, so on to figuring out how to parse AnswerResponse and reduce my token usage!

Thank you for you…

Replies: 3 comments 5 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
5 replies
@jamesbraza
Comment options

@palamangelus
Comment options

@jamesbraza
Comment options

@palamangelus
Comment options

Answer selected by jamesbraza
@jamesbraza
Comment options

Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants