Skip to content

[Feature Request]: Can we add configuration items for customizing the API request rate and token quantity? #5786

Open
@kostya-sec

Description

@kostya-sec

Is there an existing issue for the same feature request?

  • I have checked the existing issues.

Is your feature request related to a problem?

Recently, when using the API request of SiliconAPI, I found that an RPM error occurred during document parsing, which caused the document parsing to fail. I tried to modified Dockerfile to install the ratelimit and tiktoken packages during the build process, and added a modified class to the llm directory so that there would be no rate limit error when requesting chat model, embedding model, rerank model, etc.

Describe the feature you'd like

Recently, when using the API request of SiliconAPI, I found that an RPM error occurred during document parsing, which caused the document parsing to fail. I tried to modified Dockerfile to install the ratelimit and tiktoken packages during the build process, and added a modified class to the llm directory so that there would be no rate limit error when requesting chat model, embedding model, rerank model, etc.

Describe implementation you've considered

No response

Documentation, adoption, use case

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions