Skip to content
View noiji's full-sized avatar

Block or report noiji

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. hiyouga/LLaMA-Factory hiyouga/LLaMA-Factory Public

    Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

    Python 63.2k 7.6k

  2. vllm-project/vllm vllm-project/vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 64.1k 11.6k

  3. lm-sys/FastChat lm-sys/FastChat Public

    An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

    Python 39.3k 4.8k

  4. BerriAI/litellm BerriAI/litellm Public

    Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…

    Python 31.7k 4.8k

  5. NVIDIA/TensorRT-LLM NVIDIA/TensorRT-LLM Public

    TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

    C++ 12.2k 1.9k

  6. microsoft/LLMLingua microsoft/LLMLingua Public

    [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

    Python 5.6k 337