Skip to content
View wns823's full-sized avatar
  • KRAFTON Inc.
  • Seoul, Republic of Korea

Block or report wns823

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 2,661 209 Updated Jan 27, 2025

Optimizing inference proxy for LLMs

Python 1,954 155 Updated Jan 24, 2025

Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.

Python 13,623 1,463 Updated Jan 26, 2025

[NeurIPS 2023] We use large language models as commonsense world model and heuristic policy within Monte-Carlo Tree Search, enabling better-reasoned decision-making for daily task planning problems.

Python 247 21 Updated Nov 16, 2024

An Open Large Reasoning Model for Real-World Solutions

Python 1,410 72 Updated Nov 28, 2024

How to create rational LLM-based agents? Using game-theoretic workflows!

Python 47 6 Updated Dec 1, 2024

Papers and resources related to the security and privacy of LLMs 🤖

Python 468 35 Updated Nov 27, 2024

[ICLR 2025] Automated Design of Agentic Systems

Python 1,148 176 Updated Jan 24, 2025

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 884 74 Updated Jan 27, 2025
Python 15 2 Updated Nov 26, 2024

Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"

Python 266 25 Updated Sep 3, 2024

Run PyTorch LLMs locally on servers, desktop and mobile

Python 3,478 230 Updated Jan 27, 2025

Official implementation of Half-Quadratic Quantization (HQQ)

Python 736 73 Updated Jan 14, 2025

Code Repository of Evaluating Quantized Large Language Models

Python 114 6 Updated Sep 8, 2024

Composable building blocks to build Llama Apps

Python 6,753 816 Updated Jan 27, 2025

Agentic components of the Llama Stack APIs

4,095 626 Updated Jan 27, 2025

Finetuning Large Language Models on One Consumer GPU in 2 Bits

Python 714 76 Updated May 25, 2024

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (p…

1,971 212 Updated Nov 1, 2024

General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for…

C++ 2,074 159 Updated Dec 10, 2024

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 4,554 483 Updated Jan 24, 2025

Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"

Python 989 65 Updated Sep 25, 2024

ReFT: Representation Finetuning for Language Models

Python 1,387 121 Updated Jan 1, 2025

NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference

Python 63 8 Updated Dec 9, 2024

Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI

TypeScript 19,139 1,859 Updated Jan 11, 2025
Python 920 95 Updated Jan 27, 2025
Python 172 12 Updated Sep 26, 2024
Python 21 3 Updated Jan 14, 2025

Finetune Llama 3.3, Mistral, Phi-4, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory

Python 21,376 1,503 Updated Jan 26, 2025

MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248

Python 35 Updated Jun 18, 2024
Next