Skip to content
View jychen21's full-sized avatar
🐼
🐼

Block or report jychen21

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
  • CUDA Library Samples

    Cuda Other Updated Nov 12, 2024
  • cutlass Public

    Forked from NVIDIA/cutlass

    CUDA Templates for Linear Algebra Subroutines

    C++ Other Updated Nov 8, 2024
  • 🎉 Modern CUDA Learn Notes with PyTorch: CUDA Cores, Tensor Cores, fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, hgemm, sgemv, warp/block reduce, elementwise, softmax, layernorm, rmsnorm.

    Cuda GNU General Public License v3.0 Updated Nov 8, 2024
  • TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

    C++ Apache License 2.0 Updated Nov 6, 2024
  • 📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

    GNU General Public License v3.0 Updated Nov 1, 2024
  • PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)

    Python Apache License 2.0 Updated Oct 15, 2024
  • AutoFP8 Public

    Forked from neuralmagic/AutoFP8
    Python Apache License 2.0 Updated Oct 1, 2024
  • Samples Public

    Forked from yanfeich/Samples
    C++ Updated Sep 25, 2024
  • Paddle Public

    Forked from PaddlePaddle/Paddle

    PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

    C++ Apache License 2.0 Updated Sep 25, 2024
  • xDiT Public

    Forked from xdit-project/xDiT

    xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) on multi-GPU Clusters

    Python Apache License 2.0 Updated Sep 24, 2024
  • sglang Public

    Forked from sgl-project/sglang

    SGLang is a fast serving framework for large language models and vision language models.

    Python Apache License 2.0 Updated Sep 4, 2024
  • Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)

    Python Apache License 2.0 Updated Aug 30, 2024
  • vidur Public

    Forked from microsoft/vidur

    A large-scale simulation framework for LLM inference

    Python MIT License Updated Aug 24, 2024
  • Jupyter Notebook 1 Updated Aug 22, 2024
  • AISystem Public

    Forked from chenzomi12/AISystem

    AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

    Jupyter Notebook Apache License 2.0 Updated Aug 18, 2024
  • Samples for CUDA Developers which demonstrates features in CUDA Toolkit

    C Other Updated Jul 25, 2024
  • Python 11 3 Updated Jul 24, 2024
  • vllm Public

    Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python Apache License 2.0 Updated Jul 12, 2024
  • TensorFlow and PyTorch Reference models for Gaudi(R)

    Python 1 Updated Apr 28, 2024
  • Provides the examples to write and build Habana custom kernels using the HabanaTools

    C++ Updated Mar 25, 2024
  • tgi-gaudi Public

    Forked from huggingface/tgi-gaudi

    Large Language Model Text Generation Inference on Habana Gaudi

    Python Other Updated Mar 1, 2024
  • DeepSpeed Public

    Forked from HabanaAI/DeepSpeed

    DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

    Python 1 Apache License 2.0 Updated Feb 28, 2024
  • ChatGLM-6B Public

    Forked from THUDM/ChatGLM-6B

    ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

    Python Apache License 2.0 Updated Sep 14, 2023
  • ChatGLM2-6B Public

    Forked from THUDM/ChatGLM2-6B

    ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

    Python Other Updated Sep 14, 2023