-
Bytedance Seed/MLSys, prev AWS AI
- https://sites.google.com/view/haibinlin/
- @eric_haibin_lin
-
verl Public
Forked from volcengine/verlveRL: Volcano Engine Reinforcement Learning for LLM
-
-
-
deepscaler Public
Forked from agentica-project/rllmDemocratizing Reinforcement Learning for LLMs
Python MIT License UpdatedFeb 16, 2025 -
Awesome-LLM Public
Forked from Hannibal046/Awesome-LLMAwesome-LLM: a curated list of Large Language Model
Creative Commons Zero v1.0 Universal UpdatedJan 9, 2025 -
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
Python BSD 3-Clause "New" or "Revised" License UpdatedJan 6, 2025 -
-
openr Public
Forked from openreasoner/openrOpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
Python MIT License UpdatedDec 9, 2024 -
Megatron-LM Public
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
Python Other UpdatedDec 8, 2024 -
TransformerEngine Public
Forked from NVIDIA/TransformerEngineA library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…
Python Apache License 2.0 UpdatedDec 6, 2024 -
DeepSpeedExamples Public
Forked from deepspeedai/DeepSpeedExamplesExample models using DeepSpeed
Python Apache License 2.0 UpdatedDec 3, 2024 -
slapo Public
Forked from awslabs/slapoA schedule language for large model training
Python Apache License 2.0 UpdatedJun 14, 2024 -
cutlass Public
Forked from NVIDIA/cutlassCUDA Templates for Linear Algebra Subroutines
C++ Other UpdatedAug 15, 2023 -
CLIP Public
Forked from openai/CLIPCLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Jupyter Notebook MIT License UpdatedApr 24, 2023 -
evals Public
Forked from openai/evalsEvals is a framework for evaluating OpenAI models and an open-source registry of benchmarks.
Python MIT License UpdatedMar 14, 2023 -
matxscript Public
Forked from bytedance/matxscriptThe model pre- and post-processing framework
C++ Apache License 2.0 UpdatedDec 26, 2022 -
DALI Public
Forked from NVIDIA/DALIA GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
C++ Apache License 2.0 UpdatedSep 18, 2022 -
veGiantModel Public
Forked from volcengine/veGiantModelPython Apache License 2.0 UpdatedMar 23, 2022 -
pytorch Public
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
Python Other UpdatedDec 25, 2021 -
elpa Public
Forked from marekandreas/elpaA scalable eigensolver for dense, symmetric (hermitian) matrices (fork of https://gitlab.mpcdf.mpg.de/elpa/elpa.git)
Fortran Other UpdatedNov 30, 2021 -
builder Public
Forked from pytorch/builderContinuous builder and binary build scripts for pytorch
Shell BSD 2-Clause "Simplified" License UpdatedNov 18, 2021 -
ps-lite Public
Forked from dmlc/ps-liteA lightweight parameter server interface
-
byteps Public
Forked from bytedance/bytepsA high performance and general PS framework for distributed training
Python Other UpdatedOct 5, 2021 -
ucx Public
Forked from openucx/ucxUnified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)
C Other UpdatedMar 11, 2021 -
-
ucx-py Public
Forked from rapidsai/ucx-pyPython bindings for UCX
Python BSD 3-Clause "New" or "Revised" License UpdatedOct 3, 2020 -
DeepSpeed Public
Forked from deepspeedai/DeepSpeedDeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
Python MIT License UpdatedSep 26, 2020 -
pytorch-OpCounter Public
Forked from Lyken17/pytorch-OpCounterCount the MACs / FLOPs of your PyTorch model.
-
gossip Public
Forked from Funatiq/gossipgossip: Efficient Communication Primitives for Multi-GPU Systems
C++ MIT License UpdatedSep 3, 2020 -
HugeCTR Public
Forked from NVIDIA-Merlin/HugeCTRHugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training
C++ Apache License 2.0 UpdatedAug 29, 2020