-
-
Star-Attention Public
Forked from NVIDIA/Star-AttentionEfficient LLM Inference over Long Sequences
Python Apache License 2.0 UpdatedDec 28, 2024 -
-
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedJun 17, 2024 -
transformers Public
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python Apache License 2.0 UpdatedApr 20, 2024 -
-
CME-213 Public
Introduction to parallel computing using MPI, openMP, and CUDA
C++ UpdatedJan 16, 2024 -
-
-
-
-
-
MPMCQueue Public
Forked from rigtorp/MPMCQueueA bounded multi-producer multi-consumer concurrent queue written in C++11
C++ MIT License UpdatedSep 18, 2023 -
moderngpu Public
Forked from kygx-legend/moderngpuPatterns and behaviors for GPU computing
C++ Other UpdatedNov 10, 2021