-
Rebellions Inc.
- Seoul, South Korea
-
23:35
(UTC +09:00) - https://jueonpark.notion.site/Jueon-Park-1fcdd44a43134fe987f140c8881ac5e7
- in/jueonpark11
Lists (4)
Sort Name ascending (A-Z)
Stars
The TT-Forge FE is a graph compiler designed to optimize and transform computational graphs for deep learning models, enhancing their performance and efficiency.
The Triton TensorRT-LLM Backend
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Repository of model demos using TT-Buda
IREE's PyTorch Frontend, based on Torch Dynamo.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
🤘 TT-NN operator library, and TT-Metalium low level kernel programming model.
Flexible Intermediate Representation for RTL
Fast and memory-efficient exact attention
A Borrow Checker and Memory Ownership System for C++20 (heavily inspired from Rust)
A high-throughput and memory-efficient inference and serving engine for LLMs
FlatBuffers: Memory Efficient Serialization Library
Universal LLM Deployment Engine with ML Compilation
AOMP is an open source Clang/LLVM based compiler with added support for the OpenMP® API on Radeon™ GPUs. Use this repository for releases, issues, documentation, packaging, and examples.
An open-source efficient deep learning framework/compiler, written in python.
HIP: C++ Heterogeneous-Compute Interface for Portability
This repository contains the codebase for Virtual FPGA Lab in Makerchip contributing as a participant in Google Summer of Code 2021, under FOSSi Foundation.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.