kadirnar

🔥

Working from home

Kadir Nar kadirnar

🔥

Working from home

AI Research Engineer

971 followers · 973 following

Vyvo
Turkey
03:06 (UTC +03:00)
@kadirnardev
https://huggingface.co/kadirnar
in/kadir-nar

Achievements

x2 x3 x3 x3

Achievements

x2 x3 x3 x3

Highlights

Developer Program Member

Organizations

Lists (10)

Sort

Starred repositories

chenyuntc / csm-training

Python 4 Updated Apr 1, 2025

cityzen95 / LLM_from_scratch

Building LLMs from scratch following the book from S. Raschka

Jupyter Notebook 30 2 Updated Mar 27, 2025

cuhealthybrains / MT-LLM

The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"

Python 38 1 Updated Apr 7, 2025

Deep-unlearning / SmolVoice

Python 1 Updated Apr 2, 2025

VolkanSimsir / LLM-FineTune-Course

Bu Course LLM(Large Language Model) Fine Tune işlemlerini Türkçe klavuz olarak

Jupyter Notebook 10 Updated Mar 29, 2025

YuvrajSingh-mist / SmolWhisper

Trained a Whisper model a ~30M (whisper tiny.en) architecture I coded from ground up to build a small ASR model, going through the below-mentioned stage from scratch. Trained on GigaSpeech dataset …

Python 3 Updated Mar 30, 2025

R100001 / Programming-Massively-Parallel-Processors

Cuda 149 33 Updated Aug 2, 2024

bytedance / MegaTTS3

Python 4,570 311 Updated Apr 12, 2025

jishengpeng / WavChat

A Survey of Spoken Dialogue Models (60 pages)

288 16 Updated Nov 28, 2024

luotianze666 / WaveFM

[NAACL 2025] WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching

Python 83 7 Updated Mar 27, 2025

DataoceanAI / Dolphin

Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.

Python 432 26 Updated Apr 9, 2025

imdatceleste / m-ailabs-dataset

This is the M-AILABS Speech Dataset

55 3 Updated Nov 28, 2024

usefulsensors / moonshine

Fast and accurate automatic speech recognition (ASR) for edge devices

Python 2,677 141 Updated Feb 26, 2025

auspicious3000 / autovc

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

Python 1,050 210 Updated Oct 23, 2024

naklecha / llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 14,855 1,254 Updated May 23, 2024

ByungKwanLee / DeepSick-R1

Reproduction of DeepSeek-R1

Python 221 21 Updated Apr 14, 2025

Choddeok / EmoSpherepp

The official implementation of EmoSphere++

Python 81 8 Updated Apr 14, 2025

Audio-WestlakeU / CleanMel

Pytorch implementation of "CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR".

Python 47 2 Updated Apr 15, 2025

archinetai / audio-diffusion-pytorch

Audio generation using diffusion models, in PyTorch.

Python 2,034 173 Updated Jun 12, 2023

yzGuu830 / efficient-speech-codec

[EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers

Jupyter Notebook 111 4 Updated Mar 20, 2025

scb-10x / typhoon2-audio

The repository of Typhoon2-Audio, Thai audio-language model that supports speech-in and speech-out

Python 14 1 Updated Jan 27, 2025

ShoukanLabs / Vokan

The Vokan Architecture (Tsukasa speech based)

Jupyter Notebook 9 1 Updated Feb 10, 2025

ShoukanLabs / StyleTTS2-Dash

A lightweight StyleTTS2 and Vokan inference library

Python 4 1 Updated Mar 17, 2025

umbertocappellazzo / Llama-AVSR

[ICASSP 2025] Official Pytorch implementation of "Large Language Models are Strong Audio-Visual Speech Recognition Learners".

Python 16 1 Updated Mar 10, 2025

bunyaminergen / awesome-speech-dataset

Awesome Speech Dataset, including download links and a brief explanation for each resource. These datasets provide diverse and high-quality speech data covering various domains such as conversation…

8 Updated Mar 13, 2025

snakers4 / silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 5,573 541 Updated Mar 24, 2025

ShoukanLabs / VoPho

A collection of all our phonemeizers for dataset construction and inference

Python 22 2 Updated Feb 21, 2025

Respaired / Tsukasa-Speech

a Frontier Japanese Speech Generation net

Jupyter Notebook 30 11 Updated Mar 11, 2025

MekkCyber / TritonAcademy

A repository to unravel the language of GPUs, making their kernel conversations easy to understand

Kadir Nar kadirnar

Highlights

Organizations

Lists (10)

Agent

Diffusion

LLM

LLM-Paper

Next-Feature

Omni

Speech

Triton

TTS

VLM

Starred repositories

Deep learning