Skip to content
View kadirnar's full-sized avatar
🔥
Working from home
🔥
Working from home

Organizations

@goksenin-uav

Block or report kadirnar

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results
Python 4 Updated Apr 1, 2025

Building LLMs from scratch following the book from S. Raschka

Jupyter Notebook 30 2 Updated Mar 27, 2025

The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"

Python 38 1 Updated Apr 7, 2025
Python 1 Updated Apr 2, 2025

Bu Course LLM(Large Language Model) Fine Tune işlemlerini Türkçe klavuz olarak

Jupyter Notebook 10 Updated Mar 29, 2025

Trained a Whisper model a ~30M (whisper tiny.en) architecture I coded from ground up to build a small ASR model, going through the below-mentioned stage from scratch. Trained on GigaSpeech dataset …

Python 3 Updated Mar 30, 2025
Python 4,570 311 Updated Apr 12, 2025

A Survey of Spoken Dialogue Models (60 pages)

288 16 Updated Nov 28, 2024

[NAACL 2025] WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching

Python 83 7 Updated Mar 27, 2025

Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.

Python 432 26 Updated Apr 9, 2025

This is the M-AILABS Speech Dataset

55 3 Updated Nov 28, 2024

Fast and accurate automatic speech recognition (ASR) for edge devices

Python 2,677 141 Updated Feb 26, 2025

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

Python 1,050 210 Updated Oct 23, 2024

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 14,855 1,254 Updated May 23, 2024

Reproduction of DeepSeek-R1

Python 221 21 Updated Apr 14, 2025

The official implementation of EmoSphere++

Python 81 8 Updated Apr 14, 2025

Pytorch implementation of "CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR".

Python 47 2 Updated Apr 15, 2025

Audio generation using diffusion models, in PyTorch.

Python 2,034 173 Updated Jun 12, 2023

[EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers

Jupyter Notebook 111 4 Updated Mar 20, 2025

The repository of Typhoon2-Audio, Thai audio-language model that supports speech-in and speech-out

Python 14 1 Updated Jan 27, 2025

The Vokan Architecture (Tsukasa speech based)

Jupyter Notebook 9 1 Updated Feb 10, 2025

A lightweight StyleTTS2 and Vokan inference library

Python 4 1 Updated Mar 17, 2025

[ICASSP 2025] Official Pytorch implementation of "Large Language Models are Strong Audio-Visual Speech Recognition Learners".

Python 16 1 Updated Mar 10, 2025

Awesome Speech Dataset, including download links and a brief explanation for each resource. These datasets provide diverse and high-quality speech data covering various domains such as conversation…

8 Updated Mar 13, 2025

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 5,573 541 Updated Mar 24, 2025

A collection of all our phonemeizers for dataset construction and inference

Python 22 2 Updated Feb 21, 2025

a Frontier Japanese Speech Generation net

Jupyter Notebook 30 11 Updated Mar 11, 2025

A repository to unravel the language of GPUs, making their kernel conversations easy to understand

Python 175 7 Updated Apr 13, 2025
Python 52 4 Updated Mar 21, 2025
Next