-
DST Public
Forked from ictnlp/DSTDST is a Decoder-only simultaneous machine translation model, which can conduct policy decision and translation concurrently
Python MIT License UpdatedJan 26, 2025 -
speech-trident Public
Forked from ga642381/speech-tridentAwesome speech/audio LLMs, representation learning, and codec models
UpdatedNov 27, 2024 -
-
-
itsp Public
Forked from Speech-Interaction-Technology-Aalto-U/itspIntroduction to Speech Processing
Jupyter Notebook Creative Commons Attribution Share Alike 4.0 International UpdatedSep 13, 2024 -
SummaryMixing Public
Forked from SamsungLabs/SummaryMixingThis repository implements SummaryMixing, a simpler, faster and much cheaper replacement to self-attention for automatic speech recognition (see: https://arxiv.org/abs/2307.07421). The code is read…
Python Other UpdatedAug 30, 2024 -
NaturalVoices Public
Forked from 3loi/NaturalVoicesJupyter Notebook MIT License UpdatedAug 27, 2024 -
-
audiomentations Public
Forked from iver56/audiomentationsA Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
Python MIT License UpdatedJun 10, 2024 -
ppg2ppg Public
Zero-Shot Foreign Accent Conversion without a Native Reference
-
Sense_glucose Public
Forked from kathanvyas/Zephyr-BioHArness-Data_PreprocessThe repository contains code for the Sense Project. Contains code from reading teh ECG-Sumamry file from Zephyr folder and then processing it with reading glucose file. the code contains all necess…
Python UpdatedJan 12, 2024 -
DNS-Challenge Public
Forked from microsoft/DNS-ChallengeThis repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.
Python Creative Commons Attribution 4.0 International UpdatedDec 13, 2023 -
-
vq-ppg-vc Public
Vector Quantized PPGs based Voice conversion
-
vq-bnf-translator Public
Pronunciation correction in vector quantized PPG representation space
-
primehilltop Public
Forked from creativetimofficial/paper-kit-reactLanding page for primehilltop
SCSS MIT License UpdatedMay 5, 2023 -
-
vq-bnf Public
Vector Quantizing speech representations
-
mellotron Public
Forked from NVIDIA/mellotronMellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
Jupyter Notebook BSD 3-Clause "New" or "Revised" License UpdatedMar 24, 2023 -
ConvSANN Public
Implementation of Self-Attentive Convolulation Neural Network for text semantic similarity task.
Python MIT License UpdatedMar 24, 2023 -
StyleFlow Public
Forked from RameenAbdal/StyleFlowStyleFlow: Attribute-conditioned Exploration of StyleGAN-generated Images using Conditional Continuous Normalizing Flows (ACM TOG 2021)
Python UpdatedMar 24, 2023 -
watson-stt-wer-python Public
Forked from IBM/watson-stt-wer-pythonUtilities for transcribing a set of audio files with IBM Watson Speech to Text (STT), then analyzing the error rate of the STT transcription against a known-good transcription
Python Apache License 2.0 UpdatedFeb 28, 2023 -
-
-
FastSpeech2 Public
Forked from ming024/FastSpeech2An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Python MIT License UpdatedJan 10, 2023 -
silero-models Public
Forked from snakers4/silero-modelsSilero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
Jupyter Notebook Other UpdatedDec 26, 2022 -
Conditional-Normalizing-Flow Public
Forked from 5yearsKim/Conditional-Normalizing-FlowConditional Generative model (Normalizing Flow) and experimenting style transfer using this model
Python UpdatedNov 23, 2022 -
vc-spk-loss Public
Voice Conversion with additional speaker loss
Jupyter Notebook Apache License 2.0 UpdatedNov 21, 2022 -
-
SRD-VC Public
Forked from YoungSeng/SRD-VCSpeech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion (Interspeech 2022)
Python UpdatedNov 1, 2022