-
Universitat Pompeu Fabra
- Barcelona, Spain
- https://www.linkedin.com/in/guillermo-cámbara-ruiz-43312a68/
- @guillecambara
Stars
Collection of audio-focused loss functions in PyTorch
speacher is a speech teacher framework that enables research in curriculum learning for speech recognition and quality assessment for speech synthesis models.
A multi-voice TTS system trained with an emphasis on quality
PyTorch reimplementation of "Keyword Transformer: A Self-Attention Model for Keyword Spotting"
Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch
Pytorch!!!Pytorch!!!Pytorch!!! Dynamic Convolution: Attention over Convolution Kernels (CVPR-2020)
A library for efficient similarity search and clustering of dense vectors.
Vector (and Scalar) Quantization, in Pytorch
Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch
"Deep Generative Modeling": Introductory Examples
😎 A curated list of awesome GitHub Profile which updates in real time
Collecting research materials on EBM/EBL (Energy Based Models, Energy Based Learning)
AlexK-PL / tacotron2
Forked from NVIDIA/tacotron2Tacotron 2 - PyTorch implementation with faster-than-realtime inference
AlexK-PL / GST-Tacotron
Forked from KinglittleQ/GST-TacotronA PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
AlexK-PL / Tacotron2-1
Forked from kaituoxu/Tacotron2A PyTorch implementation of Tacotron2, an end-to-end text-to-speech(TTS) system described in "Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions".
A pytorch implementation of a Text-to-Speech system based on NVIDIA's Tacotron2 text2mel plus a neural vocoder
This is our work on learning speaking style in speech synthesis but only using the pitch frequency sub-band as a speaker reference. We trained a modified version of the NVIDIA's Tacotron2 model but…
A NVIDIA's Pytorch Tacotron2 adaptation with unsupervised Global Style Tokens. Instead of using the whole mel-scale spectrogram representation in the GST input, we extracted and used only the pitch…
A NVIDIA's Pytorch Tacotron2 adaptation with unsupervised Global Style Tokens. The model has been trained with the English read-speech LJSpeech Dataset.
Self-Supervised Speech Pre-training and Representation Learning Toolkit
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.