Skip to content
View matteosoo's full-sized avatar
:octocat:
:octocat:
  • Taipei, Taiwan
  • 18:26 (UTC +08:00)

Highlights

  • Pro

Block or report matteosoo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.

Python 1,339 77 Updated Feb 21, 2025

This repository contains LLM (Large language model) interview question asked in top companies like Google, Nvidia , Meta , Microsoft & fortune 500 companies.

1,144 263 Updated Feb 12, 2025

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

Python 6,384 517 Updated Mar 10, 2025

Python tool for converting files and office documents to Markdown.

Python 40,320 1,893 Updated Mar 17, 2025

Max搶票機器人(maxbot) help you quickly buy your tickets

Python 167 83 Updated Jan 12, 2023

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 8,963 917 Updated Mar 18, 2025

The Memory layer for AI Agents

Python 26,418 2,503 Updated Mar 18, 2025

Enforce the output format (JSON Schema, Regex etc) of a language model

Python 1,737 76 Updated Feb 26, 2025

Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/

TypeScript 8,312 648 Updated Mar 17, 2025

Implementation for MatMul-free LM.

Python 2,969 187 Updated Nov 5, 2024

AI chat and search for text, news, images and videos using the DuckDuckGo.com search engine.

Python 1,474 154 Updated Mar 17, 2025

Instruct-tune LLaMA on consumer hardware

Jupyter Notebook 18,842 2,228 Updated Jul 29, 2024

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 83,669 12,326 Updated Mar 19, 2025

A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM

TypeScript 2,930 389 Updated Aug 21, 2024

Tuning and Evaluation of RAG pipeline. (Automated optimization to be added soon)

Python 263 23 Updated Mar 19, 2024

Use ArXiv ChatGuru to talk to research papers. This app uses LangChain, OpenAI, Streamlit, and Redis as a vector database/semantic cache.

Python 540 68 Updated Feb 25, 2025

LlamaIndex is the leading framework for building LLM-powered agents over your data.

Python 40,149 5,719 Updated Mar 19, 2025

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Python 2,105 322 Updated Nov 14, 2023

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,830 776 Updated Feb 11, 2024

Inference code for CodeLlama models

Python 16,244 1,902 Updated Aug 12, 2024

Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.

Python 20,383 2,267 Updated Mar 2, 2025

SoftVC VITS Singing Voice Conversion

Python 26,742 4,940 Updated Nov 11, 2023

Core Engine of Singing Voice Conversion & Singing Voice Clone

Python 2,741 921 Updated Apr 23, 2024

Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!

Python 1,180 171 Updated Feb 5, 2024

PITS: Variational Pitch Inference for End-to-end Pitch-controllable TTS without External Pitch Predictor

Python 276 35 Updated Jul 16, 2023

Inference code for Llama models

Python 57,897 9,718 Updated Jan 26, 2025

XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech (INTERSPEECH 2023)

Python 317 39 Updated Jul 22, 2024

Official Code for DragGAN (SIGGRAPH 2023)

Python 35,894 3,448 Updated May 18, 2024

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion

Python 4,856 723 Updated Jan 21, 2025

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 20,936 2,609 Updated Mar 4, 2025
Next