A locally hosted, AI generated podcast from an rss feed.
Powered by -
- RSS feed parsing and article extraction
- Article summarization using Ollama
- Text-to-speech conversion using multiple engines:
- Kokoro TTS (Recommended)
- MLX Audio TTS
- Coqui TTS
- Podcast generation with customizable settings
- Web interface for configuration and control
- Go 1.21 or later
- Ollama (for article summarization)
- One of the following TTS engines:
- Kokoro TTS (recommended)
- MLX Audio TTS
- Coqui TTS
- Clone the repository:
git clone https://github.com/intothevoid/rss2podcast.git
cd rss2podcast
- Install dependencies:
go mod download
- Configure the application by editing
config.yaml
or using the web interface.
The application can be configured using the web interface or by editing the config.yaml
file. The following settings are available:
url
: The RSS feed URL to parsemax_articles
: Maximum number of articles to processfilters
: List of filters to apply to articles
end_point
: The Ollama API endpointmodel
: The Ollama model to use for summarization
subject
: The podcast subjectpodcaster
: The podcaster name
engine
: The TTS engine to use ("kokoro", "mlx", or "coqui")kokoro
: Kokoro TTS settingsurl
: The Kokoro TTS API endpointvoice
: The voice to usespeed
: The speech speed (0.25 to 4.0)format
: The audio format (mp3, opus, flac, wav, pcm)
mlx
: MLX Audio TTS settingsurl
: The MLX Audio TTS API endpointvoice
: The voice to usespeed
: The speech speed (0.5 to 2.0)format
: The audio format (mp3, wav)
coqui
: Coqui TTS settingsurl
: The Coqui TTS API endpoint
- Start the application:
go run cmd/rss2podcast/main.go
-
Access the web interface at
http://localhost:8080
-
Configure the application using the web interface or edit
config.yaml
-
The application will:
- Parse the RSS feed
- Extract and summarize articles
- Convert the summary to audio using the selected TTS engine
- Generate a podcast file
Kokoro TTS offers OpenAI-compatible speech synthesis with support for multiple voices and formats. It provides excellent quality with low latency.
MLX Audio TTS is a powerful text-to-speech engine that provides high-quality speech synthesis with support for multiple voices and formats. It offers additional features like direct audio playback and output folder management.
Coqui TTS provides high-quality speech synthesis with support for multiple voices and formats.
This project is licensed under the MIT License - see the LICENSE file for details.
- Ollama for the LLM API
- Kokoro TTS for the TTS engine
- MLX Audio TTS for the TTS engine
- Coqui TTS for the TTS engine
The application reads an rss feed, extracts the articles and summarises them.
RSS + Ollama + TTS = Podcast
The application reads an rss feed and extracts the articles. Each of these articles are then processed by scraping the article content.
The application uses a locally hosted version of Ollama. The Ollama API is used to summarise the article content. Default model used is mistral:7b
The summarised article content is then converted into an audio podcast using the Coqui TTS API.
This project requires the following dependencies to be installed on your system.
You can install the Ollama server by following the instructions on the official website.
Ollama needs to be running on your local machine for the application to work. The application is configured to use the default Ollama server URL http://localhost:11434/api/generate
. This can be changed via the config.yaml file.
ffmpeg
is a command-line tool for handling multimedia files. It is used to convert the generated audio files to the MP3 format.
You can use Homebrew to install ffmpeg
on macOS:
brew install ffmpeg
- Download the
ffmpeg
build for Windows from the official website. - Extract the downloaded ZIP file.
- Add the
bin
directory from the extracted folder to your system's PATH.
The installation command depends on your Linux distribution.
sudo apt update
sudo apt install ffmpeg
Kokoro TTS is a text-to-speech synthesis system that uses deep learning to create human-like speech from text. You can install the Kokoro TTS server by following the instructions on the official website.
Create a docker-compose.yml file and add the following:
services:
kokoro-fastapi-cpu:
ports:
- 8880:8880
image: ghcr.io/remsky/kokoro-fastapi-cpu:latest # or v0.2.3 for last stable version
Start the server by running the following command:
docker compose up -d
This will start the Kokoro TTS server on port 8880. The server provides a REST API for text-to-speech conversion.
Coqui TTS is a text-to-speech synthesis system that uses deep learning to create human-like speech from text. You can install the Coqui TTS server by following the instructions on the official website.
Start the container by using the following command:
docker run -d -p 5002:5002 --platform linux/amd64 --entrypoint /usr/local/bin/tts-server ghcr.io/coqui-ai/tts-cpu --model_name tts_models/en/ljspeech/vits
MLX Audio TTS is a text-to-speech synthesis system that uses deep learning to create human-like speech from text. You can install the MLX Audio TTS server by following the instructions on the official website.
As of this writing, MLX Audio TTS needs to be run locally as Docker does not allow GPU access on Apple Silicon.
# Install the package
pip install mlx-audio
# Create a virtual environment
python -m venv venv
# Activate the virtual environment
source venv/bin/activate
# Install the dependencies
pip install -r requirements.txt
# Run the server
mlx_audio.server
rss2podcast will automatically request the MLX Audio TTS server to generate the audio file.
To run the tests, use the following command:
go test ./...