
A locally hosted, AI generated podcast from an rss feed.

Powered by -

Features
- RSS feed parsing and article extraction
- Article summarization using Ollama
- Text-to-speech conversion using multiple engines:
- Kokoro TTS (Recommended)
- MLX Audio TTS
- Coqui TTS
- Podcast generation with customizable settings
- Web interface for configuration and control
Requirements
- Go 1.21 or later
- Ollama (for article summarization)
- One of the following TTS engines:
- Kokoro TTS (recommended)
- MLX Audio TTS
- Coqui TTS
Installation
- Clone the repository:
git clone https://github.com/intothevoid/rss2podcast.git
cd rss2podcast
- Install dependencies:
go mod download
- Configure the application by editing
config.yaml or using the web interface.
Configuration
The application can be configured using the web interface or by editing the config.yaml file. The following settings are available:
url: The RSS feed URL to parse
max_articles: Maximum number of articles to process
filters: List of filters to apply to articles
Ollama Settings
end_point: The Ollama API endpoint
model: The Ollama model to use for summarization
Podcast Settings
subject: The podcast subject
podcaster: The podcaster name
TTS Settings
engine: The TTS engine to use ("kokoro", "mlx", or "coqui")
kokoro: Kokoro TTS settings
url: The Kokoro TTS API endpoint
voice: The voice to use
speed: The speech speed (0.25 to 4.0)
format: The audio format (mp3, opus, flac, wav, pcm)
mlx: MLX Audio TTS settings
url: The MLX Audio TTS API endpoint
voice: The voice to use
speed: The speech speed (0.5 to 2.0)
format: The audio format (mp3, wav)
coqui: Coqui TTS settings
url: The Coqui TTS API endpoint
Usage
- Start the application:
go run cmd/rss2podcast/main.go
-
Access the web interface at http://localhost:8080
-
Configure the application using the web interface or edit config.yaml
-
The application will:
- Parse the RSS feed
- Extract and summarize articles
- Convert the summary to audio using the selected TTS engine
- Generate a podcast file
TTS Engines
Kokoro TTS (Recommended)
Kokoro TTS offers OpenAI-compatible speech synthesis with support for multiple voices and formats. It provides excellent quality with low latency.
MLX Audio TTS
MLX Audio TTS is a powerful text-to-speech engine that provides high-quality speech synthesis with support for multiple voices and formats. It offers additional features like direct audio playback and output folder management.
Coqui TTS
Coqui TTS provides high-quality speech synthesis with support for multiple voices and formats.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
How it works
The application reads an rss feed, extracts the articles and summarises them.
RSS + Ollama + TTS = Podcast
The application reads an rss feed and extracts the articles. Each of these articles are then processed by scraping the article content.
Ollama
The application uses a locally hosted version of Ollama. The Ollama API is used to summarise the article content. Default model used is mistral:7b
TTS
The summarised article content is then converted into an audio podcast using the Coqui TTS API.
Dependencies
This project requires the following dependencies to be installed on your system.
Ollama
You can install the Ollama server by following the instructions on the official website.
Ollama needs to be running on your local machine for the application to work. The application is configured to use the default Ollama server URL http://localhost:11434/api/generate. This can be changed via the config.yaml file.
ffmpeg
ffmpeg is a command-line tool for handling multimedia files. It is used to convert the generated audio files to the MP3 format.
macOS
You can use Homebrew to install ffmpeg on macOS:
brew install ffmpeg
Windows
- Download the
ffmpeg build for Windows from the official website.
- Extract the downloaded ZIP file.
- Add the
bin directory from the extracted folder to your system's PATH.
Linux
The installation command depends on your Linux distribution.
Ubuntu/Debian
sudo apt update
sudo apt install ffmpeg
Kokoro TTS (Recommended)
Kokoro TTS is a text-to-speech synthesis system that uses deep learning to create human-like speech from text. You can install the Kokoro TTS server by following the instructions on the official website.
Docker:
Create a docker-compose.yml file and add the following:
services:
kokoro-fastapi-cpu:
ports:
- 8880:8880
image: ghcr.io/remsky/kokoro-fastapi-cpu:latest # or v0.2.3 for last stable version
Start the server by running the following command:
docker compose up -d
This will start the Kokoro TTS server on port 8880. The server provides a REST API for text-to-speech conversion.
Coqui TTS
Coqui TTS is a text-to-speech synthesis system that uses deep learning to create human-like speech from text. You can install the Coqui TTS server by following the instructions on the official website.
Docker
Start the container by using the following command:
docker run -d -p 5002:5002 --platform linux/amd64 --entrypoint /usr/local/bin/tts-server ghcr.io/coqui-ai/tts-cpu --model_name tts_models/en/ljspeech/vits
MLX Audio TTS
MLX Audio TTS is a text-to-speech synthesis system that uses deep learning to create human-like speech from text. You can install the MLX Audio TTS server by following the instructions on the official website.
Docker
As of this writing, MLX Audio TTS needs to be run locally as Docker does not allow GPU access on Apple Silicon.
# Install the package
pip install mlx-audio
# Create a virtual environment
python -m venv venv
# Activate the virtual environment
source venv/bin/activate
# Install the dependencies
pip install -r requirements.txt
# Run the server
mlx_audio.server
rss2podcast will automatically request the MLX Audio TTS server to generate the audio file.
Testing
To run the tests, use the following command:
go test ./...