This repo contains a ready to use speech-to-speech assistant designed to run on edge devices like the Jetson Nano. It uses ollama in combination with vosk and piper to achieve real-time conversational capabilities with minimal system requirements.
-
Install
ollama- https://ollama.com/download -
Install
jq(can be skipped - needed only for running./install.sh).For Debian-based systems:
sudo apt-get install jq
-
Clone the repository:
git clone https://github.com/M1chol/atom-assistant cd atom-assistant -
You then need to manually install all the ollama models you want to use:
ollama serve ollama run <MODEL_NAME>
-
Update
config.jsonwith your values. You need to change:ollama_model_nameto match a model available in yourollamaserver.- LLM system prompt to match your language and requirements (
ollama_system_prompt). - Microphone sample rate to match your hardware (
stt_config/microphone_samplerate). - Speech-to-text vosk model to match your language (
stt_config/model). - Text-to-speech piper model to match your language (
tts_config/model).
-
Run the install script:
./install.sh
After running the install script, execute:
python main.pyIf the script fails to activate the Python environment (could happen in non-desktop environments), do it manually:
source .venv/bin/activate
python main.pyIn config.json, you can change the reading speed. If your LLM is not fast enough and is blocking speech generation, you can increase the tts_config/length_scale to slow down the reading speed.
noise_scale and noise_w_scale control respectivly audio and speaking variation. You can also change the volume of the output speech.
I have also included example files for educational puposes for the used ollama, piper and vosk libraries. They all in some way expand on simple examples provided by the library contributors.