Using Large Language Models for the Voice Activated Tracking of Everyday Interactions
Poster: VoCopilot: Enabling Voice-Activated Tracking for Everyday Interactions
Authors:
- Goh Sheen An
- Ambuj Varshney
Publication Details:
- Conference: MobiSys '23: Proceedings of the 21st Annual International Conference on Mobile Systems, Applications and Services
- DOI: https://doi.org/10.1145/3581791.3597375
This repository contains the code for both the embedded device, as well as the backend, to run the end to end system for VoCopilot
-
To get started with the frontend, train and deploy a TinyML Model for Keyword Spotting (KWS) into the embedded device, using Edge Impulse.
- For an example of an Edge Impulse Project that has been trained, refer to []
- Remember to run the
.shscript to deploy the TinyML model into Nicla Voice.
-
Ensure the following pre-requisites are met before running step 3.
- Follow this guide to install Arduino Libraries to install the following Libraries
- Connect an SD Card Module and SD Card to Nicla Voice, following this documentation.
-
After the firmware and model has been deployed into Nicla Voice, deploy the code in
./embedded_device/nicla_voice/record_to_sd.inousingArduino IDEinto the Nicla Voice.
cdtobackendfolder.- Create an
.envfile, with parameters similar to that of.env.example. - Start the pipenv shell with
pipenv shell(Make sure you have pipenv installed) - Install the dependencies with
pipenv install. - Ensure
ffmpegis installed. (e.g. withbrew install ffmpegon Mac OS). If there are errors withwhisperorffmpeg, try to runbrew reinstall tesseract. - Install
llama 2via ollama. - Start the application via
python3 app/main.py. - Drop a wav or g722 file into
WATCH_FILES_PATHand let the server pick up the file, transcrie and summarize it.
- To run the benchmark, run the command
python3 app/benchmark.py.