Blueberry is designed to be a virtual assistant, capable of doing the usual virtual assistant tasks, and more. 🫐
out.mp4
Stuff to Note:
- The "creative mode" feature is still in development, and as a result, is highly unstable.
- Adding on to the above, a view dependencies may require additional tools on
windows
for its proper functioning. Thus, consult the error logs for the same. An updatedrequirements.txt
file will be released later on.- Fortunately, all the build tools exist by default on most Linux distros.
- THE SRS DOCUMENTATION FOR THE PROJECT IS IN THE
docs/
DIRECTORY.
The entire thing is designed to work on a client-server model. All the processing of the audio is done on a python backend, and the results and also the overall GUI interface is a web based one.
This works through a series of stages:
- Stage 1: Audio data is processed in real time
on the client side
, suing the Web Speech API. - Stage 2: The transcribed text data is then parsed for the
wake-word
and on detection, is stremed to thepython server
in real time, for further processing using websockets. - Stage 3: Our backend analyses the text and triggers an appropriate function accordingly. (This could be a web search, weather information lookup, etc.)
- Stage 4: The result of the function is emitted to the client in real time.
Stage 5: An audio output is generated on the frontend using the Web Speech API.
Dependencies:
python 3.11.*
git
- A chromium based browser.
- The
HUGGINGFACEHUB_API_TOKEN
environment variable in your system PATH.
Clone the repository and cd
into the appropriate directory
Unix:
git clone https://github.com/aisoc-internal-hackathon/aisoc_T9.git
cd aisoc_T9
# activate python virtual env and install dependencies
python -m venv env
source env/bin/activate
# run server and follow the generated URL to get to the frontend.
python server.py
Windows:
NOTE: There have been multiple issues with installing the appropriate dependencies on windows natively, so it is recommended to set it up in a WSL environment.
Start WSL:
Instructions on how to set up WSL can be found here.
wsl
Follow the rest of the steps as usual as it is described above for Unix machines.
The above implementation is cross-platform, thanks to the fact that it relies on a browser based frontend. That being said, however, the Web Speech API is currently only supported in chromium based browsers, and this project, as a result, is heavily experimental. Expect minor glitches.
#Logs for nerds: [Development Environment] $ python --version Python 3.11.5
$ uname -a Linux billy 6.6.10-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 05 Jan 2024 16:20:41 +0000 x86_64 GNU/Linux
Blueberry - your personalized AI assistant.
Copyright (C) 2024 BillyDoesDev
Email: DarkKnight450@protonmail.com
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the , or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.