Skip to content

**Samanta** is a smart, voice-activated virtual assistant designed to feel personal, conversational, and deeply integrated with cutting-edge AI technology. Think of her as your open-source version of Alexa or Google Home — but powered by open models and built with love for tinkering and transparency.

License

Notifications You must be signed in to change notification settings

Allakazan/samanta

Repository files navigation

🧠 Samanta – Your Personal AI Assistant

Samanta is a smart, voice-activated virtual assistant designed to feel personal, conversational, and deeply integrated with cutting-edge AI technology. Think of her as your open-source version of Alexa or Google Home — but powered by open models and built with love for tinkering and transparency.

Gemini


✨ Features

Samanta is not just an AI assistant — she’s your multitasking digital companion.

🎤 Natural Conversations
Talk to Samanta as naturally as you would a person. She uses Google Gemini through google.generativeai.

🧠 Smart Voice Recognition
Uses faster-whisper to transcribe your voice with low latency and high accuracy.

🗣️ Natural TTS (Text-to-Speech)
Samanta speaks using the xtts_v2 model via Coqui TTS, running locally with CUDA acceleration for fast, lifelike speech.

🎵 Spotify Integration
Ask Samanta to play your favorite songs, playlists, or artists directly through Spotify.

📅 Google Calendar Integration
Schedule meetings, add reminders, or check your upcoming events — all with your voice.

🕵️‍♀️ Voice Activation
Samanta stays quietly listening in the background and wakes up with a configurable wake word via Porcupine by Picovoice.

🎙️ Voice Activity Detection (VAD)
Ensures she listens only when you're actually speaking — improving performance and privacy.

👩‍🎨 Animated Avatar
2D expressive avatar built with pygame brings a sense of presence and emotion to the conversation.


🛠️ Tech Stack

Component Library/Tool
LLM Backend google.generativeai (Gemini)
STT faster-whisper
Wake Word pvporcupine
VAD Built-in or optional VAD modules
TTS TTS.api with xtts_v2 (CUDA, PyTorch 1.12+)
Music Spotify Web API integration
Calendar Google Calendar API
Avatar Engine pygame

🚀 Getting Started

Note: This project is CUDA-accelerated. You’ll need a compatible GPU and PyTorch installed with CUDA 12.8 or higher.

🧩 Prerequisites

  • Python 3.11+
  • PyTorch with CUDA support
  • ffmpeg installed and in system path
  • Microphone and speaker
  • Spotify Developer account (for API keys)
  • Google Cloud project with Calendar API enabled

🔧 Installation

Download CUDA

https://developer.nvidia.com/cuda-12-8-0-download-archive

git clone https://github.com/Allakazan/samanta.git
cd samanta
cp .env.example .env
uv install

🗝️ API Setup

Generate the API keys and put them inside the .env file

Gemini:

Get an API key at: https://aistudio.google.com/apikey

GOOGLE_API_KEY=your_gemini_api_key

Spotify:

Create a developer app at https://developer.spotify.com and export:

SPOTIFY_CLIENT_ID=your_client_id
SPOTIFY_CLIENT_SECRET=your_client_secret

Google Calendar:

  • Step 1: Enable the Google Calendar API

    1. Go to the Google Cloud Console: https://console.cloud.google.com/
    2. Create a new project (or select an existing one).
    3. Go to "APIs & Services" → "Library".
    4. Search for "Google Calendar API" and enable it.
  • Step 2: Create OAuth Credentials

    1. Go to "APIs & Services" → "Credentials".
    2. Click "Create Credentials" → "OAuth client ID".
    3. Set the application type to "Desktop App" (or "Web App" if needed).
    4. Download the credentials.json file
    5. Save it as /.gcp/credentials.json

🗣️ Usage

Start Samanta from the command line:

uv run ./main.py

She will:

  1. Listen for the wake word.

  2. Start the conversation.

  3. Handle your commands:

    • “Play Bohemian Rhapsody on Spotify”
    • “Schedule a meeting for tomorrow at 3 PM”
    • “What’s on my calendar next week?”
    • “Tell me a joke”

📦 Roadmap

  • Wake word detection with Porcupine
  • Fast local STT with faster-whisper
  • Natural TTS with Coqui XTTSv2
  • Gemini LLM integration via google.genai
  • Spotify playback support
  • Google Calendar scheduling
  • Add smart home IoT plugin system
  • Voice memory with context awareness
  • On-device language translation
  • Visual dashboard for logs and commands
  • Optional emotion detection

🤝 Contributing

Pull requests, feature ideas, and bug reports are always welcome!


📄 License

This project is licensed under the MIT License – see the LICENSE file.


🙏 Acknowledgments

About

**Samanta** is a smart, voice-activated virtual assistant designed to feel personal, conversational, and deeply integrated with cutting-edge AI technology. Think of her as your open-source version of Alexa or Google Home — but powered by open models and built with love for tinkering and transparency.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages