Skip to content

Sentient v2#57

Merged
Kabeer2004 merged 71 commits intodevelopmentfrom
fix/optimization
Apr 9, 2025
Merged

Sentient v2#57
Kabeer2004 merged 71 commits intodevelopmentfrom
fix/optimization

Conversation

@Kabeer2004
Copy link
Contributor

🚀 Summary

Major changes in the app.

  1. A major UI/UX revamp - all app pages have received a major UI uplift.
  2. Introduces Advanced Voice mode - seamlessly switch between text and audio chats to talk to Sentient. Uses very fast real-time transcription models and realistic-sounding audio generation models that provide for low-latency voice conversations. It is also fully integrated with the existing pipeline for chats, memories and agents, meaning your voice agent can also perform actions, learn new memories, etc
  3. Introduces Async Tasks that allow tasks to be handled separately from the main chat thread, which means users don't have to wait for a task to be completed before they can send the next message
  4. Async Memory Queues - users don't have to wait for memory update operations to complete before they can send the next message
  5. Dual Memory - short-term memories are also picked up by Sentient now and saved to a separate memory store
  6. Intent - Sentient can now finally perform actions autonomously from different stimuli it receives, like new emails popping up in your inbox or new calendar events being added to your GCalendar. It also randomly searches the net for news based on your interests and sends it to you.
  7. Migrating the app to the cloud (WIP). Reverted back to monolithic architecture, revamped python backend to a modular structure and many smaller changes that will help us speed up development.

✅ Related Issues

Closes #49, #28, #25, #22, #20, #6

🔍 Changes Made

  • Major Version Release

📸 Screenshots

image
image
image
image
image
image

🔄 Additional Context

The app is still very much a work-in-progress. Full cloud migration is underway and currently does not support more than one user per server. Self-hosting now also has increased hardware demands in terms of required VRAM to run Whisper and Orpheus for voice mode.

Kabeer2004 and others added 30 commits March 27, 2025 17:45
expanded implementation details
1. insanely fast whisper for STT with Google Speech Recognition as a fallback
2. WebRTCVAD for voice activity detection. the mic will only stream to the backend when voice activity is detected
3. Sesame CSM for TTS - incorporated SOTA TTS for life-like speech capabilities
4. integrated with existing chat backend - ensures seamless integration with existing agent and memory pipelines

next steps - move VAD logic to Electron frontend, setup electron app with proper frontend for voice chats, modify sandboxed script and convert it to module that we can import into the main app server for usage, add websocket to main app server for voice streaming from electron.
FINALLY COMPLETED

completed voice mode integration with existing backend - user messages are saved to the chatdb, actions are added to the task queue and memory functions also work as expected. all low-latency on RTX A5000.

transcription uses faster-whisper base on CPU for now. can be moved to GPU with a larger model for better accuracy

TTS uses a 4 bit quant of orpheus

RTC functionality is supported by FastRTC
…moved unnecessary fallback in index, updated requirements (freeze versions)
itsskofficial and others added 26 commits April 4, 2025 02:54
Co-authored-by: Abhijeet Suryawanshi <108229267+abhijeetsuryawanshi12@users.noreply.github.com>
@github-actions
Copy link

github-actions bot commented Apr 9, 2025


Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

@Kabeer2004 Kabeer2004 merged commit 22d22a5 into development Apr 9, 2025
1 of 4 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Apr 9, 2025
@itsskofficial itsskofficial deleted the fix/optimization branch May 12, 2025 09:50
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants