VoxLens is a voice transcription and summarization tool that converts spoken audio into text and provides concise summaries. Perfect for meetings, lectures, or any scenario where you need quick summaries of voice recordings.
VoxLens was developed as a final project for the Deep Learning course in the 6th semester at CETYS Universidad for Expo Ingeniería. Our team wanted to create a practical application of speech recognition and natural language processing that could be useful in everyday scenarios.
- Voice Transcription: Accurate speech-to-text conversion
- Automatic Summarization: Get concise summaries of your recordings
- Real-time Processing: Quick results without long waits
- Simple Interface: Just hit record and let VoxLens do the rest
- Dark/Light Mode: Easy on the eyes, day or night
- Multi-language Support: Currently supports English and Spanish UI
- Frontend: Vue.js with TypeScript
- Backend: FastAPI
- ML Models: Whisper for transcription, BART for summarization
- Deployment: Docker for containerization
- Node.js (v14+)
- Python 3.8+
- Docker (optional, for containerized deployment)
-
Clone the repository
git clone https://github.com/braulio-dev/VoxLens.git cd VoxLens -
Set up the API
cd api pip install -r requirements.txt python main.py -
Set up the web app
cd web-app npm install npm run dev -
Open your browser and navigate to
http://localhost:5173
/api- FastAPI backend for transcription and summarization/model- ML model training and evaluation code/web-app- Vue.js frontend application
This project is licensed under the MIT License - see the LICENSE file for details.
