A Streamlit web app for speaker diarization and identification in audio files. Upload or record audio, transcribe conversations, and identify speakers using reference samples. Powered by AssemblyAI and SpeechBrain.
- Speaker Diarization: Automatically segments audio by speaker.
- Speaker Identification: Match speakers to reference samples using embeddings.
- Audio Upload & Recording: Upload WAV files or record directly in the browser.
- Interactive UI: Built with Streamlit for easy use.
- Downloadable Results: Export diarized and identified transcripts as CSV.
https://speaker-diarization-identification.streamlit.app/
- Upload or record a conversation audio (WAV format).
- (Optional) Upload reference audio samples for known speakers.
- Set the expected number of speakers and similarity threshold.
- Enter your AssemblyAI API key.
- Click Analyze to process the audio.
- View and download the diarized transcript.
git clone https://github.com/Parva101/speaker_diarization_identification.git
cd speaker_diarization_identification
pip install -r requirements.txt
Run the Streamlit app:
streamlit run app.py
Open the provided local URL in your browser.
- AssemblyAI API Key: Required for transcription. Get your key from AssemblyAI.
- Expected Speakers: Set the number of speakers in the sidebar.
- Similarity Threshold: Adjust to control strictness of speaker matching.
- Streamlit: Web app framework
- AssemblyAI: Speech-to-text API
- SpeechBrain: Speaker embedding and recognition
- PyDub: Audio processing
- Pandas: Data handling
Speaker | Start (s) | End (s) | Text |
---|---|---|---|
John | 0.0 | 5.2 | Hello, how are you? |
Jane | 5.2 | 8.7 | I'm good, thank you! |
Pull requests are welcome! For major changes, please open an issue first to discuss what you would like to change.
This project is licensed under the MIT License.