audio-summary is a command-line tool designed to create meeting minutes from an audio recording, leveraging OpenAI Whisper for transcription and Google Gemini for summarization. Currently, only an online version is available, with plans for an offline version utilizing Ollama and HuggingFace under development.
Demo

Ensure you have ffmpeg installed:
# Mac
brew install ffmpeg
# Windows
# 至ffmpeg 官網下載安裝: https://ffmpeg.org # Git
git clone https://github.com/thisishugow/audio-summary.git
# pip
pip install ./audio_summary.whlCreate .env file and setup OpenAI API Key and Gemini API Key
OPENAI_API_KEY=your_OpenAI_API_key
GOOGLE_API_KEY=your_Google_API_key-
Streamlit UI
python -m audio_summary.server
-
Use command line
python -m audio_summary -f meeting-recording.wav -s trueOptions:
-h,--help: Show help message and exit.-fFILE,--fileFILE: Specify the path of the audio file.-oOUTPUT,--outputOUTPUT: Specify the path of the output transcription.-sSUMMARIZE,--summarizeSUMMARIZE: Specify whether to use Gemini for summarization (true/false). Default=true.--summarize-byAPI, : Specify the summarization API to use. Choices:openai,gemini. Default=openai.--langLANG let AI response in ["original", "en", "zh-tw"]. Default="original"Then you will see the full transcription and the meeting minutes.
The tool supports summarization using either Google Gemini or OpenAI's models. You can select the preferred provider using the --summarize-by argument in the command line or via the UI.