It is suggested that Markitdown [audio-transcription] should be able to utilize the local Whisper model to transcribe MP3 and MP4 files

It is suggested that Markitdown [audio-transcription] should be able to utilize the local Whisper model to transcribe MP3 and MP4 files
I am unable to connect and use the OpenAI Whisper API for MP3 and MP4 transcription. I can only install ffmpeg, torch, and openai-whisper through the following commands and perform transcription locally. However, Markdown does not support invoking the local Whisper model for transcription. I hope this feature will be supported in future versions.

- sudo apt install ffmpeg
- pip install 'markitdown[pptx,docx,xlsx,xls,pdf,audio-transcription]'
- pip install torch --index-url https://download.pytorch.org/whl/cpu
- pip install openai-whisper
- pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

It is suggested that Markitdown [audio-transcription] should be able to utilize the local Whisper model to transcribe MP3 and MP4 files #1860

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

It is suggested that Markitdown [audio-transcription] should be able to utilize the local Whisper model to transcribe MP3 and MP4 files #1860

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions