process_av

Creates access derivatives from audio and video preservation files. Uses FFMPEG-python to create access derivatives. OpenAI/Whisper is used as a speech-to-text tool; speech-to-text is stored in a WebVTT file.

The tool is meant to be used during the digitization/transfer process for legacy media.

Setup

setup virtual environment
pip install -r requirements.txt
install ffmpeg
run tools/SetupWhisperModels.py to pre-download Whisper models (optional)

Structure of files

Master files should be packaged together in a single directory for each 'object.' For example, a two-sided cassette should have 2 WAV files in a single folder. The tool will place all derivatives in a subdirectory for each object.

File Formats

This tool is narrowly scoped to only process preservation files in the WAVE and MOV file formats. Access copies are scoped to the mp3, mp4 (h.264), and vtt file formats.

Example Structure:

 ```
root_folder/
├── object_1/
│   ├── audio_file_1.wav
│   ├── audio_file_2.wav
│   └── derivatives/
│       ├── audio_file_1.mp3
│       ├── audio_file_1_caption_eng.vtt
│       ├── audio_file_2.mp3
│       └── audio_file_2_caption_eng.vtt
├── object_2/
│   ├── audio_file_1.wav
│   └── derivatives/
│       ├── audio_file_1.mp3
│       └── audio_file_1_caption_eng.vtt
└── object_3/
    ├── audio_file_1.wav
    └── derivatives/
        ├── audio_file_1.mp3
        └── audio_file_1_caption_eng.vtt
```

Hardware considerations

Whisper and FFmpeg require a lot of processing power. Recommended Whisper models, like the large-v2, are best when using GPUs. You may run this on a personal device, but select the base or tiny whisper model and expect longer times to process content.

This tool has only been tested using NVIDA CUDA with PyTorch. If you want to use GPU acceleration for Whisper you will need to setup CUDA or DirectML with PyTorch or TensorFlow. GPU acceleration for FFMpeg is possible, but the 'convertAV.py' helper will need to be updated.

See also FFmpeg's guidance on hardware acceleration.

Additional Resources

Specifications for outputs produced by FFMPEG may be altered in the convertAV.py tool. See AMIA's FFMPEG cookbook for suggestions and implementation examples.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
config		config
gui		gui
scripts		scripts
tools		tools
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements-cuda-nvdia-ada1000.txt		requirements-cuda-nvdia-ada1000.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

process_av

Setup

Structure of files

File Formats

Hardware considerations

Additional Resources

About

Uh oh!

Releases

Packages

Uh oh!

Languages

gwu-libraries/process_av

Folders and files

Latest commit

History

Repository files navigation

process_av

Setup

Structure of files

File Formats

Hardware considerations

Additional Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages