Skip to content

Commit

Permalink
Change whisper-subtitles-generation.ipynb to genai pipeline (#2431)
Browse files Browse the repository at this point in the history
  • Loading branch information
sbalandi authored Oct 18, 2024
1 parent 4d03951 commit 395b629
Show file tree
Hide file tree
Showing 8 changed files with 353 additions and 1,123 deletions.
7 changes: 0 additions & 7 deletions .ci/skipped_notebooks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -435,13 +435,6 @@
- ubuntu-20.04
- ubuntu-22.04
- windows-2019
- notebook: notebooks/whisper-subtitles-generation/whisper-subtitles-generation.ipynb
skips:
- os:
- macos-12
- ubuntu-20.04
- ubuntu-22.04
- windows-2019
- notebook: notebooks/hunyuan-dit-image-generation/hunyuan-dit-image-generation.ipynb
skips:
- os:
Expand Down
12 changes: 8 additions & 4 deletions .github/workflows/build_treon_reusable.yml
Original file line number Diff line number Diff line change
Expand Up @@ -76,16 +76,20 @@ jobs:
uses: xom9ikk/dotenv@ac290ca23a42155a0cba1031d23afa46240116a9 # v2.3.0
with:
path: ./.github/workflows

# Packages that notebooks need to run in plain os
- name: Install required packages
if: ${{ !inputs.container }}
shell: bash
run: |
if [ "$RUNNER_OS" == "Linux" ]; then
sudo apt-get update -y
sudo apt-get install libsndfile1 -y
sudo apt-get install libsndfile1 ffmpeg -y
elif [ "$RUNNER_OS" == "macOS" ]; then
brew install ffmpeg
elif [ "$RUNNER_OS" == "Windows" ]; then
choco install ffmpeg-full
fi
# Packages that notebooks need to run in docker container
- name: Install required packages (container)
if: ${{ inputs.container }}
shell: bash
Expand All @@ -95,7 +99,7 @@ jobs:
run: |
if [ "$RUNNER_OS" == "Linux" ]; then
apt-get update -y
apt-get install git curl wget libsndfile1 libssl-dev unzip libsqlite3-dev libedit-dev libgl1 libgl1-mesa-glx libglib2.0-0 tk -y
apt-get install git curl wget libsndfile1 libssl-dev unzip libsqlite3-dev libedit-dev libgl1 libgl1-mesa-glx libglib2.0-0 ffmpeg tk -y
wget https://raw.githubusercontent.com/openvinotoolkit/openvino/master/scripts/install_dependencies/install_openvino_dependencies.sh
chmod +x ./install_openvino_dependencies.sh
./install_openvino_dependencies.sh -c=core -c=dev -c=gpu -y
Expand Down
2 changes: 1 addition & 1 deletion notebooks/whisper-asr-genai/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ The tutorial consists of following steps:

## Installation Instructions

This is a self-contained example that relies solely on its code.</br>
This example requires `ffmpeg` to be installed. All other required dependencies will be installed by the notebook itself.</br>
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
For details, please refer to [Installation Guide](../../README.md).
<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks/whisper-asr-genai/README.md" />
269 changes: 123 additions & 146 deletions notebooks/whisper-asr-genai/whisper-asr-genai.ipynb

Large diffs are not rendered by default.

12 changes: 6 additions & 6 deletions notebooks/whisper-subtitles-generation/README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,20 @@
# Video Subtitle Generation with OpenAI Whisper
# Video Subtitle Generation with OpenAI Whisper and OpenVINO Generate API
[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/whisper-subtitles-generation/whisper-subtitles-generation.ipynb)
[Whisper](https://openai.com/index/whisper/) is a general-purpose speech recognition model from [OpenAI](https://openai.com). The model is able to almost flawlessly transcribe speech across dozens of languages and even handle poor audio quality or excessive background noise.
This notebook will run the model with OpenVINO to generate transcription of a video.
This notebook will run the model with OpenVINO Generate API to generate transcription of a video.

## Notebook Contents

This notebook demonstrates how to generate video subtitles using the open-source Whisper model. Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. It is a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.
You can find more information about this model in the [research paper](https://cdn.openai.com/papers/whisper.pdf), [OpenAI blog](https://openai.com/index/whisper/), [model card](https://github.com/openai/whisper/blob/main/model-card.md) and GitHub [repository](https://github.com/openai/whisper).

This folder contains notebook that show how to convert and quantize model with OpenVINO. We will use [NNCF](https://github.com/openvinotoolkit/nncf) improving model performance by INT8 quantization.
This folder contains notebook that show how to convert and quantize model with OpenVINO and run pipeline with [Generate API](https://github.com/openvinotoolkit/openvino.genai). We will use [NNCF](https://github.com/openvinotoolkit/nncf) improving model performance by INT8 quantization.

The notebook contains the following steps:
1. Download the model.
2. Instantiate original PyTorch model pipeline.
3. Convert model to OpenVINO IR, using model conversion API.
4. Run the Whisper pipeline with OpenVINO.
4. Run the Whisper pipeline with OpenVINO Generate API.
5. Quantize the OpenVINO model with NNCF.
6. Check quantized model result for the demo video.
7. Compare model size, performance and accuracy of FP32 and quantized INT8 models.
Expand All @@ -37,8 +37,8 @@ The second notebook will guide you through steps of :

## Installation Instructions

This is a self-contained example that relies solely on its own code.</br>
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
This example requires `ffmpeg` to be installed. All other required dependencies will be installed by the notebook itself.</br>
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
For details, please refer to [Installation Guide](../../README.md).

<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks/whisper-subtitles-generation/README.md" />
23 changes: 20 additions & 3 deletions notebooks/whisper-subtitles-generation/gradio_helper.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,29 @@
from typing import Callable
from pathlib import Path
import gradio as gr


def make_demo(fn: Callable, quantized: bool):
def make_demo(fn: Callable, quantized: bool, sample_path: Path):
demo = gr.Interface(
description=f"""
<div style="text-align: center; max-width: 700px; margin: 0 auto;">
<div
style="
display: grid; align-items: center; gap: 0.8rem; font-size: 1.75rem;
"
>
<h1 style="font-weight: 900; margin-bottom: 7px; line-height: normal;">
OpenVINO Generate API Whisper demo {'with quantized model.' if quantized else ''}
</h1>
<div style="font-size: 12px;">
If you use video more then 30 sec, please, note, max_length will be increased. You also could be update it useing generation_config.
</div>
</div>
</div>
""",
fn=fn,
inputs=[
gr.Textbox(label="YouTube URL"),
gr.Video(label="Video"),
gr.Radio(["Transcribe", "Translate"], value="Transcribe"),
gr.Checkbox(
value=quantized,
Expand All @@ -15,7 +32,7 @@ def make_demo(fn: Callable, quantized: bool):
),
],
outputs="video",
examples=[["https://youtu.be/kgL5LBM-hFI", "Transcribe"]],
examples=[[sample_path, "Transcribe"]],
allow_flagging="never",
)

Expand Down
Loading

0 comments on commit 395b629

Please sign in to comment.