YouTube Transcription Docker Project

This project allows you to download, process, and transcribe YouTube videos using AssemblyAI. The entire process is containerized using Docker, making it easy to set up and run on any machine.

Prerequisites

Before you can run this project, ensure you have the following:

Docker: Installed on your machine. You can download Docker from here.
AssemblyAI Account: Sign up for a free account at AssemblyAI to obtain an API key.

Setting Up AssemblyAI

Sign Up for AssemblyAI:
- Go to AssemblyAI's website and sign up for a free account.
- After signing up, navigate to your dashboard where you'll find your API key. Copy this key, as you'll need it in the next step.
Create a .env File:
- In your project directory (the directory where you will run the Docker container), create a .env file. This file will store your AssemblyAI API key.
- Add the following line to the .env file:
```
ASSEMBLYAI_API_KEY=<YOUR_API_KEY>
```
- Replace <YOUR_API_KEY> with the API key you copied from the AssemblyAI dashboard.
Example .env file:
```
ASSEMBLYAI_API_KEY=1234567890abcdef1234567890abcdef
```
Save the .env File:
- Make sure the .env file is saved in the directory where you will be running the Docker commands.

Pulling and Running the Docker Image

Follow these steps to pull the Docker image and run the transcription process:

Pull the Docker Image:
- Pull the pre-built Docker image from Docker Hub using the following command:
```
docker pull agentmaddy/yt-transcriptor
```
Run the Docker Container:
- Use the following command to run the Docker container. This command mounts your current directory to the /app directory inside the container and runs the transcription script.
- Replace FILE_NAME with your desired output file name and the YouTube URL with the URL of the video you want to transcribe.
- Add a .env file and necessary credentials.
```
docker run --env-file .env -v $(pwd):/data agentmaddy/yt-transcriptor python /app/app.py -o /data/output --urls https://www.youtube.com/watch\?v\=KUECJHlV1LE
```
- Explanation:
  - -v $(pwd):/data: Mounts the current directory on your local machine to the /data directory inside the Docker container. This allows the container to save output files directly to your local directory.
  - agentmaddy/yt-transcriptor: The Docker image you pulled from Docker Hub.
  - python app.py -o FILE_NAME --urls https://www.youtube.com/watch?v=KUECJHlV1LE: The command run inside the container, where app.py processes the YouTube video, extracts the audio, and generates a transcription.
Accessing the Output:
- The output files, including the .mp3 audio file and the transcription .txt file, will be saved in the directory where you ran the Docker command.
- For example, if you run the command from /home/user/projects, the output files will be saved in /home/user/projects.

Example Usage

To transcribe a YouTube video with the title "How to Dockerize Your Python Applications", you would run the following:

docker run --env-file .env -v $(pwd):/data agentmaddy/yt-transcriptor python /app/app.py -o /data/output --urls https://www.youtube.com/watch\?v\=KUECJHlV1LE

This command will download the video, extract the audio, and save the transcription as docker_tutorial_transcript.txt in your current directory.

Troubleshooting

Docker Not Found: Make sure Docker is installed and running on your machine.
Invalid API Key: Ensure your .env file is correctly set up with a valid AssemblyAI API key.
Permission Issues: If you encounter permission issues when running the Docker container, try running the command with sudo.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Summary of the Steps:

Set Up AssemblyAI: Sign up, obtain an API key, and create a .env file with the API key.
Pull Docker Image: Use docker pull agentmaddy/yt-transcriptor to get the image.
Run the Docker Container: Execute the Docker command with your desired output file name and YouTube URL.

This documentation provides a comprehensive guide on how to set up and use your Dockerized YouTube transcription project. You can add this content to your README.md file in your GitHub repository to guide users through the process. Let me know if you need further assistance!

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

YouTube Transcription Docker Project

Prerequisites

Setting Up AssemblyAI

Pulling and Running the Docker Image

Example Usage

Troubleshooting

License

Summary of the Steps:

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

License

opensourceops/yt-transcript-generator

Folders and files

Latest commit

History

Repository files navigation

YouTube Transcription Docker Project

Prerequisites

Setting Up AssemblyAI

Pulling and Running the Docker Image

Example Usage

Troubleshooting

License

Summary of the Steps:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Packages