Skip to content

Simple REST API endpoints for retrieving transcripts for YouTube videos.

License

Notifications You must be signed in to change notification settings

zimzion/youtube-transcriber-api

 
 

Repository files navigation

youtube-transcriber-api

Youtube's official API currently does not support fetching of a video's transcript. This project is a simple flask server that provides API endpoints for retrieving pure-text transcripts for YouTube videos. It also provides the ability to translate transcripts into different languages. This project is built on top of jdepoix's library

codecov

API Endpoints

Note: All language codes used should follow the ISO 639-1 standard (case-sensitive)

Transcripts

GET /v1/transcripts{{id}}

Retrieve transcripts for a specified YouTube video. (try: https://youtube-transcriber-api.vercel.app/v1/transcripts?id=k_GM1JA608Y&lang=en)

Query Parameters

Parameter Required Note
id Yes The ID of the YouTube video
lang No The language code for the desired transcript. If no language is specified, all available transcripts will be returned
type No The desired output format. Accepts json, text, srt, and webvtt. Default to text if not specified
lb No Boolean (0 or 1) indicating whether the transcript should contain line breaks. Only applies for type text. Default to 0 if not specified
sfx No Boolean (0 or 1) indicating whether the transcript should contain sound effects information eg. [Cheering], [Applause], [Music]. Default to 0 if not specified

Response

The request returns a JSON object containing the following fields:

Field Description
video_id The ID of the YouTube video
transcripts A list of transcripts. Each transcript has the following fields:

language: The language of the transcript
languageCode: The ISO 639-1 code
isGenerated: Boolean indicating whether the transcript is machine-generated
isTranslatable: Boolean indicating whether the transcript can be translated
text: The transcript in the specified format

Translation

GET /v1/translations{{id}}{{lang}}

Retrieve a translated transcript for a specified YouTube video. (try: https://youtube-transcriber-api.vercel.app/v1/transcripts?id=k_GM1JA608Y&lang=es)

Query Parameters

Parameter Required Note
id Yes The ID of the YouTube video
lang Yes The language code for the target language

Response

The request returns a JSON object containing the following fields:

Field Description
video_id The ID of the YouTube video
sourceLanguage The language code of the source transcript
targetLanguage The language code of the target translation
transcripts The transcript text

Metadata

GET /v1/metadata{{id}}

Retrieve transcript metadata for a specified YouTube video. (try: https://youtube-transcriber-api.vercel.app/v1/metadata?id=k_GM1JA608Y)

Query Parameters

Parameter Required Note
id Yes The ID of the YouTube video

Response

The request returns a JSON object containing the following fields:

Field Description
video_id The ID of the YouTube video.
transcripts A list of transcript metadata. Each item has the following fields:

language: The language of the transcription
languageCode: The ISO 639-1 code
isGenerated: Boolean indicating whether the transcript is machine-generated
isTranslatable: Boolean indicating whether the transcript can be translated

Future Plans

  • Add rate limitating
  • Migrate from flask (WSGI) to FastAPI (ASGI)

Donation

"Buy Me A Coffee"

License

See license

About

Simple REST API endpoints for retrieving transcripts for YouTube videos.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%