Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Video captions #227

Open
swissspidy opened this issue Nov 27, 2023 · 7 comments
Open

Video captions #227

swissspidy opened this issue Nov 27, 2023 · 7 comments

Comments

@swissspidy
Copy link
Owner

swissspidy commented Nov 27, 2023

Ability to create video subtitles on-demand.

Unfortunately it's not possible to do this automatically for new uploads, as the upload queue has no knowledge about which exact block the video is for, and can't update its attributes. And the onChange callback doesn't update the tracks either: https://github.com/WordPress/gutenberg/blob/4494a79e24bfbbcda97ce9af5db3dcb9e81b09f6/packages/block-library/src/video/edit.js#L133-L156

@swissspidy swissspidy added the enhancement New feature or request label Nov 28, 2023
swissspidy added a commit that referenced this issue Nov 28, 2023
@swissspidy
Copy link
Owner Author

And the onChange callback doesn't update the tracks either: WordPress/gutenberg@4494a79/packages/block-library/src/video/edit.js#L133-L156

Maybe GB could, in a backward compatible way, provide the clientID to the upload function somehow.

@swissspidy
Copy link
Owner Author

swissspidy commented Nov 30, 2023

There's also https://github.com/xenova/whisper-web, though it seems more useful for transcripts

@swissspidy
Copy link
Owner Author

The current solution seems to work fine, though needs some fine-tuning to ensure proper length (x words/characters per line). Not sure if it does proper sentences to split by, but that could be an option.

@swissspidy
Copy link
Owner Author

Would be good to compare the current solution with https://github.com/xenova/whisper-web. The upcoming release there might have better support for word-level timestamps, which would help for subtitles, but even without that it seems to have good output already.

@swissspidy swissspidy added feature and removed enhancement New feature or request labels Jan 3, 2024
@swissspidy swissspidy added the p3 label Jun 6, 2024
@swissspidy
Copy link
Owner Author

vosk-browser is quite heavy (5.5MB). There is vosklet, which is a bit smaller (3.5MB)

@swissspidy
Copy link
Owner Author

Once captions are generated, an editor a la https://vtt-creator.com (https://github.com/roballsopp/vtt-creator, no open source license) would be nice. https://github.com/opusonline/webvtt-editor could be a starting point.

@swissspidy
Copy link
Owner Author

Apparently the fastest solution at the moment: https://github.com/FL33TW00D/whisper-turbo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant