Funding

oTranscribe+

oTranscribe+ is a free web app designed to take the pain out of transcribing recorded interviews through the use of browser based automated speech recognition. It is based on oTranscribe and vosk-browser

Get the initial transcription of your audio content with speech recognition that runs in your browser; free and private
Pause (ESC), rewind (F1) and fast-forward (F2) without taking your hands off the keyboard
Adjust playback speed with a slider or using F3/F4
Your transcript is automatically saved to the browser's localStorage every second
Rich text support using contentEditable

... and more!

Despite the advances in machine learning applications, specifically Automatic Speech Recognition (ASR), the language work based within the audiovisual sector such as transcription, translation and subtitling still relies on manual labor done by experts. During the last decade, the inclusion of the new technologies only contributed to the Computer Assisted Transcription technologies, through the appearance of new startups which combine ASR and related technologies (speaker diarization, punctuation and capitalization recovery etc.) with an online editor.

However, there is a barrier of entry to the adoption of these tools, mostly due to their cost reflected on the client, which are priced based on the length of audio to be processed/transcribed. We aim to present a low-cost alternative to these platforms, both for the final user and the provider; taking advantage of the latest developments in the speech technologies, namely the use of edge (client-side) computing and open ASR models which are small and precise.

With this platform/tool, language workers can upload a file to the web application and receive a basic transcription. Since ASR decoding is done on client-side, the provider can serve multiple users without the concern on costs per user computationally; since the server side processes will be limited.

Extended version

Requirements

This project currently works with Node version 19 (LTS as of written date). However we recommend to use the Node Version Manager tool, as well as the Yarn package manager. With them you can locally use that Node version 19 and install requirements:

nvm use 19
yarn install

Download a copy

Although a web version is available, you can install oTranscribe anywhere by following these steps:

Download the current ZIP archive.
Compile the CSS and JS with Webpack (see below for more detailed instructions).
Upload the files in the newly-generated dist folder to a server of your choice.

Please note that, in Chrome, local copies oTranscribe may not run correctly due to the browser's privacy settings.

Compiling the CSS and JavaScript

The src folder in this repository only includes the "raw" JavaScript and CSS. To compile the production-ready files:

Install Node.js and NPM.
Run npm install to install dependencies.
Run make build_prod to compile the dist folder.
Run make build_prod BASEURL=test.com to compile the dist folder also generating the sitemap.xml file. With the BASEURL value you set it as the site root path. You can set it like test.com or like test.com/path.
Run make build_app to compile the dist folder with the desktop application version.

An example for building for the app:

make build_app MODELSPREFIX=https://otranscribe.bsc.es

Usage and compilation (Extended version)

Code lives in src folder. There you will find the raw JavaScript and CSS files. Before you start expanding them you need to be using Node version 12 and have requirements already installed. Then, for compiling the code, obtaining a sourcemap, and 'watch-for-changes' (it will be kept running for development and watch real-time changes), run make build_dev.

dist folder will be filled with the end result of oTranscribe+ files and folders. You can emulate the access by a remote browser launching on that location the next Python command: python3 -m http.server. Having run this, you will be able to access with your browser to your local port 8000, where oTranscribe+ should be served.

OTR file format

oTranscribe has its own file format (.otr), which is just a JSON file with the following parameters:

text: The raw HTML of the transcript
media: If available, the name of the last media used
media-source: If available, a link to the last media used
media-time: If available, the playtime of the last media used

Running tests

oTranscribe is not fully tested. There are only a small number of tests, for data migration.

To setup, install CasperJS.

Then run a server at the root directory of this repository at http://localhost:8000, and on the command line run:

casperjs test tests/

Translations

Translations have been provided by the following talented and generous volunteers:

Catalan: Joan Montané and Jon Sindreu.
Chinese: baiqj, Cindy Ng, Andy Pan, Cp0204 and Robin Kwong
Danish: Christian Bruun.
Dutch: Patrick Mackaaij and Marjolein Quist.
Filipino: Patricia Albano.
French: Olivier Aubert, @goofy-bz and Dr J Rogel-Salazar.
German: Dr J Rogel-Salazar and Lisa Bernhardt.
Indonesian: Joy Tikoalu.
Italian: Dr J Rogel-Salazar, Edoardo Putti and Federico Lasta.
Japanese: harupong.
Norwegian: Hallvar Hauge Johnsen
Polish: Emil Maruszczak and Piotr Tarasewicz.
Portuguese: enVide neFelibata.
Brazilian Portuguese: Leonardo Barichello and Carlos Eduardo Pinheiro Rocha.
Romanian: Iain Apreotesei and Catalina Albeanu
Russian: Pavel Osminin
Spanish: Cristian Duque, Dr J Rogel-Salazar and Adrián Blanco.
Swedish: c3ons.
Turkish: Mehmet S. DERİNDERE.
Ukrainian: Myroslav Opyr
Vietnamese: Trần Ngọc Quân
Greek: Konstantinos Alexiou

More about translating oTranscribe here.

Authors and acknowledgment

Developed by Jamgo SCCL for the Text Mining Unit in Barcelona Supercomputing Center.

License

Mozilla Public License 2.0

Funding

This work is funded by the Generalitat de Catalunya within the framework of Projecte AINA.

Name		Name	Last commit message	Last commit date
Latest commit History 534 Commits
models		models
src		src
tests		tests
tools		tools
.babelrc		.babelrc
.gitignore		.gitignore
.nvmrc		.nvmrc
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
package.json		package.json
webpack.config.js		webpack.config.js
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

oTranscribe+

Extended version

Requirements

Download a copy

Compiling the CSS and JavaScript

Usage and compilation (Extended version)

OTR file format

Running tests

Translations

Authors and acknowledgment

License

Funding

About

Releases 1

Packages

Languages

License

projecte-aina/oTranscribe-plus

Folders and files

Latest commit

History

Repository files navigation

oTranscribe+

Extended version

Requirements

Download a copy

Compiling the CSS and JavaScript

Usage and compilation (Extended version)

OTR file format

Running tests

Translations

Authors and acknowledgment

License

Funding

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages