I share this library to help others add voice-to-text to their workflows. This library will be mainly used to inspire the creation of commands that are more relevant to the reader.
This is a programming tool, not an educational tool.
Voice In is a free plugin for Google Chrome that supports dictation in the text areas of web pages. 750words.com (or the free alternative writehoney.com) are distraction free environments where I draft most of my writing these days before pasting it into a final document.
Voice In also provides a Notepad for dictation. It also works in Overleaf, Jupyter, and Colab. The word error is about 2-4 percent, better than most alternative dictation software. It does not do extended hallucinations like chatbots, which I find annoying and a waste of my time.
Voice In Plus requires a modest subscription fee. It adds support for custom commands, which I find essential for reducing the word error rate and thereby reducing the downstream editing.
The full version of my command library includes
- expansion of spoken contractions in English
- expansions of acronyms
- expansions of names of colleagues
- commands to open favorite web pages
- LaTeX boilerplate
- open specific websites in the browser (e.g.,
open PubMed
,open Google Scholar
,open PDB
,open LBSF
,open OCSB
,open weather forecast
)
I use contractions in my speech but do not want them in my writing. I mapped all the contractions that I could find to their expansions. Whenever I say a contraction, it is expanded automatically as the software transcribes the text.
It is easy to expand acronyms from memory into the wrong phrase.
Their inclusion in the library eliminates the need to look them up more than once.
If I say the command expand XXX
, acronym XXX is expanded immediately after it is spoken.
The expansions of acronyms negate the need to look them up repeatedly to ensure they are correct.
I also include expansions of the names of colleagues from their first name to their full name. This ensures that the spelling of the last name is correct and that I do not have to look it up repeatedly. These types of commands were removed to protect the identity of my colleagues.
I can imagine mapping voice commands to insert citation keys to standard citations for standard methods and key equations. This kind of information is domain-specific and not included here. Instead, I have set up several domain-specific libraries in separate repositories so that users can download only those collections of voice snippets relevant to their work. See Voice Computing section of the MooersLab landing page for hyperlinks to these repositories.
By using a voice command to navigate to a favorite web page, you can keep your fingers on the keyboard. It is also faster to use a voice command to navigate to a desired page than to use the mouse cursor to navigate to that web page.
Most of my writing is done in LaTeX. The overleaf.com web service makes writing in LaTeX easy. Its one weakness of Overleaf is that it lacks support for code snippets. Voice In Plus offers the opportunity to overcome this limitation. I have mapped voice commands to several dozen LaTeX code snippets.
Another approach I have discussed to overcome the absence of code snippets in Overleaf and elsewhere is to send the text area from Overleaf to one of five popular text editors via the GhostText plugin for most web browsers. The code snippets available in these text editors can be inserted into the text.
Because only the English contraction commands may have more widespread appeal and utility, these have been segregated into a separate file available here.
The file format is comma-separated values (csv). The first column has the voice commands in lowercase. The second column has the text that is inserted and any action commands that are executed between angular braces. Multi-line text fragments are possible by putting the entire chunk of text between one set of double quotes.
- Right-click on the plugin icon and select
Options
. - This action opens the Voice-in options page.
- Click on
bulk add
button. A simple window with a plain text area will open. - Open the csv file in a text editor like VS Code. Select all, copy, and then paste them into the text area of the
bulk add
window. - Click the
add commands
button below the text area. The new commands will be available for use immediately.
See Voice Computing section of the MooersLab landing page.
The basic rule for developing a voice command is to pick a word combination that is very unlikely to be used in one's prose. This choice can avoid the accidental insertion of an unintended set of words. For example, using the voice command "to do" to insert an org-mode TODO is pointless because this phrase is used frequently in my prose. Instead, I came up with the command ''priority'' and then the associated alphanumeric code for the priority. It is pretty unlikely that I will say the command "priority A1" in my usual prose.
If you pick a word combination with a subset of words already assigned to another command, the commands will collide, and you will not get the intended effect. It is better to pick a synonym for the new command than include the old one.
I use the verb "insert" in front of the computer code that I want to insert. I use the verb "expand" to expand a person's first name into their full name and to expand acronyms into their full term.
Like other forms of computer code, test the Voice In commands to ensure you get the intended effect. The speed with which you vocalize a command has a significant impact. You may find that you have to verbalize the command at high speed to avoid inserting just the first word of the command rather than the entire command.
Version | Changes | Date |
---|---|---|
Version 0.2 | Added badges and update table. Fixed all.csv so that it shows up as a table on GitHub. | 2024 April 13 |
Version 0.3 | Added funding. Edited README.md to improve clarity | 2024 April 20 |
- NIH: R01 CA242845, R01 AI088011
- NIH: P30 CA225520 (PI: R. Mannel); P20GM103640 and P30GM145423 (PI: A. West)