If you work in Localization or Globalization your in organization, you likely rely on glossaries to ensure brand/product message consistency, especially for technical specifications. This handy Python script will help you leverage content from existing translation memories to create an almost-perfect bilingual glossary leveraging Perplexity's Sonar model. You can read more about why I made this script by visiting Localization Times (blog).
- A list of terms in Excel format.
- A translation memory in Excel format. Check your TMS documentation for details: Phrase, MemoQ, Trados (via GC).
- Perplexity API key. You will need to use a valid payment method. Check their API pricing (Sonar) for more details.
- OpenAI
- Pandas; openpyxl; requests
- Edit the script and add your API key (line 40).
- Run the script.
- Drag and drop your Excel files.
- Specify the column header names for each Excel file.
- Enjoy your new glossary.
- Ability to use TMX or Excel files interchangeably via lxml and some conditionals).
- Reverse matching of extracted terms for validation purposes.
- Code optimizations for reduced friction during the wizard.