AzeLexicon is a structured, open-source repository of Azerbaijani words and terminology. Its main purpose is to serve as the primary source for the Azerbaijani word list, its hyphenated version, and academic/scientific terminology, supporting anyone producing academic or scientific work in Azerbaijani.
The repository includes:
-
General word list
A plain list of Azerbaijani words (txtformat). It contains only words, without translations. Some cleanup and refinement are still needed. -
Hyphenated words
A hyphenated version of the general word list. This list is not yet complete, as the hyphenation algorithm is under active development. -
Academic and scientific terminology
Organized by subject (mathematics, physics, computer science, etc.). Each subject has a main file (terms.json) containing the translations of English terms into Azerbaijani. Subfields (e.g., linear algebra, probability) are used for initial collection of terms, which are then consolidated into the mainterms.json.
This project aims to be the authoritative reference for Azerbaijani in scientific and academic contexts.
AzeLexicon/
βββ data/
β βββ general/
β β βββ words.txt # Plain list of Azerbaijani words (no translations)
β β βββ words-hyphenated.txt # Hyphenated version of the general word list
β β
β βββ scripts/
β β βββ generate_markdown.py # Generates Markdown glossaries from terms.json
β β βββ hyphenation.py # Experimental hyphenation algorithm
β β
β βββ subjects/
β βββ math/
β β βββ terms.json # Glossary of math terms (EN β AZ)
β β βββ math_terms.txt # Consolidated list of all English math terms
β β βββ categories/ # Subfield-specific English terms
β β β βββ linalg.txt # Linear Algebra terms
β β β βββ prob.txt # Probability terms
β β β βββ ... # More subfields can be added here
β β βββ scripts/
β β βββ process_subject.py # Script to process, validate, and sort terms
β β
β βββ ... # Other subjects (physics, chemistry, biology, etc.)
β
βββ glossary/
β βββ math.md # Generated Markdown glossary from terms.json
β βββ ... # Glossaries for other subjects
β
βββ .github/workflows/
β βββ sort-validate.yml # Automated Term Standardization workflow
β
βββ LICENSE
βββ README.md
βββ CONTRIBUTING.md
βββ CODE_OF_CONDUCT.md
βββ CONTRIBUTORS.md # List of contributors and maintainers
To maintain consistency and quality across the repository, all terms.json files are automatically sorted alphabetically and checked for duplicates by the π Automated Term Standardization workflow (sort-validate.yml). Contributors are expected to update the status for each term they are contributing in the JSON file. The workflow automatically flags missing translations with β Missing, so there is no need to add this manually.
Valid statuses are:
β Missingβ automatically flagged by the workflow for missing translations.β οΈ Revisionβ use this if the translation needs review or you are unsure.β Completeβ use this if the translation is verified and fully correct.
The workflow can be triggered manually in the GitHub Actions tab.
Caution
Before submitting a pull request (PR), ensure that the π Automated Term Standardization workflow completes successfully. For guidance on when to run the workflow, please refer to the wiki section describing the appropriate cases. Additionally, make sure that the status of each contributed term is updated correctly.