Create Anki flashcard decks from text files, with automatic word counting, translation, and text-to-speech audio generation.
![screeshot [(https://imgur.com/RwBlM8S)]]
- File Processing: Select one or more
.txtfiles to extract words. - Word Counting: Counts the frequency of each word.
- Customizable Word Limit: Specify the maximum number of most frequent words to process.
- Language Support:
- Specify input language for accurate word extraction (especially for CJK languages).
- Supported input languages include: English, Arabic, German, Spanish, French, Italian, Portuguese, Turkish, Dutch, Hebrew, Japanese, Korean, Russian, Chinese (Simplified), Swedish, Polish, Finnish, Greek, Hindi, Indonesian.
- Translation (Optional):
- Translate extracted words to a target language using Google Translate.
- Supported target languages: English, Arabic, German, Spanish, French, Italian, Portuguese.
- Option to process words without translation.
- Text-to-Speech (TTS):
- Generate audio pronunciation for words using Google Text-to-Speech.
- Speak individual words from the list.
- Speak all visible words in sequence.
- Anki Deck Export (.apkg):
- Flexible export options for card fronts and backs:
- Word (Front) / Translation (Back)
- Translation (Front) / Word (Back)
- Word (Front) / Speech (Back)
- Translation (Front) / Speech + Word (Back)
- Word (Front) / Speech + Translation (Back)
- Audio is embedded in the Anki package.
- Customizable deck name.
- Option to select a temporary folder for audio file generation during export.
- Flexible export options for card fronts and backs:
- User Interface:
- Easy-to-use GUI built with Tkinter.
- Progress bar for file processing and Anki export.
- Results displayed in a sortable table (Word, Count, Translation).
- Copy selected words to clipboard.
- Responsive UI: Uses
asyncioto prevent UI freezes during long operations like translation and TTS generation.
- Python 3.8+
tkinter(usually included with Python standard library)pygame(for audio playback)googletrans==4.0.0-rc1(for translation) - Important: Use this specific version or a compatible one.gTTS(for text-to-speech)genanki(for creating Anki .apkg files)pyperclip(for clipboard operations)
-
Clone the repository (or download the script):
git clone <repository_url> cd <repository_directory>
-
Install dependencies: It's highly recommended to use a virtual environment.
python -m venv venv # On Windows venv\Scripts\activate # On macOS/Linux source venv/bin/activate
Then install the required packages:
pip install pygame googletrans==4.0.0-rc1 gTTS genanki pyperclip
Note: If you encounter issues with
tkinter, ensure it's installed with your Python distribution (it usually is, but on some Linux systems, it might be separate, e.g.,sudo apt-get install python3-tk).
- Run the script:
python anki_dictionary_creator.py # (Replace anki_dictionary_creator.py with the actual script name if different) - Select File(s): Click "Select File(s)" to choose one or more
.txtfiles containing the text you want to process. - Set Input Language: Choose the language of the text in your selected files from the "Input Lang" dropdown.
- Set Translate To: Choose the target language for translation. Select "None" if you don't want translation.
- Word Limit: Enter the maximum number of most frequent words you want to display and process.
- Deck Name (for export): Enter the desired name for your Anki deck.
- Process Files: Click "Process Files". The application will extract words, count them, translate (if a target language is selected), and display them in the table.
- Interact with Results:
- Click the "🔊" icon next to a word to hear its pronunciation (uses the "Input Lang" setting for TTS).
- Click "Speak All Visible" to hear all words in the current list.
- Select rows and press
Ctrl+C(orCmd+Con macOS) to copy words to the clipboard.
- Export Anki Deck:
- Choose an "Export As" format for your Anki cards.
- Click "Export Anki Deck".
- If your export format includes speech, you will be prompted to select a folder to temporarily store the generated audio files. These files will be packaged into the
.apkgfile. - Save the
.apkgfile.
- Import into Anki: Import the generated
.apkgfile into your Anki application.
- Audio for "Speak Word" / "Speak All": When you use the speak functions, temporary audio files are created in a
temp_audio_filessub-directory where the script is run. The application currently does not automatically delete this folder on exit, but it will log a message reminding you about it. You can manually delete this folder. - Audio for Anki Export: You select a directory for these temporary files during the export process. These files are then packaged by
genanki. It is generally safe to clean this user-selected directory after the.apkgfile has been successfully created.
- Google Translate API Limits: The
googletranslibrary uses an unofficial Google Translate API endpoint. Heavy usage (many words, frequent requests) can lead to temporary IP blocks (HTTP 429 errors). The application has some retry logic, but if you encounter persistent translation failures, try again later or process smaller batches of words. googletransVersion: The4.0.0-rc1version is specified due to its past stability with the API. Other versions might behave differently or require code adjustments.- Audio Playback (Pygame):
pygame.mixerinitialization can sometimes fail on certain systems. If audio playback doesn't work, check the console for warnings. - Large File Processing: While
asynciois used to keep the UI responsive, processing extremely large files or a very high word limit might still consume significant resources and time. - Shutdown: Graceful shutdown of asyncio tasks and Tkinter is complex. If the application hangs on exit or doesn't close the console window immediately, there might be pending async operations or loop state issues.
Contributions, bug reports, and feature requests are welcome! Please open an issue or submit a pull request.