Caution
This project is currently under development, please see the issues to see everything that still needs to be done before this is ready to use.
FluentAI is inspired by the method detailed in the paper SmartPhone: Exploring Keyword Mnemonic with Auto-generated Verbal and Visual Cues by Jaewook Lee and Andrew Lan. The aim is to recreate their approach using accessible, open-source models. The pipeline they propose, as shown below, serves as the blueprint for our project. It illustrates the process of automating language learning, blending cutting-edge AI techniques with proven language learning methodology. For the architectural overview view our Figma board
You can find the list of supported languages here.
In the image below you can see a more detailed process of deriving the mnemonic word, which is the core of the project. The mnemonic word is a word that is easy to remember and that is associated with the word you want to learn. This is done by using a pre-trained model to generate a sentence that is then used to generate a mnemonic word. In the image above this is referred to as "TransPhoner", as this is where the image below is derived from.
The imageability of a word is a measure of how easily a word can be visualized. This is important for the mnemonic word, as it should be easy to visualize. To determine the imageability of a word, we train a model on this dataset. It includes the embeddings for each word and their imageability score. The embeddings are generated by the FastText model and these embeddings can be used to predict the imageability of words that are not in the dataset.
The phonetic similarity of a word is a measure of how similar the pronunciation of two words is. This is important for the mnemonic word, as it should be easy to remember. Therefore we use this to determine which English words should be considered for the mnemonic word. We use the CLTS and PanPhon models to generate the feature vectors of the IPA representation of the words. These feature vectors are then used to calculate the phonetic similarity between the words. We use faiss to speed up the search for the most similar words.
The orthographic similarity of a word is a measure of how similar the spelling of two words is. This is a very simple process and the user can select a few methods that they'd like to use.
The semantic similarity of a word is a measure of how similar the meaning of two words is. The FastText model is used to generate the embeddings of the words and these embeddings are used to calculate the semantic similarity between the words.
To determine the best mnemonic word, we use the methods described above. The results of each method are given as a score (between 0 and 1) and these scores are combined to determine the best mnemonic word. The user can select the weights of each method to determine how important each method is.
TODO
Before starting, make sure you have the following requirements:
- Anki installed on your device.
- Anki-Connect this add-on allows you to add cards to Anki from the command line.
- Add the deck in
/deck/FluentAI.apkg
to your Anki application. You can do this by dragging and dropping the file into the Anki application.
The required packages to run this code can be found in the requirements.txt file. To run this file, execute the following code block after cloning the repository:
pip install -r requirements.txt
or
pip install git+https://github.com/StephanAkkerman/FluentAI.git
If you would like to use a GPU to run the code, you can install the torch
package with the CUDA support. You can find the installation instructions here.
TODO
If you use this project in your research, please cite as follows:
@misc{FluentAI,
author = {Stephan Akkerman, Winston Lam},
title = {FluentAI},
year = {2024},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/StephanAkkerman/FluentAI}}
}
Contributions are welcome! If you have a feature request, bug report, or proposal for code refactoring, please feel free to open an issue on GitHub. We appreciate your help in improving this project.
If you would like to make code contributions yourself, please read CONTRIBUTING.MD.
This project is licensed under the MIT License. See the LICENSE file for details.