- Python 3.7
- PyTorch 1.8.0
- transformers 4.19.0
- numpy 1.21.4
- scipy 1.7.1
- pandas 1.3.4
- umap-learn 0.5.2
- scikit-learn 0.24.2
- sentence-transformers 2.2.0
Code is not tested with other versions.
You can add your own data by refering to the data class in data.py
. The data should be in the format of a list of documents, where each document is a string.
ContrastiveTM-main.ipynb
is the main notebook for training and evaluating the model.
- Modularize the code
- Make it into an installable package
- Test on different Python/PyTorch versions
Some parts of the implementation is based on the code from contextualized-topic-models.