DNA sequence predition using Transformer-based models. Second version of Enigma. It has two models- one BERT-based model for classification & analysis, other transformer-based alphafold like model.
More technical details about models is available in the documentation: Models.md
Utilizes a custom-built pipeline to fetch datasets from trust NCBI database using EnigmaDataset library, that could be downloaded and used by anyone with proper NCBI-specified parameters. If you want to download pre-fetched database, you can dowload it from here- huggingface/EnigmaDatasaet
-
Fork the repository.
-
Create a feature branch:
git checkout -b feature-name
- Commit your changes:
git commit -m "Add feature"
- Push to the branch:
git push origin feature-name
- Create a pull request.
This project is licensed under the Apache 2 License. See the LICENSE file for details.