This repository contains the main code for data collection and analysis of CRISPR self-targeting spacers.
self-target-proteins.tsv
: Main data table used in the study (most of the columns were generated by the python scripts in this repository).gene-product.py
: Code used to obtain annotations about protospacer sequences.protein-info.py
: Code used to obtain additional information about proteins being targeted by self-targeting spacers.download-hypothetical.py
: Code used to download the sequences of hypothetical proteins being target of self-targeting spacers.cctyper.py
: Prediction of CRISPR types.hypothetical.fasta
: Amino acid sequences of hypothetical proteins.taxonomy.py
: Code to obtain taxonomical information of each register in the data table.protospacer-location.py
: Analysis of protospacer location in the coding sequences.scanprosite-result.tsv
: Result of analysis of hypothetical proteins with ScanProsite.