This repository is no longer maintained. Please refer to npanuhin/BIOCAD for a continuation of this project.
This repository is not intended to represent work, but rather to store and transmit data.
✅ - Works as intended ⚠ - There are problems, but the solution is possible ❌ - There are problems that make the solution wrong
- ✅ large01
- ✅ large02
- ✅ large03
- ✅ large04
- ✅ large05
- ✅ large06
- ✅ large07
- ⚠ large08
- ✅ large09
- ⚠ large10
- ✅ large11
- ❌ large12
- ✅ small (BWA⚠)
This repository also includes implementations of various algorithms written in C++
such as Burrows–Wheeler transform, Knuth–Morris–Pratt algorithm and k-mers compression.
- BWA indexes two
fasta
sequences - BWA aligns these two sequences
- samtools converts
sam
file tobam
file (currently disabled) - samtools sorts
bam
file (currently disabled) - sam2pairwise converts
sam
file to pairwise (txt
file) (currently disabled)
For
SAM
andpairwise
files word wrap should be disabled
- BWA: http://bio-bwa.sourceforge.net
- Samtools: https://www.htslib.org
- sam2pairwise: https://github.com/mlafave/sam2pairwise
Or run sudo apt install bwa samtools
large01/large_genome1.fasta
: Rickettsia rickettsii str. Brazil, complete genome
large01/large_genome2.fasta
: Rickettsia rickettsii str. Iowa, complete genome
large02/large_genome1.fasta
: Brucella abortus 104M chromosome 1, complete sequence
large02/large_genome2.fasta
: Brucella suis bv. 2 strain Bs143CITA chromosome I, complete sequence
large03/large_genome1.fasta
: Brucella abortus 104M chromosome 2, complete sequence
large03/large_genome2.fasta
: Brucella suis bv. 2 strain Bs143CITA chromosome II, complete sequence
large04/large_genome1.fasta
: Brucella pinnipedialis B2/94 chromosome 2, complete sequence
large04/large_genome2.fasta
: Brucella melitensis biovar Abortus 2308 chromosome II, complete sequence, strain 2308
large05/large_genome1.fasta
: Rickettsia rickettsii str. Iowa, complete sequence
large05/large_genome2.fasta
: Rickettsia prowazekii str. Madrid E, complete genome
large06/large_genome1.fasta
: Methanococcus maripaludis C5, complete genome
large06/large_genome2.fasta
: Methanococcus maripaludis X1, complete genome
large07/large_genome1.fasta
: Mycobacterium tuberculosis variant africanum GM041182, complete genome
large07/large_genome2.fasta
: Mycobacterium intracellulare ATCC 13950, complete sequence
large08/large_genome1.fasta
: Desulfurococcus kamchatkensis 1221n, complete genome
large08/large_genome2.fasta
: Desulfurococcus fermentans DSM 16532, complete genome
large09/large_genome1.fasta
: Sulfolobus islandicus M.16.27, complete genome
large09/large_genome2.fasta
: Sulfolobus islandicus REY15A, complete genome
large10/large_genome1.fasta
: Rickettsia canadensis str. CA410, complete genome
large10/large_genome2.fasta
: Rickettsia conorii str. Malish 7, complete sequence
large11/large_genome1.fasta
: Rickettsia canadensis str. CA410, complete genome
large11/large_genome2.fasta
: Rickettsia sibirica 246 chromosome, whole genome shotgun sequence