Skip to content

Commit

Permalink
Added db and updated documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
kalilamali committed Jun 14, 2024
1 parent 285771a commit 4a46378
Show file tree
Hide file tree
Showing 21 changed files with 1,403 additions and 166 deletions.
162 changes: 0 additions & 162 deletions .gitignore

This file was deleted.

66 changes: 65 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,66 @@
# cdiff_fbi
cdiff pipeline
Typing of *Clostridioides difficile* isolates using NGS data (reads and contigs) based on tandem repeat loci (TR6, TR10), and toxin genes (cdtA, cdtB, tcdA, tcdB, tcdC).

## Quick start
```bash
# Type
bash cdifftyping.sh -h
# Process
bash postcdifftyping.sh -h
# Summarize
python3 qc_cdiff_summary.py -h
```

## Installation
### Source
```bash
# Clone this repo
git clone https://github.com/ssi-dk/cdiff_fbi.git
# Create an environment with the required tools with conda
conda create --name cdiff_pipeline picard gatk4 biopython ruamel.yaml kraken bwa samtools
# Activate the environment
conda activate cdiff pipeline
# Install a custom tool
git clone https://github.com/ssi-dk/serum_readfilter
cd serum_readfilter
pip install .
```

## Usage
### Example
```bash
# Download data into the test folder
mkdir -p test
wget -nc ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR142/ERR142064/ERR142064_2.fastq.gz -P test
wget -nc ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR142/ERR142064/ERR142064_1.fastq.gz -P test
touch ERR142064.fasta # Create an empty file as a fake assembly for testing purposes
```

### Pipeline
```bash
# Build the db
bash cdifftyping.sh -db db -update yes # WARNING: Not yet implemented for serumdb or trstdb
# Type
bash cdifftyping.sh -i ERR142064 -R1 test/ERR142064_1.fastq.gz -R2 test/ERR142064_2.fastq.gz -c test/ERR142064.fasta -qc pass -o test -db db -update no
# Process
bash postcdifftyping.sh -i ERR142064 -d test -stbit "STNA;NA:NA"
# Summarize
python3 qc_cdiff_summary.py -i test -o test
```

## Output
### .csv
```
Name;cdtA/B;tcdA;tcdB;tcdClength;117del;A117T;TRST;TR6;TR10;ST;STalleles;WGS;tcdA:tcdB:tcdC:cdtA:cdtB
ERR142064;+/+;+;+;0;+;-;Unknown;Unknown;Unknown;STNA;NA:NA;test;8119/8133:6914/7101:700/700:1389/1389:2628/2628
```
### .json
```
{"Name": "ERR142064", "cdtA": "+", "cdtB": "+", "tcdA": "+", "tcdB": "+", "tcdClength": "0", "117del": "+", "A117T": "-", "TRST": "Unknown", "TR6": "Unknown", "TR10": "Unknown", "ST": "STNA;NA:NA", "WGS": "test", "cov_info": {"tcdA": "8119/8133", "tcdB": "6914/7101", "tcdC": "700/700", "cdtA": "1389/1389", "cdtB": "2628/2628"}}
```

## Updating the db
```bash
# Build the db
bash cdifftyping.sh -db db -update yes # WARNING: Not yet implemented for serumdb or trstdb
```
1 change: 1 addition & 0 deletions TRSTfinder3.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ def revtrans(seq):


## Main
# Tandem repeat loci TR6 and TR10
TR6fragments = parse_repeat_sequences(args.trstdb, "TR6_repeat_sequences.ashx")
TR10fragments = parse_repeat_sequences(args.trstdb, "TR10_repeat_sequences.ashx")
TR6 = parse_types(args.trstdb, "TR6_types.ashx", TR6fragments)
Expand Down
4 changes: 1 addition & 3 deletions cdifftyping.sh
Original file line number Diff line number Diff line change
Expand Up @@ -90,9 +90,7 @@ echo "TRSTDB: $trstdb"
if [[ "$update" == "yes" ]] ; then
echo -e "\n# Updating dbs..."
## TODO: Find which command was used to make the serumdb, code below is a draft!
echo -e "\n# WARNING!: Code not implemented yet!"
echo -e "\n# Exiting.."
exit 1
echo -e "\n# WARNING!: Not yet implemented for serumdb or trstdb!"
# # cdiff_serum_readfilter
# echo -e "\n# Updating cdiff_serum_readfilter..."
# rm -r $serumdb/library $serumdb/taxonomy || true # Remove old folders
Expand Down
144 changes: 144 additions & 0 deletions db/cdiff_TRST/TR10_repeat_sequences.ashx
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
>N001
AAATTAATTATTATATTTCTTT
>N002
AAATTAATTTTCTATATTTCTT
>N003
AAATTAATGTATTGTATTTCTTT
>N004
AAATTAGTTTATTATACTTCTTT
>N005
AGATTAATTTTCTATACTTCCT
>N006
AGATTAGCTTTCTATACTTCCT
>N007
AGATTAGCTTTCTATATTTCTT
>N008
AAATTAATTTTCTATACTTCCT
>N009
AAATTAGTTTATTACACTTCTTT
>N010
AGATTAATTTTCTATACTTTCT
>N011
AAATTAATTTTCTATACTTCTT
>N012
AAATTAGTTTACTATACTTCTTT
>N013
AGATTAGCTTTCTATACTTCTTT
>N014
AGATTAATTTTCTACACTTCCT
>N015
AAATTAATTTATTATATTTTTT
>N016
AAATTAATTTATTGTATTTCTTT
>N017
AAATTAATTTTCTATGTTTCTT
>N018
AAATTAGCTTATTATACTTTTT
>N019
GAATTAGTTTATTATACTTCTTT
>N020
AGATTAATTTTCTATATTTCTT
>N021
AAATTAGTTTATTATACTTCCT
>N022
AAATTAGTTCATTATACTTCTTT
>N023
AAATTAATTTTCTACACTTCCT
>N024
AAATTAGTTTATTATATTTCTT
>N025
AAATTAATATATTGTATTTCTTT
>N026
AGATTAATTTTCTATACTTCTTT
>N027
AAGTTAATTTATTGTATTTCTTT
>N028
AGATTAATTTTCTATATTTCCT
>N029
AAATTAATTTATTATATTTCTTT
>N030
AAGTTAGCTCATTATACTTCTTT
>N031
AAATTAGCTTATTATACTTCTT
>N032
AGATTAACTTTCTATACTTTCT
>N033
AGATTAGTTTTCTATACTTCCT
>N034
AAATTAGTTTATTATGCTTCTTT
>N035
AAATTAATTTTTTATACTTCCT
>N036
AGATTAATTTTCCATACTTCCT
>N037
AAATTAGCTCATTATACTTCTTT
>N038
AAATTAATTTTCCATATTTCTT
>N039
AGATTAATTTTCTATCCTTCCT
>N040
AAACTAATTTTCTATACTTCCT
>N041
AAATTAGTTTATTATACTTTTT
>N042
AAATTAATTTTTTATACTTCTTT
>N043
AAATTAGCTTATTATATTTCTT
>N044
AGATTAGCTTTCTATACTTTCT
>N045
AAATTAGTTTATTATACTTATTT
>N046
AAATTAATTTTCTATACTTTCT
>N047
AAATTAATTATTGTGTTTCTTT
>N048
GAATTAGTTTATTATACTCCTTT
>N049
AAATTAATTTTCTATCCTTCCT
>N050
AAATTAATGTATTGTGTTTCTTT
>N051
AGATTAATTTCCTATACTTCCT
>N052
AAATTAGCTTTCTATACTTCCT
>N053
AAATCAATTTTCTATATTTCTT
>N054
AAATTAATTTTCCATACTTCTT
>N055
AGATTAGCTTTTTATACTTCTTT
>N056
AGATTAATTTTATATACTTCCT
>N057
AAATTAGTTTATTATACTTCTT
>N058
AGATTAGCTTTTTATACTTCCT
>N059
AAATTAGTTTATTATATTTTTT
>N060
AAATTAGTTTATTATATTTCTTT
>N061
AAATTAATTTATTATACTTCTTT
>N062
AAATTAGTTTATTACACTTCCTT
>N063
AAATTAATCTTCTATACTTCCT
>N064
AAATTAGTTTATTATACTTCTCT
>N065
AAATTAGTTTGTTACACTTCTTT
>N066
AAATTAGTTTTCTATATTTCTT
>N067
AAATTAAATTTCTATATTTCTT
>N068
AGATTAGCTTTCTATACTTCCTT
>N069
AAATTAGTTTATTGTACTTCTTT
>N070
AAATTAGCTTATTATACTTCTTT
>N071
AAATTAATTTTCTATACTTCTTT
>N072
AGATTAGCTTTCTATACTTCCTA
Loading

0 comments on commit 4a46378

Please sign in to comment.