SRR_database_extraction

Extraction of SRR database for training deep learning algorithm

Structure

Folder: SRRIDS

Contains the two files bam_ids.txt and fastq_ids.txt which are the IDs extracted from the SRR table. We are using these IDs to activate the command of extraction for each ID.

convert.sh

For the moment not used, but you could use the skeleton of this code if you need to transform files into BAM or FASTQ.

fastqfetch.sh

The main script; it will take the files online and download them under two reads in the workspace given under FASTQ format.

filedone.txt

Contains every ID that has already been downloaded.

names.sh

RUN IT BEFORE ANY fastqfetch.sh USAGE. This script will write in filedone.txt and this file will be used to avoid downloading the same files twice.

ngc file

The key for downloading SRR, needs to be in the folder.

sorting.sh

File to extract the IDs from the initial SRR table.

srabatch.sh

File to activate fastqfetch.sh. It will run until 7 days have passed and will download all the files it can during the allowed time. It might be good to run multiple instances, but make sure it's not downloading the same files at the same time. Maybe split the IDs files.

suppr.py

Suppress empty folders in the workspace, if needed.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
SRRIDS		SRRIDS
.gitignore		.gitignore
README.md		README.md
SRR.error		SRR.error
SRR.out		SRR.out
bambatch.sh		bambatch.sh
bamfetch.sh		bamfetch.sh
convert.sh		convert.sh
fastqfetch.sh		fastqfetch.sh
filedone.txt		filedone.txt
move.sh		move.sh
move_downloaded.sh		move_downloaded.sh
names.sh		names.sh
prj_33410_D38764.ngc		prj_33410_D38764.ngc
sorting.sh		sorting.sh
srabatch.sh		srabatch.sh
sratoolkit.3.1.1-ubuntu64.tar.gz		sratoolkit.3.1.1-ubuntu64.tar.gz
suppr.py		suppr.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SRR_database_extraction

Structure

Folder: SRRIDS

convert.sh

fastqfetch.sh

filedone.txt

names.sh

ngc file

sorting.sh

srabatch.sh

suppr.py

About

Releases

Packages

Contributors 2

Languages

tuonglab/SRR_database_extraction

Folders and files

Latest commit

History

Repository files navigation

SRR_database_extraction

Structure

Folder: SRRIDS

convert.sh

fastqfetch.sh

filedone.txt

names.sh

ngc file

sorting.sh

srabatch.sh

suppr.py

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages