Skip to content
William Van Hevelingen edited this page Aug 26, 2014 · 2 revisions

Command Line Usage

The following docs are for using the codonpdx command line tool.

Table of contents

Mirroring

Make sure your config file in config/mirror.cfg is setup before running the command. You will also need to make the sure the lftp package is installed on your machine.

RefSeq

Passing -d refseq to the mirror subcommand will mirror refseq to the directory specified in the mirror config file.

./codonpdx.py mirror -d refseq

Genbank

Passing -d genbank to the mirror subcommand will mirror genbank to the directory specified in the mirror config file.

./codonpdx.py mirror -d genbank

Upgrading the codon count metadata to the next release

Make sure your celery workers are running before trying this.

The queueJobs subcommand will start a job for each file in refseq or genbank that needs to be processed. The celery workers will process the jobs in parallel on the amount of available cores they have available. The resulting codon counts will be stored in the database as each job finishes.

Start a celery worker

celery -A codonpdx worker -l info

RefSeq

./codonpdx.py queueJobs -d refseq -f genbank

Genbank

./codonpdx.py queueJobs -d genbank -f genbank

Clearing old jobs

Clear the results table and the results directory of data older than a week. You can use the -d flag to specify how many days to remove. For example -d 3 would remove files and entries older than 3 days.

./codonpdx.py clearResults -d 3

Calculating the results for a sequence file

Step 1: Count codons of the input file

For a genbank format file named input_file.gb:

./codonpdx.py count -i input_file.gb -j ID -f genbank -o counts.json -s

This reads in input_file.gb and generates counts, placing them into the file counts.json. The job ID is not important for manually running jobs from the command line, but will overwrite a previous result set that has the same ID. The -s option indicates we want to also generate counts for a shuffled version of the sequence; omit it to avoid this behavior. For other options, pass -h.

Step 2: Insert count results into database

Continuing on from the previous example:

./codonpdx.py insertInput -j ID -i counts.json

Step 3: Calculate results of the comparison

./codonpdx.py calc -j ID -d refseq -w input -v input -o

The -o option outputs the results to the command line. If you omit it, the results will go into the database and can be accessed like a normal job via the web application.