Skip to content

Commit

Permalink
Updates formatting of README
Browse files Browse the repository at this point in the history
  • Loading branch information
samhorsfield96 committed Aug 10, 2020
1 parent 971e7f7 commit b16dd64
Showing 1 changed file with 23 additions and 23 deletions.
46 changes: 23 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,10 @@ Analyse the proportion of genic read bases aligned by GraphAligner.
```python analyse_gaf_genic.py gaffile blastfile reads.fa outfile.txt```

Input/Output:
- gaffile: graphical alignment file produced by [Graphaligner](https://github.com/maickrau/GraphAligner)
- blastfile: [BLAST](https://www.sciencedirect.com/science/article/abs/pii/S0022283605803602?via%3Dihub) output file in tabular format generated from exact alignment of gene sequences from [Lees et al.](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5930550/) (aligned files are '*_dna.fa') to simulated genomes (```blastn -outfmt 6 -perc_identity 100 -qcov_hsp_perc 100```)
- reads.fa: read sequences used in alignment in FASTA format, generated by [Nanosim-H](https://github.com/karel-brinda/NanoSim-H)
- outfile.txt: output summary file
- ```gaffile```: graphical alignment file produced by [Graphaligner](https://github.com/maickrau/GraphAligner)
- ```blastfile```: [BLAST](https://www.sciencedirect.com/science/article/abs/pii/S0022283605803602?via%3Dihub) output file in tabular format generated from exact alignment of gene sequences from [Lees et al.](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5930550/) (aligned files are '*_dna.fa') to simulated genomes (```blastn -outfmt 6 -perc_identity 100 -qcov_hsp_perc 100```)
- ```reads.fa```: read sequences used in alignment in FASTA format, generated by [Nanosim-H](https://github.com/karel-brinda/NanoSim-H)
- ```outfile.txt```: output summary file

### analyse_gaf_total.py

Expand All @@ -30,8 +30,8 @@ Analyse the proportion of total read bases aligned by GraphAligner.
```python analyse_gaf_total.py gaffile outfile.txt```

Input/Output
- gaffile: graphical alignment file produced by [Graphaligner](https://github.com/maickrau/GraphAligner)
- outfile.txt: output summary file
- ```gaffile```: graphical alignment file produced by [Graphaligner](https://github.com/maickrau/GraphAligner)
- ```outfile.txt```: output summary file

### analyse_nodes.py

Expand All @@ -40,8 +40,8 @@ Analyse node length and degree from a GFA file.
```python analyse_nodes.py gfafile outpref```

Input/Output
- gfafile: GFA file produced produced from graph construction
- outpref: output prefix for distribution and summary files
- ```gfafile```: GFA file produced produced from graph construction
- ```outpref```: output prefix for distribution and summary files

### analyse_unitig.py

Expand All @@ -50,9 +50,9 @@ Analyse unitig frequencies from a Bifrost graph.
```python analyse_unitig.py gfa_dist.txt colours.tsv output```

Input/Output
- gfa_dist.txt: Distribution file generated from analyse_nodes.py
- colours.tsv: TSV file produced from [Bifrost](https://github.com/pmelsted/bifrost) query by querying constituent unitigs against GFA itself
- output: output file of results
- ```gfa_dist.txt```: Distribution file generated from analyse_nodes.py
- ```colours.tsv```: TSV file produced from [Bifrost](https://github.com/pmelsted/bifrost) query by querying constituent unitigs against GFA itself
- ```output```: output file of results

### bifrost_unitig_freq.R

Expand All @@ -70,10 +70,10 @@ Checks presence of a called ORF in forward and reverse complements of a set of r
```check_ORF_in_ref(ref_fasta_for, ref_fasta_rev, query_fasta, outfasta)```

Input/Output
- ref_fasta_for: Multi-FASTA of reference source sequences (forward strand)
- ref_fasta_rev: Multi-FASTA of reference source sequences (reverse strand)
- query_fasta: ORF calls in FASTA format
- outfasta: output FASTA containing ORFs not present in forward or reverse sequences.
- ```ref_fasta_for```: Multi-FASTA of reference source sequences (forward strand)
- ```ref_fasta_rev```: Multi-FASTA of reference source sequences (reverse strand)
- ```query_fasta```: ORF calls in FASTA format
- ```outfasta```: output FASTA containing ORFs not present in forward or reverse sequences.

#### check_ref_in_query()

Expand All @@ -83,9 +83,9 @@ Checks presence of a known gene in longer called ORFs.
```check_ref_in_query(ref_fasta, query_fasta, outfile)```

Input/Output
- ref_fasta: Multi-FASTA of known genes
- query_fasta: ORF calls in FASTA format
- outfile: output FASTA containing known genes not found in any called ORFs
- ```ref_fasta```: Multi-FASTA of known genes
- ```query_fasta```: ORF calls in FASTA format
- ```outfile```: output FASTA containing known genes not found in any called ORFs

### compare_gene_calls.py

Expand All @@ -94,10 +94,10 @@ Compares known genes against called ORFs by Prodigal/ggCaller in S. pneumoniae c
```python compare_gene_calls.py reference_genes gene_calls caller_type group```

Input/Output
- reference_genes: known genes in FASTA format
- gene_calls: ORF calls by [Prodigal](https://github.com/hyattpd/Prodigal) or [ggCaller](https://github.com/samhorsfield96/ggCaller) in FASTA format
- caller_type: specify which caller used (ggCaller = ggc, Prodigal = prod)
- group: CBL group used in comparison.
- ```reference_genes```: known genes in FASTA format
- ```gene_calls```: ORF calls by [Prodigal](https://github.com/hyattpd/Prodigal) or [ggCaller](https://github.com/samhorsfield96/ggCaller) in FASTA format
- ```caller_type```: specify which caller used (ggCaller = ggc, Prodigal = prod)
- ```group```: CBL group used in comparison.

### gfa_to_fasta.py

Expand All @@ -107,7 +107,7 @@ Creates FASTA of unitigs from GFA files
```gfa_to_fasta(gfafile)```

Input/Output
- gfafile: GFA file generated from graph construction
- ```gfafile```: GFA file generated from graph construction
- output is FASTA with same file prefix as gfafile

### panaroo_gene_freq.R
Expand Down

0 comments on commit b16dd64

Please sign in to comment.