Skip to content

Input Parameters

bj8th edited this page Jul 22, 2021 · 3 revisions

Input Files

Parameter Description
sample_ccs PacBio CCS reads (not required if SQANTI input provided)
gencode_gtf GENCODE Comprehensive gene annotation. Link
gencode_transcript_fasta GENCODE Protein-coding transcript sequences Link
gencode_translation_fasta GENCODE Protein-coding transcript translation sequences
genome_fasta Genome sequence, primary assembly (GRCh38) Link
fastq_read_1 (optional) RNA-Seq fastq files. Used in STAR alignment to obtain junctions for the sample.
fastq_read_2 (optional) RNA-Seq fastq files. Used in STAR alignment to obtain junctions for the sample.
star_genome_dir (optional) STAR genome index (index is generated if fastq files provided and star_genome_dir not provided)
primers_fasta Primers fasta file. used in IsoSeq (not required if SQANTI input provided) (example)
hexamer Hexamer model provided by CPAT here
logit_model Logit model provided by CPAT here
sqanti_classification (optional*) SQANTI classification file
sqanti_fasta (optional*) SQANTI corrected fasta file
sqanti_gtf (optional*) SQANTI corrected gtf file
sample_kallisto_tpm Sample Kallisto file
normalized_ribo_kallisto Normalized Kallisto file
uniprot_protein_fasta Uniprot Protein Isoforms fasta file Link
mass_spec (optional**) Directory containing mass spec data. Data can be *.raw or *.mzml format
metamorpheus_toml (optional**) Metamorpheus .toml file to use in mass spec. Not required to run proteomic analysis
rescue_resolve_toml (optional**) Rescue and Resolve Metamorpheus .toml file. Required to run proteomic analysis

*The pipeline does not run IsoSeq or SQANTI if all sqanti files are provided.

** If mass spec data is not provided then proteomic analysis is not done.

Input Parameters

Parameter Description Default Value
name Name of pipeline run false
outdir Output directory of results to be stored ./results
max_cpus Number of CPUs a process can use 8
coding_score_cutoff CPAT coding score cutoff. Remove ORFs with coding score below cutoff 0.0
min_junctions_after_stop_codon Minimum number of junctions an ORF can have after stop codon to not be filtered. Only applies if protein classification is not pFSM or pNIC 2
lower_cpm Lower CPM for filtering of high confidence space 3
lower_kb Lower gene nucleotide length for filtering of high confidence space 1
upper_kb Upper gene nucleotide length for filtering of high confidence space 4