-
Notifications
You must be signed in to change notification settings - Fork 20
Example command lines
- Download pre-formatted UniprotSprot from Pre‐built Database Indexes and unpack.
- Select your query file or take this example.
- Run
bin/lambda3 -q /path/to/CAMI_plant_associated_sample0_10Mb.fasta -d /path/to/uniprot_sprot_20230713.lba.gz
You will see something like this:
LAMBDA - the Local Aligner for Massive Biological DatA
======================================================
Version 3.0.0
Reading index properties... done.
Detecting query alphabet... dna5 detected.
Checking memory requirements... met.
Loading Database Index... done.
Loading Query Sequences... done.
Searching and extending hits on-line...progress:
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
|....:....:....:....:....:....:....:....:....:....|
Number of valid hits: 120549
Number of Queries with at least one valid hit: 16106
Since you did not specify it, the default output file name and format was used: output.m8
.
Browse the output file with your editor of choice, or less
on the command line:
% less output.m8
S0R0/2 sp|G3XD97|PTXS_PSEAE 74.47 47 12 0 141 1 264 310 7e-14 75.9
S0R0/2 sp|Q88HH7|PTXS_PSEPK 78.57 42 9 0 126 1 269 310 5e-12 69.7
S0R0/2 sp|A0A167V873|PTXS_PSEDL 70.73 41 12 0 123 1 270 310 6e-11 66.2
S0R12844/1 sp|I6YDK7|ACCD1_MYCTU 79.07 43 9 0 21 149 78 120 3e-13 73.9
S0R1/1 sp|A1VMB2|KHSE_POLNA 57.14 49 14 1 3 149 190 231 2e-10 64.3
S0R1/1 sp|Q82UL3|KHSE_NITEU 51.02 49 17 2 3 149 186 227 1e-05 48.9
S0R1/1 sp|Q2YBJ8|KHSE_NITMU 46.94 49 19 1 3 149 186 227 2e-05 48.1
S0R1/1 sp|Q0AHY7|KHSE_NITEC 46.00 50 18 2 3 149 186 227 6e-05 46.2
S0R1/1 sp|Q9RAM6|KHSE_METFK 42.86 49 21 2 3 149 186 227 4e-04 43.5
S0R1/1 sp|O32378|KHSE_METGL 40.82 49 22 2 3 149 186 227 4e-04 43.5
S0R1/1 sp|Q9JWE5|KHSE_NEIMA 34.69 49 25 1 3 149 186 227 0.002 41.2
S0R1/1 sp|Q4W557|KHSE_NEIMB 34.69 49 25 1 3 149 186 227 0.002 41.2
[...]
NOTE: Because Lambda uses multiple threads by default, the output is not guaranteed to be in the same order (however matches of one query sequence always appear en-bloc and sorted by E-value).
Follow above instructions, but choose .sam
-format as output. Also use an E-value cutoff of 1e-4.
How would the command line look? [click to see]
`bin/lambda3 -q /path/to/CAMI_plant_associated_sample0_10Mb.fasta -d /path/to/uniprot_sprot_20230713.lba.gz -o output.sam -e 1e-4`The program will now print:
LAMBDA - the Local Aligner for Massive Biological DatA
======================================================
Version 3.0.0
Reading index properties... done.
Detecting query alphabet... dna5 detected.
Checking memory requirements... met.
Loading Database Index... done.
Loading Query Sequences... done.
Searching and extending hits on-line...progress:
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
|....:....:....:....:....:....:....:....:....:....|
Number of valid hits: 107983
Number of Queries with at least one valid hit: 14349
As you can see, the number of hits has been reduced slightly due to the more stringent cutoff.
View the output again to verify that it is beautiful SAM:
@HD VN:1.4 GO:query
@PG ID:lambda PN:lambda VN:3.0.0 CL:searchp -i uniprot_sprot_20230713.lba.gz -q CAMI_plant_associated_sample0_10Mb.fasta -o output.sam -e 1e-4
@CO Lambda is a high performance BLAST compatible local aligner, please see http://seqan.de/lambda for more information.
@CO SAM/BAM dialect documentation is available here: https://github.com/seqan/lambda/wiki/Output-Formats
@CO If you use any results found by Lambda, please cite Hauswedell et al. (2014) doi: 10.1093/bioinformatics/btu439
@CO Optional tags as follow AS:bit score NM:edit distance (in protein space unless BLASTN) ae:expect value ai:% identity (in protein space unless BLASTN) qf:query frame
S0R0/2 16 sp|G3XD97|PTXS_PSEAE 264 255 141M9H * 0 0 CGTGAACCCAAGTGCAATCTGTTCGATGACGTGGGCTTGATCGCCCTCGATGACCTGGATTGGTACCCGTTGGTGGGCAGCGGCATTACCGCTCTCGCGCAGCCGACCACCGAGATGGGCGCCAGTGCATTTGAGTGTCTG * ae:f:7.45803e-14 AS:i:75 ai:i:74 qf:i:-1 NM:i:12
S0R0/2 272 sp|Q88HH7|PTXS_PSEPK 269 255 126M24H * 0 0 AATCTGTTCGATGACGTGGGCTTGATCGCCCTCGATGACCTGGATTGGTACCCGTTGGTGGGCAGCGGCATTACCGCTCTCGCGCAGCCGACCACCGAGATGGGCGCCAGTGCATTTGAGTGTCTG * ae:f:5.34478e-12 AS:i:69 ai:i:78 qf:i:-1 NM:i:9
S0R0/2 272 sp|A0A167V873|PTXS_PSEDL 270 255 123M27H * 0 0 CTGTTCGATGACGTGGGCTTGATCGCCCTCGATGACCTGGATTGGTACCCGTTGGTGGGCAGCGGCATTACCGCTCTCGCGCAGCCGACCACCGAGATGGGCGCCAGTGCATTTGAGTGTCTG * ae:f:5.90935e-11 AS:i:66 ai:i:70 qf:i:-1 NM:i:12
S0R1/1 0 sp|A1VMB2|KHSE_POLNA 190 255 2H48M21I78M1H * 0 0 GTGCATGCCGACATGTTCCGCGACAACGTGATGTTCGCCACCGGTGAAGACGCCGGCGCAGCGCCGCGCCTCACCGGCGTTTTCGACTTCTATTTCGCGGGCACCGACACATGGCTGTTCGACCTGGCTGTGTGCCTGTACCACTGG * ae:f:2.24555e-10 AS:i:64 ai:i:57 qf:i:3 NM:i:21
S0R1/1 256 sp|Q82UL3|KHSE_NITEU 186 255 2H36M3I18M18I72M1H * 0 0 * * ae:f:9.7631e-06 AS:i:48 ai:i:51 qf:i:3 NM:i:24
S0R1/1 256 sp|Q2YBJ8|KHSE_NITMU 186 255 2H45M21I81M1H * 0 0 * * ae:f:1.66533e-05 AS:i:48 ai:i:46 qf:i:3 NM:i:26
S0R1/1 256 sp|Q0AHY7|KHSE_NITEC 186 255 2H36M3D9M24I78M1H * 0 0 * * ae:f:6.32826e-05 AS:i:46 ai:i:46 qf:i:3 NM:i:27
S0R3/1 16 sp|A0QX20|ACNA_MYCS2 181 255 1H144M5H * 0 0 GGCATCGTACACCAGGTCAACCTGGAATACCTGGCGCGCGGCGTGCACCGGAAGGACGGCGTCTACTACCCTGACCCGCTGGTCGGCACCGAATCGCACACCACCATGATCAACGGCATCGGCGTGGTCGGCTGGGGCGTCGGC * ae:f:1.35917e-15 AS:i:81 ai:i:75 qf:i:-3 NM:i:12
S0R3/1 272 sp|O53166|ACNA_MYCTU 176 255 1H144M5H * 0 0 * * ae:f:5.16485e-15 AS:i:79 ai:i:72 qf:i:-3 NM:i:13
S0R3/1 272 sp|Q92G90|ACNA_RICCN 175 255 1H144M5H * 0 0 * * ae:f:5.16485e-15 AS:i:79 ai:i:72 qf:i:-3 NM:i:13
S0R3/1 272 sp|Q4UK20|ACNA_RICFE 175 255 1H144M5H * 0 0 * * ae:f:5.16485e-15 AS:i:79 ai:i:72 qf:i:-3 NM:i:13
S0R3/1 272 sp|Q9RTN7|ACNA_DEIRA 179 255 1H84M3D6M9D54M5H * 0 0 * * ae:f:6.7455e-15 AS:i:79 ai:i:75 qf:i:-3 NM:i:13
[...]
For more information on the selection of output formats and more fine-grained options, see the article.
If anything is unclear, don't hesitate to contact to me.