Skip to content

chaolinzhanglab/olego

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OLego -- short or long RNA-seq read mapping to discover exon junction

Jie Wu (wuj@cshl.edu), Chaolin Zhang (cz2294@columbia.edu)

Please find the most recent documentation at http://zhanglab.c2b2.columbia.edu/index.php/OLego_Documentation

What is OLego?
======================

OLego is a program specifically designed for de novo spliced mapping of mRNA-seq reads. OLego adopts a seed-and-extend scheme, and does not rely on a separate external mapper. It achieves high sensitivity of junction detection by strategic searches with very small seeds (12-14 nt), efficiently mapped using Burrows-Wheeler transform (BWT) and FM-index. This also makes it particularly sensitive for discovering small exons. It is implemented in C++ with full support of multiple threading, to allow for fast processing of large-scale data.

OLego is an open source code project and released under GPLv3. The implementation of OLego relies heavily on BWA (version 0.5.9rc1, http://bio-bwa.sourceforge.net/).  

Citation
======================
Wu,J., Anczukow,O., Krainer,A.R., Zhang,M.Q. �, Zhang,C. �, 2013. OLego: Fast and sensitive mapping of spliced mRNA-Seq reads using small seeds. Nucleic Acids Res. , In press.

Versions
======================
v1.1.5 ( 7-14-2014 )
---------------------
* Bugs fixed
* Optmization for speed

v1.1.2 ( 7-1-2013 )
---------------------
* Sensitivity improved in small exons and single anchor search.by allowing mismatch.
* Allows overlapping seeds to improve speed and seeding flexibility.
* Increase default seed size to 15 (max 1 nt overlapping ) to keep both senstivity and speed.
* A bug fixed (crashes when using -M 0 )

v1.1.1 ( 4-14-2013 )
---------------------
* Improved speed by filtering simple repetitive anchors.
* Default options optimized. 

v1.1.0 ( 3-31-2013 )
---------------------
* Bug fixed for duplicate entries for some reads, sensitivity improved.
* Optimized on option -W.
* Bug fixed in sam2bed.pl.
* Bug fixed for option -e.
* Bug fixed for regression_model_gen.
* Add support for gzip input file for sam2bed.pl.

v1.0.8 ( 11-20-2012 )
---------------------
* Improvement in hit clustering.
* Fixed an overcounting problem in mismatch counting.
* Fixed bug in merging step. 
* Fixed bug in XS tag for extra exon body reads.
* Allows pipe input/output with "-" for some of the scripts. 

v1.0.6 ( 08-09-2012 )
---------------------

* Added option --max-multi (default:1000) to avoid huge data in a single line.
* Added option --num-reads-batch.
* Fixed a bug in the junction connecting step.

v1.0.5 ( 07-16-2012 )
---------------------

* Minor bug fixed (the old code crashes in a very rare case).

v1.0.4 ( 06-12-2012 )
---------------------

* Option changes ( do single-anchor search by default now ). 

v1.0.3 ( 06-10-2012 )
---------------------

* Now supports strand specific library
* Fixed bugs about XS

v1.0.0 ( 05-15-2012 )
----------------------

* The initial Public release