I no longer maintain this repo. If you want to fork this project, use the Alm Lab's version.
dbotu3 is a new implementation of Sarah Preheim's dbOTU algorithm. The scope is narrower, the numerical comparisons are faster, and the interface is more user-friendly.
Read the documentation for:
- a guide to getting started,
- an explanation of the algorithm, and
- the API reference.
You can also read our new paper for more technical details about the algorithm. The Alm Lab website also has a short page with information.
dbotu3 is on PyPi and can be installed with pip:
pip install dbotu
dbotu3 is also on conda and can be installed as follows:
conda install -c cduvallet -c conda-forge dbotu
For QIIME 2 users, dbotu3 is also available as a plugin.
- Numpy, SciPy, BioPython, Pandas
- Levenshtein
- 1.1: Corrected error where sequence IDs that could be read as integers would not be found in the table
- 1.2: Python 2 compatibility, tox test framework, warnings for improperly-formatted sequence count tables
- 1.2.1: Added setup requirements
- 1.3.0: Improved OTU file header. Split the log file into a debug and progress log.
- 1.4.0: Made an improvement to the Levenshtein-based genetic dissimilarity metric.
- 1.4.1: Account for pandas API change to
MultiIndex
- 1.5.0: Added the restart and rep seq scripts
- 1.5.1: New function for Qiime2 compatibility
- Testing for the restart scripts
- Better coverage for unit tests
If you use dbOTU3 in a scientific paper, we ask that you cite the original dbOTU publication (Preheim et al.) or the dbOTU3 publication:
Preheim et al. Distribution-Based Clustering: Using Ecology To Refine the Operational Taxonomic Unit. Appl Environ Microbiol (2013) doi:10.1128/AEM.00342-13.
Olesen SW, Duvallet C, and Alm EJ. dbOTU3: A new implementation of distribution-based OTU calling. PLoS ONE (2017) doi:10.1371/journal.pone.0176335.
If you find a bug or have a request for a new feature, open an issue.
Scott Olesen / swo at alum.mit.edu