-
Notifications
You must be signed in to change notification settings - Fork 1
2. Introduction to DBC
-Distribution-based clustering is a different way of organizing sequence data to maximize the useful information from the data and reduce redundancy.
-It can be applied to any dataset with multiple samples, and is most useful in analyzing data across samples where the abundance of organisms in the samples change fairly dramatically
-It can be used to conservatively identify true sequences in a sample or to conservatively estimate different populations
-The original implementation of DBC was slow because it used an inelegant interface with R, which calculated the statistical test
-New implementation in python uses rpy2 interface between python and r which is a stable and elegant interface
-This has increased the speed of the algorithm without any loss in accuracy
-This is the most current version on github