- Meeting abstract
- Open access
- Published:
Subgroup and outlier detection analysis
BMC Bioinformatics volume 14, Article number: A2 (2013)
Background
High-dimensional biological data presents the opportunity to discover novel forms of biological heterogeneity, such as overexpression or suppression of expression of a particular gene in a subset of a cohort. This novel biological heterogeneity appears in the data as outliers or distinct subgroups. Here, we describe and evaluate three procedures for subgroup and outlier detection analysis (SODA): a leave-one-out (LOO) procedure that is widely used for outlier detection in the bioinformatics literature, the least median squares (LMS) procedure from the statistics literature, and the dip test (DT) from the statistics literature. We also propose and evaluate the max spacing test (MST) as a novel SODA method.
Results
In simulation studies, we found that LMS, DT, and MST are each the best method in specific settings. In an example analysis, we found that LMS and MST effectively identified confirmed fusion genes as outliers and DT and MST effectively identified genes that distinguish between two confirmed subtypes of pediatric acute megakaryoblastic leukemia. We conclude that LMS, DT, and MST are robust and complimentary methods for SODA.
Acknowledgements
We gratefully acknowledge funding from ALSAC which raises funds for St. Jude.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( https://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Wu, G., Pawlikowska, I., Gruber, T. et al. Subgroup and outlier detection analysis. BMC Bioinformatics 14 (Suppl 17), A2 (2013). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/1471-2105-14-S17-A2
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/1471-2105-14-S17-A2