![]() |
American Statistical Association
|
Recent advances in low-cost metagenomic and amplicon sequencing techniques enable routine sampling of environmental and host-associated microbial communities across different habitats. The data produced by these large-scale surveys typically comprise relative abundances (or compositions) of microbial taxa at different taxonomic levels. To investigate the dependency of additional covariate measurements such as metabolites or host phenotypes on the microbial compositions we introduce a general robust regression framework for compositional data. We propose a novel log-contrast regression model with mean shift parameters that allows the identification of sample outliers and maintains sub-compositional coherence with respect to the associated phylogenetic tree. The model is estimated using a sparse penalized regression approach that simultaneously enforces sparsity in mean shift and covariate parameters. We demonstrate the superiority of our approach using a wide range of synthetic simulation scenarios and infer novel associations between body mass index measurements and human gut microbes on a large public collection of human gut microbiome data.
Date: | Thursday, January 24, 2019 |
---|---|
Time: | 11:45 A.M. - 12:45 P.M. |
Location: |
Mailman School of Public Health
Department of Biostatistics 722 West 168th Street AR Building 8th Floor Auditorium New York, New York |