American Statistical Association
"Mixed Data" comprising a large number of heterogeneous variables (e.g. count, binary, continuous, skewed continuous, among others) is prevalent in varied areas such as national security, multi-sensor recordings, Internet advertising, and our particular motivation - high-throughput integrative genomics. There have been limited efforts at statistically modeling such mixed data jointly. In this talk, we address this by introducing several new classes of Markov Random Fields (MRFs), or graphical models, that yield joint densities which directly parameterize dependencies over mixed variables. To begin, we present a novel class of MRFs arising when all node-conditional distributions follow univariate exponential family distributions that, for instance, yield novel Poisson graphical models. Next, we present several new classes of Mixed MRF distributions built by assuming each node-conditional distribution follows a potentially different exponential family distribution. Fitting these models and using them to select the mixed graph in high-dimensional settings can be achieved via penalized conditional likelihood estimation that comes with strong statistical guarantees under certain assumptions. These assumptions are often violated, however, in our motivating application of integrative genomics; to address this, we develop a new adaptive estimation strategy that improves performance on mixed data with complex dependencies. We use our methods to find epigenetic markers that affect gene regulation in ovarian cancer.
|Date:||Wednesday, June 27, 2018|
|Time:||4:00 - 5:00 P.M.|
Memorial Sloan Kettering Cancer Center
Department of Epidemiology and Biostatistics
485 Lexington Avenue
(Between 46th & 47th Streets)
2nd Floor, Conference Room B
New York, New York
**Outside visitors please email email@example.com for building access.
You must be on the security list to enter the floor.