American Statistical Association
New York City
Metropolitan Area Chapter

Mailman School of Public Health
Columbia University
Department of Biostatistics Colloquium



STATISTICAL CHALLENGES WITH NEXT-GENERATION SEQUENCE DATA

by

Prof. Terry Speed
Department of Statistics
University of California-Berkeley


Abstract

Since their first appearance just over a decade ago, microarrays have become the assays of choice for high-throughput genome-wide studies of gene expression. At the same time the use of microarrays has broadened to include studies of DNA polymorphism, DNA copy-number, DNA binding proteins, DNA (re)sequencing, and more. In the course of these developments, we have learned a lot about the many non-biological aspects of microarray data, and have devised methods which attempt to deal with them. Also, many novel statistical methods have been developed to address the challenges posed by the availability of large amounts of microarray data for answering biological questions.

Recent improvements in the efficiency, quality, and cost of genome-wide sequencing are prompting biologists to abandon microarrays in favor of next-generation sequencers, e.g., Applied Biosystems' SOLiD, Helicos BioSciences' HeliScope, Illumina's Solexa, and Roche's 454 Life Sciences sequencing systems, and more. These high-throughput sequencing technologies have already been applied to studying genome-wide transcription levels (mRNA-Seq), transcription factor binding sites (ChIP-Seq), chromatin structure, DNA copy number, and DNA methylation status.

While we might hope that these new sequencing-based studies have overcome many of the limitations of microarray-based studies, realistically we should expect that these new technologies raise problems of their own similar to the ones we met with microarrays. If so, there will be a need for statisticians and others to understand and deal with non-biological features of the data, and to modify existing or develop novel statistical methods to get the best out of these data, when helping biologists address the questions of interest to them.

This talk, which draws heavily on recent, unpublished work of Sandrine Dudoit and her students,
reports on early findings, work in progress, and promising directions.


Date: Thursday, March 26, 2009
Time: 4:00 - 5:00 P.M.
Location: Mailman School of Public Health
Department of Biostatistics
722 West 168th Street
Biostatistics Computer Lab
6th Floor - Room 656
New York, New York

RESERVATIONS ARE NOT REQUIRED

Refreshments will be served at 3:30 P.M. in the
Biostatistics Conference Room (R627).


Home Page | Chapter News | Chapter Officers | Chapter Events
Other Metro Area Events | ASA National Home Page | Links To Other Websites
NYC ASA Chapter Constitution | NYC ASA Chapter By-Laws

Copyright © 2009 by New York City Metropolitan Area Chapter of the ASA
Designed and maintained by Cynthia Scherer
Send questions or comments to nycasa@mindspring.com