## Somnath Datta, PhD

**Background**

Ph. D. (1988), Statistics and Probability, Michigan State University, East Lansing.

M. Stat. (1985), Mathematical Statistics and Probability, Indian Statistical Institute, Calcutta.

B. Stat. (1983), Statistics, Indian Statistical Institute, Calcutta.

Dr. Datta is a professor at the University of Florida. Website

Honors and Awards (selected)

• 2011:**CDC ATSDR 2011 Statistical Science Award**: Best Theoretical Paper, "Inverse Probability of Censoring Weighted U-statistics for Right-Censored Data with an Application to Testing Hypotheses", Datta, Somnath, Bandyopadhyay, Dipankar and Satten, Glen A.,Scandinavian Journal of Statistics, 37, 680-700 (2010).

• 2010:**Elected Fellow, Institute of Mathematical Statistics.**

• 2009:**Elected member, International Statistical Institute.**

• 2007:**First Place, American Spinal Injury Association, Best Poster Award**, 33rd Annual Scientific Meeting, for the poster " A Multivariate Examination of Temporal Change in BERG Balance Scale Variables for Patients with ASIA C AND D Spinal Cord Injuries" by S. Datta, D. Lorenz, M. Schmidt-Read, E. Ardolino, S. Morrison, and S.J. Harkema.

• **2006: Elected Fellow, American Statistical Association.**

• 2005:**CDC ATSDR 2005 Statistical Science Award**: Best Application Paper, “Standardization and denoising algorithms for mass spectra to classify whole-organism bacterial specimens”, Satten, G. A., Datta, S., Moura, H., Woolfitt, A., Carvalho, G., De, B. K, Pavlopoulos, A., Carlone, G. M., and Barr, J. : Bioinformatics, 20, 3128-3136 (2004).

• 2004:**CDC ATSDR 2004 Statistical Science Award**: Best Theoretical Paper, “Marginal analyses of clustered data when cluster size is informative”, Williamson, J. M., Datta, S. and Satten, G. A, Biometrics, 59, 36-42 (2003).

• 2003:**Snedecor Award nomination**for the paper “Estimation of integrated transition hazards and stage occupation probabilities for non-Markov systems under stage dependent censoring” by Datta, S. and Satten, G. A.. Biometrics, 58, 792-802 (2002).

• 2001:**CDC ATSDR 2001 Statistical Science Award**: Best Theoretical Paper, “A simulate-update algorithm for missing data problems”, Satten, G. A. and Datta, S. Computational Statistics, 15, 243-277 (2000).

• 1999:**CDC ATSDR 1999 Statistical Science Award**: Best Theoretical Paper, “A semiparametric approach to the proportional hazards model for interval censored data”, Satten, G. A. and Datta, S. and Williamson, J. M., Journal of the American Statistical Association, 93, 318-327 (1998).

**External Research Funding (since 1995)**

**PI Level**

11. National Institutes of Health, NIDCR, R03 grant, Grant Number: 1R03DE022538‐01, “Novel statistical models for dental caries”, July 2012‐June 2014, Role: Principal Investigator, 3 cal months. Salary, student, travel and other support.

10. National Institutes of Health, NIDCR, R03 grant, Grant Number: 1R03DE020839-01A1, “Rank tests for clustered data with potentially informative cluster size: Novel statistical methods for analyzing dental data", September 2011-August 2013, Role: Principal Investigator, 2.4 cal months. Salary, student, travel and other support.

9. National Science Foundation, DMS, “SOLAR: New Materials Search for Solar Energy Conversion to Fuels”, Recommended for award, Role: Co-Principal Investigator; Awarded jointly with M. Sunkara (Louisville), M. Menon (Kentucky) and K. Rajan (Iowa State). 1 cal month. Salary, post-doc, student, travel and other support.

8. National Security Agency, Mathematical Science Grant, Grant Number: H98230-11-1-0168, "Nonparametric Regression of State Occupation Probabilities, State Entry, Exit and Waiting Time Distributions in a Multistate Model", Mathematical Sciences Grant, March 2011-February 2013, Role: Principal Investigator, 1 cal month. Salary, student and travel support.

7. National Science Foundation, Statistics Program (DMS), Grant Number: DMS-0706965, “Theory and Applications of U-statistics for Multistate Models under Censoring", July 2007-June 2011, Role: Principal Investigator, 1 cal month. Salary, student and travel.

6. National Security Agency, Mathematical Science Grant, Grant Numbers: H98230-05-1-0054 (Georgia), H98230-06-1-0062 (Louisville), “Nonparametric inference in censored data problems”, Jan 2005- Dec 2006, Role: Principal Investigator, 1 cal month. Salary, student, travel and computing support.

5. Centers for Disease Control and Prevention, Division of Molecular Biology, IPA Award, “Problems in Genetic Epidemiology”, June 2001-May 2005, Role: Principal Investigator, 3 cal months. Salary support.

4. National Security Agency, “Large Sample Theory of Inverse Probability of Censoring Weighted Estimation in Multistage and Mixed Linear Models”, Mathematical Science Grant, Dec. 2002-Nov 2004, Role: Principal Investigator, 1 cal month. Salary, travel and computing support.

3. Centers for Disease Control and Prevention, Division of HIV/AIDS Prevention: Surveillance and Epidemiology, IPA Award, “Analysis of Complex Survival Data”, Sept 1996-August 2000, Role: Principal Investigator, 3 cal months. Salary support.

2. National Security Agency, Mathematical Science Grant, Grant number: MDA904-96-1-0049, Dec 1996-Dec 1999, Role: Principal Investigator, 1 cal month. Salary, travel and computing support.

1. National Science Foundation, Statistics Program (DMS), “Mathematical Sciences Computing Research Environments”, August 1995- July 1996, Role: Co-Principal Investigator (awarded jointly with L. Billard and T. N. Sriram). Computing support.

** Non PI Level**

5. National Institutes of Health, NICHD, R01 Grant, Grant Number: 1R01HD065279-01 "Gross morphological correlates to the minicolumnopathy of autism", PI: M. Casanova, September 2009- August 2011. Role: co-Investigator, 1.2 cal months. Salary support.

4. National Institutes of Health, NINDS, R01 Grant, Grant Number: 1 R01 NS049209-01 A1, “Plasticity of Human Spinal Neural Networks After Injury”, PI: S. Harkema, January 2007- March 2009,Role: co-Investigator, 1.2 cal months. Salary support.

3. Christopher Reeve Foundation, "Development of Neural Recovery Rehabilation and Research Centers", PI: S. Harkema, August 2006- November 2011, Role: Senior Statistician, 1.2 cal months - 4.8 cal months. Salary support.

2. National Institutes of Health, NIMH, R34 Grant, "Outcomes of Teacher Training on Autism", PI: L. Ruble, 2005-2008, Role: co-Investigator, 0.6 cal months.

Salary support.

1. National Institutes of Health, NCI, R15 Grant, “Efficient Estimation Methods for Censored Survival Data”, PI: S. Subramanian, April 2004-March 2007, Role: Consultant. Flat Fee.

**Service to Profession**

• **Editor-in-Chief**(co with H. Koul), Statistics & Probability Letters, 2007-2011.

• **Guest Editor**(co with H. van Houwelingen), Special Issue on Statistics in Biological and Medical Sciences.

Statistics & Probability Letters, 2010-2011.

• **Associate Editor**, The American Statistician, 2006-current.

• **Associate Editor**, BMC Bioinformatics, 2010-current.

• **Associate Editor**, Communications in Statistics, 2002-current.

• **Co-Editor**, Sankhya, 2001-2007.

• **Vice-president**: Forum for Interdisciplinary Mathematics, 2011-2013.

**Selected Publications **

Chakraborty, S., Datta, S. and Datta, S. Surrogate variable analysis using partial least squares (SVA‐PLS) in gene expression studies. Bioinformatics, 28, 799‐806 (2012).

Datta, S., Lorenz, D. J., Harkema, S. J. A dynamic longitudinal evaluation of the utility of the Berg Balance Scale in patients with motor incomplete spinal cord injury. Archives of Physical Medicine and Rehabilitation, 93, 1565‐1573 (2012).

Fan, J. and Datta, S. Fitting accelerated failure time models to clustered survival data with potentially informative cluster size. Computational Statistics & Data Analysis, 55,3295-3303 (2011).

Datta, S.,**Datta, S.**, Kim, S., Chakraborty, S. and Gill, R. S. Statistical Analyses of Next Generation Sequence Data: A Partial Overview, Journal of Proteomics & Bioinformatics, 3: 183-190 (2010).

Lan, L. and Datta, S. Comparison of state occupation, entry, exit and waiting times in two or more groups based on current status data in a multistate model. Statistics in Medicine, 29, 906 - 914 (2010).

Datta, S., Bandyopadhyay, D. and Satten, G. A. Inverse probability of censoring weighted U-statistics for right censored data with applications. Scandinavian Journal of Statistics, 37, 680–700 (2010).

Datta, S. and**Datta, S**. Computational biology touches all bases. Genome Biology, 10, 303 (2009).

Datta, S., Lorenz, D. J., Morrison, S., Ardolino, E., Harkema, S. J. A multivariate examination of temporal changes in Berg variables for patients with AIS C and D spinal cord injuries. Archives of Physical Medicine and Rehabilitation, 90, 1208-1217 (2009).

Pihur, V.,**Datta, S.**and Datta, S. Finding cancer genes through meta-analysis of microarray experiments: Rank aggregation via the cross entropy algorithm. Genomics, 92, 400-403 (2008).

Pihur, V.,**Datta, S.**and Datta, S. Reconstruction of genetic association networks from microarray data: A partial least squares approach. Bioinformatics, 24, 561-568 (2008).

Datta, S. and Satten, G. A. A signed-rank test for clustered data. Biometrics, 64, 501-507 (2008).

Pihur, V., Datta, S. and**Datta, S.**Weighted rank aggregation of cluster validation measures: A Monte Carlo cross-entropy approach. Bioinformatics, 23, 1607-1615 (2007).

Datta, S. and Sundaram, R. Nonparametric marginal estimation in a multistage model using current status data, Biometrics, 62, 829–837 (2006).

Datta, S. and Satten, G. A. Rank-sum tests for clustered data, Journal of the American Statistical Association, 100, 908-915 (2005).

Chakraborty, S. and Datta, S. How will plant pathogens adapt to host plant resistance at elevated CO2 under a changing climate? New Phytologist, 159, 733-742 (2003).

Datta, S. and**Datta, S.**Comparisons and validation of statistical clustering techniques for microarray gene expression data. Bioinformatics, 19, 459-466 (2003).

Williamson, J., Datta, S., and Satten, G. A. Marginal analyses of clustered data when cluster size is informative. Biometrics, 59, 36-42 (2003).

Datta, S. and Satten, G. A. Estimation of integrated transition hazards and stage occupation probabilities for non-Markov systems under stage dependent censoring. Biometrics, 58, 792-802 (2002).

Datta, S. and Satten, G. A. Validity of the Aalen-Johansen estimators of stage occupation probabilities and integrated transition hazards for non-Markov models. Statistics and Probability Letters, 55, 403-411 (2001).

Satten, G. A., Datta, S. and Robins, J. M. An estimator for the survival function when data are subject to dependent censoring. Statistics and Probability Letters, 54, 397-403 (2001).

Satten, G. A. and Datta, S. The Kaplan-Meier Estimator as an inverse-probability-of-censoring weighted average. American Statistician, 55, 207-210 (2001).

Satten, G. A. and Datta, S. A simulate-update algorithm for missing data problems. Computational Statistics, 15, 243-277 (2000).

**Datta, S.**, Satten, G. A. and Datta, S. Nonparametric estimation for the three-stage irreversible illness-death model. Biometrics, 56, 841-847 (2000).

Satten, G. A. and Datta, S. and Williamson, J. M. A semiparametric approach to the proportional hazards model for interval censored data. Journal of the American Statistical Association, 93, 318-327 (1998).

Datta, S. and Hannan, J. F. A uniform L1 law of large numbers for functions on a totally bounded metric space. Sankhya A, 59, 167-174 (1997).

Datta, S. On asymptotic properties of bootstrap for AR(1) processes. Journal of Statistical Planning and Inference, 53, 361-374 (1996).

Datta, S. and McCormick, W. P. Bootstrap inference for a first order autoregression with positive innovations. Journal of American Statistical Association, 90, 1289-1300 (1995).

Datta, S. Limit theory and bootstrap for explosive and partially explosive autoregression. Stochastic Processes and Their Applications, 57, 285-304 (1995).

Datta, S. and McCormick, W. P. Some continuous Edgeworth expansions for Markov chains with applications to bootstrap. Journal of Multivariate Analysis, 52, 83-106 (1995).

Datta, S. Some non asymptotic bounds for L1 density estimation using kernels. Annals of Statistics, 20, 1658-1667 (1992).

Datta, S. Asymptotic optimality of Bayes compound estimators in compact exponential families. Annals of Statistics, 19, 354-365 (1991).

Datta, S. On the consistency of posterior mixtures and its application. Annals of Statistics, 19, 338-353 (1991).

**Outside Interests**

Photography, Digital Editing, Sound Recording, Stamp Collecting.

**Other**