Karthik Devarajan, PhD

Assistant Professor


Karthik.Devarajan@fccc.edu
Phone:
Fax:
Karthik Devarajan, PhD


Recent advances in high-throughput technologies have given rise to large-scale biological data in the form of expression profiles of tens of thousands of genes and proteins, often with only a handful of tissue samples. The focus of my research is in the development of novel statistical methodology for the analysis of large-scale data stemming from high-throughput studies such as microarrays, comparative genomics hybridization, siRNA screening and microscopy. It includes methods for pattern recognition as well as for correlating a certain phenotype (such as tissue type, patient response to a certain treatment or survival time etc.) with large numbers of covariates (genes).

We are currently investigating two methods from statistical learning theory - nonnegative matrix factorization and partial least squares. Specifically, we are developing unsupervised clustering methods for molecular pattern discovery as well as for text mining applications in biomedical informatics. In this setting, there is no prior knowledge of the expected gene expression patterns for a given set of genes or for any phenotype. Our methods are based on nonnegative matrix factorization for the discrimination of competing models and elucidation of clusters and hidden variables within such large-scale data. Another problem of interest is in associating large scale molecular data and clinical data with patient survival time in the presence of censoring. This is an important issue in translational medicine, however very little research has been done in this area. We addressed this problem by developing methods that combine partial least squares with the accelerated failure time model for censored survival data. We are currently extending this approach for other learning theoretic methods and models for censored survival data.