Faculty Summaries
Karthik Devarajan, PhD
Karthik Devarajan, PhD
Associate Member & Assistant Professor
Office Phone: 215-728-2794
Fax: 215-728-2553
Office: R383
Statistical Methods in Bioinformatics

Recent advances in high-throughput technologies have given rise to large-scale biological data in the form of expression profiles of tens of thousands of genes and proteins, often with only a handful of patient samples. The focus of my research is in the development of novel statistical methodology for the analysis of large-scale data stemming from high-throughput studies such as next-generation sequencing, microarrays, allele-specific expression, SNP arrays, comparative genomics hybridization, siRNA screening and microscopy. It includes methods for dimension reduction and pattern recognition as well as for correlating a certain phenotype (such as tissue type, patient response to a certain treatment, survival time etc.) with large numbers of covariates (genes, SNPs or sequence tags).

We are currently investigating two methods from statistical learning theory - nonnegative matrix factorization and continuum regression. Specifically, we are developing unsupervised learning methods for text mining applications in biomedical informatics as well as for model-based clustering of next-generation sequencing data. In this setting, there is no prior knowledge of the expected gene expression patterns for a given set of genes or for any phenotype. Our methods are based on a unified theoretical framework for nonnegative matrix factorization for the discrimination of competing models and elucidation of clusters and hidden variables within such large-scale data. Another problem of interest is in associating large scale molecular data and clinical data with patient survival time in the presence of censoring. This is an important issue in translational medicine, however little research has been done in this area. We address this problem by developing methods that utilize continuum regression, a framework for supervised dimension reduction, in conjunction with well-known models for censored survival data.  

Description of research projects
Selected Publications
  1. Devarajan K, Cheung VCK. (2013). On non-negative matrix factorization algorithms for signal-dependent noise with application to electromyography data, Neural Computation, In press. doi: 10.1162/NECO_a_00576.
  2. Devarajan K, Ebrahimi N. (2013). On penalized likelihood estimation for a non-proportional hazards regression model. Statistics and Probability Letters. 83, 1703-1710. NIHMS 462966. NIHMS 462966. COBRA Preprint Series. Working Paper 92
  3. Devarajan K, Ebrahimi N. A semi-parametric generalization of the Cox proportional hazards regression model: Inference and Applications. Computational statistics & data analysis, 55(1):667-76, 2011. PMC2976538
  4. Devarajan K, Wang G, Ebrahimi N. (2011). A unified approach to non-negative matrix factorization and probabilistic latent semantic indexing, COBRA pre-print series, Working Paper 80. http://biostats.bepress.com/cobra/art80.
  5. Devarajan K, Zhou Y, Chachra N, Ebrahimi N. A supervised approach for predicting patient survival with gene expression data. Proceedings / Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE) IEEE International Symposium on Bioinformatics and Bioengineering, 2010(5521718):26-31, 2010. PMC2941901
  6. Devarajan K. Nonnegative matrix factorization: an analytical and interpretive tool in computational biology. PLoS computational biology, 4(7):e1000029, 2008. PMC2447881
  7. Wang MJ, Mehta A, Block TM, Marrero J, Di Bisceglie AM, Devarajan K. A comparison of statistical methods for the detection of hepatocellular carcinoma based on serum biomarkers and clinical variables. Bmc Medical Genomics, 6(Suppl 3):S9, 2013. Bmc Medical Genomics, 6(Suppl 3):S9, 2013..
  8. Anastassiadis T, Deacon SW, Devarajan K, Ma H, Peterson JR. Comprehensive assay of kinase catalytic activity reveals features of kinase inhibitor selectivity. Nature biotechnology, 29(11):1039-45, 2011. PMC3230241
  9. Cortellino S, Xu J, Sannai M, Moore R, Caretti E, Cigliano A, Le Coz M, Devarajan K, Wessels A, Soprano D, Abramowitz LK, Bartolomei MS, Rambow F, Bassi MR, Bruno T, Fanciulli M, Renner C, Klein-Szanto AJ, Matsumoto Y, Kobi D, Davidson I, Alberti C, Larue L, Bellacosa A. Thymine DNA glycosylase is essential for active DNA demethylation by linked deamination-base excision repair. Cell, 146(1):67-79, 2011. PMC3230223
  10. Astsaturov I, Ratushny V, Sukhanova A, Einarson MB, Bagnyukova T, Zhou Y, Devarajan K, Silverman JS, Tikhmyanova N, Skobeleva N, Pecherskaya A, Nasto RE, Sharma C, Jablonski SA, Serebriiskii IG, Weiner LM, Golemis EA. Synthetic lethal screen of an EGFR-centered network to improve targeted therapies. Science signaling, 3(140):ra67, 2010. PMC2950064
  11. Altomare DA, Vaslet CA, Skele KL, De Rienzo A, Devarajan K Jhanwar SC, McClatchey AI, Kane AB, Testa JR. A mouse model recapitulating molecular features of human mesothelioma. Cancer research, 65(18):8090-5, 2005.
All publications