Karthik Devarajan, PhD
Associate Member & Assistant Professor
Office Phone: 215-728-2794
Recent advances in high-throughput technologies have given rise to large-scale biological data in the form of expression profiles of tens of thousands of genes and proteins, often with only a handful of patient samples. The focus of my research is in the development of novel statistical methodology for the analysis of large-scale data stemming from high-throughput studies such as next-generation sequencing, microarrays, allele-specific expression, SNP arrays, comparative genomics hybridization, siRNA screening and microscopy. It includes methods for dimension reduction and pattern recognition as well as for correlating a certain phenotype (such as tissue type, patient response to a certain treatment, survival time etc.) with large numbers of covariates (genes, SNPs or sequence tags).
We are currently investigating two methods from statistical learning theory - nonnegative matrix factorization and continuum regression. Specifically, we are developing unsupervised learning methods for text mining applications in biomedical informatics as well as for model-based clustering of next-generation sequencing data. In this setting, there is no prior knowledge of the expected gene expression patterns for a given set of genes or for any phenotype. Our methods are based on a unified theoretical framework for nonnegative matrix factorization for the discrimination of competing models and elucidation of clusters and hidden variables within such large-scale data. Another problem of interest is in associating large scale molecular data and clinical data with patient survival time in the presence of censoring. This is an important issue in translational medicine, however little research has been done in this area. We address this problem by developing methods that utilize continuum regression, a framework for supervised dimension reduction, in conjunction with well-known models for censored survival data.Description of research projects
- Devarajan, K., Wang, G., Ebrahimi, N. (2014). A unified statistical approach to non-negative matrix factorization and probabilistic latent semantic indexing, Machine Learning, 1-27. doi: 10.1007/s10994-014-5470-z. COBRA pre-print series, Article 80 (July 2011).
- Devarajan, K., Cheung, V.C.K. (2014). On non-negative matrix factorization algorithms for signal-dependent noise with application to electromyography data, Neural Computation, Jun;26(6):1128-68. Epub 2014 Mar 31. doi: 10.1162/NECO_a_00576.
- Devarajan, K., Ebrahimi, N. (2013). On penalized likelihood estimation for a non-proportional hazards regression model. Statistics and Probability Letters. 83, 1703-1710. NIHMS 462966. Devarajan, K., Ebrahimi, N.(2011). A semi-parametric generalization of the Cox proportional hazards regression model: Inference and Applications, Computational Statistics and Data Analysis, 55(1):667-76, doi:10.1016/j.csda.2010.06.010. PMCID: PMC2976538.
- Devarajan, K., Zhou, Y., Chachra, N., Ebrahimi, N. (2010). A supervised approach for predicting patient survival with gene expression data, Proceedings of the IEEE Tenth International Conference in Bioinformatics and Bioengineering, 2010(5521718):26-31. PMCID: PMC2941901.
- Devarajan, K. (2008). Non-negative matrix factorization ñ An analytical and interpretive tool in computational biology, 4(7): e1000029. doi:10.1371/journal.pcbi.1000029, PLoS Computational Biology. PMCID: PMC2447881.
- Wang, M., Mehta, A., Block, T.M., Marrero, J., Di Bisceglie, A.M., and Devarajan, K. (2013). A comparison of statistical methods for the detection of hepatocellular carcinoma based on serum biomarkers and clinical variables. BMC Medical Genomics 6(Suppl 3):S9 (11 November 2013).
- Anastassiadis, T., Deacon, S., Devarajan, K., Ma, H., Peterson, J. Comprehensive assay of kinase catalytic activity reveals features of kinase inhibitor selectivity. Nat. Biotechnol. 29(11):1039-1045, 2011. PMCID: PMC3230241.
- Cortellino, S., Xu, J., Sannai, M., Moore, R., Caretti, E., Cigliano, A., Le Coz, M., Devarajan, K., Wessels, A., Soprano, D., Abramowitz, L.K., Bartolomei, M.S., Rambow, F., Bassi, M.R., Bruno, T., Fanciulli, M., Renner, C., Klein-Szanto, A.J., Matsumoto, Y., Kobi, D., Davidson, I., Alberti, C., Larue, L., Bellacosa, A. Thymine DNA Glycosylase Is Essential for Active DNA Demethylation by Linked Deamination-Base Excision Repair (ch. 108), Cell. 2011 Jul 8;146(1):67-79. Epub 2011 Jun 30. PMCID: PMC3230223.
- Astsaturov, I., Ratushny, V., Sukhanova, A., Einarson, M.B., Bagnukova, T., Zhou, Y., Devarajan, K., Silverman, J.S., Tikhmyanova, N., Skobeleva, N., Pecherskaya, A., Sharma, C., Nasto, R., Jablonski, S., Serebriiskii, I., Weiner, L., Golemis, E. (2010). Synthetic lethal screen of an EGFR-centered network to improve targeted therapies, Science Signaling, 2010 Sep 21;3(140):ra67. PMCID: PMC2950064.
- Altomare, D.A., Vaslet, C.A., Skele, K.L., De Rienzo, A., Devarajan, K., McClatchey, A.I., Kane, A.B. Jhanwar, S.C., Testa, J.R. (2005). A Mouse Model Recapitulating Molecular Features of Human Mesothelioma, Priority Report, Cancer Research, 65 (18): 8090-8095.