Photograph of Surajit Ray

Surajit Ray


University Gardens
University of Glasgow GLASGOW G12 8QQ

Phone:   0141 330 6238
Fax:       0141 330 4814
E-mail: Surajit.Ray at glasgow.ac.uk
Professional Experience


2012-present Senior Lecturer, School of Mathematics and Statistics,
University of Glasgow.
2006-2012 Assistant Professor, Dept. of Mathematics and Statistics,
Boston University.
2009-present Affiliated Faculty, Bioinformatics Program, Boston University.
2008-present Associate Member, Cancer Vaccine Center, Dana Farber Cancer Research Institute, Harvard Medical School.
2005-2006 Visiting Assistant Professor, Dept. of Biostatistics,
University of North Carolina at Chapel Hill,
2004-2005Post Doctoral Fellow, Statistics and Applied Mathematical Sciences Institute, Research Triangle Park, Durham.
2003-2004 Visiting Assistant Professor, Dept. of Biostatistics
University of North Carolina at Chapel Hill.
2000-2003 Research Assistant, Dept. of Statistics,
Pennsylvania State University, University Park.


Education


1999-2003 Ph.D. Dept of Statistics, Pennsylvania State University
Dissertation: "Distance-based Model-Selection with application to Analysis of Gene Expression Data"
Advisor: Bruce G. Lindsay.
1997-1999 Master of Statistics, Indian Statistical Institute, Calcutta, India.
1994-1997 Bachelor of Science (Honors) in Statistics, Presidency College, Calcutta, India.


Honors and Awards


2009 Institute of Mathematical Statistics Young Researcher Travel-Award to present at the First IMS Asia Pacific Rim Meetings, Seoul National Univ, Seoul, South Korea
2009 NSF Travel-Award to present at the DSC 2009 Meetings at Copenhagen, Denmark on Future of Statistical Computing
2007 Honored by the Class of 2007 through the Class of 2007 Gift Program at Boston University.
2004Laha Travel Award from the Institute of Mathematical Statistics
2003 "Most Outstanding Student Presentation" in Theoretical Statistics at the International Conference on Statistics, Combinatorics and Related Areas. (See Presentations below)
2001-2003 Several graduate student travel awards from the Eberly College of Science, PennState.
2002 Davey Graduate fellowship award from the Eberly College of Science, PennState.
2002 August and Ruth Homeyer Graduate fellowship award from the Eberly College of Science, PennState.
2000 Vollmer-Kleckner Scholarship award in Science from the Eberly College of Science, PennState, for the most outstanding performance in PhD Qualifiers.


Research and Education Funding


Current Federal Grants (Currently PI/CoPI on active awards totaling $3.23M)
  • NSF Award No: #0934739: Functional Data Modeling of Climate-Ecosystem Dynamics 09/01/09- 08/31/13 Total Award $350,001 Role: PI
  • NSF Award No: #0947950: NSF GK-12 Graduate STEM Fellows in K-12 Education GLACIER-Global Change Initiative-Education & Research 03/15/2010-03/14/2014 Total Award $2,879,294 Role: Co-PI
  • NSF REU Award No: #0934739: Functional Data Modeling of Climate-Ecosystem Dynamics 09/01/10- 08/31/13 Total Award $6,000 Role: PI
  • NSF REU Award No: #0934739: Functional Data Modeling of Climate-Ecosystem Dynamics 09/01/11- 08/31/13 Total Award $7,200 Role: PI
    Current Boston University Grants
  • BU RULES: Transforming the BU Introductory Statistics Experience. 05/01/2011- 04/30/2013 Total Award $50000 Role: PI
    Past Grants
  • NASA Carbon Cycle & Ecosystem Grant: MODIS Algorithm Refinement and Earth Science Data Record Development for Global Land Cover and Land Cover Dynamics. NNX08AE61A (P.I. Mark Friedl) 06/01/2008- 07/01/2008 Role: Statistician.
  • NIH Program Project Grant: Medical Image Presentation, (P.I: Pizer, S.M.) 09/15/2005-06/31/2007 Role: Statistician.
  • BU UROP Award: Functional Data Modeling of Climate-Ecosystem Dynamics 09/01/10- 12/31/10 Total Award $1,500 Role: PI
  • BU UROP Award: Functional Data Modeling of Climate-Ecosystem Dynamics 01/02/11- 05/15/11 Total Award $750 Role: PI

    Research Interests


  • Theory and applications of finite mixture models, and detection of modes in high dimensional data, modal clustering. Assessment of model fit in high dimensional data and nonlinear manifold. Statistical methodology for social sciences focusing on structural equation models. Medical Imaging- Segmentation and characterization of anatomical objects in high dimensions and non-linear manifolds. Immuno-informatics- focusing on classification of epitopes and model based clustering of microarray gene-expression data, with applications to "epitope-based" vaccine development. Functional data analysis with application to analysis of remote sensing data.
  • Analysis and visualization of Flow Cytometry Data.


    Publications


  • Ray, S., Pyne, S., (2012) A Computational Framework to Emulate the Human Perspective in Flow Cytometric Data Analysis with Saumyadipta Pyne PLoS one 7(5), e35693.
  • Liu, C., Ray, S., Hooker, G., Friedl, M.F. (2012) Functional Factor Analysis For Periodic Remote Sensing Data. Annals of Applied Statistics, 6:2, 601-624.
  • Ray, S., Ren, D. (2012) On the upper bound of the number of modes of a multivariate normal mixture. Journal of Multivariate Analysis 108, 41-52.
  • Bollen, K.A., Ray, S., Zavisca, J. (2012) A Comparison of Bayes Factor Approximation Methods Including Two New Methods. by Sociological Methods and Research 41, 294-324.
  • Shi, P., Kon, M., Ray, S., Zhu, Q. (2011) Top Scoring Pairs for Feature Selection in Machine Learning with Applications to Cancer Outcome Prediction. BMC Bioinformatics 12:375
  • Ray, S. (2011) Discussion on "Projection Pursuit Via White Noise Matrices" by Hui G., and Lindsay, B. Sankhya, Series B. 72 147-150.
  • DeLuca, D., Marina, O., Ray, S., Zhang, G.L., Wu, C.J, and Brusic, V. (2011) Data Processing and Analysis for Protein Microarrays in Protein Microarrays for Disease Analysis, Ed: Catherine Wu, Methods in Molecular Biology Humana Press 723 337-347 .
  • Lin, H. H., Ray, S., Tongchusak, S., Reinherz, E. L., and Brusic, V. (2008) Evaluation of MHC class I peptide binding prediction servers: applications for vaccine research. BMC Immunology, 9:8
  • Lindsay, B.G., Markatou M., Ray, S., Yang, K., Chen, S.C. (2008) Quadratic distances on probabilities: the foundations. The Annals of Statistics Vol. 36, No. 2, page 983-1006
  • Ray, S., Lindsay, B.(2008) Model selection in High-Dimensions: A Quadratic-risk Based Approach. Journal of the Royal Statistical Society - Series B Volume 70 Issue 1 (Feb), 95-118.
  • Ray, S., Tom Kepler (2007) Amino acid biophysical properties in the statistical prediction of peptide-MHC class I binding. Immunome Research Oct 29;3(1):9
  • Li, J., Ray, S., Bruce G Lindsay. (2007) A Nonparametric Statistical Approach to Clustering via Mode Identification Journal of Machine Learning Research 8(Aug):1687-1723.
  • Levy, J.H, Broadhurst R.R., Ray, S., Chaney, E.L., Pizer, S.M.(2007) Signaling local non-credibility in an automatic segmentation pipeline Proceedings of the International Society for Optical Engineering meetings on Medical Imaging, Volume 6512
  • M. Gupta, Ray, S. (2006). Sequence pattern discovery with applications to understanding gene regulation and vaccine design. Handbook of Statistics Ed. Chakraborty, R. and Rao, C.R. Elsevier Press [in press].
  • Jeong, J., Pizer, S.M., Ray, S. (2006) Statistics on Anatomic Objects Reflecting Inter-Object Relations. Proceedings of International Workshop on Mathematical Foundations of Computational Anatomy.
  • Ray, S., Lindsay, B.(2005). The Topography of Multivariate Normal Mixtures. The Annals of Statistics 33, 5, 2042-2065.
  • Ray, S., Lindsay, B.,(2005) Selecting the Number of Components in a Finite Mixture: A Risk-Based Approach. Proceedings of the of the 37th Symposium on the Interface, Computing Science and Statistics 37.
  • Basu, A., Ray, S., Park, C., Basu, S. (2002) Improved Power in Multinomial Goodness-of-fit Tests, Journal of the Royal Statistical Society Series D, 51, 3, 381-393.
  • Ray, S. (2003) Distance-based Model-Selection with application to the Analysis of Gene Expression Data. Electronic Thesis.  http://etda.libraries.psu.edu/theses/approved/WorldWideIndex/ETD-375/


    Books


  • Ray, S. Clustering, Cluster Inference and Applications in Clustering: Applications to the Analysis of Gene Expression Data. LAP LAMBERT Academic Publishing, ISBN 978-3845423623, 2011 [Reproduction of Dissertation]


    Manuscripts under Review


  • Lindsay, B.G., Markatou M., Ray, S.. Degrees of Freedom in Quadratic Goodness of Fit.[Submitted to Journal of American Statistical Association, October 2011]

    Working Papers


  • Ray, S. and Brusic, V. Statistical Analysis of ProtoArray: Evaluation of reproducibility.
  • Berger, J.O., Ray, S., Visser, I, Bayarri, M.J., Jang, W. Generalization of BIC.

    Published software


  • MODALCLUST: R-package for finding the number of modes of a multivariate normal and providing graphical and analytical representation of high-dimensional manifolds. (Available via CRAN )
  • ProtMAT: Web-based Protmat's core function is the implementation of the CDA (Concentration Dependent Analysis) algorithm. This method removes bias caused by the varying concentrations of spotted protein.
  • QUADRISK: C++ binary for calculating quadratic risk of a mixture fit and providing graphical aid to high-dimensional model selection problems.
  • MHCPROP: R-package for MHC binder prediction based on biophysiochemical properties of amino acids.


    Invited Presentations


  • Functional Factor Analysis For Periodic Remote Sensing Data. High dimensional and dependent functional data Research workshop: Bristol, UK September 10-12, 2012.
  • Functional Factor Analysis For Periodic Remote Sensing Data. The 31st Leeds Annual Statistical Research Workshop: New Statistics and Modern Natural Sciences July 3-5, 2012.
  • Functional Factor Analysis For Periodic Remote Sensing Data. 2012 Conference on Contemporary Issues and Applications of Statistics (CIAS2012) , Indian Statistical Institute, Kolkata, India January 2-4, 2012.
  • How Many Modes Can a Gaussian Mixture Have? Dept. of Statistics, Cornell University April 11, 2012
  • Understanding Climate change through Functional Data Analysis. Workshop on Understanding Climate Change through Data. This workshop is a part of the annual meeting of the NSF Expeditions in Computing, Minneapolis, MN, USA. August 15-16, 2011
  • Functional Factor Analysis For Periodic Remote Sensing Data International Conference on Probability, Statistics, and Data Analysis, Raleigh, NC, USA April 21-24, 2011
  • Functional Factor Analysis For Periodic Remote Sensing Data Workshop on Applied Statistical Methods, Indian Statistical Institute, Kolkata, India Mar 22, 2011.
  • Challenges in Functional Modeling of Climate Eco-System Dynamics Functional Data Analysis: Future Directions, Banff, Canada May 2 - May 7, 2010.
  • How many modes can a mixture of two components have? Mixture estimation and applications, Edinburgh, UK Mar 3, 2010 - Mar 5, 2010.
  • Statistical Analysis of Climate Ecosystem Dynamics Brown University Pattern Theory Seminars February 10, 2010.
  • Clustering of Functional Data DSC 2009: A Meeting on the Future of Statistical Computing, Copenhagen, Denmark July 13, 2009.
  • Clustering and classification of functional data. IMS Asia Pacific Rim Meetings, Seoul National Univ, Seoul, South Korea June 30, 2009.
  • Clustering and classification of functional data. 75th anniversary of the Statistical Laboratory Department of Statistics, Iowa State University June 4, 2009.
  • Clustering and classification of functional data. Department of Computer Science, Boston University February 19, 2009.
  • Clustering and classification of functional data. Department of Biostatistics Colloquium, Columbia University November 20, 2008.
  • Clustering and classification of functional data Colloque de statistique de Montreal, University of Montreal, Montreal Canada October 31, 2008.
  • Data Mining and Knowledge Discovery of Land Cover and Terrestrial Ecosystem Processes from Global Remote Sensing Data NASA conference on Intelligent Data Understanding: Presented by Mark Friedl September 9-10, 2008
  • Modal Inference and Its Application to High-Dimensional Clustering Session on Mixture Models: A Tool for Multilayered Clustering and Dimension Reduction at the Joint Statistical Meetings August 3-7, 2008.
  • A tool for multi-layered clustering and dimension reduction. International Conference on Statistical Paradigms - Recent Advances AND Reconciliations (ICSPRAR-2008), Indian Statistical Institute, Kolkata January 1-4, 2008.
  • Modal Inference and Its Application to High-Dimensional Clustering. Department of Biostatistics, University of Minnesota, October 31, 2007.
  • An Extended BIC for Model Selection. Joint Statistical Meetings, Salt Lake City August 1, 2007. Modal Inference: Building the bridge between nonparametric clustering and mixture analyses. WNAR and IMS Meetings, Irvine June 26, 2007.
  • Modal inference and its application to high-dimensional clustering, Department of Statistics, Harvard University, Cambridge April 30, 2007.
  • Modal Inference: Building the bridge between nonparametric clustering and mixture analyses. Department of Electrical and Computer Engineering, Boston University March 21, 2007.
  • Modal inference and its application to high-dimensional clustering, Department of Biotatistics, Harvard University, Boston Feb 21, 2007.
  • Hierarchical Modal Clustering based on the Topography of Multivariate Mixtures, International Conference on Multivariate Statistical Methods, Kolkata, India Dec 29, 2006.
  • Modal inference and its application to high-dimensional clustering, Department Statistics, University of Connecticut, Storrs Nov 9, 2006.
  • Modal EM for Mixtures and its Application in Clustering, Department of Mathematics and Statistics, Boston University, Boston Sep 28, 2006.
  • Hierarchical Modal Clustering based on the Topography of Multivariate Mixtures, Department of Mathematics and Statistics, Boston University, Boston Mar 22, 2006.
  • Hierarchical Modal Clustering based on the Topography of Multivariate Mixtures, Department of Mathematics and Statistics, McGill University, Montreal, Canada Feb 28, 2006.
  • Hierarchical Modal Clustering based on the Topography of Multivariate Mixtures, Department of Biostatistics, University of North Carolina, Chapel Hill Feb 24, 2006.
  • Hierarchical Modal Clustering based on the Topography of Multivariate Mixtures, Department of Statistics, Yale University, New haven Feb 13, 2006.
  • Model Selection in High-Dimensions: A Quadratic risk-based approach, Department of Probability and Statistics, National University of Singapore, Singapore Feb 3, 2006
  • The topography of multivariate mixtures and Modal clusters, Department of Mathematics and Statistics, University of Bristol, UK, Jan 11, 2006
  • Quadratic Distance:The basis for building High-dimensional model selection tool, Department of Statistics, University of Glasgow, UK Dec 12, 2005
  • Effective sample size and the Bayes factor, Transition Workshop: Latent Variable Models in the Social Sciences, SAMSI Nov 11, 2005
  • Model Selection in High-Dimensions: A Quadratic risk-based approach, Department of Statistics, University of California, Davis Oct 6, 2005
  • Using Quadratic Risk to select models in High dimensions, Department of Statistics, London School of Economics, London, UK Sep 27, 2005
  • Classification of MHC-I binding epitopes, WNAR/IMS Annual Meeting, Fairbanks, Alaska June 21-24, 2005
  • On using Quadratic Risk to Select High dimensional Mixture Model, Annual Meeting of the Statistical Society of Canada June 12-15, 2005 Selecting the Number of Components in a Finite Mixture: A Risk-based Approach, Joint Annual Meeting of the Interface and the Classification Society of North America: Theme: Clustering and Classification Washington University School of Medicine, St. Louis, Missouri June 8-12, 2005 Bayes Factors in Structural Equation Models: Schwarz's BIC and Other Approximations, American Sociological Association Section on Methodology: 2005 Annual Meeting, Chapel Hill Apr 22, 2005
  • Selecting the Number of Components in a Finite Mixture: A Risk-based Approach, International Conference on the future of statistical theory, practice and education, Hyderabad, India Dec 29-Jan 1, 2004,
  • The Topography of Multivariate Normal Mixtures, Seventh North American New Research Conference Toronto, Canada Aug 4-7, 2004
  • Distance-based Model-selection in Mixture Distributions, International Conference on:Statistics in Health Sciences , Nantes, France June 23-25, 2004.


    Teaching Experience


    Courses Teaching at University of Glasgow
  • Developing new courses in Computational Inference and Design Of Statistical Investigations Courses Taught at Boston University
  • Hypothesis Testing, Multivariate Statistical Analysis, Generalized Linear Models, Basic Statistics & Probability, Topics in High Dimensional Data Analysis, Design of Experiments. Courses Restructured at Boston University
  • Currently working on revamping the introductory sequence of Statistics courses (MA 213 and MA 214) under the RULES grant awarded by Boston University. Courses Taught at University of North Carolina
  • Principles of Statistical Inference, Principles of Statistical Inference (Distance Education) Courses Taught at Pennsylvania State University
  • Experimental Methods

    Student Mentoring


    Current Graduate Students
  • Kirsten Fairlie (M. Sc School of Mathematics and Statistics, University of Glasgow) Co-Advisor.
  • Chong Liu (Ph. D., Expected Aug 2013, Dept. of Mathematics & Statistics, Boston University) Thesis: Functional Data Modeling of Climate-Ecosystem Dynamics Role: First Reader.
  • Yansong Cheng (Ph. D., Expected Aug 2013, Dept. of Biostatistics, Boston University and School of Mathematics and Statistics, University of Glasgow) Role: First Reader. Past Undergraduate Students
  • Elizabeth Hunter, Supported by Boston University UROP fund and NSF REU Award No: #0934739. Past Students
  • Shu Yang (Ph. D., 2011, Dept. of Mathematics & Statistics) Thesis: Analysis of Network Type Data Using Statistical Methods Role: Second Reader.
  • Burton Shank (Ph. D., 2009 ,Dept. of Biology) Thesis: Spatial Variation in Coral Reef community
    Role: Third Reader.
  • Ja-Yeon Jeong (Ph.D., 2009, Dept. of Computer Science, University of North Carolina at Chapel Hill) Thesis: Estimation of probability distribution on multiple anatomical object complex
    Role: Primary Statistics Advisor. Role: Fourth Reader.


    Professional Activities


    • Organizer: 2012 New England Statistics Symposium (NESS) to be held at Boston University.
    • Program Committee member: Workshop on Informatics Applications in Therapeutics at the 2011 IEEE International Conference on Bioinformatics and Biomedicine November 12-15 2011, Atlanta.
    • Organizer and chairperson of sessions in scientific meetings.
    • Joint Statistical Meetings, Vancouver, 2010
    • New England Statistics Symposium, 2009
    • Joint Statistical Meetings, Denver, 2008
    • WNAR/IMS Annual Meeting, Irvine,2007.
    • International Conference on Multivariate Statistical Methods, Kolkata, India, 2006
    • WNAR/IMS Annual Meeting, Fairbanks,2005.
    • Joint Annual Meeting of the Interface and the Classification Society of North America, 2005,
    • WNAR/IMS Annual Meeting, Irvine, 2007.
    • Organizer of NSF sponsored Undergraduate workshop in statistics, North Carolina State University, 2005
    • Reviewer of several peer reviewed journal articles.
      • The Annals of Statistics
      • Journal of American Statistical Association
      • Journal of Royal Statistical Society (Series B)
      • Multivariate Analysis
      • Statistical Methodology
      • Australian and New Zealand Journal of Statistics.
      • Sankhya
      • Computational Statistics and Data Analysis
      • The Annals of Applied Statistics


    Professional Memberships


    2001-present American Statistical Association (ASA)
    2001-present Institute of Mathematical Statistics (IMS)
    1999-2007 Mathematical Association of America (MAA)