Giles Hooker
Giles Hooker
Associate Professor:

Department of Statistical Science
Department of Biological Statistics and Computational Biology
Cornell University

Department of Public Health
Weill Cornell Medical College

Director of Graduate Studies:
Graduate Field of Statistics

Curriculum Vitae
1186 Comstock Hall
Cornell University
Ithaca, NY 14853

Phone: (1 607) 255 1638
Fax: (1 607) 255 4698

giles [dot] hooker [at] cornell [dot] edu


My blog, Of Models and Meanings is a collection of musings on the philosophy of mathematical modeling and what makes a model "interpretable".


My book, Functional Data Analysis in R and Matlab with Jim Ramsay and Spencer Graves is now out.

I gave a short course based on this material at the International Workshop on Statistical Modeling in June, 2010. Here is a handout and computer lab from it.


For a bit of amusement, some creativity from long ago: "The Stanford Statistics Songbook: A Musical Tribute". Technical Report, Department of Statistics, Stanford University.

Research Interests:

  • Data analysis for dynamical systems and differential equations
  • Functional data analysis
  • Machine learning and data mining
  • Disparity-based inference
  • My research focusses on a number of issues within these three fields. I am particularly interested in developing and extending the methods of functional data analysis for examining the evolution of systems in terms of nonlinear differential equations. This involves estimating parameters for such equations, diagnosing when and why equations do not fit data well and developing statistical theory to account for smooth perturbations of such systems.

    Please see my profiling webpages for Matlab code, manuals and webpage demonstrations. CollocInfer, a somewhat more general R package has now been released, a user manual provides a guide to it.

    In machine learning, I focus on the problem of diagnostics and understanding the prediction functions that machine learning produces. Recent work in this includes estimates for conditional density and quantile functions. I am also interested in analyzing the results of experiments in machine learning in terms of determining when and why particular methods were successful.

    Lastly, I am interested in robust inference via disparity-based methods such as Hellinger distance. These techniques have been in existence for 30 years or more for iid data; part of my research focusses on adapting these to regression, mixed-effects and Bayesian frameworks.

    I also have a number of articles on the phenomenon of "paradoxical results" in multidimensional educational testing.

    Teaching:

    BTRY 6940: Inference in Nonlinear Dynamical Systems, Fall 2012.

    BTRY 3520: Statistical Computing, Spring 2012/2013.

    BTRY 7180: Generalized Linear Models, Fall 2011.

    BTRY 6020: Statistical Methods II, Spring 2009/2010/2011.

    BTRY 6150: Applied Functional Data Analysis, Fall 2008.

    CSCU Workshop: Introduction to Functional Data Analysis, October 13/14, 2011, March 27, 28, 2008.

    BTRY 694: Theory of Multivariate Statistics, Spring 2008/Fall 2009

    BTRY 694: Statistical Learning Theory, Fall 2007

    BTRY 694: Functional Data Analysis, Spring 2007

    Publications:

    Teppo Hiltunen, Nelson G. Hairstone, Giles Hooker, Laura E. Jones and Stephen P. Ellner, "A newly discovered role of evolution in previously published consumer-resource dynamics", Ecology Letters, in press.

    Matthew W. McLean, Giles Hooker and David Ruppert , 2013, "Restricted Likelihood Ratio Tests for Linearity in Scalar-on-Function Regression'', Statistics and Computing, in press.

    Giles Hooker and Anand Vidyashankar, 2014, "Bayesian Model Robustness via Disparities", TEST, in press. See also a .tar file of R code to reproduce all our results and an older ArXiv version.

    Matthew McLean, Giles Hooker, Ana-Maria Staicu, Fabian Schiepl and David Ruppert, 2014," Functional Generalized Additive Models'', Journal of Computational and Graphical Statistics, 23(1):249-269.

    Maria Asencio, Giles Hooker and H. Oliver Gao, 2013, "Functional Convolution Models", Statistical Modeling, 14(4):1-21.

    Giles Hooker and Stephen P. Ellner, 2013, "Goodness of Fit in Nonlinear Dynamics: Mis-specified Rates or Mis-specified States?", under review.

    Cecilia Earls and Giles Hooker, 2013, "Bayesian Covariance Estimation and Inference in latent Gaussian Process Models'', Statistical Methodology, in press.

    Giles Hooker, 2013, "On the Identifiability of the Functional Convolution Model", Technical Report BU-1681-M, Department of Biological Statistics and Computational Biology, Cornell University.

    Yuefeng Wu and Giles Hooker, 2013, "Hellinger Distance and Bayesian Non-parametrics: Hierarchical Models for Robust and Efficient Bayesian Inference", under review.

    Giles Hooker, 2013, A review of "Boosting: Foundations and Algorithms" by Schapire and Freund, Journal of the American Statistical Association, 108(502):750-754.

    Giles Hooker, 2013, Consistency, Efficiency and Robustness of Conditional Disparity Methods, under review.

    Matthew W. McLean, Fabian Scheipl, Giles Hooker, Sonja Greven and David Ruppert, 2013 "Bayesian Functional Generalized Additive Models with Sparsely Observed Covariates'', under review.

    Yin Lou, Rich Caruana, Johannes Gehrke and Giles Hooker, 2013, "Accurate Intelligible Models with Pairwise Interactions", KDD'13.

    S.A. Jesty., S.W. Jung, J.M. Cordeiro, T.M. Gunn, J.M. Di Diego, S. Hemsley, B.G. Kornreich, G. Hooker, C. Antzelevitch, N.S. Moise, 2013, "Cardiomyocyte calcium cycling in a naturally occurring German shepherd dog model of inherited ventricular arrhythmia and sudden cardiac death'', Journal of Vetinary Cardiology 15(1): 5-14.

    Robert D. Gibbons, Giles Hooker, Matthew D. Finkelman, David J. Weiss, Paul A. Pilkonis, Ellen Frank, Tara Moore and David J. Kupfer, 2103, "Computerized Adaptive Diagnosis of Depression Using the CAD-MDD'', Journal of Clinical Psychiatry, 74(7):669-674.

    Kevin K. Lin, Giles Hooker and Bruce Rogers, 2012, Control Theory and Experimental Design in Diffusion Processes, under review.

    Leifur Thorbergsson and Giles Hooker, 2012, "Experimental Design for Partially Observed Markov Decision Processes'', under review.

    Giles Hooker and James O. Ramsay, 2012. "Learned-Loss Boosting." Computational Statistics and Data Analysis, 56:3935-3944. Matlab software is also available.

    David Campbell, Giles Hooker and Kim McAuley, 2012, "Parameter Estimation in Differential Equation Models with Constrained States.", Journal of Chemometrics, 56:322-332.

    Giles Hooker and Saharon Rosset, 2012, "Prediction-Focussed Regularization Using Data-Augmented Regression", Statistics and Computing, 1:237-349. Simulation code is available.

    Chong Liu, Surajit Ray, Giles Hooker and Mark Friedl, 2012, "Functional Factor Analysis for Periodic Remote Sensing Data", Annals of Applied Statistics, 6:601-624.

    Giles Hooker, Stephen P. Ellner, Laura de Vargas Roditi and David J. D. Earn, 2011, "Parameterizing State-space Models for Infectious Disease Dynamics by Generalized Profiling: Measles in Ontario", Journal of the Royal Society Interface, 8:961-975. Matlab code to the conduct the analysis is available.

    Marija Zeremski, Giles Hooker, Marla A. Shu, Emily Winkelstein, Queenie Brown, Don C. Des Jarlais, Leslie H. Tobler, Barbara Rehermann, Michael P. Busch, Brian R. Edlin, and Andrew H. Talal, 2011, "Induction of CXCR3- and CCR5-associated Chemokines during Acute Hepatitis C Virus Infection", Journal of Hepatology, 55:545-553.

    Giles Hooker and Stephen P. Ellner, 2010, ``On Forwards Prediction Error'', Technical Report BU-1679-M, Department of Biological Statistics and Computational Biology, Cornell University.

    J. O. Ramsay, G. Hooker and J. Nielsen, 2010 "Estimating the Quantile Function", under review.

    Matthew Finkelman, Giles Hooker and Jane Wang, 2010, "Prevalence and Magnitude of Paradoxical Results in Multidimensional Item Response Theory". Journal of Educational and Behavioral Statistics, 35:744-761.

    Giles Hooker, 2010. "On Separable Tests Correlated Priors and Paradoxical Results in Multidimensional Item Response Theory". Psychometrika, 75:694-707. Manuscript available upon request.

    Daniel Fink, Wesley M. Hochachka, Benjamin Zuckerberg, David W. Winkler, Ben Shaby, M. Arthur Munson, Giles Hooker, Mirek Riedewald, Daniel Sheldon and Steve Kelling, 2010, ``Spatiotemporal Exploratory Models for Broad-scale Survey Data'', Ecological Applications, 20:2131-22147.

    Giles Hooker, 2010, ``Comments on: Dynamic Relations for Sparsely Sampled Gaussian Processes'', TEST, 19, 50-53.

    Ercan Atam and Giles Hooker, 2010, ``An Identification-based State Estimation Method for a Class of Nonlinear Systems''. J. Systems and Control Engineering, 224: 349-359.

    Giles Hooker and Matthew Finkelman, 2009. "Paradoxical Results and Item Bundles". Psychometrika, 75:249-271. Manuscript available upon request.

    Giles Hooker, Matthew Finkelman and Armin Schwartzman, 2009, "Paradoxical Results in Multidimensional Item Response Theory". Psychometrika, 74:419-442. Slides from a recent talk. Manuscript available upon request.

    Matthew Finkelman, Giles Hooker and Jane Wang, 2009, "Unidentifiability and Lack of Monotonicity in the Multidimensional Three-Parameter Logistic Model". Technical Report BU-1678-M, Department of Biological Statistics and Computational Biology, Cornell University.

    S. Kelling, W. Hochachka, D. Fink, M. Riedewald, R. Caruana, M. Ballard and G. Hooker, 2009, "Data Intensive Science: A New Paradigm for Diversity Studies". Bioscience, 59:613-620.

    Giles Hooker, 2009. "Forcing Function Diagnostics for Nonlinear Dynamics". Biometrics, 65:928-936

    Gelzer, A., M. L. Koller, N. F. Otani, J. J. Fox, M. W. Enyeart, G. Hooker, M. L. Riccio, C. R. Bartoli and R. F. Gilmour, 2008, "Dynamic Mechanisms for Initiation of Ventricular Fibrillation in vivo", Circulation, 118:1123-1129.

    Giles Hooker and Larry Biegler, 2007. "IPOPT and Neural Dynamics: Tips, Tricks and Diagnostics", Technical Report BU-1676-M, Department of Biological Statistics and Computational Biology, Cornell University. A demonstration bundle provides data and AMPL code from this estimation.

    James Ramsay, Giles Hooker David Campbell and Jiguo Cao, 2007. "Parameter Estimation for Differential Equations: A Generalized Smoothing Approach". Journal of the Royal Statistical Society 69:741796, (with discussion).

    Giles Hooker, 2007, "Theorems and Calculations for Smoothing-based Profiled Estimation of Differential Equations", Technical Report BU-1671-M, Department of Biological Statistics and Computational Biology, Cornell University.

    Giles Hooker, 2007. "Generalized Functional ANOVA Diagnostics for High Dimensional Functions of Dependent Variables". Journal of Computational and Graphical Statistics, 16:709-732.

    Robert Norris, Jessica Ngo, Karen Nolan and Giles Hooker, 2005. "Volunteers are Unable to Properly Apply Pressure Immobilization in a Simulated Snakebite Scenario". Journal of Wilderness and Environmental Medicine, 16:16-21.

    Armin Schwartzman, Matthew Finkelman and Giles Hooker, 2004. "The Stanford Statistics Songbook: A Musical Tribute". Technical Report, Department of Statistics, Stanford University.

    Giles Hooker, 2004. "Diagnostics and Extrapolation in Machine Learning". PhD Thesis, Department of Statistics, Stanford University.

    Giles Hooker and Matthew Finkelman, 2004. "Sequential Analysis for Learning Modes of Browsing". WEBKDD 2004: Proceedings of the Sixth International Workshop on Knowledge Discovery from the Web.

    Giles Hooker, 2004. "Diagnosing Extrapolation: Tree-Based Density Estimation". Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

    Giles Hooker, 2004. "Discovering ANOVA Structure in Black Box Functions". Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

    Giles Hooker and Fuliang Weng, 2003. "Subset Selection in Large, Sparse Systems: An application of the Forward Stagewise approach to Natural Language Processing". Technical Report, Robert Bosch Corp.

    Michael Shirts, Eric Bair, Giles Hooker and Vijay Pande, 2003. "Equilibrium Free Energies from Nonequilibrium Estimates Using Maximum Likelihood Methods". Physical Letters Review. 91(14):140601.

    Giles Hooker, 1999. "Developing a Spline Smoothed Density". Honours Thesis, Department of Mathematics, Australian National University.

    Markus Hegland, Giles Hooker and Stephen Roberts, 1999. "Finite Element Thin Plate Splines in Density Estimation". Computational Techniques and Applications: Proceedings of the Ninth Biennial Conference: CTAC99. Journal of the Australian Mathematical Society, Series B (special issue).

    Prospective Publications:

    These are a range of paper ideas, some of them more likely to turn into papers than others, some of them larger projects than others, that I think worthwhile. Anybody who is interested in them, has relevant data, or knows of authors that have beaten me to it is highly encouraged to contact me. Many of these are also potential graduate student projects.

    "Boosting for Conditional Density and Quantile Estimates": describes a boosting scheme to estimate the conditional density of a response given features at each point in feature space; extensions to directly estimating quantile functions are possible.

    "Disparity Estimation in Nonlinear and Nonparametric Regression". Considers ways to perform Hellinger-distance and other disparity-based estimation for nonlinear regression and extensions to non-parametric regression. With Anand Vidyashankar.

    "Semi-Parametric Boosting": generalizes the ideas in "Learned-Loss Boosting" to a semi-parametric context in which there is an infinite dimensional non-parametric component. Not clear how well this would work, but worth a try.

    "Experiments in Extrapolation and Truncation": Considers a number of post-hoc truncation methods to deal with extrapolation. I will set up some careful experiments and use real-world data to determine what aspects of extrapolation are most salient.

    "A Tale of Two ANOVAs": suppose we have some set of prediction functions for the same task and we want to evaluate where those functions differ. These differences can be measured point-wise using the functional ANOVA in the sense of Ramsay and Silverman. These point-wise differences then define a high dimensional function that may be investigated through a functional ANOVA in the sense of Gu and Wahba. Together, they may provide some insights into what distinguishes the output of different machine learning algorithms.