Highly Irregular Functional Generalized Linear Regression with Electronic Health Records

Abstract

This work presents a new approach, called Multiple Imputation of Sparsely-sampled Functions at Irregular Times (MISFIT), for fitting generalized functional linear regression models with sparsely and irregularly sampled data. Current methods do not allow for consistent estimation unless one assumes that the number of observed points per curve grows sufficiently quickly with the sample size. In contrast, MISFIT is based on a multiple imputation framework, which, as we demonstrate empirically, has the potential to produce consistent estimates without such an assumption. Just as importantly, it propagates the uncertainty of not having completely observed curves, allowing for a more accurate assessment of the uncertainty of parameter estimates, something that most methods currently cannot accomplish. This work is motivated by a longitudinal study on macrocephaly, or atypically large head size, in which electronic medical records allow for the collection of a great deal of data. However, the sampling is highly variable from child to child. Using MISFIT we are able to clearly demonstrate that the development of pathologic conditions related to macrocephaly is associated with both the overall head circumference of the children as well as the velocity of their head growth.

Publication
In Journal of the Royal Statistical Society: Series C (Applied Statistics)
Justin Petrovich
Justin Petrovich
Associate Professor of Statistics and Business Data Analytics

My research interests include functional data analysis, longitudinal data analysis, and applied statistics.