Supervised Learning with Missing Values
Seminar in Mathematics
Abstract: Missing data is a pervasive issue in statistical practice, appearing for a range of reasons such as device failure, participant not answering sensitive questions in polls, and ever-increasing data volume. In this presentation, I will briefly review the literature addressing missing data in an inferential framework, and then discuss the challenges posed by missing values in supervised learning. This includes an analysis of the impute-then-regress procedure's consistency when using powerful non-parametric methods such as random forest, as well as a description of the NeuMiss neural network architecture, which allows for joint learning of imputation and regression. Finally, I will illustrate the impact of the methods developed in the causal inference field for estimating treatment effects from incomplete clinical data.
Abstract: Missing data is a pervasive issue in statistical practice, appearing for a range of reasons such as device failure, participant not answering sensitive questions in polls, and ever-increasing data volume. In this presentation, I will briefly review the literature addressing missing data in an inferential framework, and then discuss the challenges posed by missing values in supervised learning. This includes an analysis of the impute-then-regress procedure's consistency when using powerful non-parametric methods such as random forest, as well as a description of the NeuMiss neural network architecture, which allows for joint learning of imputation and regression. Finally, I will illustrate the impact of the methods developed in the causal inference field for estimating treatment effects from incomplete clinical data.
Practical information
- Informed public
- Free
- This event is internal
Organizer
- Institute of Mathematics
Contact
- Prof. Maryna Viazovska, Director