Consortium on Individual Development


Limiting data loss in infant EEG

Written by Bauke van der Velde (PhD candidate Utrecht University)

“Studying the features of data loss can help prevent biases in gathered data and increase our understanding of how to study infants more effectively in the future”


Studying the infant brain with electroencephalography (EEG) can be difficult. It is impossible to tell an infant to sit still or be attentive for the entirety of an experiment. This can lead to problems with the data (like noise or limited amounts of data), which makes studying infants costly. We wanted to understand what factors, some related to the child (f.e. gender, age, or head shape) or some related to the testing environment (f.e. time of testing, the season of testing, or research assistant present), influence this loss of data.

Small-scale studies often try to keep these factors similar across infants, to limit their effects on data loss. For example, they only test in the morning or have just one research assistant doing all the testing. However, since the YOUth cohort study is such a large study it is impossible to keep many of these factors similar across infants. This makes the data of the YOUth project more susceptible to differences in data loss between infants, but also a perfect guinea pig on which to test the relationship between variation in these factors and data loss.


We wanted to relate data loss to external factors either related to the child or related to the testing environment to:

  • Increase insight into the variety of factors influencing data loss
  • Aiding future studies in understanding what factors to keep similar across their design


We tested 1278 5-month-old infants, 1048 10-month-old infants, and 104 3-year-old toddlers and calculated the loss of data for every infant after analysis. We related this loss of data to the following factors: gender, age, head shape, general well-being (was the infant sick or tired?), time of the EEG experiment (first experiment of the day?), season of testing, and which research assistant. 


We found that a very wide array of factors influenced data loss. Some of these were not very surprising:

  • Research assistants showed great differences in data loss, but nearly all research assistants got better over time. The more experiments a research assistant had performed, the more likely it was that the infant had little data loss.
  • Older participants had lower data loss than younger ones
  • Testing early in the morning led to lower data loss
  • Longer experiments led to increased data loss towards the end

Other were less expected:

  • Boys had less data loss than girls
  • Testing during summer and spring led to lower amounts of data loss
  • Infants with unique head shapes had severely affected data loss


What this study mainly shows is that a very wide range of factors influence data loss in EEG infant studies. Some of these factors cannot be controlled. For example, excluding a certain gender or certain head shapes is often not an option. But it is good to realise, these decisions can bias your data. What if, for example, when you study premature birth in infants and you find that prematurely born infants have a more unique range of head shapes than their controls. Or what if you study a group of infants twice, once in the winter and once in the summer to see how the brain has developed over time. In these cases, certain biases could creep into your results based on differences in data loss between groups. Therefore, it is very important to always keep data loss in mind when designing and analyzing studies.


Because of this, we gave the following advice in the study

  • Pay close attention to possible biases in data loss between your experimental groups. Control for these biases during the analysis.
  • Limit data loss, where possible, by
    1. testing early in the day;
    2. testing during the summer and spring months;
    3. keeping the experiment short;
    4. limit the number of research assistants
  • Always report measures for EEG data loss and pay extra attention to reporting them separately for your different experimental groups
  • Replicate these findings in other labs to better understand how well these findings generalize to other studies and populations

More information

Bauke van der Velde & Caroline Junge (2020) Limiting data loss in infant EEG: putting hunches to the test Developmental cognitive neuroscience DOI: 10.1016/j.dcn.2020.100809

This paper is part of a special issue in Developmental cognitive neuroscience about the Consortium on Individual Development. For an overview of all papers go to here.

YOUth is part of the Utrecht University research theme Dynamics of Youth and part of UMC Utrecht Brain Center