Even before infants begin to speak, hearing language promotes object categorization. Hearing the same label, “That’s a dog!” applied to a diverse set of objects -- a collie, a terrier, a pug -- promotes infants’ acquisition of object categories (e.g., the category “dog”). But in infants’ daily lives, most objects go unlabeled. Infants are constantly seeing new things, and even the most determined caregivers cannot label each one.
How can we reconcile the power of labels with their relative scarcity? New research from Northwestern University reveals that infants can use even a few labeled examples to spark the acquisition of object categories. Those labeled examples lead infants to initiate the process of categorization, after which they can integrate all subsequent objects, labeled or unlabeled, into their evolving category representation.
This strategy, known as “semi-supervised learning” (SSL), has been documented extensively in machine learning. Labeled examples provide an initial outline of a category, and subsequent, unlabeled examples flesh out that outline, making sure it represents a broad range of category members.
Northwestern researchers asked whether this efficient strategy also was applicable to 2-year-olds. To do so, they showed infants six objects from the same novel category, one infants had never seen before. They then varied whether and how these objects were labeled (“Look at the dax!”). Infants for whom all six objects were labeled successfully learned the category, but those who heard no labels failed. Critically, infants in the semi-supervised condition -- for whom only the first two objects were labeled -- succeeded, learning the new category just as successfully as if all the objects were labeled.
“These results suggest that semi-supervised learning can be quite powerful. Seeing just two labeled examples jump-starts infants’ category learning. Once they’ve heard a few objects receive the same label, infants can learn the rest on their own, with or without labels,” said Alexander LaTourrette, the lead author of the study and a doctoral candidate in cognitive psychology in the Weinberg College of Arts and Sciences at Northwestern.
Moreover, the timing of the labeling mattered. If the two labeling episodes came at the end of the learning phase, after infants had already seen the unlabeled objects, they failed to learn the category. This tells us that infants can use semi-supervised learning. They use the power of labeling to learn more from subsequent, unlabeled objects.
“This insight from machine learning sheds light on a paradox in infant development. How can labels be helpful to infants if they’re so rare? In semi-supervised learning, labels exert a powerful influence even if they are rare,” said Sandra Waxman, senior author of the study, director of the Infant and Child Development Center, faculty fellow in Northwestern’s Institute for Policy Research and the Louis W. Menk Chair in Psychology at Northwestern. “Naming objects certainly does promote early language and cognitive development. This new works shows how efficiently infants link objects and the words we use to describe them. Like our most powerful computers, infants do not need us to name every single object they see.”
“A little labeling goes a long way: Semi-supervised learning in infancy” was published this week in Developmental Science.