In a world of conflicting views, let’s draw attention to something we can all agree on: if I show you my pen and then hide it behind my back, my pen still exists – even if you can’t see more. We can all agree that it still exists and is probably the same shape and color as it was before it went behind my back. This is just common sense.
These common sense laws of the physical world are universally understood by humans. Even two month old babies part this one concept. But scientists are still puzzled by some aspects of how we achieve this fundamental understanding. And we have yet to build a computer that can compete with the common sense of a normally developing child.
New Research by Luis Piloto and colleagues at Princeton University — whom I’m reviewing for an article in Nature Human Behavior — takes a step toward filling this gap. The researchers created a deep learning artificial intelligence (AI) system that gained insight into some of the common sense laws of the physical world.
The findings will help build better computer models that simulate the human mind, by approaching a task with the same assumptions as a baby.
Usually, AI models start with a clean slate and are trained on data with many different examples, from which the model constructs knowledge. But research in babies suggests this isn’t what babies do. Instead of building knowledge from scratch, babies start with some principled expectations about objects.
For example, they expect that if they pay attention to an object that is then hidden behind another object, the first object will persist. This is a core assumption that points them in the right direction. Their knowledge then becomes more refined with time and experience.
The exciting finding from Piloto and colleagues is that a deep learning AI system modeled after what babies do outperforms one that starts with a clean slate and tries to learn based on experience alone.
Cube slides and balls into walls
The researchers compared both approaches. In the blank version, the AI model got several visual animations of objects. In some examples, a cube would slide down a slope. In others, a ball bounced off a wall.
The model detected patterns from the various animations and was then tested for its ability to predict outcomes with new visual animations of objects. This performance was compared to a model that had “expectations of principle” built in before visual animations.
These principles were based on the expectations babies have about how objects behave and interact with each other. For example, babies expect that two objects should not mix.
If you show a baby a magic trick where you violate this expectation, they can detect the magic. They reveal this knowledge by looking significantly longer at events with unexpected or “magic” results, compared to events whose results are expected.
Babies also expect an object not to just blink in and out of existence. She can detect when this expectation is also violated.
Piloto and colleagues found that the deep-learning model that started with a clean slate did a good job, but the object-centered encoding model, inspired by infant cognition, fared significantly better.
The latter model could more accurately predict how an object would move, was more successful in applying expectations to new animations, and learned from a smaller set of samples (e.g. it succeeded after the equivalent of 28 hours of video).
An innate understanding?
Obviously, learning through time and experience is important, but it’s not the whole story. This research by Piloto and colleagues contributes to understanding the age-old question of what can be innate in humans and what can be learned.
In addition, it defines new frontiers for the role that perceptual data can play when it comes to the acquisition of knowledge through artificial systems. And it also shows how studies on babies can help build better AI systems that simulate the human mind.