Similar to how infants hold expectations for how objects should move and interact with each other, the model, called ADEPT, registers "surprise" when objects in simulations move in unexpected ways, such as rolling behind a wall and not reappearing on the other side. The model, say the researchers, could be used to help build smarter artificial intelligence (AI) and, in turn, provide information to help scientists understand infant cognition.
"By the time infants are 3 months old, they have some notion that objects don’t wink in and out of existence, and can't move through each other or teleport," says Kevin A. Smith, a research scientist in the Department of Brain and Cognitive Sciences (BCS) and a member of the Center for Brains, Minds, and Machines (CBMM). "We wanted to capture and formalize that knowledge to build infant cognition into artificial-intelligence agents. We’re now getting near human-like in the way models can pick apart basic implausible or plausible scenes."
The model observes objects moving around a scene and makes predictions about how the objects should behave, based on their underlying physics. While tracking the objects, the model outputs a signal at each video frame that correlates to a level of "surprise." The bigger the signal, the greater the surprise - for example, if an object dramatically mismatches the model’s predictions by vanishing or teleporting across a scene the model's surprise levels will spike.
ADEPT relies on two modules: an "inverse graphics" module that captures object representations - such as shape, pose, and velocity - from raw images, and a "physics engine" that predicts the objects' future representations from a distribution of possibilities. ADEPT requires only some approximate geometry of each shape to function, which helps the model generalize predictions to new objects, not just those it’s trained on.
"It doesn't matter if an object is rectangle or circle, or if it's a truck or a duck," says Smith. "ADEPT just