Deep learning software makes sense of data like images or audio by looking for statistical patterns it has extracted from past data. Apple’s Photos app can automatically create an album of your pets because it has deep learning algorithms trained on thousands or millions of labeled images of cats and dogs. One way to make a robot grasp objects is to program it to try different approaches and use deep learning on its successes and failures to determine a good claw hold.
This kind of statistical pattern matching has found many, profitable, uses. But George points out that it doesn’t let computers reason about the world, intuit the cause of events, or handle situations outside their past experience. “Just scaling up deep learning is not going to solve those fundamental limitations,” George says. “We’ve made a conscious decision to find and tackle those problems.” Vinod Khosla, the billionaire investor whose firm Khosla Ventures has invested $25 million into Vicarious, says he had trouble finding AI experts to help vet the company as a potential investment. “Everyone knows deep learning, but not this other stuff,” Khosla says.
A provocative paper Vicarious presented in 2017 at a leading deep learning conference illustrates its approach to AI. The company designed experiments that exposed the inflexibility of deep learning software from Alphabet’s DeepMind research group that learned to play vintage Atari games such as Breakout better than top gamers. Vicarious showed how these superhuman AI players crumbled if a game was trivially altered, such as by increasing the brightness of colors or subtly changing the size of objects.
The startup’s own software could handle such changes because they did not affect its understanding of the mechanisms at work in the game. Though the software also learned from past data, it was primed to pick up the causal relationships between objects and events in the game and could use that knowledge to adapt to small changes it hadn’t previously experienced.
Brenden Lake, an assistant professor at NYU, says the paper demonstrated something the field of AI needs to figure out, as talk grows of deep learning hitting its limits. “A key part of human intelligence is building flexible models of the world that can be used in a variety of situations,” Lake says. “I think people are realizing you can’t get there with large-scale pattern recognition systems trained on large data sets for one specific task.”
A Flick, and a Miss
The robotic arm, taller than a person, stacking boxes in one corner of Vicarious’ cavernous factory looks like it’s playing a particularly boring videogame. Whirring and hissing, it picks up cubic boxes and stacks them into a neat grid on a wooden pallet, a common warehouse and factory operation called palletizing. Nearby, a line of robot arms sorts cosmetics into boxes with flair, using firm flicks of their suction fingers to throw items like tubes of lotion into boxes just beyond their reach.
Vicarious is not the only startup using AI to teach industrial robots new tricks. Many, including some featured in WIRED, rely heavily on deep learning. Alphabet recently unveiled a fleet of robots that rove around two of its offices collecting waste and sorting it into trash, recycling, and compostable items.
Phoenix says his robots are distinguished by their flexibility—born of algorithms like those that allowed his Atari bots to adapt to tweaks to a game. Robotic arms that stack pallets are usually paired with expensive feeders that position every incoming box or bin identically. Vicarious’ software is flexible enough to pick up boxes that aren’t perfectly positioned, Phoenix says, and can grab them from an ordinary table. It takes a reporter only about a minute using a touchscreen interface to reprogram the arm to palletize its boxes into a squiffy, blocky take on the WIRED logo.