Skip to main content
Move the World.
reinforcement learning robots

Lead Image © BublikHaus / Adobe Stock

Treats-for-tricks works for training dogs — and apparently AI robots, too.

That's the takeaway from a new study out of Johns Hopkins, where researchers have developed a new training system that allowed a robot to quickly learn how to do multi-step tasks in the real world — by mimicking the way canines learn new tricks.

Reinforcement Learning

One day, AI robots could clean our homes, care for our elderly, and do all of the dull, dirty, and dangerous jobs we don't want to do.

But the real world is complicated. Developers will need to train robots to learn on the job — it'd be impossible to program a dish-cleaning robot to recognize every possible dirty dish, for example, but it still needs to know what to do when an unfamiliar one turns up in the sink.

One way developers train AIs is by letting them explore a virtual world and "rewarding" them when they do something right. This technique is called reinforcement learning, and it's not unlike how we train dogs — they do a trick, they get a treat.

While it can be effective, reinforcement learning can also be time-consuming — the AI might try a lot of things before landing on the reward-worthy trick.

To overcome this limitation, the JHU team developed a new reinforcement learning framework they call Schedule for Positive Task (SPOT).

"The question here was how do we get the robot to learn a skill?" lead author Andrew Hundt said in a press release. "I've had dogs so I know rewards work and that was the inspiration for how I designed the learning algorithm."

See SPOT Stack

In the SPOT framework, the robot's "reward" isn't a tasty treat but numerical points. The "trick," meanwhile, is stacking multiple blocks on top of one another.

One way to speed up training time, the researchers discovered, was to reward their AI for doing "sub tasks." These would be equivalent to trying to train a dog to sit and giving it a treat if it starts to lower its rear — the dog didn't do exactly what you wanted, but it's on the right path.

"It used to take a month to achieve 100% accuracy. We were able to do it in two days."

Andrew Hundt

It also helped if the AI lost points for doing something that negated its previous progress, like knocking over the blocks after stacking them — this is called "progress reversal."

They also coded some common sense into the AI, pre-programming it with intuitions to avoid wasting time on dead ends and recognize what it was supposed to do more quickly.

"(G)rasping at thin air isn't worth a robot's time, but (since) robots learn through trial and error, they would not typically have this intuition, until now," Hundt told Freethink. "We have developed a practical way for the robot to incorporate this common sense knowledge into a safety check, which skips the actions which are definitely not worth trying."

The Future of the SPOT Framework

In total, their framework allowed them to train an actual robot — not just an AI in a virtual world — to accurately complete multi-step tasks much faster than another common reinforcement learning method.

"(The robot) quickly learns the right behavior to get the best reward," Hundt said in the press release. "In fact, it used to take a month of practice for the robot to achieve 100% accuracy. We were able to do it in two days."

His hope is that the SPOT framework might one day help AI developers train robots to do things far more complicated than stacking blocks.

"We believe that with further development, this technology has the potential to change a variety of industries for the better, from home care and surgery to warehousing and even self-driving cars," he told Freethink.

We'd love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Up Next

Artificial Intelligence
Training a Home Robot To See — and Hear
Home Robot
Artificial Intelligence
Training a Home Robot To See — and Hear
Facebook's AI lab has released new tools for its embodied AI training platform, including one to train a home robot to respond to sounds.

Facebook's AI lab has released new tools for its embodied AI training platform, including one to train a home robot to respond to sounds.

The Future Explored
Robots Can Read Your P-P-P-Poker Face
emotion ai
The Future Explored
Robots Can Read Your P-P-P-Poker Face
Emotion-detecting technology could be the next frontier of personalization. But what does this mean for privacy?

Emotion-detecting technology could be the next frontier of personalization. But what does this mean for privacy?

Wrong
Will Robots Take Our Jobs?
Will Robots Take Our Jobs?
Watch Now
Wrong
Will Robots Take Our Jobs?
Since the dawn of the industrial revolution, fear of automation has been on the rise. From weaving machines to cars...
Watch Now

Since the dawn of the industrial revolution, fear of automation has been on the rise. From weaving machines to cars to computers, cries about robots coming to take our jobs keep getting louder. But if the robots are taking our jobs—and they are!—then why do there seem to be even more jobs than ever? What are we getting… wrong?

Robotics
This AI Robot Could Become Your Favorite Coworker
AI Robot
Robotics
This AI Robot Could Become Your Favorite Coworker
European researchers have developed an AI robot to safely work alongside humans, anticipating their needs and providing a strong pair of extra hands.

European researchers have developed an AI robot to safely work alongside humans, anticipating their needs and providing a strong pair of extra hands.

Environment
The Robot Racing to Study Antarctica’s Massive Ice Melt
The Robot Racing to Study Antarctica’s Massive Ice Melt
Environment
The Robot Racing to Study Antarctica’s Massive Ice Melt
Icefin, a semi-autonomous research vessel, is on a mission to search for clues about one of the continent’s fastest melting glaciers, the Thwaites Glacier.
By Sarah Wells

Icefin, a semi-autonomous research vessel, is on a mission to search for clues about one of the continent’s fastest melting glaciers, the Thwaites Glacier.

Agriculture
This Massive Farm Robot Is Helping Secure the Future of Food
Farm Robot
Agriculture
This Massive Farm Robot Is Helping Secure the Future of Food
A massive farm robot is analyzing crops in Arizona, helping identify ones that could grow in hotter climates to help secure the future of food.

A massive farm robot is analyzing crops in Arizona, helping identify ones that could grow in hotter climates to help secure the future of food.

Uprising
The Construction Robots Building Space Colonies
contruction robots
Uprising
The Construction Robots Building Space Colonies
Sending construction robots into outer space will help pave the way for human exploration, but there are some real challenges that lie ahead.
By Tien Nguyen

Sending construction robots into outer space will help pave the way for human exploration, but there are some real challenges that lie ahead.

Robotics
Underwater Robots May Be The Future of Deep Sea Mining
Robotics
Robotics
Underwater Robots May Be The Future of Deep Sea Mining
Tech requires raw material, and some lays at the bottom of the sea. Could underwater robots be the future of deep sea mining?

Tech requires raw material, and some lays at the bottom of the sea. Could underwater robots be the future of deep sea mining?