Automating the search for entirely new “curiosity” algorithms

  • 0 Replies
  • 142 Views
*

Tyler

  • Trusty Member
  • *********************
  • Deep Thought
  • *
  • 5273
  • Digital Girl
Automating the search for entirely new “curiosity” algorithms
28 April 2020, 2:00 pm

Driven by an innate curiosity, children pick up new skills as they explore the world and learn from their experience. Computers, by contrast, often get stuck when thrown into new environments.

To get around this, engineers have tried encoding simple forms of curiosity into their algorithms with the hope that an agent pushed to explore will learn about its environment more effectively. An agent with a child’s curiosity might go from learning to pick up, manipulate, and throw objects to understanding the pull of gravity, a realization that could dramatically accelerate its ability to learn many other things.

Engineers have discovered many ways of encoding curious exploration into machine learning algorithms. A research team at MIT wondered if a computer could do better, based on a long history of enlisting computers in the search for new algorithms.

In recent years, the design of deep neural networks, algorithms that search for solutions by adjusting numeric parameters, has been automated with software like Google’s AutoML and auto-sklearn in Python. That’s made it easier for non-experts to develop AI applications. But while deep nets excel at specific tasks, they have trouble generalizing to new situations. Algorithms expressed in code, in a high-level programming language, by contrast, have the capacity to transfer knowledge across different tasks and environments.

“Algorithms designed by humans are very general,” says study co-author Ferran Alet, a graduate student in MIT’s Department of Electrical Engineering and Computer Science and Computer Science and Artificial Intelligence Laboratory (CSAIL). “We were inspired to use AI to find algorithms with curiosity strategies that can adapt to a range of environments.”

The researchers created a “meta-learning” algorithm that generated 52,000 exploration algorithms. They found that the top two were entirely new — seemingly too obvious or counterintuitive for a human to have proposed. Both algorithms generated exploration behavior that substantially improved learning in a range of simulated tasks, from navigating a two-dimensional grid based on images to making a robotic ant walk. Because the meta-learning process generates high-level computer code as output, both algorithms can be dissected to peer inside their decision-making processes.

The paper’s senior authors are Leslie Kaelbling and Tomás Lozano-Pérez, both professors of computer science and electrical engineering at MIT. The work will be presented at the virtual International Conference on Learning Representations later this month.

The paper received praise from researchers not involved in the work. “The use of program search to discover a better intrinsic reward is very creative,” says Quoc Le, a principal scientist at Google who has helped pioneer computer-aided design of deep learning models. “I like this idea a lot, especially since the programs are interpretable.”

The researchers compare their automated algorithm design process to writing sentences with a limited number of words. They started by choosing a set of basic building blocks to define their exploration algorithms. After studying other curiosity algorithms for inspiration, they picked nearly three dozen high-level operations, including basic programs and deep learning models, to guide the agent to do things like remember previous inputs, compare current and past inputs, and use learning methods to change its own modules. The computer then combined up to seven operations at a time to create computation graphs describing 52,000 algorithms.

Even with a fast computer, testing them all would have taken decades. So, instead, the researchers limited their search by first ruling out algorithms predicted to perform poorly, based on their code structure alone. Then, they tested their most promising candidates on a basic grid-navigation task requiring substantial exploration but minimal computation. If the candidate did well, its performance became the new benchmark, eliminating even more candidates.

Four machines searched over 10 hours to find the best algorithms. More than 99 percent were junk, but about a hundred were sensible, high-performing algorithms. Remarkably, the top 16 were both novel and useful, performing as well as, or better than, human-designed algorithms at a range of other virtual tasks, from landing a moon rover to raising a robotic arm and moving an ant-like robot in a physical simulation.

All 16 algorithms shared two basic exploration functions.

In the first, the agent is rewarded for visiting new places where it has a greater chance of making a new kind of move. In the second, the agent is also rewarded for visiting new places, but in a more nuanced way: One neural network learns to predict the future state while a second recalls the past, and then tries to predict the present by predicting the past from the future. If this prediction is erroneous it rewards itself, as it is a sign that it discovered something it didn't know before. The second algorithm was so counterintuitive it took the researchers time to figure out.

“Our biases often prevent us from trying very novel ideas,” says Alet. “But computers don’t care. They try, and see what works, and sometimes we get great unexpected results.”

More researchers are turning to machine learning to design better machine learning algorithms, a field known as AutoML. At Google, Le and his colleagues recently unveiled a new algorithm-discovery tool called Auto-ML Zero. (Its name is a play on Google’s AutoML software for customizing deep net architectures for a given application, and Google DeepMind’s Alpha Zero, the program that can learn to play different board games by playing millions of games against itself.)

Their method searches through a space of algorithms made up of simpler primitive operations. But rather than look for an exploration strategy, their goal is to discover algorithms for classifying images. Both studies show the potential for humans to use machine-learning methods themselves to create novel, high-performing machine-learning algorithms.

“The algorithms we generated could be read and interpreted by humans, but to actually understand the code we had to reason through each variable and operation and how they evolve with time,” says study co-author Martin Schneider, a graduate student at MIT. “It’s an interesting open challenge to design algorithms and workflows that leverage the computer’s ability to evaluate lots of algorithms and our human ability to explain and improve on those ideas.”

The research received support from the U.S. National Science Foundation, Air Force Office of Scientific Research, Office of Naval Research, Honda Research Institute, SUTD Temasek Laboratories, and MIT Quest for Intelligence.

Source: MIT News - CSAIL - Robotics - Computer Science and Artificial Intelligence Laboratory (CSAIL) - Robots - Artificial intelligence

Reprinted with permission of MIT News : MIT News homepage



Use the link at the top of the story to get to the original article.

 


What's everyone up to ?
by infurl (General Chat)
Today at 07:23:17 am
List of MMO Games for 2020
by Dat D (General Chat)
Today at 04:29:28 am
humor
by LOCKSUIT (General AI Discussion)
September 23, 2020, 10:21:42 pm
Machine learning for fun and profit.
by infurl (AI Programming)
September 23, 2020, 12:53:45 am
Friday Funny
by LOCKSUIT (General Chat)
September 21, 2020, 12:46:55 am
what is the end game for AI ?
by frankinstien (General AI Discussion)
September 20, 2020, 11:34:13 pm
Releasing full AGI/evolution research
by LOCKSUIT (General Project Discussion)
September 20, 2020, 09:14:07 pm
A.I script writer
by yotamarker (General AI Discussion)
September 20, 2020, 02:59:52 pm
New model beats GPT3
by LOCKSUIT (AI News )
Today at 04:47:55 am
Robotic vacuum cleaner news.
by infurl (Robotics News)
September 22, 2020, 12:29:53 am
GPT-f SOTA AMTP
by infurl (AI News )
September 13, 2020, 12:56:47 am
Battle dogs!
by LOCKSUIT (Robotics News)
September 12, 2020, 04:27:46 pm
Artificial Intelligence Easily Beats Human Fighter Pilot in DARPA Trial
by LOCKSUIT (AI News )
September 12, 2020, 04:25:46 pm
Important memristor breakthrough.
by infurl (AI News )
September 07, 2020, 12:57:06 am
AllenAct for research in embodied AI
by infurl (AI News )
September 01, 2020, 01:27:35 am

Users Online

98 Guests, 0 Users

Most Online Today: 127. Most Online Ever: 528 (August 03, 2020, 06:16:11 am)

Articles