Curriculum Learning in Deep Reinforcement Learning

  • 1 Replies
  • 623 Views
*

Marco

  • Bumblebee
  • **
  • 34
Curriculum Learning in Deep Reinforcement Learning
« on: January 10, 2018, 03:58:20 pm »
Hi,

I'd like to discuss some thoughts about curriculum learning in deep reinforcement learning. Curriculum learning intends to provide an increasingly more challenging problem to an agent. So the learning starts out with an easy lesson and ends with a complex one.

I'm currently training a ball labyrinth, there the agent (the game board) has to be rotated to make the ball roll into the final hole (red).


The only positive reward is signaled by reaching the final hole. Negative rewards are signaled for the ball falling of the board or through the wrong whole. Also, there is a small negative reward for the ball being located inside a corner.

As the ball starts very far away from the final hole, the only positive reward is triggered really late. This is there curriculum learning comes in to ease the challenge. Each lesson puts the ball's starting position farther away from the final hole like seen in this picture.


For more than 40 inputs (velocity of the ball, rotation of the board, testing wall distances, corner test, hole distances, direction to final hole, distance to final hole), the agent currently makes it to position 6 in a short amount of training steps, but then gets stuck in this lesson. The ball always manages to roll into the rectangular area to the bottom right of position 6.
This is actually not the point which I'd like to discuss. More generally, I started to have concerns about the agent losing memories about the previous lessons. As the neural network faces new data while lessons are completed, doesn't it loose memories of the first lessons, because the model simply gets overridden by the new data? After quiet some time being at position 6, the agent cannot solve the previous positions anymore.


What do you guys think about that concern of using curriculum learning?



I'm thinking about to let the agent redo some lessons occasionally or in the case of being stuck in a lesson for too long.

*

korrelan

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1082
  • Look into my eyes! WOAH!
    • Google +
It thunk... therefore it is!

 


FAQ on the newly established MIT Stephen A. Schwarzman College of Computing
by Tyler (Robotics News)
Today at 12:00:10 pm
MIT reshapes itself to shape the future
by Tyler (Robotics News)
October 16, 2018, 12:01:15 pm
Tag... you're it...
by Art (AI News )
October 16, 2018, 02:55:01 am
In need of a psychological coach?
by ruebot (AI News )
October 13, 2018, 01:33:37 pm
Leela Chess Zero
by Art (AI News )
October 08, 2018, 03:30:42 pm
Alpha GO Zero
by Art (AI News )
October 08, 2018, 02:48:32 pm
Detecting fake news at its source
by Tyler (Robotics News)
October 06, 2018, 12:00:55 pm
Phase Change Memory
by Art (AI News )
October 05, 2018, 01:21:59 pm

Users Online

77 Guests, 2 Users
Users active in past 15 minutes:
JohnnyWaffles, ivan.moony
[Roomba]
[Trusty Member]

Most Online Today: 134. Most Online Ever: 208 (August 27, 2008, 09:36:30 am)

Articles