Curriculum Learning in Deep Reinforcement Learning

  • 1 Replies
  • 498 Views
*

Marco

  • Bumblebee
  • **
  • 34
Curriculum Learning in Deep Reinforcement Learning
« on: January 10, 2018, 03:58:20 pm »
Hi,

I'd like to discuss some thoughts about curriculum learning in deep reinforcement learning. Curriculum learning intends to provide an increasingly more challenging problem to an agent. So the learning starts out with an easy lesson and ends with a complex one.

I'm currently training a ball labyrinth, there the agent (the game board) has to be rotated to make the ball roll into the final hole (red).


The only positive reward is signaled by reaching the final hole. Negative rewards are signaled for the ball falling of the board or through the wrong whole. Also, there is a small negative reward for the ball being located inside a corner.

As the ball starts very far away from the final hole, the only positive reward is triggered really late. This is there curriculum learning comes in to ease the challenge. Each lesson puts the ball's starting position farther away from the final hole like seen in this picture.


For more than 40 inputs (velocity of the ball, rotation of the board, testing wall distances, corner test, hole distances, direction to final hole, distance to final hole), the agent currently makes it to position 6 in a short amount of training steps, but then gets stuck in this lesson. The ball always manages to roll into the rectangular area to the bottom right of position 6.
This is actually not the point which I'd like to discuss. More generally, I started to have concerns about the agent losing memories about the previous lessons. As the neural network faces new data while lessons are completed, doesn't it loose memories of the first lessons, because the model simply gets overridden by the new data? After quiet some time being at position 6, the agent cannot solve the previous positions anymore.


What do you guys think about that concern of using curriculum learning?



I'm thinking about to let the agent redo some lessons occasionally or in the case of being stuck in a lesson for too long.

*

korrelan

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1010
  • Look into my eyes! WOAH!
    • Google +
It thunk... therefore it is!