Curriculum Learning in Deep Reinforcement Learning

  • 1 Replies
  • 1899 Views
*

Marco

  • Bumblebee
  • **
  • 34
Curriculum Learning in Deep Reinforcement Learning
« on: January 10, 2018, 03:58:20 pm »
Hi,

I'd like to discuss some thoughts about curriculum learning in deep reinforcement learning. Curriculum learning intends to provide an increasingly more challenging problem to an agent. So the learning starts out with an easy lesson and ends with a complex one.

I'm currently training a ball labyrinth, there the agent (the game board) has to be rotated to make the ball roll into the final hole (red).


The only positive reward is signaled by reaching the final hole. Negative rewards are signaled for the ball falling of the board or through the wrong whole. Also, there is a small negative reward for the ball being located inside a corner.

As the ball starts very far away from the final hole, the only positive reward is triggered really late. This is there curriculum learning comes in to ease the challenge. Each lesson puts the ball's starting position farther away from the final hole like seen in this picture.


For more than 40 inputs (velocity of the ball, rotation of the board, testing wall distances, corner test, hole distances, direction to final hole, distance to final hole), the agent currently makes it to position 6 in a short amount of training steps, but then gets stuck in this lesson. The ball always manages to roll into the rectangular area to the bottom right of position 6.
This is actually not the point which I'd like to discuss. More generally, I started to have concerns about the agent losing memories about the previous lessons. As the neural network faces new data while lessons are completed, doesn't it loose memories of the first lessons, because the model simply gets overridden by the new data? After quiet some time being at position 6, the agent cannot solve the previous positions anymore.


What do you guys think about that concern of using curriculum learning?



I'm thinking about to let the agent redo some lessons occasionally or in the case of being stuck in a lesson for too long.

*

Korrelan

  • Trusty Member
  • ***********
  • Eve
  • *
  • 1454
  • Look into my eyes! WOAH!
    • YouTube
It thunk... therefore it is!...    /    Project Page    /    KorrTecx Website

 


LLaMA2 Meta's chatbot released
by spydaz (AI News )
August 24, 2024, 02:58:36 pm
ollama and llama3
by spydaz (AI News )
August 24, 2024, 02:55:13 pm
AI controlled F-16, for real!
by frankinstien (AI News )
June 15, 2024, 05:40:28 am
Open AI GPT-4o - audio, vision, text combined reasoning
by MikeB (AI News )
May 14, 2024, 05:46:48 am
OpenAI Speech-to-Speech Reasoning Demo
by MikeB (AI News )
March 31, 2024, 01:00:53 pm
Say good-bye to GPUs...
by MikeB (AI News )
March 23, 2024, 09:23:52 am
Google Bard report
by ivan.moony (AI News )
February 14, 2024, 04:42:23 pm
Elon Musk's xAI Grok Chatbot
by MikeB (AI News )
December 11, 2023, 06:26:33 am

Users Online

460 Guests, 0 Users

Most Online Today: 464. Most Online Ever: 2369 (November 21, 2020, 04:08:13 pm)

Articles