Ideas/opinions for troubleshooting exploding output values (DQN)

  • 37 Replies
  • 11882 Views
*

Art

  • At the end of the game, the King and Pawn go into the same box.
  • Trusty Member
  • **********************
  • Colossus
  • *
  • 5865
Re: Ideas/opinions for troubleshooting exploding output values (DQN)
« Reply #30 on: August 25, 2017, 09:41:32 pm »
What if they are started individually then stopped all at once? One at a time?

How would those patterns be affected?

I still think random plays a big role here as the player has no control other perhaps than to start or stop the wheels (preferably without seeing the displays on the wheels).

Thoughts?
In the world of AI, it's the thought that counts!

*

Marco

  • Bumblebee
  • **
  • 34
Re: Ideas/opinions for troubleshooting exploding output values (DQN)
« Reply #31 on: August 25, 2017, 10:01:02 pm »
The simulation does not progress until the agent decided to stop or wait. The agent gets the currently visible items as inputs. So if the agent sees a seven on the first reel's slot, then he could decide to stop immediately.

*

keghn

  • Trusty Member
  • *********
  • Terminator
  • *
  • 824
Re: Ideas/opinions for troubleshooting exploding output values (DQN)
« Reply #32 on: August 26, 2017, 02:29:52 am »
The goal is to land head up With A DQN. 
 A original coin like plane with two wing an ailerons and rudder and tail wing has a 99 percent chance of can find a way to land after 12 hours training.   
 A coin with on with under sized controls, 25 percent of full size, will have 70 percent chance of lading head up after 1000 hours of training. 
 The plane with all control wing with 10 percent of the original will take land 55 percent of the time heads up and will take 15 years to train. 
 A coin with no wing will have a 50 percent chance of landing heads up and will take for every to train. 

*

Art

  • At the end of the game, the King and Pawn go into the same box.
  • Trusty Member
  • **********************
  • Colossus
  • *
  • 5865
Re: Ideas/opinions for troubleshooting exploding output values (DQN)
« Reply #33 on: August 26, 2017, 12:59:12 pm »
What about the coin (a nickle in this case) landing on its edge?
http://adsabs.harvard.edu/abs/1993PhRvE..48.2547M

Life always has its exceptions and oddities.
In the world of AI, it's the thought that counts!

*

Marco

  • Bumblebee
  • **
  • 34
Re: Ideas/opinions for troubleshooting exploding output values (DQN)
« Reply #34 on: August 26, 2017, 01:07:30 pm »
Here is an update concerning the implementation:

- A bug was fixed for computing the loss in the regression layer
- The cuda exception is related to the cuda context initialization, which might have to be done in the same thread
- The training data pair was composed maliciously

As of now, I cannot tell if this is leading to a breakthrough, because there is not much time right now to drive further tests and check for the cuda performance.

*

Marco

  • Bumblebee
  • **
  • 34
Re: Ideas/opinions for troubleshooting exploding output values (DQN)
« Reply #35 on: September 07, 2017, 10:15:26 pm »
This is most likely the last update. The GPU issues got fixed, but it turns out that the GPU version runs much slower than the CPU one. Improving performance is not feasible for me due to the missing documentation and comments.

So my plan is now to completely focus on Python. By the end of July, Unity said that there will be an API to make use of Tensorflow and so on in a few weeks. So maybe the whole matter will become obsolete.

In the meantime, I started a repository on GitHub to provide some experimental environments for deep reinforcement learning.
https://github.com/MarcoMeter/AI-Learning-Environments

*

Marco

  • Bumblebee
  • **
  • 34
Re: Ideas/opinions for troubleshooting exploding output values (DQN)
« Reply #36 on: October 25, 2017, 05:25:22 pm »
One month ago, Unity released its ML Agents.
This is what I started to work with.
« Last Edit: October 25, 2017, 11:13:03 pm by Marco »

*

keghn

  • Trusty Member
  • *********
  • Terminator
  • *
  • 824

 


Requirements for functional equivalence to conscious processing?
by DaltonG (General AI Discussion)
November 19, 2024, 11:56:05 am
Will LLMs ever learn what is ... is?
by HS (Future of AI)
November 10, 2024, 06:28:10 pm
Who's the AI?
by frankinstien (Future of AI)
November 04, 2024, 05:45:05 am
Project Acuitas
by WriterOfMinds (General Project Discussion)
October 27, 2024, 09:17:10 pm
Ai improving AI
by infurl (AI Programming)
October 19, 2024, 03:43:29 am
Atronach's Eye
by WriterOfMinds (Home Made Robots)
October 13, 2024, 09:52:42 pm
Running local AI models
by spydaz (AI Programming)
October 07, 2024, 09:00:53 am
Hi IM BAA---AAACK!!
by MagnusWootton (Home Made Robots)
September 16, 2024, 09:49:10 pm
LLaMA2 Meta's chatbot released
by spydaz (AI News )
August 24, 2024, 02:58:36 pm
ollama and llama3
by spydaz (AI News )
August 24, 2024, 02:55:13 pm
AI controlled F-16, for real!
by frankinstien (AI News )
June 15, 2024, 05:40:28 am
Open AI GPT-4o - audio, vision, text combined reasoning
by MikeB (AI News )
May 14, 2024, 05:46:48 am
OpenAI Speech-to-Speech Reasoning Demo
by MikeB (AI News )
March 31, 2024, 01:00:53 pm
Say good-bye to GPUs...
by MikeB (AI News )
March 23, 2024, 09:23:52 am
Google Bard report
by ivan.moony (AI News )
February 14, 2024, 04:42:23 pm
Elon Musk's xAI Grok Chatbot
by MikeB (AI News )
December 11, 2023, 06:26:33 am

Users Online

344 Guests, 1 User
Users active in past 15 minutes:
squarebear
[Trusty Member]

Most Online Today: 597. Most Online Ever: 2369 (November 21, 2020, 04:08:13 pm)

Articles