two minute paper: learning complex tasks by playing

  • 21 Replies
  • 3734 Views
*

infurl

  • Administrator
  • ***********
  • Eve
  • *
  • 1365
  • Humans will disappoint you.
    • Home Page
two minute paper: learning complex tasks by playing
« on: March 27, 2018, 10:34:38 pm »
https://www.youtube.com/watch?v=veWkBsK0nwU

Here is the latest two minute paper presentation, this time about learning to solve complex tasks with sparse rewards by just playing around like a baby. Amazing piece of work.

*

ivan.moony

  • Trusty Member
  • ************
  • Bishop
  • *
  • 1723
    • mind-child
Re: two minute paper: learning complex tasks by playing
« Reply #1 on: March 27, 2018, 11:11:34 pm »
If you think about it, entire evolution is a kind of experiment + punishment / reward system. Each generation has deviations in forms of individuals that correspond to experiments. Unsuccessful individuals fail to mate and die (punishment), while successful pairs pass their genes on the next generations (reward).

I wonder if over an infinite generations cycles, all the successors appear equally each to other. In other words, are there multiple perfection forms, or there is exactly one perfection form. If the later is the case, we could expect advanced aliens from anywhere in this Universe appear exactly like advanced humans.

*

Freddy

  • Administrator
  • **********************
  • Colossus
  • *
  • 6855
  • Mostly Harmless
Re: two minute paper: learning complex tasks by playing
« Reply #2 on: March 27, 2018, 11:18:47 pm »
I wonder if over an infinite generations cycles, all the successors appear equally each to other. In other words, are there multiple perfection forms, or there is exactly one perfection form. If the later is the case, we could expect advanced aliens from anywhere in this Universe appear exactly like advanced humans.

I wonder this too. Have you heard of convergent evolution ? You get animals and other life that evolve into similar forms despite having had no contact, ie; on different continents etc.

This makes me think that Star Trek could have got it right. Again.

*

infurl

  • Administrator
  • ***********
  • Eve
  • *
  • 1365
  • Humans will disappoint you.
    • Home Page
Re: two minute paper: learning complex tasks by playing
« Reply #3 on: March 27, 2018, 11:29:20 pm »
If there are intelligent aliens they'll either be similar to us, or similar to octopuses. I don't know of any other species that are as intelligent and versatile as those two forms. Evolution isn't about survival of the fittest, it's about survival of the most adaptable, and there many ways to solve that problem. At one extreme you have intelligence and at the other extreme you have vast numbers of bacteria. Both are viable ways for genes to survive. Ironically we still can't survive without all the bacteria that occupy our bodies, but there are many bacteria that are doing just fine without us.

Quote
The perfect is the enemy of the good. -- Voltaire

*

Freddy

  • Administrator
  • **********************
  • Colossus
  • *
  • 6855
  • Mostly Harmless
Re: two minute paper: learning complex tasks by playing
« Reply #4 on: March 27, 2018, 11:34:25 pm »
I was reading this the other day, but couldn't decide how to bring it up. Now seems perfect, thank you  :lightbulb:

https://nypost.com/2018/02/24/heres-what-aliens-probably-look-like/

*

ivan.moony

  • Trusty Member
  • ************
  • Bishop
  • *
  • 1723
    • mind-child
Re: two minute paper: learning complex tasks by playing
« Reply #5 on: March 28, 2018, 12:00:57 am »
The fittest or adaptable, If the evolution is a punishment / reward system, and we know that that kind of system results with an intelligent behavior, than we can label entire evolution as an intelligent process. We talked about this a while ago, but I'm bringing some new conclusions.

*

Freddy

  • Administrator
  • **********************
  • Colossus
  • *
  • 6855
  • Mostly Harmless
Re: two minute paper: learning complex tasks by playing
« Reply #6 on: March 28, 2018, 12:17:22 am »
I don't really get the punishment factor of the end of an evolutionary chain. It's just something that didn't work out isn't it ?

*

ivan.moony

  • Trusty Member
  • ************
  • Bishop
  • *
  • 1723
    • mind-child
Re: two minute paper: learning complex tasks by playing
« Reply #7 on: March 28, 2018, 12:21:02 am »
I imagine having no kids as a punishment. No genes are passed to the next generations to be mixed up with each other.

*

Freddy

  • Administrator
  • **********************
  • Colossus
  • *
  • 6855
  • Mostly Harmless
Re: two minute paper: learning complex tasks by playing
« Reply #8 on: March 28, 2018, 12:24:27 am »
It doesn't seem like a punishment. I'd need to know the final conclusion to know for sure though...

*

ivan.moony

  • Trusty Member
  • ************
  • Bishop
  • *
  • 1723
    • mind-child
Re: two minute paper: learning complex tasks by playing
« Reply #9 on: March 28, 2018, 12:42:13 am »
I have in mind this kind of analogy


1. With thought process an idea gets born:
  • If it does what is expected -> as a reward, the idea gets memorized
  • if it doesn't fit the reality -> as a punishment, the idea gets forgotten
All the successful ideas form an intelligent system that improves over time

2. With evolution a kid gets born:
  • if it succeeds to reproduce -> as a reward, its genes continue to pass to the next generations
  • if it fails to reproduce -> as a punishment, its genes are being cut out of the next generations
All the newborn generations form an intelligent system that improves over cycles.

So the entire evolution looks like a thought process to me

*

infurl

  • Administrator
  • ***********
  • Eve
  • *
  • 1365
  • Humans will disappoint you.
    • Home Page
Re: two minute paper: learning complex tasks by playing
« Reply #10 on: March 28, 2018, 01:03:28 am »
Success and failure, reward and punishment, these don't seem like terms that are applicable to a completely inanimate and unconscious process. Life came about because molecules that self replicate are more likely to stick around than molecules that don't. Molecules that self-replicate even better are even more likely to stick around. Molecules that think they will burn in hell if they don't do their utmost to stick around are most likely of all to stick around.

*

Freddy

  • Administrator
  • **********************
  • Colossus
  • *
  • 6855
  • Mostly Harmless
Re: two minute paper: learning complex tasks by playing
« Reply #11 on: March 28, 2018, 01:36:04 am »
I think it's a fuzzy area  :D

I think it's just the way I perceive the word punishment. I can see where you are coming from Ivan :)

*

Freddy

  • Administrator
  • **********************
  • Colossus
  • *
  • 6855
  • Mostly Harmless
Re: two minute paper: learning complex tasks by playing
« Reply #12 on: March 28, 2018, 02:09:27 am »
Sorry about taking this a little off topic, it was a well presented video and easy to understand - I found it interesting that the learning transferred from a software simulation to hardware. No one tell Lock ;)

Hmm, I was thinking, evolution can almost come to a full stop - because the fossil record shows us that some animals have remained the same for millions of years.

So it seems that once a solution is good enough to ensure survival then evolution stops or pauses. This fits with Ivan's analogy - the reward is continued life, the 'punishment' is no more life from that ancestor.

This robot seeks it's rewards and becomes something useful. How to say it is rewarded, not sure. Perhaps my human perspective is getting in the way.

 :detective:

*

infurl

  • Administrator
  • ***********
  • Eve
  • *
  • 1365
  • Humans will disappoint you.
    • Home Page
Re: two minute paper: learning complex tasks by playing
« Reply #13 on: March 28, 2018, 02:41:09 am »
No one tell Lock ;)

When I posted this I figured he'd be first to reply and all over the thread like a rash. Let's just hope he's absent because he's outside getting some fresh air and exercise.

Hmm, I was thinking, evolution can almost come to a full stop - because the fossil record shows us that some animals have remained the same for millions of years. So it seems that once a solution is good enough to ensure survival then evolution stops or pauses. This fits with Ivan's analogy - the reward is continued life, the 'punishment' is no more life from that ancestor.

That got me thinking. Crocodiles are one of the best examples of this. They've been around unchanged for about 200 million years and they seem to be in little danger of going away any time soon.

https://www.thenakedscientists.com/articles/questions/why-havent-crocodiles-changed

Those two answers a) because they're so successful and b) because nothing else wants their niche, don't seem very satisfactory to me. I would phrase it this way: the niche they occupy has been available unchanged for 200 million years. In that time there have been several extinction level events, Pangaea has broken up to form the continents, and the very composition of the atmosphere has changed, but crocodiles are still there in the same form.

That's because there have always been muddy riverbanks and all the other animals, no matter what form they take, have to come down to drink twice a day. Most other habitats have disappeared and come back again several times over.

*

Freddy

  • Administrator
  • **********************
  • Colossus
  • *
  • 6855
  • Mostly Harmless
Re: two minute paper: learning complex tasks by playing
« Reply #14 on: March 28, 2018, 03:08:20 am »
Quote
the niche they occupy has been available unchanged for 200 million years

This. Yes, some life forms are going to have a hard time evolving fast enough to cope with a quickly changing environment. Like with those mass extinctions you mention, I'm immediately thinking of Ammonites and things; which were plentiful and had a massive environment they could thrive in. But when things went awry failed to cope.

I guess the Crocs got lucky - right time right place.

 


OpenAI Speech-to-Speech Reasoning Demo
by ivan.moony (AI News )
Today at 01:31:53 pm
Say good-bye to GPUs...
by MikeB (AI News )
March 23, 2024, 09:23:52 am
Google Bard report
by ivan.moony (AI News )
February 14, 2024, 04:42:23 pm
Elon Musk's xAI Grok Chatbot
by MikeB (AI News )
December 11, 2023, 06:26:33 am
Nvidia Hype
by 8pla.net (AI News )
December 06, 2023, 10:04:52 pm
How will the OpenAI CEO being Fired affect ChatGPT?
by 8pla.net (AI News )
December 06, 2023, 09:54:25 pm
Independent AI sovereignties
by WriterOfMinds (AI News )
November 08, 2023, 04:51:21 am
LLaMA2 Meta's chatbot released
by 8pla.net (AI News )
October 18, 2023, 11:41:21 pm

Users Online

313 Guests, 0 Users

Most Online Today: 343. Most Online Ever: 2369 (November 21, 2020, 04:08:13 pm)

Articles