Open AI - Hide & Seek - OMG COOL!

LOCKSUIT · « **on:** September 18, 2019, 12:48:08 pm »

OPEN ai does it again
I'll give my take on it below

https://openai.com/blog/emergent-tool-use/

Art · « **Reply #1 on:** September 18, 2019, 12:52:25 pm »

That was very cool. Nice find Lock!!

LOCKSUIT · « **Reply #2 on:** September 18, 2019, 01:46:48 pm »

Some the following [may] sound a tad different, but what they are saying can really be something else.

my notes:
#1 Ok so they can move and pick which objects to LIKELY go to.
#2 They start off randomly moving. True to my name and my baby experiment.
#3 Aha they use Transformer ideas yes yes!!! Now all they need is transformer based simulated motors.
#4 Objects in its sight are payed attention too.
#5 Self-play is key in larger complex 'environments'
#6 Long term evaluation in RL is hard.
#7 Many different tasks can be learned by RL supervision. And more transferable representations are needed.
#8 Self-emergent complex cooperation and competition using self play robotics language RL (THE COMPETITION SELF-PLAY IS A SELF-ATTENTION)
#9 As they say in the video, multi-agents play competition and cooperation survival thousands of rounds of self play against others and past selves in parallel occurred on Earth using Hide & Seek (uhm, ya, sounds more friendly put that way) and the champions (as shown in their video at the end) are the updates.

Gotta sleep but I must refine this tomorrow, so much work! These are notes for now.

Freddy · « **Reply #3 on:** September 18, 2019, 03:12:41 pm »

That little surfing dude was ahead of the game

AndyGoode · « **Reply #4 on:** September 19, 2019, 12:03:31 am »

Quote from: LOCKSUIT on September 18, 2019, 01:46:48 pm

#3 Aha they use Transformer ideas yes yes!!! Now all they need is transformer based simulated motors.

Which ideas are Transformer ideas? Are you referring to the film, or a user by that name?

I wasn't too impressed by the simulation due to the super slow learning, but I liked the simulation environment they had developed. If that environment is available to the public, I'd love to program some of my own ideas into it to see if I could speed up the learning.

LOCKSUIT · « **Reply #5 on:** September 19, 2019, 01:11:27 am »

Oops, I added the wrong link in opening post! Go check now, I meant to link yous to the open ai PAGE

LOCKSUIT · « **Reply #6 on:** September 19, 2019, 04:33:04 am »

No not the film lol. The page I linked.

#1 Ok so they can move and pick which objects to LIKELY go to.
#2 They start off randomly moving. True to my name and my baby experiment.
#3 Aha they use Transformer ideas yes yes!!! Now all they need is transformer based simulated motors.
#4 Objects in its sight are payed attention too.
#5 Self-play is key in larger complex 'environments'.
#6 Long term evaluation in RL is hard.
#7 Many different tasks can be learned by RL supervision. And more transferable representations are needed.
#8 Self-emergent complex cooperation and competition using self play robotics language RL (THE COMPETITION SELF-PLAY IS A SELF-ATTENTION).
#9 As they say in the video, multi-agents play competition and cooperation survival thousands of rounds of self play against others and past selves in parallel occurred on Earth using Hide & Seek (uhm, ya, sounds more friendly put that way) and the champions (as shown in their video at the end) are the updates.

Wait, this is motors, simply they skipped hands/running like atoms and just simulate/learn bubbles at a higher level! What happens is winners duplicate faster and find food faster, losers don't, the winners face off past, current, and other domain creatures. The self play rounds allow it to learn higher level features, this works on its own. But. To pass 1 step in this Game Theory, it must learn, they start random, pay attention to objects/movements, they get reward for hiding/seeking, long term reward is needed still, and yes they have big trial space but they still learn but I bet in larger space complexity they will need language, and lastly the actual hints the agents use in this lower dimensional space to figure out these behavior items is um, ok so they start off randomly moving and they pay attention to objects/motions and instead of GPT-2 on words it does on objects/action however it does look like they track right-to the object and know where to bring it, as if they tried millions of runs (which they did actually). It's as if they are reusing behaviors they learnt already. Can someone explain their amazing behaviors? Why do they run to the correct object etc, its too perfect, as if the ran all possible outcomes, except that wasn't the point and probably not what happened.

It seems like all they do is turn and move forward/backward and grab object. So, all they must learn is sort-of where to go and to choose correct object when see it.....maybe too simple? That'd explain such wonderful behavior, because its such simple actions to learn (there's many correct ways to act).

So the takeway here (not already mentioned i mean) is that looking back/forth far enough is important, to skip lower layers and focus on high end task learning, use a self-attentive GAN R&D feedback competition self-play using many agents in parallel, and use Transformers FOR each step of THAT self play.

goaty · « **Reply #7 on:** September 20, 2019, 10:19:14 am »

its hard to say what they did just looking at it, everyone would put their own works bias on guessing how they did it.
But u could be right, just don't know for sure.

LOCKSUIT · « **Reply #8 on:** September 20, 2019, 01:17:06 pm »

I see exactly where all this is going. It's moving fast. We may be out of here before we can blink. I'm beginning reading their paper and the autocurriculum they gave (see below). The data and the compute will grow soon (yes, bigger computers, millions of times the amount of cpu/etc). We may have to team up with our strong solo potential desires to be a wave with the other big player groups in AI such as the 100 team members in Open AI. Googlebrain works with them too. We feed from them too, but not back, nor do they chat or work with us. People give money to larger teams. Open AI is also rich, too, not just many members.

https://arxiv.org/pdf/1903.00742.pdf

LOCKSUIT · « **Reply #9 on:** September 20, 2019, 02:14:21 pm »

That one line :))))
"The truly intelligent adaptive units are massive multifaceted multi-agent systems that, while being composed of humans (as humans are composed of cells), have their own internal logic and computational mechanisms different from those of individual humans."
See it says groups are more powerful than human, human is more powerful than cell.

LOCKSUIT · « **Reply #10 on:** September 20, 2019, 04:09:45 pm »

Is it so that this algorithm is training the net/s so it will determinstically react to objects/actions/others and top the competitors? So it is just brute forcing the backprop that is needed (static now once gets the correct backprop update), then plays it out and they win (get 1+ reward)

So its like a brute force then???

I mean these little blue and red men are updating their way to react step by step passes clearly, one tops the other, then its their turn

LOCKSUIT · « **Reply #11 on:** September 20, 2019, 04:23:58 pm »

"cooperate or defect"
love it when i see that, hilarious and smarty

LOCKSUIT · « **Reply #12 on:** September 20, 2019, 04:38:04 pm »

See aboves too.... stop laughing i just created new word

They could try this on many other environments....like you have a cat, cage, food, etc, and one gets happy, but then the other gotta top theirs as it takes their food happiness away, repeat.......I'll give a go at it, bit diff but similar: cat finds food......human puts cat in cage......cat unlocks cage using key at back....human goes in cage to remove it....cat barks 8000 times....human puts the mozel on the cat.....cat hits it against the cage, human opens cage to get it back on.....cat bites human....lol

LOCKSUIT · « **Reply #13 on:** September 20, 2019, 04:50:35 pm »

I like how they mix GAN+Transformer+RL+sim

GAN=competing

LOCKSUIT · « **Reply #14 on:** September 22, 2019, 10:39:09 am »

Nasa leaked Google's quantum computer. If you remember, Google suggested they were nearing quantum supremacy. Some links below:

prior
https://www.technologyreview.com/s/610274/google-thinks-its-close-to-quantum-supremacy-heres-what-that-really-means/
https://techcrunch.com/2018/03/05/googles-new-bristlecone-processor-brings-it-one-step-closer-to-quantum-supremacy/
https://www.bloomberg.com/news/articles/2017-10-26/google-s-quantum-supremacy-moment-may-not-mean-what-you-think

yesterday
https://fortune.com/2019/09/20/google-claims-quantum-supremacy/

Open AI - Hide & Seek - OMG COOL!

LOCKSUIT

Open AI - Hide & Seek - OMG COOL!

Art

Re: Open AI - Hide & Seek - OMG COOL!

LOCKSUIT

Re: Open AI - Hide & Seek - OMG COOL!

Freddy

Re: Open AI - Hide & Seek - OMG COOL!

AndyGoode

Re: Open AI - Hide & Seek - OMG COOL!

LOCKSUIT

Re: Open AI - Hide & Seek - OMG COOL!

LOCKSUIT

Re: Open AI - Hide & Seek - OMG COOL!

goaty

Re: Open AI - Hide & Seek - OMG COOL!

LOCKSUIT

Re: Open AI - Hide & Seek - OMG COOL!

LOCKSUIT

Re: Open AI - Hide & Seek - OMG COOL!

LOCKSUIT

Re: Open AI - Hide & Seek - OMG COOL!

LOCKSUIT

Re: Open AI - Hide & Seek - OMG COOL!

LOCKSUIT

Re: Open AI - Hide & Seek - OMG COOL!

LOCKSUIT

Re: Open AI - Hide & Seek - OMG COOL!

LOCKSUIT

Re: Open AI - Hide & Seek - OMG COOL!

Recent Topics

Recent News

Users Online

Articles