Ai Dreams Forum
Artificial Intelligence => General AI Discussion => Topic started by: elpidiovaldez5 on October 15, 2017, 02:54:08 pm
-
I just read Deepmind's paper on their new Reinforcement Learning system which uses 'imagination' during problem solving. It's pretty cool. The features I liked are:
- Plans ahead using an Environment Model (EM), which predicts what will happen when it takes an action in a given state.
- Can build the environment model as it learns, or use a supplied model.
- Deliberately robust to errors in the EM. If the EM does not help, system learns to ignore or down-weight it, thus falling back on standard Reinforcement Learning.
The planning uses various simple look-ahead schemes, e.g.:
- Considering all alternative actions for next step.
- Recursively chaining actions to predict (imagine) the result of the next N steps.
- Learning to combine the above two methods to perform plan-tree expansion - I think they never actually used this idea
Clearly the Environment Model is the interesting part of the system. The details are a bit sketchy. The idea is to train a neural net to take the current state, and a possible action, as input and output a probabalistic, imagined next state. Since the input state is a pixel image giving a view of a game, the output state gives pixel probabilities.
The EM clearly works playing Sobokan, but it occurs to me that actions are quite deterministic in this game. i.e. if you push a box forward into an empty space, the player and the box both move forward one step. The EM should generate near 100% probability for the new positions. The situation would be quite different if the action was e.g. let fall a pen balanced on its end. Here the resulting pen position is quite non-deterministic, although there is a well defined locus of positions where it might end up. A probabalistic model would 'average' together the possible positions giving a small probability to all positions on a circle. That is not what really happens.
Hence my reason for writing this message. Could a more sophisticated EM not be implemented using Generative Adversarial Networks.? These are well known for 'imagination' applications. The benefit is that they generate realistic specific outcomes. They do not blur together multiple possibilities. Of course one could run the GAN multiple times to search within the variation in outcomes. If it is run enough times the probability distribution emerges.
Of course the system just needs information from the EM to choose the next step.. It is possible that a blurry probability distribution provides this information better than actual possible future events. Thoughts ?
-
Imagination-Augmented Agents for Deep Reinforcement Learning:
https://www.youtube.com/watch?v=agXIYMCICcc
-
old Hassabis needs some better ideas imo. :)
-
I sure he will steal them form some else.
-
haha!
I guess what this would involve would be->
* how far in the future it looks, (this is processor hungry.)
* what are the symantics hes using (these are hard to detect so i guess it would be fairly raw)
* how it intertwines with the robots decision. (i guess its some weighting it develops.)
-
GAN are two NN working together. The front end is the detector. The other NN is the generator. Recreates the image from the detector.
Once the detector is trained to detect a cat then the detector is turned away from the real cat and is pointed a canvas. The generator will star
painting and drawing like. Like that of the GIMP imaged editor. When the first NN detector see a cat in this painting it will activate.
It is done. The total GAN system can now detect a cat and recreate it to show to other.
BUT! The generator NN does not start out with a clean canvas to start with. The out puts of the detector NN do not go directly into the
generator NN. The out put of the detector NN are just bell that go off when painting is finished. So the generator neural network MUST take
input a value and then has to convert this into cat, on the output.
At fist the input value of one, "1" into the generator NN generate a output of garbage. A GIMP translation program need
to adjust the weights in the generator NN to force a image out.
This can be done to make NN play video. Input the value of "one" for the first frame of video then train the output to match the fist frame.
Then move to the next frame and then input the value of "two" and repeat.
Them maybe later convert the value of 2 to dog and so on.
So a generative NN can take a input of 0 to ten million. This way the brain can replay video of what it has seen or jump to some thing of interest.
When the front end of the detector NN sees a cat a output of one pixel goes high to a vale of one on the output of first NN.
The output side of first detector is a matrix of 100 x 100 pixels. So there are total 10,000 potential animals that can be detected. The
cat out put is at x 1000 an y 50.
-
my problem i have with this, is it wont generate anything new than what the inference nn has inside of it, so u just are copying the data across - doing nothing interesting at all.
-
@rouncer81 I remember that Auto encoder of the race car. I believe i have videos of that on my hard drive.
NN transform input data into output data. Yes in that it can also transform the original back to the original.
I find that very interesting in that all neurons are leaning so if something is left out it will recreated like a hologram.
It not working like a bundle of fiber optic connected input pixel directly to output pixel. But NN are so flexible that they can
be trained to be transmission lines like the optic nerve:)
-
Progressive Growing of GANs for Improved Quality, Stability, and Variation:
https://www.youtube.com/watch?v=XOxxPcy5Gr4&feature=youtu.be&t=38
-
AI learns to Create ̵K̵Z̵F̵ ̵V̵i̵d̵e̵o̵s̵ Cat Pictures: Papers in Two Minutes #1:
https://www.youtube.com/watch?v=MUVbqQ3STFA
-
Video Game Graphics To Reality And Back | Two Minute Papers #203:
https://www.youtube.com/watch?v=dqxqbvyOnMY
-
$$$ DeepMarketing $$$$ Breaking News$$$
I don't want to read more than the tittle.
Being paid millions to confuse "imagination" and "deliberation".
-
Generative Adversarial Networks — A Deep Learning Architecture:
https://hackernoon.com/generative-adversarial-networks-a-deep-learning-architecture-4253b6d12347