The idea is really simple. (But actually theres a few important constraints to it.)
If I have an ordinary physics system, and it supports a computer vision system to give it the visible scene.
This works to correctly bring back reality, the problem is, its 2^(MOTORS*FRAMES IN FRONT) amount of permutations to look ahead.
There is a way, via developing with a genetic algorythm, to see about a second tops, without wrecking it with local optima too much.
I think I know a way to make it see further in front!!
So what I think you need to do is go to a symbolic conversion of the key. And then I custom design a set of generic tasks that manipulate the scene 1 object at a time. (this is so it makes its keys more plastic, but I have to make sure that the tasks are fully fleshed out to make sure that only this object is going to change, and the rest stays the same.) So that way, I can ternary off the rest of the scene and its guaranteed to not affect it no matter what it is.
So now Ive made those first 2 very important tiers in the hierarchy. I can now add even maybe 3 -4 automatic layers! that just take the collection of modifications of the environment, and then clump it over more time.
So if the first physics layer is 1 second in front, the layer in front of that is 10 seconds in front. (ONLY 1 OBJECT MANIPULATED AT A TIME!)
I can then make some automatic layers by reducing the detail of the key, and clumping the amount of objects modified/replaced, for 100 seconds in front, 1000 seconds in front, and then 10000 seconds in front. which is 2 hours 42 minutes in front! They are all only looking 10 frames in front, but its a compounding from the last layers search! And its all nice and symbolic, and the key is as simple as possible, and they are ternary removable all parts of the scene were not manipulated! So they respond very often, which is the measure against the poor amount of patterns to pure permutations... which is .000000000000001% you can get a 0 or 2 off if you compress it as well.
If you can see 2 hours 42 minutes in front, then all I have to do is tell it the environmental key its supposed to be at, when its finished the job, and it may automatically find it from getting given no more information.
All the levels actually have to do a 10 frame brute force, to get the instrumental goal for the level before, but its only linear cost, not exponential to run all the layers.
And you run the last layer once, the second last layer 10 times, then the one before that 100 times, etcetera and u have to run the direct raw physics the most.
So basicly each layer is searching for the clump of changes that were dictated to it from the more longer lasting layer.