My natural alternative to Backpropagation

LOCKSUIT · « **on:** November 17, 2020, 03:44:15 pm »

So I've went through like 10 books this week and did lots of searching and thinking, and am sure about something now and can ask it in a more compact way now:

Mine:
Why don't we use a natural alternative to Backpropagation? Backprop is very complex and unnatural, even Hinton and others say so. Doesn't the brain learn functions by thinking about known ideas? If I don't know the answers to x^2= or 2^x= (put 1, 2, 3, or 4 where the x is) and only have some of the answers, I can realize the algorithm that generates the other answers, by thinking "hmm, let me try 2*2, it =4, nope, not matching observations, maybe it times itself? Repeat. (I'm searching likely answers and may get it right soon)". My alternative (if works) can not only explain how it discovered the function/algorithm behind the observed data but also why it generates answers and unseen answers.

Backprop:
From what I've understood, Backprop learns by taking data and then tweaking the last output layer weights, then the earlier layer weights, backwards, so that the net outputs more accurate answers when prompted. It isn't just tweaking the net until it finds a good brain, it uses (and needs to) the data so that it can find patterns (functions) that generate the observed data. Backprop requires the network to be narrow in the middle and in a hierarchical form so it can learn compressed representations (patterns). The world and Google has built on this way, Backprop is used in like 98% of AI, to make "Backprop" work better (lol...) they gave the net RNN, then residuals, LSTM gates for the gradient vanishing problem, eventually falling back to feedforward in Transformer architectures and using instead positional encoding. And gave Backprop semantic embeds, etc, because Backprop is not running the whole show you see, we need to purposely make the network narrow, really? And once you train it you can't make it bigger unless start over fresh, you can only lower the learning rate until converges. You can't understand anything in the network. The world has built on a super complex math-filled idea adding band-aids to make it work better, all because it "worked better". Adding to this design is hard because it is not a good base to build on... The AI field was at Markov Chains and Hidden Markov Models but then went to RNNs etc.

How does Backprop work?:
As for how Backprop finds functions during its network tweaking, I still don't know how, explain fully in clear English so there's no way of misunderstanding if you can with no math nor 5 pages of text. To me it seems to use the data and rules given (semantics, narrow-ized hierarchy, data) to create new rules to get more out of data (as said 1/2/3/4^2=?), I'm not sure how it works or how good it works but I'm confident there no such thing as a free non-brute-force approach, it is not doing enough to find functions, it only has 1 rule which it to lower cost, and if it IS using semantics etc and data to find the way to tweak the net, then it is doing something more humanly as I suggested. I can't see how you could not look at the observed data and look at related ideas that generate them to do the finding of the function that made the observed data.

HS · « **Reply #1 on:** November 17, 2020, 06:53:24 pm »

Remember Andy was developing some kind of commonsense reasoning? He was convinced it was the key to AI. Lets definitely think about this further. I'll post my ideas if I get any.

frankinstien · « **Reply #2 on:** November 17, 2020, 08:31:54 pm »

Quote

Why don't we use a natural alternative to Backpropagation? Backprop is very complex and unnatural, even Hinton and others say so. Doesn't the brain learn functions by thinking about known ideas? If I don't know the answers to x^2= or 2^x= (put 1, 2, 3, or 4 where the x is) and only have some of the answers, I can realize the algorithm that generates the other answers, by thinking "hmm, let me try 2*2, it =4, nope, not matching observations, maybe it times itself? Repeat. (I'm searching likely answers and may get it right soon)". My alternative (if works) can not only explain how it discovered the function/algorithm behind the observed data but also why it generates answers and unseen answers.

Order from chaos happens if and only if there are feedback loops. In fact, a simple automatic gain control function, found in just about all amplifiers, is a form of learning. Backpropagation proved that systems of feedback loops can learn just about anything, not just the ability to control an amplifier's power. With that said natural feedback loops are in biological systems, most natural neural systems aren't symmetrical like the ones used in AI. The pathways of biological neural networks can be forward neurons that have dendritic connections to neurons receiving inputs and therefore affect the rear layer neurons' sensitivity. Another feedback loop are gilia, where gilia accumulated neural transmitters as neurons fire, and then when they've accumulated too much material they discharge. That discharge is a form of feedback that indicates how active an area is and affects the neurons that receive the discharge.

So backpropagation is natural but biological neurons don't arrange themselves symmetrically the way artificial neural networks do.

LOCKSUIT · « **Reply #3 on:** November 17, 2020, 09:23:28 pm »

Yes feedback, like you try a door, and will try the billion left to try, until find the gold room, but we need short cuts, not feedback... Feedback doesn't tell you "how" to find patterns. It just tells you if you are any better.

With the example I gave above where 1/ 2/ 3/ 4 is where the x is ex. x^2, you will see you get outputs ex. 1^2=1, 2^2=4, 3^2=9, 4^2=16.....you are asked what is 5^2=? You need to find the pattern. You already have it, all you know BTW is 1, 4, 9, 16, ?, and so you have the patterns right there yes! They all are matches. You can get all them by doing the same algorithm. So 1=4 and 1=9 because their context (algorithm that made them) is same. So you get the ? which is 25 (5^2=25). This is typical with all patterns, the brain learns the frequency of words and phrases and letters, it doesn't store the same letter or word twice, it strengthens connections instead by seeing matches. Word2vec is based on the same thing, cat=dog because both eat sleep run etc, like how 1=4=9=16=25 because they all are generated by algorithm Y.

frankinstien · « **Reply #4 on:** November 17, 2020, 10:45:54 pm »

Quote

Yes feedback, like you try a door, and will try the billion left to try, until find the gold room, but we need short cuts, not feedback... Feedback doesn't tell you "how" to find patterns. It just tells you if you are any better.

Not quite, the backpropagation feedback works to find patterns! Artificial NNs are combinational processing units, each unit, or neurode is combined to produce the outputs. So as you open each door it gives you a hint as to what the pattern might be. In your example: 1^2=1, 2^2=4, 3^2=9, 4^2=16 each neurode is asked to give its contribution, if it's wrong it then corrects its weight, it does this until it finds the pattern.

What you're describing is predicting the pattern by understanding the concept of squaring numbers. If you know how to square numbers then you know what to do for the next number in the sequence. Without being able to understand that the pattern is dependent on the concept of squaring a number you would have to use something like a backpropagation technique to figure it out, it's unavoidable...

LOCKSUIT · « **Reply #5 on:** November 17, 2020, 11:12:28 pm »

Quote

concept of squaring a number you would have to use something like a backpropagation technique to figure it out, it's unavoidable...

But I can see how my own brain solves these problems. If I look at a given sequence and try to predict it ex. 3, 29, 649, 8885463, ?, I won't just get instantly the algorithm, I will start brainstorming likely answers like "hmm this has an algorithm behind it and wants me to finish the sequence, let me try and see if it is squaring them, or multiplying it by the last number, or adding all of them, (keeps trying until finds it, or gives up)". All the observed items in 1, 4, 9, 16, 25 are the same (pattern matching), because the same algorithm generated them (x^2), so finding these matches let's you learn the pattern rule function then. Finding this isn't so time taking because you recognize some clue when see the prompt.

If I'm given the 8-queens problem where each cannot be in line of sight to attack each other (its hard to place 8 of them like that lol), it will take me quite some time unless i come up with a algorithm, such as the min-conflict algorithm, which is you don't start over if fail or even partially necesarily, but rather take 1 or 4 off and keep trying to move them so you get less conflicts. Here I am finding a related algorithm/function so that I can get the likely answer faster. Try backpropagating that ? lol I think you cannot? If asked to place 8 queens with no conflicts, why would Backprop learn the min-conflict algorithm? Does it lower cost or time spent? Unsure. But my plan looks natural and simpler.

Humans invent tools/algorithms, physics sims are predictors like Brute Force Search is too, the opposite simply, both are used sparingly and help predictions, algorithms can find algorithms, I can find new brain improvements if I get more improvement algorithms installed in my brain. I can too just by doing the physics sim by hand or in my brain manually, just slow sometimes, but worth it. Backprop is one such rule, not sure if it's a bad mimic though, my alternative may be its true form, so we can't combine both then I bet.

frankinstien · « **Reply #6 on:** November 17, 2020, 11:53:58 pm »

Quote

But I can see how my own brain solves these problems. If I look at a given sequence and try to predict it ex. 3, 29, 649, 8885463, ?, I won't just get instantly the algorithm, I will start brainstorming likely answers like "hmm this has an algorithm behind it and wants me to finish the sequence, let me try and see if it is squaring them, or multiplying it by the last number, or adding all of them, (keeps trying until finds it, or gives up)".

That is because we learn math by instructions, we don't learn math by simply sampling data. So you were taught concepts of mathematics as well as you understand the notation of squaring a number. Any common sense is actually learned rules and the ability to assess goals or objectives. So without doing that or implying it in your approach to solve a problem your brain can't and neither could a machine.

LOCKSUIT · « **Reply #7 on:** November 18, 2020, 12:12:05 am »

So your point is that that is where Backprop can help, it can solve things niether me or a machine can (i mean without backprop :-)). Well, to add it to my AGI design I need to know how it finds patterns, and no one can explain backprop extremely intuitively, so... If I can't add it, then it is same as mine. I know I can add it as a memory in my design.

I need to combine predictors to make a really good predictor AI.... its memories can act as rules/predictors. So, Backprop, if I can't add it to my design ignoring that fact it can be an idea/separate algorithm, then it will change all my AGI design. You'd only do this if it was same but true version of my design. Otherwise would keep it as a separate algorithm ex. we use physics sims, its not exactly part of the human brain so we can ignore those. Is backprop the true form of my design? It's just pattern matching, you HAVE to match features, else you can't find functions/patterns. There's *always some matching linking the items, so yes, matching is all AI runs on.....

LOCKSUIT · « **Reply #8 on:** November 18, 2020, 08:24:29 pm »

So Gradient Descent uses the chain rule on derivatives to try to approximate the direction and size of a jump to take during training to get down steep slopes until is on flat ground or at the bottom of the hill?

It configures the model you give it - the model restricts how good it can do? It works on a batch of training examples at a time.

So the model allows "how accurate it CAN be/represent" at prediction, and Backprop just makes sure it is "using the race car to its fullish potential"?

The direction and size of the jump is the weight and bias adjustment? The slope angle and location offset?

LOCKSUIT · « **Reply #9 on:** November 22, 2020, 10:43:25 am »

I don't recall asking korrelan if he uses Backprop in his AI. Do you? If not, what do you use?

My natural alternative to Backpropagation

LOCKSUIT

My natural alternative to Backpropagation

HS

Re: My natural alternative to Backpropagation

frankinstien

Re: My natural alternative to Backpropagation

LOCKSUIT

Re: My natural alternative to Backpropagation

frankinstien

Re: My natural alternative to Backpropagation

LOCKSUIT

Re: My natural alternative to Backpropagation

frankinstien

Re: My natural alternative to Backpropagation

LOCKSUIT

Re: My natural alternative to Backpropagation

LOCKSUIT

Re: My natural alternative to Backpropagation

LOCKSUIT

Re: My natural alternative to Backpropagation

Recent Topics

Recent News

Users Online

Articles