Spikes in neural networks

157239n · « **on:** February 21, 2018, 07:49:45 pm »

Hi guys, I'm new here (literally created this account 5 min ago) and I want to ask a technical question about AI. I'm not sure whether this forum is suitable for technical questions or not but here's it:

Context:
So I have been doing AI for a long time now. I'm just curious about it and have no intention for now to apply AI in any fields. I have been testing different kinds of neural networks lately to see which works and which don't. I program all of these in Processing which is based on Java and I don't use any libraries like Tensorflow at all. Because of that, it'll be hard to give out the code as you'll need a lot of time to comprehend it so I'll just give the setup and the results I found here:

Network setup:
Goal: recognize the MNIST handwritten digits (really popular dataset in AI).
Architecture: plain feed forward network
Dimension: 28*28+1 neurons (input layer. Last input always equal to 1 in all samples to replicate the bias), 50 neurons (hidden layer), 10 neurons (output layer).
Correct answer format: If 0 is presented, the first neuron in the output layer will be 1 and all the other 0. Vice versa for every other number.
Activation function: sigmoid
Learning constant: 0.04 (plain backpropergation).
Learning rule: Delta learning rule
Training samples: 1000
Batch size: 150 (last batch have size 100)
Testing samples: 100

Evaluation function used:
Error: Sum over all samples(Sum over all output neurons( abs(prediction-answer) ))
Confidence: Sum over all samples(Sum over all output neurons( min(1-answer, answer) ))/number of samples. The more the network is confident that it is correct, the lower this value is.
Accuracy: Sum over all samples( kronecker delta(max(all output neurons), max(all correct answer neurons)) )/number of samples

Extension evaluation functions:
Confidence: confidence on the first batch of the training samples
Real confidence: confidence on the testing samples
Accuracy: accuracy on the first batch of the training samples
Real accuracy: accuracy on the testing samples

These are the graphs I collected from running this:

This is a closer look at the begining:

Observations:
- Accuracy and real accuracy graphs closely match each other in shape
- Confidence and real confidence graphs closely match each other in shape
- Accuracy keeps improving over time, but most of the time it's really quiet and unchanging and increases only at specific points (let's call them spikes)
- When a spike occurs, confidence increases dramatically, accuracy increases by a bit and error nudges a bit and slopes downward faster, eventually reaching a new, lower equilibrium.

Further experiments:
- The spikes occur very randomly and you can't really tell one is coming up
- Spikes stop happening when you reach an accuracy of around 0.96 (96% correct)
- Sometimes the error drops sharply to 0 and rise back up to its nominal value just before a spike happens.
- When I run it with a learning constant of 0.02, the spikes don't appear at all and the error race towards 0.96 right away. A summary graph of it can be found here: http://157239n.com/frame%203.png
- At first, I suspect the spikes are due to the fact that the network's experiencing learning slowdown to the nature of the sigmoid activation function and the quadratic cost function. This can be dealt with using cross-entropy cost function but why don't the spikes appear if they're still stuck with learning slowdown when the learning constant is 0.02. I am working on implementing the cross-entropy cost function and see what happens but in the mean time, that's all the information I've got.

My question:
So my question is, how can those spikes be formulated? What are those spikes anyway? What caused them? Can I somehow trigger a spike to happen so that the network can continue to learn?

Thanks in advanced if anyone knows about this. Let me know if you have any questions.

Korrelan · « **Reply #1 on:** February 22, 2018, 01:24:49 am »

Local minima problem?

157239n · « **Reply #2 on:** February 22, 2018, 08:54:19 am »

Could be a local minima problem but then why should the network changes at all later on? Meanwhile it stays idle for a while and suddenly gets better. $:-\$

Korrelan · « **Reply #3 on:** February 22, 2018, 01:24:28 pm »

HI 157238m and welcome.

Sorry my reply was so shortâ€¦ I was typing on my phone and I hate typing on my phone lol.

I agree it is a weird one, the â€˜spikeâ€™ seems to be a compounding problem, the randomness of the epoch points me towards something like a local minima scenario, where the variability of the backprop/ training is causing an accumulation error/ event, though how itâ€™s recovering to continue improving is a little confusing lol.

Obviously as all the code is newly/ personally/ hand written and not from a regular tried and tested library/ source Iâ€™m assuming the graphs are correct, though the correlation between the error and accuracy graphs seem to contradict during the spike event, the error rate rises as the accuracy increases.

If you are confident your code is correct then have you checked for buffering problems with the OS you are using, depending on how you are decoding/ presenting the image dataâ€¦ are you confident the images are presented correctly on each iteration. Check variable scope/ resolution/ accuracy within your code, it could be an inherent problem with the Java interpreter/ compiler you are using. If you are using external libraries (sigmoid, etc) check them for accuracy.

Try presenting the training data in a different order, adding noise to the weights or gradually increasing the training weight adjustment rate as the system learns (momentum).

I hope you find the solutionâ€¦ let me know if you do.

keghn · « **Reply #4 on:** February 23, 2018, 09:44:43 pm »

Mapping Neural Connections and Their Development I:

USA's brain club:
https://www.youtube.com/user/SimonsInstitute/videos

keghn · « **Reply #5 on:** February 24, 2018, 03:10:28 pm »

Neuroscientists discover a brain signal that indicates whether speech has been understood:

https://medicalxpress.com/news/2018-02-neuroscientists-brain-speech-understood.html

Spikes in neural networks

157239n

Spikes in neural networks

Korrelan

Re: Spikes in neural networks

157239n

Re: Spikes in neural networks

Korrelan

Re: Spikes in neural networks

keghn

Re: Spikes in neural networks

keghn

Re: Spikes in neural networks

Recent Topics

Recent News

Users Online

Articles