Do any neural network training schemes vary the nonlinear activation function?

  • 2 Replies
  • 451 Views
*

elpidiovaldez5

  • Roomba
  • *
  • 12
I have just read about ‘looks linear’ initialisation https://arxiv.org/pdf/1702.08591.pdf.  This offers the possibility to train deep networks simply by initialising their weights properly (i.e. so that non-linearities cancel out, and the network initially 'looks linear') . It made me wonder if anybody has tried training a deep network with an activation function that starts linearly, and gradually becomes nonlinear, as training proceeds. Without non-linearity a deep network is equivalent to a single linear layer, no matter how many layers it contains. Thus at the outset of training a deep network is essentially 1 layer deep, and very easy to train (gradients do not decay or explode at all).  As training proceeds the non-linearity could be increased gradually. I would like to imagine this as the network effectively, slowly 'expanding' to a greater number of layers.  Of couse the weights would need to be adjusted to track the changing network.  It would be nice if the process could be controlled so that the network expands to the number of layers necessary to solve the problem and then stops.

Has anyone heard of anything like this ?  Opinions ?

*

keghn

  • Trusty Member
  • ********
  • Replicant
  • *
  • 673
A visual proof that neural nets can compute any function:   

http://neuralnetworksanddeeplearning.com/chap4.html



Which Activation Function Should I Use? : 


*

keghn

  • Trusty Member
  • ********
  • Replicant
  • *
  • 673
 When Neural Networks are used as transmission  lines to filters , then yes. Brain does not really use multiplexing like computers.
but they can have many many parallel path turned on at the same time to a filter, which is a little Neural network.
Multiplexing: Doing more with less. // Technology: