Do any neural network training schemes vary the nonlinear activation function?

  • 1 Replies


  • Roomba
  • *
  • 9
I have just read about ‘looks linear’ initialisation  This offers the possibility to train deep networks simply by initialising their weights properly (i.e. so that non-linearities cancel out, and the network initially 'looks linear') . It made me wonder if anybody has tried training a deep network with an activation function that starts linearly, and gradually becomes nonlinear, as training proceeds. Without non-linearity a deep network is equivalent to a single linear layer, no matter how many layers it contains. Thus at the outset of training a deep network is essentially 1 layer deep, and very easy to train (gradients do not decay or explode at all).  As training proceeds the non-linearity could be increased gradually. I would like to imagine this as the network effectively, slowly 'expanding' to a greater number of layers.  Of couse the weights would need to be adjusted to track the changing network.  It would be nice if the process could be controlled so that the network expands to the number of layers necessary to solve the problem and then stops.

Has anyone heard of anything like this ?  Opinions ?



  • Trusty Member
  • *******
  • Starship Trooper
  • *
  • 445
A visual proof that neural nets can compute any function:

Which Activation Function Should I Use? : 


Users Online

14 Guests, 1 User
Users active in past 15 minutes:
[Global Moderator]

Most Online Today: 28. Most Online Ever: 208 (August 27, 2008, 09:36:30 am)