Abstract
An adaptive back-propagation algorithm is studied and compared with gradient descent (standard back-propagation) for on-line learning in two-layer neural networks with an arbitrary number of hidden units. Within a statistical mechanics framework, both numerical studies and a rigorous analysis show that the adaptive back-propagation method results in faster training by breaking the symmetry between hidden units more efficiently and by providing faster convergence to optimal generalization than gradient descent.
Original language | English |
---|---|
Title of host publication | Proceedings of the neural information processing systems |
Editors | David S Touretzky, Michael C Mozer, Michael E. Hasselmo |
Place of Publication | Boston |
Publisher | MIT |
Volume | 8 |
ISBN (Print) | 0262201070 |
Publication status | Published - 1996 |
Event | Neural Information Processing Systems 95 - Duration: 1 Jan 1996 → 1 Jan 1996 |
Conference
Conference | Neural Information Processing Systems 95 |
---|---|
Period | 1/01/96 → 1/01/96 |
Bibliographical note
Copyright of the Massachusetts Institute of Technology Press (MIT Press)Keywords
- adaptive back-propagation
- algorithm
- gradient descent
- neural networks
- statistical