MULTILAYERED NEURAL NETWORK WITH AN AMSGrad OPTIMIZATION LEARNING METHOD
Abstract
In this research, we have tested the AMSGrad optimization learning method for a multilayered neural network using the logistic function. This function describes the doubling process of the local minima number and the Fourier spectra for the error function. As a result of neural network retraining, the learning error function on each neuron is characterized by a set of wave vectors of different periodicities. The average value of the learning error for all neurons can be considered an average value for all existing periodicities. At the same time, the wave vector of the total oscillation can take both commensurate and incommensurate values. The appearance of local minima is shown to be caused by the non-homogeneous learning of the neural network, which is related to the retraining of individual neurons. As the local minimum number increases with the learning rate, so does the number of such neurons.
The AMSGrad optimization method reduces the number of retrained neurons by controlling the exponential rate of average gradients decay and the square of the error objective function gradient. In other words, the learning rate of each neuron is corrected, which removes this system's degeneracy by preventing the processes of each neuron's retraining.
Keywords: multilayered neural network, AMSGrad method, local minimums, block structure.
Full Text:
PDFReferences
- Engelbrecht A. Computational intelligence: an introduction – Sidney: John Wiley & Sons, 2007. – 597 p. DOI: 10.1002/9780470512517
- Hart P. E. The condensed nearest neighbor rule. IEEE Transactions on Information Theory. – 1968. – Vol. 14. – P. 515–516. DOI: 10.1109/TIT.1968.1054155
- Cummins H.Z. Experimental Studies of structurally incommensurate crystal phases. Physics Reports. – 1990. – Vol.185, N 5,6. P. 211–409.
- N. Jankowski, M. Grochowski. Comparison of instance selection algorithms I. Algorithms survey. Artificial Intelligence and Soft Computing: 7th International Conference ICAISC-2004, Zakopane, 7–11 June, 2004: proceedings. – Berlin: Springer, 2004. – P. 598–603. – (Lecture Notes in Computer Science, Vol. 3070). DOI: 10.1007/978-3-540-24844-6_90
- Reinartz T. A unifying view on instance selection / T. Reinartz. Data Mining and Knowledge Discovery. – 2002. – № 6. – P. 191–210. DOI: 10.1023/A:1014047731786
- S. Sveleba, I. Katerynchuk, I. Kunyo, O. Semotiuk, Ya. Shmyhelskyy, N. Sveleba. Electronics and Information Technologies, 2021, Vol. 16, P. 20-35. DOI: https://doi.org/10.30970/eli.16.3
- Tran Thi Phuong, Le Trieu Phong. On the Convergence Proof of AMSGrad and a New Version [Submitted on 7 Apr 2019 (v1), last revised 31 Oct 2019 (this version, v4)] 1904.03590.pdf (arxiv.org).
- S. Sveleba, I. Katerynchuk, I. Kunyo, O. Semotiuk, Ya. Shmyhelskyy, N. Sveleba. Electronics and Information Technologies, 2022, 17. – P. 36–53. DOI: https://doi.org/10.30970/eli.16.3
- Yu. Taranenko. Information entropy of chaos URL: https://habr.com/ru/post/447874/
DOI: http://dx.doi.org/10.30970/eli.25.4
Refbacks
- There are currently no refbacks.