Comparison of the efficiency of zero and first order minimization methods in neural networks

E. A. Gubareva; S. I. Khashin; E. S. Shemyakova

doi:10.26425/1816-4277-2022-11-48-55

Comparison of the efficiency of zero and first order minimization methods in neural networks

E. A. Gubareva, S. I. Khashin, E. S. Shemyakova

https://doi.org/10.26425/1816-4277-2022-11-48-55

Full Text:

PDF (Rus)

Generate QR code

Abstract

To minimize the objective function in neural networks, first-order methods are usually used, which involve the repeated calculation of the gradient. The number of variables in modern neural networks can be many thousands and even millions. Numerous experiments show that the analytical calculation time of an N variable function’s gradient is approximately N/5 times longer than the calculation time of the function itself. The article considers the possibility of using zero-order methods to minimize the function. In particular, a new zero-order method for function minimization, descent over two-dimensional spaces, is proposed. The convergence rates of three different methods are compared: standard gradient descent with automatic step selection, coordinate descent with step selection for each coordinate, and descent over two-dimensional subspaces. It has been shown that the efficiency of properly organized zero-order methods in the considered problems of training neural networks is not lower than the gradient ones.

Keywords

Neural networks, objective function, minimization, gradient, gradient descent, coordinate descent, convergence rate

About the Authors

E. A. Gubareva

State University of Management
Russian Federation

Elena A. Gubareva – Cand. Sci (Phys. and Math.), Assoc. Prof. at the Department of Mathematics and Computer Science

Moscow

S. I. Khashin

Ivanovo State University
Russian Federation

Sergey I. Khashin – Cand. Sci (Phys. and Math.), Assoc. Prof. at the Information Technologies and Applied Mathematics Department

Ivanovo

E. S. Shemyakova

University of Toledo
United States

Ekaterina S. Shemyakova – Ph.D., Assoc. Prof. at the Department of Mathematics and Statistics

Toledo

References

1. Gafarov F.M., Galimyanov A.F. Artificial neural networks and applications: textbook. Allowance. Kazan: Kazan Publishing House. Un-ta; 2018. (In Russian).

2. Nikolenko S., Kadurin A., Arkhangelskaya E. Deep Learning: series “Programmer’s Library”. St. Petersburg: Peter; 2018. (In Russian).

3. Chollet F. Deep Learning with Python. 2nd Ed. Shelter Island: Manning Publications Co.; 2020. 450 р.

4. Buduma N. Fundamentals of Deep Learning. M.: Mann, Ivanov and Ferber; 2020. 306 p. (In Russian).

5. Schroff F., Kalenichenko D., Philbin J. FaceNet: A unified embedding for face recognition and clustering, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015; P. 815–823. https://doi.org/10.1109/CVPR.2015.7298682

6. Yanping Huang et al. GPipe: Efficient training of giant neural networks using pipeline parallelism. arXiv:1811.06965 [cs.CV]. https://doi.org/10.48550/arXiv.1811.06965

7. Bunday B.D. Basic optimization methods. Trans. from Eng. M.: Radio i svyaz’; 1988. 127 p. (In Russian).

8. Kochenderfer M.J., Wheeler T.A. Algorithms for Optimization. MIT Press; 2019. 520 p.

9. Avriel M. Nonlinear programming: analysis and methods. Dover Publishing; 2003. 512 p.

10. Rosenbrock H.H. An automatic method for finding the greatest or least value of a function. The Computer Journal. 1960;3:175–184.

11. Gorodeckii S.Y., Grishagin V.A. Nonlinear programming and multi-extreme optimization. N. Novgorod: N.I. Lobachevsky State University of Nizhny Novgorod Publ. House; 2007. 489 c. (In Russian).

12. Nocedal J., Wright S.J. Numerical Optimization. 2nd ed. NY: Springer New York; 2006. 651 p. https://doi.org/10.1007/978-0-387-40065-5

13. Aculich I.L. Mathematical programming in examples and tasks. Moscow: Vyshaya shkola; 1986. 352 p. (In Russian).

14. Prokopenko M.Y. Optimization methods. N. Novgorod: NNGASU; 2018. 118 p. (In Russian).

Review

For citations:

Gubareva E.A., Khashin S.I., Shemyakova E.S. Comparison of the efficiency of zero and first order minimization methods in neural networks. Vestnik Universiteta. 2022;1(11):48-55. (In Russ.) https://doi.org/10.26425/1816-4277-2022-11-48-55

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 1816-4277 (Print)
ISSN 2686-8415 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Vestnik Universiteta

Comparison of the efficiency of zero and first order minimization methods in neural networks

Full Text:

Abstract

Keywords

About the Authors

References

Review

For citations:

Cookies policy