An optimization algorithm is essential for minimizing loss (or objective) functions in machine learning and deep learning. Optimization algorithms face several challenges, one among which is to determine an appropriate learning rate. Generally, a low ...
An optimization algorithm is essential for minimizing loss (or objective) functions in machine learning and deep learning. Optimization algorithms face several challenges, one among which is to determine an appropriate learning rate. Generally, a low learning rate leads to slow convergence whereas a large learning rate causes the loss function to fluctuate around the minimum. As a hyper-parameter, the learning rate is determined in advance before parameter training, which is time-consuming. This paper proposes a modified stochastic gradient descent (mSGD) algorithm that uses a random learning rate. Random numbers are generated for a learning rate at every iteration, and the one that gives the minimum value of the loss function is chosen. The proposed mSGD algorithm can reduce the time required for determining the learning rate. In fact, the k-point mSGD algorithm can be considered as a kind of steepest descent algorithm. In a real experiment using the MNIST dataset of hand-written digits, it is demonstrated that the convergence performance of mSGD algorithm is much better than that of the SGD algorithm and slightly better than that of the AdaGrad and Adam algorithms.