In this paper, we analyze the convergence properties of gradient methods for logistic regression with L1 regularization. We consider a scenario where the goal is to minimize an L1-regularized logistic regression problem, which involves finding the optimal values for the coefficients of a logistic function that can accurately predict a binary label based on some input features.
To better understand the convergence behavior of gradient methods in this context, we first introduce the concept of an escaping limit cycle. Essentially, this refers to a situation where the gradient method gets stuck in a loop of small improvements, leading to slow convergence or even divergence. By analyzing the worst-case scenario for gradient methods in terms of the escaping limit cycle, we can determine the optimal choice of step size that minimizes the risk of getting stuck in such cycles.
Next, we explore the use of three-term splitting as a way to avoid the escaping limit cycle problem without any lifting. This involves breaking down the optimization problem into smaller subproblems and solving them recursively. We show that under certain assumptions, this approach can lead to faster convergence than using the standard gradient method with a fixed step size.
Finally, we discuss an adaptive variant of three-term splitting called Barzilai-Borwein (BB) step size, which can potentially lead to even faster convergence. This involves dynamically adjusting the step size based on the historical gradient information, rather than using a fixed step size. We compare the performance of these different approaches through simulations and show that they all converge to the same optimal solution, but with varying degrees of speed and accuracy.
In summary, this paper provides insights into the convergence properties of gradient methods for logistic regression with L1 regularization. By analyzing the escaping limit cycle problem and exploring various optimization techniques, we can better understand how to choose the appropriate step size for faster convergence and improved accuracy.
Mathematics, Optimization and Control