In this article, we delve into the realm of empirical risk minimization (ERM) for generalized linear models (GLMs). ERM is a fundamental problem in learning theory and statistics, and its applications are far-reaching. In particular, we focus on optimizing GLMs, which include linear regression, logistic regression, and βπ regression [BCLL18, AKPS19b].
The ERM Problem
Consider a GLM with loss functions π1, . . . , ππ : β
including linear regression, logistic regression, and βπ regression [BCLL18, AKPS19b]. Our goal is to find the optimal parameters π₯ that minimize the empirical risk. Specifically, we want to minimize the total loss πΉ : βπ
ππ : βπ
such that π₯ = arg minπ₯ πΉ(π₯).
Sparsity Considerations
One of the challenges in ERM is dealing with large datasets. When the number of observations is large, the risk function becomes computationally expensive to optimize. To overcome this hurdle, we develop a multiscale notion of "importance scores" for down-sampling πΉ into a sparse representation. This allows us to approximate the objective value with lower computational complexity while maintaining good multiplicative accuracy.
Conclusion
In conclusion, this article delves into ERM for GLMs, addressing the challenges of large datasets and developing multiscale importance scores for sparse approximation. By understanding these concepts, we can better appreciate the essence of ERM and its applications in learning theory and statistics.