论文标题
随机马鞍点问题的概括范围
Generalization Bounds for Stochastic Saddle Point Problems
论文作者
论文摘要
本文研究了随机鞍点(SSP)问题的经验鞍点(ESP)解决方案的概括界限。对于具有Lipschitz连续且强烈凸出的凹面目标函数的SSP,我们建立了一个$ \ Mathcal {O}(1/n)$通过使用统一稳定性参数限制的概括。我们还提供了多种假设下的概括界限,包括没有强凸度和没有界面域的情况。我们在两个示例中说明了我们的结果:马尔可夫决策过程中的批处理政策学习,以及对随机游戏的NASH均衡估计的混合策略。在每个示例中,我们都表明,正则化的ESP解决方案具有近乎最佳的样本复杂性。据我们所知,这是ESP概括理论的第一组结果。
This paper studies the generalization bounds for the empirical saddle point (ESP) solution to stochastic saddle point (SSP) problems. For SSP with Lipschitz continuous and strongly convex-strongly concave objective functions, we establish an $\mathcal{O}(1/n)$ generalization bound by using a uniform stability argument. We also provide generalization bounds under a variety of assumptions, including the cases without strong convexity and without bounded domains. We illustrate our results in two examples: batch policy learning in Markov decision process, and mixed strategy Nash equilibrium estimation for stochastic games. In each of these examples, we show that a regularized ESP solution enjoys a near-optimal sample complexity. To the best of our knowledge, this is the first set of results on the generalization theory of ESP.
