Optimization of Variational Inequalities As mentio...
Создано: 12 апреля 2025 г.
Создано: 12 апреля 2025 г.
Optimization of Variational Inequalities
As mentioned in Section 3, both saddle point optimization and game optimization can be cast as
variational inequality problems. In this section, we present methods to solve (VIP).
Standard batch methods. The two standard methods studied in the VIP literature are the gradient
method (Bruck, 1977) and the extragradient method (Korpelevich, 1976). The iterates of the gradient
method are known to converge linearly when the operator is strongly monotone7
(Chen and Rockafellar, 1997), but oscillates for a bilinear operator as shown in Fig. 1. On the other hand, the uniform
average of the iterates of the gradient method converges for any monotone operator with a O(1/
√
t)
rate (Nedic and Ozdaglar ´ , 2009). Finally, the extragradient method does not require any averaging to
converge for monotone operators, and can even converge at the faster O(1/t) rate (Nesterov, 2007).
The idea of this method is to compute a lookahead step (see extrapolation step in Alg. 2) in order to
compute a more stable direction to follow. As we have seen in Fig. 1, oscillations for the gradient
method may result from the game between the generator and the discriminator. The extragradient
method takes advantage of the regularity of F to prevent such oscillations.
Stochastic extensions We present extensions of these standard methods in the context of a stochastic operator, i.e. F(x) := Eξ∼P [F(x, ξ)]. We assume that this expectation cannot be computed and
that we can only sample ξ ∼ P and compute F(x, ξ). This is motivated from the GAN formulation
where we only have access to a finite sample estimate of the expectation, and the gradients are usually
computed on mini-batches. For GANs, ξ is thus a mini-batch of points coming from the true data
distribution p and the generator distribution qθ. In the following, we focus on stochastic algorithms
designed to solve (VIP). These algorithms use stochastic approximations of the gradients (6) of
the respective cost functions of the discriminator and the generator. To handle constraints such as
parameter clipping (Arjovsky et al., 2017), we present a projected version of the algorithms, where
PX (y) denotes the projection of y onto X (see App. A). Note that when X = R
d
, the projection is
the identity mapping (unconstrained setting). For our analysis, we require one of the two following
assumption on the stochastic operator F(x, ξ):
Assumption 1. Bounded variance by σ
2
: Eξ[kF(x) − F(x, ξ)k
2
] ≤ σ
2
, ∀x ∈ X .
Assumption 2. Bounded expected squared norm by M2
: Eξ[kF(x, ξ)k
2
] ≤ M2
, ∀x ∈ X .
Assump. 1 is standard in stochastic variational analysis, while Assump. 2 is a stronger assumption
sometimes made in stochastic convex optimization. To illustrate how strong Assump. 2 is, note that
it does not hold for an unconstrained bilinear objective. It is thus mainly reasonable for bounded
constraint sets. Note that in practice we have σ M. In the following, we present three algorithms:
the first one is the stochastic extension of the gradient method for solving (VIP); the second one is a
stochastic extensions of the extragradient method and the third one is OneSEM, a novel variant of
SEM. Throughout this section
• x
∗
is a solution of (VIP) and x0 is the initialization of our sequence.
• R is set big enough such that R > kx0 − x
∗k and F is a monotone operator.
While we assume the monotonicity of F for our theoretical convergence results, we note that the
merit function presented in Sec. 3 is still defined for non-convex objectives and we can derive local
convergence results (see App. B.3).
объясни подробно что здесь написано
Данный текст посвящен оптимизации вариационных неравенств (VIP) с использованием различных методов, которые применяются для решения задач оптимизации, связанных с вариационными неравенствами, таких как задачи оптимизации для игры или задач с седловыми точками. Давайте подробно разберем основные моменты текста.
Вариационные неравенства — это задачи, в которых требуется найти точку, удовлетворяющую определенным неравенствам, которые выражают условия для решения различных задач оптимизации, например, в задаче обучения генеративных моделей, таких как GAN (Generative Adversarial Networks).
Метод градиента (Gradient Method):
Метод экстраградиента (Extragradient Method):
Стохастический оператор:
Предположения для стохастических операторов:
Проекционные методы:
Этот текст описывает методы, которые используются для решения задач вариационных неравенств, в том числе стохастические расширения классических методов градиента и экстраградиента. В контексте GAN или других генеративных моделей важную роль играет работа с выборками данных, что требует разработки стохастических алгоритмов с учетом ограничений и вариаций данных.