已知正例点：$\mathbf{x}_1 = (1,2)^T, \quad \mathbf{x}_2 ...

Frage

已知正例点： $\mathbf{x}_1 = (1,2)^T, \quad \mathbf{x}_2 = (2,3)^T, \quad \mathbf{x}_3 = (3,3)^T$

负例点： $\mathbf{x}_4 = (2,1)^T, \quad \mathbf{x}_5 = (3,2)^T$

试求最大间隔分离超平面，并指出所有的支持向量。

Gedanke

Finding hyperplane for SVM

I’m given a set of points where positive examples are (1,2), (2,3), (3,3), and negatives are (2,1), (3,2). I need to find the maximum margin hyperplane that separates these points using the formula w^T x + b = 0. Positive labels are y=+1, negatives are y=-1. The goal is to solve the hard margin SVM problem by maximizing the margin while minimizing ||w||^2. I'll check for which points are support vectors as I begin exploring the separating hyperplane. Let's see what works!

Finding separating hyperplane

I’m trying to sketch a separating hyperplane for the points: positives (1,2), (2,3), (3,3) and negatives (2,1), (3,2). A possible candidate line is of the form x2 - x1 - 1 = 0, but it doesn't work since it places (3,3) on the wrong side.

I’ll try another line, x2 - x1 - 0.5 = 0, but that doesn't work either. Given the coordinates, the line should be roughly diagonal.

The support vectors seem to be (1,2), (3,3), (2,1), (3,2). I’ll need to refine my margin conditions using them.

Finding optimal hyperplane

I’m trying to figure out the maximum margin hyperplane with w = (w1, w2)^T and b. The constraints are that for positive points, w1x1 + w2x2 + b >= 1, and for negatives, w1x1 + w2x2 + b <= -1.

I hypothesize that the support vectors are closest to the decision boundary, and I’m testing various combinations of points to find the optimal separating hyperplane, adjusting w and b until the margin is satisfied. I’ll need to carefully check the scaling factor and solve the system to finalize the solution.

Refining support vector analysis

I’m testing potential support vectors and refining my hyperplane calculation. Initially, I guessed (2,3), (3,3), and (3,2) might be support vectors. After calculations, I adjusted w and b values but encountered issues with the margin.

After more exploration using different candidates, I found that the expected optimal hyperplane could have a normal vector (1,-1), passing through x2 - x1 + 0.5 = 0. After some scaling, I arrived at w = (-2,2) and b = 1, which now properly separates the points with the correct margin. So, this seems to be a strong candidate!

Reassessing support vectors and margin

I'm continuing to explore the correct support vectors. After testing combinations like (2,3), (3,3), and others, I noticed some misclassifications, like (3,3) not being on the correct side. I completed some margin calculations but realized certain vectors don't satisfy the margin condition required for SVM. My latest conclusion is that (3,3) might be the problem, needing to become a support vector. I need to explore this further and check which other negative support vector best matches. It’s quite a puzzle, but I think I’m getting closer!

Recalculating margin optimization

While working through candidate support vectors like (3,3) and (2,1), I adjusted w and b to ensure constraints were met. I found that (3,3), (2,1), and (3,2) aligned correctly while others didn't satisfy the margin. After recalculating, I concluded that the margin for w = (-2, 2) with b = 1 was 0.3536, but upon considering alternate pairings, I achieved a larger margin of 0.447 and satisfied the constraints for all support vectors. This candidate seems to be a good solution.

Evaluating optimal margin

I’ve been experimenting with different pairings of support vectors like (2,3), (3,2), and (1,2) to maximize the margin. After testing various combinations, I've found that the candidate w = (-1,2), b = -2 produces a margin of approximately 0.447 and satisfies all constraints for positive and negative points. While other pairings were less effective, this one seems to work well as it satisfies the KKT conditions and provides a reasonable margin with the active constraints.

Reasoned for 1m 53s

Antwort

我们希望找到一个分离正负例的超平面

w^T x + b = 0,

使得对所有正例（标签 $+1$ ）满足

w^T x_i + b \ge 1,

对所有负例（标签 $-1$ ）满足

w^T x_i + b \le -1,

并使得间隔（距离两个平行决策面 $w^T x+b= \pm1$ 的距离）最大，即使得

\gamma=\frac{1}{\|w\|}

最大，也就是使 $\|w\|^2$ 最小。

给定的数据为

正例： $x_1=(1,2)^T,\quad x_2=(2,3)^T,\quad x_3=(3,3)^T;$
负例： $x_4=(2,1)^T,\quad x_5=(3,2)^T.$

下面介绍一种求解思路（实际上数据较少，可用几何方法“猜‐调”出最优解）。

几何观察

将各点画在平面上，可以发现正例点大致分布在左上（ $(1,2)$ ）和右上区域（ $(2,3), (3,3)$ ），而负例点在右下或中下区域（ $(2,1), (3,2)$ ）。
观察各点在某个方向上的投影，试图找出哪几个点距离分离边界最近。经过尝试，可以发现如果选取超平面使得
$w^T x + b = \begin{cases} 1, & \text{对于} \; x_1=(1,2)\ \text{和}\ x_3=(3,3),\\[1mm] -1, & \text{对于} \; x_5=(3,2), \end{cases}$
那么其它点均满足不等式（注意： $x_2=(2,3)$ 和 $x_4=(2,1)$ 离界面较远）。

设未知数求解

设

w=(w_1,\,w_2)^T,\quad b\in\mathbb{R}.

我们要求使下列支持向量上的约束取等号：

\begin{cases} w^T(1,2)^T+b = w_1+2w_2+b = 1,\\[1mm] w^T(3,3)^T+b = 3w_1+3w_2+b = 1,\\[1mm] w^T(3,2)^T+b = 3w_1+2w_2+b = -1. \end{cases}

减去第一式与第二式可得：

(3w_1+3w_2) - (w_1+2w_2)=2w_1+w_2 = 0,\quad\text{即}\quad w_2=-2w_1.

将 $w_2=-2w_1$ 代入第一式：

w_1+2(-2w_1)+b = w_1-4w_1+b = -3w_1+b = 1,

所以

b=1+3w_1.

再代入第三式：

3w_1+2(-2w_1)+ (1+3w_1)= 3w_1-4w_1+1+3w_1 = 2w_1+1 = -1.

解得

2w_1 = -2,\quad w_1 = -1.

因此

w_2=-2(-1)=2,\quad b=1+3(-1)= -2.

得到最优超平面

我们得到了

w=(-1,\,2)^T,\quad b=-2.

因此决策函数为

f(x)= -x_1+2x_2-2.

超平面方程为

-x_1+2x_2-2=0\quad \text{或等价写为} \quad 2x_2-x_1-2=0.

计算 $\|w\|=\sqrt{(-1)^2+2^2}=\sqrt{1+4}=\sqrt{5}$ ，所以间隔为

\gamma=\frac{1}{\sqrt{5}}.

检查各点

我们来验证各点是否满足要求（注意正例需 $f(x)\ge1$ ，负例需 $f(x)\le -1$ ）：

$x_1=(1,2)$ （正例）： $f(x_1)= -1+4-2=1 \quad (\text{正好为}1,\text{故为支持向量})$
$x_2=(2,3)$ （正例）： $f(x_2)= -2+6-2=2 \quad (\ge1)$
$x_3=(3,3)$ （正例）： $f(x_3)= -3+6-2=1 \quad (\text{正好为}1,\text{支持向量})$
$x_4=(2,1)$ （负例）： $f(x_4)= -2+2-2=-2 \quad (\le -1)$
$x_5=(3,2)$ （负例）： $f(x_5)= -3+4-2=-1 \quad (\text{正好为}-1,\text{支持向量})$

结论

最大间隔分离超平面为
$-x_1+2x_2-2=0\quad \text{或}\quad 2x_2-x_1-2=0,$
它的间隔为 $1/\sqrt{5}$ 。
支持向量为那些使约束恰好取等号的点，即
$x_1=(1,2)^T,\quad x_3=(3,3)^T,\quad x_5=(3,2)^T.$

这一解法与结果与题目要求相符。