IMPROVING ADVERSARIAL ROBUSTNESS REQUIRES REVISITING MISCLASSIFIED EXAMPLES

Posted mtandhj

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了IMPROVING ADVERSARIAL ROBUSTNESS REQUIRES REVISITING MISCLASSIFIED EXAMPLES相关的知识,希望对你有一定的参考价值。

Wang Y, Zou D, Yi J, et al. Improving Adversarial Robustness Requires Revisiting Misclassified Examples[C]. international conference on learning representations, 2020.

@article{wang2020improving,
title={Improving Adversarial Robustness Requires Revisiting Misclassified Examples},
author={Wang, Yisen and Zou, Difan and Yi, Jinfeng and Bailey, James and Ma, Xingjun and Gu, Quanquan},
year={2020}}

作者认为, 错分样本对于提高网络的鲁棒性是很重要的, 为此提出了一个启发于此的新的损失函数.

主要内容

符号

(h_{ heta}): 参数为( heta)的神经网络;
((x,y) in mathbb{R}^d imes {1,ldots, K}): 类别及其标签;

[ ag{2} h_{oldsymbol{ heta}}left(mathbf{x}_{i} ight)=underset{k=1, ldots, K}{arg max } mathbf{p}_{k}left(mathbf{x}_{i}, oldsymbol{ heta} ight), quad mathbf{p}_{k}left(mathbf{x}_{i}, oldsymbol{ heta} ight)=exp left(mathbf{z}_{k}left(mathbf{x}_{i}, oldsymbol{ heta} ight) ight) / sum_{k^{prime}=1}^{K} exp left(mathbf{z}_{k^{prime}}left(mathbf{x}_{i}, oldsymbol{ heta} ight) ight) ]

定义正分类样本和误分类样本

[mathcal{S}_{h_{ heta}}^+ = {i : i in [n], h_{ heta} (x_i)=y_i } quad mathrm{and} quad mathcal{S}_{h_{ heta}}^- = {i : i in [n], h_{ heta} (x_i) ot =y_i }. ]

MART

在所有样本上的鲁棒分类误差:

[ ag{3} mathcal{R}(h_{ heta}) = frac{1}{n} sum_{i=1}^n max_{x_i‘ in mathcal{B}_{epsilon}(x_i)} mathbb{1}(h_{ heta}(x_i‘) ot= y_i), ]

并定义在错分样本上的鲁棒分类误差

[ ag{4} mathcal{R}^- (h_{ heta}, x_i):= mathbb{1} (h_{ heta}(hat{x}_i‘) ot=y_i) + mathbb{1}(h_{ heta}(x_i) ot= h_{ heta} (hat{x}_i‘)) ]

其中

[ ag{5} hat{x}_i‘=arg max_{x_i‘ in mathcal{B}_{epsilon} (x_i)} mathbb{1} (h_{ heta} (x_i‘) ot = y_i). ]

以及正分样本上的鲁棒分类误差:

[ ag{6} mathcal{R}^+(h_{ heta}, x_i):=mathbb{1}(h_{ heta}(hat{x}_i‘) ot = y_i). ]

最后, 我们要最小化的是二者的混合误差:

[ ag{7} egin{aligned} min _{oldsymbol{ heta}} mathcal{R}_{ ext {misc }}left(h_{oldsymbol{ heta}} ight): &=frac{1}{n}left(sum_{i in mathcal{S}_{h}^{+}} mathcal{R}^{+}left(h_{oldsymbol{ heta}}, mathbf{x}_{i} ight)+sum_{i in mathcal{S}_{oldsymbol{h}_{oldsymbol{ heta}}}^{-}} mathcal{R}^{-}left(h_{oldsymbol{ heta}}, mathbf{x}_{i} ight) ight) &=frac{1}{n} sum_{i=1}^{n}left{mathbb{1}left(h_{oldsymbol{ heta}}left(hat{mathbf{x}}_{i}^{prime} ight) eq y_{i} ight)+mathbb{1}left(h_{oldsymbol{ heta}}left(mathbf{x}_{i} ight) eq h_{oldsymbol{ heta}}left(hat{mathbf{x}}_{i}^{prime} ight) ight) cdot mathbb{1}left(h_{oldsymbol{ heta}}left(mathbf{x}_{i} ight) eq y_{i} ight) ight} end{aligned}. ]

为了能够传递梯度, 需要利用一些替代函数"软化"上面的损失函数, 对于(mathbb{1}(h_{ heta}(hat{x}_i‘) ot = y_i))利用BCE损失函数替代

[ ag{8} mathrm{BCE} (p(hat{x}_i, heta),y_i)= -log (p_{y_i} (hat{x}_i‘, heta))- log (1-max_{k ot=y_i} p_k(hat{x}_i‘, heta)), ]

第一项为普通的交叉熵损失, 第二项用于提高分类边界.

对于第二项(mathbb{1}(h_{ heta}(x_i) ot=h_{ heta}(hat{x}_i‘))), 用KL散度作为替代

[ ag{9} mathrm{KL} (p(x_i, heta)| p(hat{x}_i‘, heta))=sum_{k=1}^K p_k(x_i, heta)log frac{p_k(x_i, heta)}{p_k(hat{x}_i‘, heta)}. ]

最后一项(mathbb{1}(h_{ heta}(x_i) ot =y_i))则可用 (1-p_{y_i}(x_i, heta))来代替.

于是最后的损失函数便是

[ ag{11} mathcal{L}^{mathrm{MART}}( heta)= frac{1}{n} sum_{i=1}^n ell(x_i, y_i, heta), ]

其中

[ell (x_i,y_i, heta):=mathrm{BCE}(p(hat{x}_i‘, heta),y_i)+lambda cdot mathrm{KL} (p(x_i, heta) |p(hat{x}_i, heta)) cdot (1-p_{y_i}(x_i, heta)). ]





以上是关于IMPROVING ADVERSARIAL ROBUSTNESS REQUIRES REVISITING MISCLASSIFIED EXAMPLES的主要内容,如果未能解决你的问题,请参考以下文章

Adversarial Faces

Magnum Improving Solution

Improving the quality of the output

Consideration about improving mathematics study

cs231n spring 2017 lecture16 Adversarial Examples and Adversarial Training 听课笔记

Improving the accuracy of roundness measurement