Controllable Unsupervised Text Attribute Transfer via Editing Entangled Latent Representation
Posted Facico
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Controllable Unsupervised Text Attribute Transfer via Editing Entangled Latent Representation相关的知识,希望对你有一定的参考价值。
Controllable Unsupervised Text Attribute Transfer via Editing Entangled Latent Representation
Controllable Unsupervised Text Attribute Transfer via Editing Entangled Latent Representation
- 由于将风格和内容分开可能会破坏完整性,这里采用风格和内容纠缠在一起来表示
整个模型被分为三个部分
- 1、encoder E θ e E_\\theta_e Eθe:把文本 x x x编码进latent representation z z z中
- 2、decoder D θ d D_\\theta_d Dθd:将 z z z解码到文本 x ^ \\hat x x^中
- 3、attribute classifier
C
θ
c
C_\\theta_c
Cθc:为
z
z
z分类
x ^ ′ = D θ d ( z ′ ) w h e r e z ′ = a r g m i n z ∗ ∣ ∣ z ∗ − E θ c ( x ) ∣ ∣ s . t . C θ c ( z ∗ ) = y ′ \\hat x'=D_\\theta_d(z')\\quad where\\quad z'=argmin_z^*||z^*-E_\\theta_c(x)||\\\\ s.t. \\quad C_\\theta_c(z^*)=y' x^′=Dθd(z′)wherez′=argminz∗∣∣z∗−Eθc(x)∣∣s.t.Cθc(z∗)=y′
Transformer-based Autoencoder
- 要构建一个低重构损失的自编码器
z = E θ c ( x ) = S u m ( S i g m o i d ( G R U ( U + H ) ) ) w h e r e U = E t r a n s f o r m e r ( x ) z=E_\\theta_c(x)=Sum(Sigmoid(GRU(U+H)))\\\\ where\\quad U=E_transformer(x) z=Eθc(x)=Sum(Sigmoid(GRU(U+H)))whereU=Etransformer(x) - U U U:encoder得到的中间表示
- H H H:positional embedding
加入标签平滑正则LSR(label smoothing regularization),因此最后的重构损失为
L
a
e
(
D
θ
d
(
E
θ
e
(
x
)
)
,
x
)
=
L
a
e
(
D
θ
d
(
z
)
,
x
)
=
−
∑
∣
x
∣
(
(
1
−
ϵ
)
∑
i
=
1
v
p
i
‾
log
(
p
i
)
+
ϵ
v
∑
i
=
1
v
log
(
p
i
)
)
\\mathcalL_ae(D_\\theta_d(E_\\theta_e(x)),x)= \\mathcalL_ae(D_\\theta_d(z),x)=\\\\ -\\sum^|x|((1-\\epsilon)\\sum_i=1^v \\overlinep_i\\log(p_i)+\\frac\\epsilonv\\sum_i=1^v\\log(p_i))
Lae(Dθd(Eθe(x)),x)=Lae(Dθd(z),x)=−∑∣x∣((1−ϵ)i=1∑vpilog(pi)+vϵi=1∑vlog(pi))
- v v v:vocabulary size
- ϵ \\epsilon ϵ:平滑参数
- p 和 p ‾ i p和\\overlinep_i p和pi分别表示预测概率分布和真实概率分布
Attribute Classifier for Latent Representation
L c ( C θ c ( z ) , y ) = − ∑ i = 1 ∣ q ∣ q ‾ i log q i \\mathcalL_c(C_\\theta_c(z),y)=-\\sum_i=1^|q|\\overlineq_i \\log q_i Lc(Cθc(z),y)=−i=1∑∣q∣qilogqi
- q 和 q ‾ q和\\overlineq q和q分别表示attribute的预测分布和真实分布
作者发现把上面两个loss分开优化,效果会比一起优化好
Fast Gradient Iterative Modification Algorithm
用于修改latent representation
z
∗
=
z
−
w
i
∇
z
L
c
(
C
θ
c
(
z
)
,
y
′
)
z^*=z-w_i\\nabla_z \\mathcalL_c(C_\\theta_c(z),y')
z∗=z−wi∇zLc(Cθc(z),y′)
-
w
i
w_i
w论文阅读笔记——CDL: Curriculum Dual Learning for Emotion-Controllable Response Generation
Disentangling the independently controllable factors of variation by interacting with the world
unsupervised learning: K-means 算法
笔记:unsupervised domain adaptation by backpropagation
CVICML2015_Unsupervised Learning of Video Representations using LSTMs
CVICCV2015_Unsupervised Learning of Visual Representations using Videos