生成式判别式模型对比

Posted 2021-01-22 yaoyaohust

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了生成式判别式模型对比相关的知识，希望对你有一定的参考价值。

参考文献：On Discriminative vs. Generative classifiers: A comparison of logistic regression and naive Bayes

生成式模型：model p(x,y)=p(x|y)*p(y) -> Bayes rule预测: p(y|x)=p(x,y)p(x)，代表模型：Naive Bayes

判别式模型：model p(y|x)，代表模型：Logistic Regression

参考文献中的结论：

判别式模型有更低的理论渐近误差[the generative model does indeed have a higher asymptotic error - as the number of training examples becomes large - than the discriminative model]，

生成式模型理论上更快逼近渐近误差（前提是样本能够满足条件独立性和特定的分布，比如Gaussian分布）[but the generative model may also approach its asymptotic error much faster than the discriminative model - possibly with a number of training examples that is only logarithmic, rather than linear, in the number of parameters]

实际情况由于样本很难严格服从特定条件，使得判别式模型往往更优。

其他来源的观点：

- Easy to fit?

G: easy, simple counting and averaging (NB, LDA)

D: much slower, solving a convex optimization problem (LogR)

- Fit classes separately?

G: not have to retrain when add more classes

D: must be retrained (all parameters interact)

- Handle missing features easily?

G: simple, marginalizing them out (NB)

D: no principled solution, model assumes that x is given

- Can handle feature preprocessing?

G: hard to define model on preprocessed data

D: allow to preprocess the input, replace x with kernel(x)

- Can handle unlabeled training data (like semi-supervised learning)?

G: easy

D: much harder

以上是关于生成式判别式模型对比的主要内容，如果未能解决你的问题，请参考以下文章