L1和L2特征的适用场景

Posted simple_wxl

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了L1和L2特征的适用场景相关的知识,希望对你有一定的参考价值。

How to decide which regularization (L1 or L2) to use?

Is there collinearity among some features? L2 regularization can improve prediction quality in this case, as implied by its alternative name, "ridge regression." However, it is true in general that either form of regularization will improve out-of-sample prediction, whether or not there is multicollinearity and whether or not there are irrelevant features, simply because of the shrinkage properties of the regularized estimators. L1 regularization can‘t help with multicollinearity; it will just pick the feature with the largest correlation to the outcome. Ridge regression can obtain coefficient estimates even when you have more features than examples... but the probability that any will be estimated precisely at 0 is 0.

What are the pros & cons of each of L1 / L2 regularization?

L1 regularization can‘t help with multicollinearity. L2 regularization can‘t help with feature selection. Elastic net regression can solve both problems. L1 and L2 regularization are taught for pedagogical reasons, but I‘m not aware of any situation where you want to use regularized regressions but not try an elastic net as a more general solution, since it includes both as special cases.

以上是关于L1和L2特征的适用场景的主要内容,如果未能解决你的问题,请参考以下文章

l1/l2 正则化导致 vowpal wabbit 中所有特征权重为零是不是合理?

机器学习中正则化项L1和L2的直观理解

为什么 L1 正则化能做特征选择而 L2 正则化不能

机器学习中L1,L2正则化项

正则化项L1和L2

特征工程