yellowbrick牛逼,机器学习“炼丹师”“调参侠”们有福了

Posted pythonic生物人

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了yellowbrick牛逼,机器学习“炼丹师”“调参侠”们有福了相关的知识,希望对你有一定的参考价值。

yellowbrick是机器学习工具Scikit-Learn的扩展,通过几行代码可视化特征值、模型、模型评估等帮助“调参侠”们更便捷的的选择机器学习模型调参依赖Matplotlib和Scikit-Learn。

目录

yellowbrick安装

yellowbrick核心“武器” - Visualizers

yellowbrick实例快速上手

yellowbrick常用的Visualizers

特征展示(Feature Visualization)

分类模型展示(Classification Visualization)

回归模型展示(Regression Visualization)

聚类模型展示(Clustering Visualization)

模型选择(Model Selection Visualization)

目标展示(Target Visualization)

文本展示(Text Visualization)

yellowbrick图形个性化设置


yellowbrick安装

# 清华源加速安装
pip install yellowbrick -i https://pypi.tuna.tsinghua.edu.cn/simple

yellowbrick核心“武器” - Visualizers

Visualizers可以理解为一个scikit-learn的估计器(estimator)对象,但是附加了可视化的属性,使用过程与使用scikit-learn模型类似:

导入特定的visualizers;

实例化visualizers;

拟合visualizers;

可视化展示

yellowbrick实例快速上手

展示ROC曲线,评估不同模型效果

import matplotlib.pyplot as plt

plt.figure(dpi=120)
from sklearn.linear_model import RidgeClassifier
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OrdinalEncoder, LabelEncoder

from yellowbrick.classifier import ROCAUC
from yellowbrick.datasets import load_game

# 导入数据
X, y = load_game()

# 数据转换
X = OrdinalEncoder().fit_transform(X)
y = LabelEncoder().fit_transform(y)

# 构建测试集和训练集
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

# 实例化分类模型和visualizer
model = RidgeClassifier()
visualizer = ROCAUC(model, classes=["win", "loss", "draw"])

visualizer.fit(X_train, y_train)  # 拟合visualizer
visualizer.score(X_test, y_test)  # 评价模型在训练集上效果
visualizer.show()

特征工程中,展示PCA降维效果

import matplotlib.pyplot as plt

plt.figure(dpi=120)
from yellowbrick.features import PCA

X, y = load_credit()
classes = ['account in default', 'current with bills']

visualizer = PCA(scale=True, projection=3, classes=classes)
visualizer.fit_transform(X, y)
visualizer.show()

回归模型中,展示预测值和真实值之间的残差,Q-Q plot评估模型效果

from sklearn.linear_model import Ridge
from sklearn.model_selection import train_test_split

from yellowbrick.datasets import load_concrete
from yellowbrick.regressor import ResidualsPlot

# 导入数据
X, y = load_concrete()

# 构建训练集、测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 实例化模型和visualizer
model = Ridge()
visualizer = ResidualsPlot(model, hist=False, qqplot=True)
visualizer.fit(X_train, y_train)
visualizer.score(X_test, y_test)
visualizer.show()

Residuals Plot on the Concrete dataset with a Q-Q plot

展示Lasso回归模型效果

import matplotlib.pyplot as plt

plt.figure(dpi=120)
from sklearn.linear_model import Lasso
from yellowbrick.datasets import load_bikeshare
from yellowbrick.regressor import prediction_error


X, y = load_bikeshare()
visualizer = prediction_error(Lasso(), X, y)#一行代码即可展示,方不方便

更多实例见下一节~~


yellowbrick常用的Visualizers

特征展示(Feature Visualization)

分类模型展示(Classification Visualization)

回归模型展示(Regression Visualization)

聚类模型展示(Clustering Visualization)

模型选择(Model Selection Visualization)

目标展示(Target Visualization)

  • Balanced Binning Reference: generate a histogram with vertical lines showing the recommended value point to bin the data into evenly distributed bins

  • Class Balance: see how the distribution of classes affects the model

  • Feature Correlation: display the correlation between features and dependent variables

文本展示(Text Visualization)


yellowbrick图形个性化设置

https://github.com/DistrictDataLabs/yellowbrick/blob/master/docs/index.rst

以上是关于yellowbrick牛逼,机器学习“炼丹师”“调参侠”们有福了的主要内容,如果未能解决你的问题,请参考以下文章

你是几级调参侠?

python机器学习可视化工具Yellowbrick介绍及平行坐标图实战示例

python机器学习可视化工具Yellowbrick绘图获取最佳聚类K值实战示例

深度学习“炼丹”难?三分钟带你了解国产丹炉旷视天元

收藏 | 写给新手炼丹师:2021版调参上分手册

Python机器学习之数据探索可视化库yellowbrick