【中文标题】在 Jupyter Notebook 中执行高斯朴素贝叶斯时出错【英文标题】:Error while doing Gaussian Naive Bayes in Jupyter Notebook 【发布时间】:2021-11-14 05:23:01 【问题描述】:

我目前正在 udacity 学习“机器学习入门”免费课程,其中有一个关于高斯朴素贝叶斯的测验。在 udacity 环境中运行时,代码给出了所需的输出(如下图所示) Code output in udacity environment

但是当我在 jupyter notebook 中运行它时显示错误,对于模块 class_vis.py 它显示错误'NoneType' object has no attribute 'predict'(如下图所示) Error in jupyter notebook


    Naive Bayes classifier to classify the terrain data.
    The objective of this exercise is to recreate the decision 
    boundary found in the lesson video, and make a plot that
    visually shows the decision boundary """

from prep_terrain_data import makeTerrainData
from class_vis import prettyPicture, output_image
from ClassifyNB import classify

import numpy as np
import pylab as pl

features_train, labels_train, features_test, labels_test = makeTerrainData()

### the training data (features_train, labels_train) have both "fast" and "slow" points mixed
### in together--separate them so we can give them different colors in the scatterplot,
### and visually identify them
grade_fast = [features_train[ii][0] for ii in range(0, len(features_train)) if labels_train[ii]==0]
bumpy_fast = [features_train[ii][1] for ii in range(0, len(features_train)) if labels_train[ii]==0]
grade_slow = [features_train[ii][0] for ii in range(0, len(features_train)) if labels_train[ii]==1]
bumpy_slow = [features_train[ii][1] for ii in range(0, len(features_train)) if labels_train[ii]==1]

# You will need to complete this function imported from the ClassifyNB script.
# Be sure to change to that code tab to complete this quiz.
clf = classify(features_train, labels_train)
### draw the decision boundary with the text points overlaid
prettyPicture(clf, features_test, labels_test)
output_image("test.png", "png", open("test.png", "rb").read()) 
#from udacityplots import *
import warnings

import matplotlib 

import matplotlib.pyplot as plt
import pylab as pl
import numpy as np

#import numpy as np
#import matplotlib.pyplot as plt

def prettyPicture(clf, X_test, y_test):
    x_min = 0.0; x_max = 1.0
    y_min = 0.0; y_max = 1.0

    # Plot the decision boundary. For that, we will assign a color to each
    # point in the mesh [x_min, m_max]x[y_min, y_max].
    h = .01  # step size in the mesh
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
    Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])

    # Put the result into a color plot
    Z = Z.reshape(xx.shape)
    plt.xlim(xx.min(), xx.max())
    plt.ylim(yy.min(), yy.max())

    plt.pcolormesh(xx, yy, Z, cmap=pl.cm.seismic)

    # Plot also the test points
    grade_sig = [X_test[ii][0] for ii in range(0, len(X_test)) if y_test[ii]==0]
    bumpy_sig = [X_test[ii][1] for ii in range(0, len(X_test)) if y_test[ii]==0]
    grade_bkg = [X_test[ii][0] for ii in range(0, len(X_test)) if y_test[ii]==1]
    bumpy_bkg = [X_test[ii][1] for ii in range(0, len(X_test)) if y_test[ii]==1]

    plt.scatter(grade_sig, bumpy_sig, color = "b", label="fast")
    plt.scatter(grade_bkg, bumpy_bkg, color = "r", label="slow")

import base64
import json
import subprocess

def output_image(name, format, bytes):
    image_start = "BEGIN_IMAGE_f9825uweof8jw9fj4r8"
    image_end = "END_IMAGE_0238jfw08fjsiufhw8frs"
    data = 
    data['name'] = name
    data['format'] = format
    data['bytes'] = base64.encodestring(bytes)
    print (image_start+json.dumps(data)+image_end) 
import random

def makeTerrainData(n_points=1000):
### make the toy dataset
    grade = [random.random() for ii in range(0,n_points)]
    bumpy = [random.random() for ii in range(0,n_points)]
    error = [random.random() for ii in range(0,n_points)]
    y = [round(grade[ii]*bumpy[ii]+0.3+0.1*error[ii]) for ii in range(0,n_points)]
    for ii in range(0, len(y)):
        if grade[ii]>0.8 or bumpy[ii]>0.8:
            y[ii] = 1.0

### split into train/test sets
    X = [[gg, ss] for gg, ss in zip(grade, bumpy)]
    split = int(0.75*n_points)
    X_train = X[0:split]
    X_test  = X[split:]
    y_train = y[0:split]
    y_test  = y[split:]

    grade_sig = [X_train[ii][0] for ii in range(0, len(X_train)) if y_train[ii]==0]
    bumpy_sig = [X_train[ii][1] for ii in range(0, len(X_train)) if y_train[ii]==0]
    grade_bkg = [X_train[ii][0] for ii in range(0, len(X_train)) if y_train[ii]==1]
    bumpy_bkg = [X_train[ii][1] for ii in range(0, len(X_train)) if y_train[ii]==1]

#    training_data = "fast":"grade":grade_sig, "bumpiness":bumpy_sig
#            , "slow":"grade":grade_bkg, "bumpiness":bumpy_bkg

    grade_sig = [X_test[ii][0] for ii in range(0, len(X_test)) if y_test[ii]==0]
    bumpy_sig = [X_test[ii][1] for ii in range(0, len(X_test)) if y_test[ii]==0]
    grade_bkg = [X_test[ii][0] for ii in range(0, len(X_test)) if y_test[ii]==1]
    bumpy_bkg = [X_test[ii][1] for ii in range(0, len(X_test)) if y_test[ii]==1]

    test_data = "fast":"grade":grade_sig, "bumpiness":bumpy_sig
            , "slow":"grade":grade_bkg, "bumpiness":bumpy_bkg

    return X_train, y_train, X_test, y_test
#    return training_data, test_data
def classify(features_train, labels_train):   
    ### import the sklearn module for GaussianNB
    ### create classifier
    ### fit the classifier on the training features and labels
    ### return the fit classifier
    ### your code goes here!
    from sklearn.naive_bayes import GaussianNB
    clf = GaussianNB()




据我所知,您的分类函数没有返回任何内容,但是您将它的返回值分配给一个变量,该变量将根据 python 标准将其设置为None。要解决此问题,请在分类函数处插入一个 return 语句:

def classify(features_train, labels_train):   
    ### import the sklearn module for GaussianNB
    ### create classifier
    ### fit the classifier on the training features and labels
    ### return the fit classifier
    ### your code goes here!
    from sklearn.naive_bayes import GaussianNB
    clf = GaussianNB()
    return clf


即使在编写插入函数后它仍然显示相同的错误,根据错误声明(i.stack.imgur.com/J8k2p.png),我认为它显示了 class_vis.py 模块的错误 再次检查,除了 output_image 函数外,它对我有用。如果你想在 notebook 中看到结果,不要忘记将 %matplotlib inline 放在 jupyter 的开头。 你也应该从prep_terrain_data.py中删除shebang 非常感谢,我得到了输出。 class_vis.py 中的一些语句是作为注释编写的,在将它们更改为代码并删除 shebang 之后,我得到了输出,虽然我也得到了一些错误以及如图所示的输出(i.stack.imgur.com/Q7epK.png),你能告诉我吗我为什么会收到这些错误 我不确定你在 outputImage 中做了什么。 json.dumps() 部分似乎引发了错误。你可以把这个函数全部删除,代码就会按预期工作。

