Keras ImageDataGenerator：为啥我的 CNN 的输出反转了？

Posted 2023-02-23

技术标签:

【中文标题】Keras ImageDataGenerator：为啥我的 CNN 的输出反转了？【英文标题】：Keras ImageDataGenerator: why are the outputs of my CNN reversed?Keras ImageDataGenerator：为什么我的 CNN 的输出反转了？ 【发布时间】：2019-06-20 00:04:52 【问题描述】：

我正在尝试编写一个区分猫和狗的 CNN。我已经将我的标签设置为 dog:0 和 cat:1，所以我希望我的 CNN 如果是狗则输出 0，如果是猫则输出 1。然而，它正在做相反的事情（当它是一只猫时给出 0，而对于狗则给出 1）。请检查我的代码，看看我哪里出错了。谢谢

我目前在 python 3.6.8 上，使用 jupyter notebook（里面的所有代码都是我从 jupyter notebook 复制粘贴代码的不同部分）

import os
import cv2
from random import shuffle
import numpy as np
from keras.preprocessing.image import ImageDataGenerator, load_img, img_to_array
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense, Activation, Conv2D, MaxPooling2D, Flatten, Dropout, BatchNormalization
from keras.callbacks import EarlyStopping, ReduceLROnPlateau
%matplotlib inline

train_dir = r'C:\Users\tohho\Desktop\Python pypipapp\Machine Learning\data\PetImages\train'
test_dir = r'C:\Users\tohho\Desktop\Python pypipapp\Machine Learning\data\PetImages\test1'
IMG_WIDTH = 100
IMG_HEIGHT = 100
batch_size = 32



######## THIS IS WHERE I LABELLED 0 FOR DOG AND 1 FOR CAT ##########
filenames = os.listdir(train_dir)
categories = [] 
for filename in filenames:
    category = filename.split('.')[0]
    if category == 'cat':
        categories.append(1)
    elif category == 'dog':
        categories.append(0)

df = pd.DataFrame('filename':filenames, 'class':categories) # making the dataframe

#### I SPLIT THE DATA INTO TRAIN AND VALIDATION DATASETS ####
df_train, df_validate = train_test_split(df, test_size=0.25) # splitting data for train/test
 # need to reset index for both dataframs so imagedatagenerator works properly
df_train = df_train.reset_index(drop=True)
df_validate = df_validate.reset_index(drop=True)

print(df_train['class'].value_counts())
print(df_validate['class'].value_counts())

len_training = df_train.shape[0]
len_validate = df_validate.shape[0]
print(' training eg,  test eg'.format(len_training, len_validate))



#### CREATE IMAGE DATA GENERATORS ####
train_datagen = ImageDataGenerator(rescale=1./255,
                               shear_range = 0.2,
                               zoom_range = 0.2,
                               horizontal_flip = True)
# our train_datagen generator will use the following transformations on the images
validation_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_dataframe(df_train, 
                                                    train_dir,
                                                    target_size=(IMG_WIDTH, IMG_HEIGHT),
                                                    batch_size=batch_size,
                                                    x_col='filename',
                                                    y_col='class',
                                                    class_mode = 'binary')

# generator = ImageDataGenerator(*args).flow_from_dataframe(dataframe, directory, target_size,
# batch_size, x_col, y_col, class_mode)
# your dataframe shoudl be in the format such that x_col = features, y_col = class/label
# binary class mode since output is either 0(dog) or 1(cat)

validation_generator = validation_datagen.flow_from_dataframe(df_validate, 
                                                   train_dir,
                                                    target_size=(IMG_WIDTH, IMG_HEIGHT),
                                                    x_col='filename',
                                                    y_col='class',
                                                    class_mode='binary', 
                                                  batch_size=batch_size)

########## BUILDING MODEL ############
model = Sequential()
model.add(Conv2D(32, (3,3), input_shape=(IMG_WIDTH, IMG_HEIGHT, 3)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Conv2D(64, (3,3), input_shape=(IMG_WIDTH, IMG_HEIGHT, 3)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Conv2D(128, (3,3), input_shape=(IMG_WIDTH, IMG_HEIGHT, 3)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Flatten()) # remember to flatten conv2d to dense layer
model.add(Dense(256))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.4))

model.add(Dense(1))
model.add(Activation('sigmoid')) 
# since we have only 1 output with range [0,1], we use sigmoid
# if there were n categories, use softmax

# binary_crossentropy since output is either 0,1
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

earlystop = EarlyStopping(monitor='val_loss', patience=3) # stops learning if val_loss doesnt improve
learning_rate_reduction = ReduceLROnPlateau(monitor='val_acc', 
                                            patience=2, 
                                            verbose=1, 
                                            factor=0.5, 
                                            min_lr=0.000001) 
# reduces learning rate if val_acc doesnt improve
callbacks = [earlystop, learning_rate_reduction]

##### FIT THE MODEL #####
epochs = 50
model.fit_generator(train_generator,
                   steps_per_epoch=len_training//batch_size,
                   verbose=1,
                   epochs=epochs,
                   validation_data=validation_generator,
                   validation_steps=len_validate//batch_size,
                   callbacks=callbacks) # fitting model


######### PREDICTING #############
output_generator = validation_datagen.flow_from_dataframe(df_output,
                                                   outputdir,
                                                   x_col='filename',
                                                   y_col=None,
                                                   class_mode=None,
                                                   target_size=(IMG_WIDTH, IMG_HEIGHT),
                                                   shuffle=False,
                                                   batch_size=batch_size)
predictions = model.predict_generator(output_generator, 
                                      steps=np.ceil(len_output/batch_size))
df_output['probability'] = predictions
df_output['label'] = np.where(df_output['probability'] > 0.5, 'cat','dog')
df_output.head()

CNN 给出了与正确答案相反的结果，当反转输出时，我得到了预期的结果（正确识别和准确性）。我知道只需将行 df_output['label'] = np.where(df_output['probability'] > 0.5, 'cat','dog') 更改为 df_output['label'] = np.where(df_output['probability'] < 0.5, 'cat','dog') 即可解决问题，但这并不能帮助我弄清楚为什么 CNN 的输出会反转。

【问题讨论】：

您能否在定义数据框的位置包含代码df？编辑了上面的代码以包含它。这只是一个df = pd.DataFrame('filename':filenames, 'class':categories) @sdcbr 【参考方案1】：

问题的原因很微妙。我将用一个玩具示例来说明发生了什么。假设我们使用以下代码实例化一个数据生成器：

# List of image paths, doesn't matter here
image_paths = ['./img_.png'.format(i) for i in range(5)] 
labels = ...  # List of labels

df = pd.DataFrame()
df['filename'] = image_paths
df['class'] = labels

generator = ImageDataGenerator().flow_from_dataframe(dataframe=df, 
                                                    directory='./',
                                                    x_col='filename',
                                                    y_col='class')

ImageDataGenerator 期望数据框中的 class 列包含与图像关联的字符串标签。在内部，它将这些标签映射到类整数。您可以通过调用class_indices 属性来检查此映射。在使用以下标签列表实例化我们的生成器后：

labels = ['cat', 'cat', 'cat', 'dog', 'dog']

class_indices 映射如下所示：

generator.class_indices
> 'cat': 0, 'dog': 1

让我们再次实例化生成器，但更改第一张图片的标签：

labels = ['dog', 'cat', 'cat', 'dog', 'dog']
# After re-instantiating the generator
generator.class_indices
> 'dog': 0, 'cat': 1

我们的类的整数编码被交换，这表明标签到类整数的内部映射取决于遇到不同类的顺序。

您将 cat 映射到 1 并将 dog 映射到 0，但 ImageDataGenerator 将它们解释为标签字符串并在内部将它们映射到整数。

如果您目录中的第一张图片是猫会怎样？

labels = [1, 0, 1, 0, 0] # ['cat', 'dog', 'cat', 'dog', 'dog']
# After re-instantiating the generator
generator.class_indices
> 1: 0, 0: 1  # !

这就是你困惑的根源。 :) 为了避免这种情况，要么：

在数据框的标签列中使用“猫”和“狗”，让 ImageDataGenerator 为您处理映射将类列表传递给调用中的classes 参数 flow_from_dataframe 明确指定映射。

【讨论】：

如果我通过生成器传递训练集，并说映射的类是'dog':0, 'cat':1，我将传递一个新数据集，其中一只猫是第一张图像。那么 cat 会被映射到 0 还是 1？好问题。我想知道这是否是预期的行为。我必须查看源代码，但我现在真的没有时间。自己明确指定映射可能是最安全的。在过去的两周里，keras_preprocessing 的几个错误修复版本已经发布，并且这种行为已经改变。见github.com/keras-team/keras-preprocessing/releases

以上是关于Keras ImageDataGenerator：为啥我的 CNN 的输出反转了？的主要内容，如果未能解决你的问题，请参考以下文章