conv2d 层的输入 0 与层不兼容:输入形状的预期轴 -1 具有值 1,但接收到形状为 [None, 64, 64, 3] 的输入

Posted

技术标签:

【中文标题】conv2d 层的输入 0 与层不兼容:输入形状的预期轴 -1 具有值 1,但接收到形状为 [None, 64, 64, 3] 的输入【英文标题】:Input 0 of layer conv2d is incompatible with layer: expected axis -1 of input shape to have value 1 but received input with shape [None, 64, 64, 3] 【发布时间】:2021-08-05 18:05:02 【问题描述】:

我正在 EMNIST(128x128 灰度图像)上运行模型,但我无法理解如何将数据正确加载到 Tensorflow 中进行建模。

我一直在关注 TensorFlow (https://www.tensorflow.org/hub/tutorials/image_feature_vector) 除了 CNN 结构 提供的花卉示例,直到突然 model.fit() 失败并出现错误 Input 0 of layer conv2d_120 is incompatible with the layer: expected axis -1 of input shape to have value 1 but received input with shape [None, 64, 64, 3]

加载数据集

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential

batch_size = 32
image_w = 64
image_h = 64
seed = 123
data_dir = 'B:\Datasets\EMNIST Digital Number & Digits\OriginalDigits'

train_df = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="training",
  seed=seed,
  image_size=(image_w,image_h),
  batch_size=batch_size)

val_df = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="validation", #Same exact code block ... this is the only line of difference
  seed=seed,
  image_size=(image_w,image_h),
  batch_size=batch_size)

找到属于 10 个类的 10160 个文件。 使用 8128 文件进行训练。 找到属于 10 个类的 10160 个文件。 使用 2032 文件进行验证。

确认数据加载正确

import matplotlib.pyplot as plt
plt.figure(figsize=(10, 10))
for images, labels in train_df.take(1): #Take subsets the dataset into at most __1__ element (Seems to randomly create it)
    for i in range(9):
        ax = plt.subplot(3, 3, i + 1)
        plt.imshow(images[i].numpy().astype("uint8"))
        plt.title(labels[i].numpy().astype("str"))
        plt.axis("off")

将数据集处理成 tf.data.Dataset 对象

class_labels = train_df.class_names
num_classes = len(class_labels)
print(class_labels,num_classes)

['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'] 10

AUTOTUNE = tf.data.experimental.AUTOTUNE

train_df_modeling = train_df.cache().shuffle(len(train_df)) #Load training data into memory cache + shuffle all 10160 images
val_df_modeling = val_df.cache().shuffle(len(train_df)) #Load validation data into memory cache

定义模型

#Model from https://www.kaggle.com/henseljahja/simple-tensorflow-cnn-98-8
model = keras.models.Sequential([

    layers.experimental.preprocessing.Rescaling(1./255, input_shape=(image_h, image_w, 1)), #(64,64,1)
    layers.Conv2D(64, 7, padding='same', activation='relu'),    
    layers.GaussianNoise(0.2),
    layers.MaxPooling2D(pool_size=2),
    layers.Conv2D(filters=128, kernel_size=3, activation='relu', padding="SAME"),
    layers.Conv2D(filters=128, kernel_size=3, activation='relu', padding="SAME"),
    layers.MaxPooling2D(pool_size=2),
    layers.Conv2D(filters=128, kernel_size=3, activation='relu', padding="SAME"),
    layers.Conv2D(filters=128, kernel_size=3, activation='relu', padding="SAME"),
    layers.MaxPooling2D(pool_size=2),
    layers.Flatten(),
    layers.Dense(units=256, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(units=128, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(units=64, activation='relu'),
    layers.Dropout(0.5),    
    keras.layers.Dense(num_classes, activation='softmax'), #10 outputs [0,1,2,3,4,5,6,7,8,9]
])

model.summary()

模型:“顺序” _________________________________________________________________ 图层(类型)输出形状参数 # ==================================================== =============== 重新缩放 (Rescaling) (无, 64, 64, 1) 0 _________________________________________________________________ conv2d (Conv2D) (无, 64, 64, 64) 640 _________________________________________________________________ max_pooling2d (MaxPooling2D) (无, 32, 32, 64) 0 _________________________________________________________________ conv2d_1 (Conv2D) (无, 32, 32, 128) 73856 _________________________________________________________________ conv2d_2 (Conv2D) (无, 32, 32, 128) 147584 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (无, 16, 16, 128) 0 _________________________________________________________________ conv2d_3 (Conv2D) (无, 16, 16, 128) 147584 _________________________________________________________________ conv2d_4 (Conv2D) (无, 16, 16, 128) 147584 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (无, 8, 8, 128) 0 _________________________________________________________________ 展平(展平)(无,8192)0 _________________________________________________________________ 密集(密集)(无,256)2097408 _________________________________________________________________ 辍学(辍学)(无,256)0 _________________________________________________________________ dense_1(密集)(无,128)32896 _________________________________________________________________ dropout_1(辍学)(无,128)0 _________________________________________________________________ dense_2(密集)(无,64)8256 _________________________________________________________________ dropout_2(辍学)(无,64)0 _________________________________________________________________ dense_3(密集)(无,10)650 ==================================================== =============== 总参数:2,656,458 可训练参数:2,656,458 不可训练的参数:0


训练模型

model.compile(
    loss="sparse_categorical_crossentropy",
    optimizer = 'nadam',
    metrics=['accuracy']
)

result = model.fit(train_df_modeling,
                   validation_data=val_df_modeling,
                   epochs=20,
                   verbose=1)

ValueError:

我知道我的问题与形状有关,并且 [None, 64, 64, 3] 是 [batch_size, width, height, channels] 但我有以下问题:

    为什么它期望输入形状为have value 1? Conv2D 层不应该期待图像吗? 为什么我的输入有 3 个通道?我告诉它输入只有 1 个通道。 注意:尝试删除重新缩放层并简单地将 Conv2D 作为初始层仍然会给出相同的错误消息,即期望值为 1 但得到 64x64x3

【问题讨论】:

【参考方案1】:

嗯...在输入关于我的问题的最后一部分时,我想到了问题 #2 的解决方案。

我的数据(尽管是灰度数据)被 Tensorflow 读取为 RGB,因为我从未指定过。

解决方案

以灰度方式读取数据

文档:https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image_dataset_from_directory

感兴趣的参数:color_mode='grayscale'

修改我的代码以使其正常工作:

只需要更改 1 块代码(2 个变量)

data_dir = 'B:\Datasets\EMNIST Digital Number & Digits\OriginalDigits'

train_df = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="training",
  seed=seed,
  image_size=(image_w,image_h),
  batch_size=batch_size,
  color_mode='grayscale') #<---- This is was the missing link

val_df = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="validation",
  seed=seed,
  image_size=(image_w,image_h),
  batch_size=batch_size,
  color_mode='grayscale') #<---- This is was the missing link

虽然这个解决方案修复了模型并允许代码执行...... 任何人都可以回答问题 #1 吗?我仍然很好奇为什么它认为它需要输入到 have value 1 时我相信输入应该是图像。

【讨论】:

错误只是说你的输入形状(image_h,image_w,1)的最后一个轴(-1)应该是1,这是正确的,你的图像形状的最后一个轴是3而不是 1。 这更有意义,感谢您指出这一点。

以上是关于conv2d 层的输入 0 与层不兼容:输入形状的预期轴 -1 具有值 1,但接收到形状为 [None, 64, 64, 3] 的输入的主要内容,如果未能解决你的问题,请参考以下文章

ValueError:lstm_45 层的输入 0 与层不兼容:预期 ndim=3,发现 ndim=4。收到的完整形状:(无,无,无,128)

层 lstm_9 的输入 0 与层不兼容:预期 ndim=3,发现 ndim=4。收到的完整形状:[None, 2, 4000, 256]

ValueError:层顺序的输入 0 与层不兼容:输入形状的预期轴 -1 具有值 3,但接收到的输入具有形状

Tensorflow ValueError:层“顺序”的输入0与层不兼容:预期形状=(无,20,20,3),找到形状=(无,20,3)

DNN 中的错误:层序贯_10 的输入 0 与层不兼容

ValueError: 层序号_29 的输入 0 与层不兼容:预期 ndim=3,发现 ndim=2。收到的完整形状:[无,22]