TensorFlow 读取带有标签的图像

Posted 2023-02-16

技术标签:

【中文标题】TensorFlow 读取带有标签的图像【英文标题】：Tensorflow read images with labels 【发布时间】：2016-03-24 06:28:22 【问题描述】：

我正在使用 Tensorflow 构建标准图像分类模型。为此，我输入了图像，每个图像都分配了一个标签（0,1 中的数字）。因此，数据可以使用以下格式存储在列表中：

/path/to/image_0 label_0
/path/to/image_1 label_1
/path/to/image_2 label_2
...

我想使用 TensorFlow 的排队系统来读取我的数据并将其提供给我的模型。忽略标签，使用string_input_producer 和wholeFileReader 可以轻松实现这一点。代码如下：

def read_my_file_format(filename_queue):
  reader = tf.WholeFileReader()
  key, value = reader.read(filename_queue)
  example = tf.image.decode_png(value)
  return example

#removing label, obtaining list containing /path/to/image_x
image_list = [line[:-2] for line in image_label_list]

input_queue = tf.train.string_input_producer(image_list)                                                     
input_images = read_my_file_format(input_queue)

但是，在该过程中，标签会丢失，因为图像数据被故意打乱为输入管道的一部分。将标签与图像数据一起通过输入队列推送的最简单方法是什么？

【问题讨论】：

我有一个问题：如何为图像分配标签？我有 3 个图像文件夹，我想为每个图像分配正确的标签。我该怎么做？嗯，这是特定于任务的，具体取决于您要分类的内容。假设您有猫和狗的图像。您可以定义cats := 0 和dogs := 1。然后，您将为所有显示猫的图像分配 0，为所有显示狗的图像分配 1。你可以尝试随意分配标签，只要有明确的语义标准，这样网络就能很好地泛化。 【参考方案1】：

解决这个问题主要分为三个步骤：

用包含文件名和标签的以空格分隔的原始字符串的字符串列表填充tf.train.string_input_producer()。

使用tf.read_file(filename) 而不是tf.WholeFileReader() 来读取您的图像文件。 tf.read_file() 是一个无状态操作，它使用单个文件名并生成包含文件内容的单个字符串。它的优点是它是一个纯函数，因此很容易将数据与输入和输出相关联。例如，您的 read_my_file_format 函数将变为：

def read_my_file_format(filename_and_label_tensor):
  """Consumes a single filename and label as a ' '-delimited string.

  Args:
    filename_and_label_tensor: A scalar string tensor.

  Returns:
    Two tensors: the decoded image, and the string label.
  """
  filename, label = tf.decode_csv(filename_and_label_tensor, [[""], [""]], " ")
  file_contents = tf.read_file(filename)
  example = tf.image.decode_png(file_contents)
  return example, label

通过传递来自input_queue 的单个出队元素来调用read_my_file_format 的新版本：

image, label = read_my_file_format(input_queue.dequeue())

然后，您可以在模型的其余部分使用 image 和 label 张量。

【讨论】：

你好，如果文件是.mat格式需要先使用h5py加载，然后使用convert_to_tensor，如何将filename_and_label_tensor解析为numpy字符串以使用h5py.Open() ?目前 tensorflow 似乎没有 decode_mat 支持。不知道为什么tensorflow需要将字符串作为Tensor。【参考方案2】：

使用slice_input_producer 提供了一个更简洁的解决方案。 Slice Input Producer 允许我们创建一个包含任意多个可分离值的输入队列。这个问题的 sn-p 如下所示：

def read_labeled_image_list(image_list_file):
    """Reads a .txt file containing pathes and labeles
    Args:
       image_list_file: a .txt file with one /path/to/image per line
       label: optionally, if set label will be pasted after each line
    Returns:
       List with all filenames in file image_list_file
    """
    f = open(image_list_file, 'r')
    filenames = []
    labels = []
    for line in f:
        filename, label = line[:-1].split(' ')
        filenames.append(filename)
        labels.append(int(label))
    return filenames, labels

def read_images_from_disk(input_queue):
    """Consumes a single filename and label as a ' '-delimited string.
    Args:
      filename_and_label_tensor: A scalar string tensor.
    Returns:
      Two tensors: the decoded image, and the string label.
    """
    label = input_queue[1]
    file_contents = tf.read_file(input_queue[0])
    example = tf.image.decode_png(file_contents, channels=3)
    return example, label

# Reads pfathes of images together with their labels
image_list, label_list = read_labeled_image_list(filename)

images = ops.convert_to_tensor(image_list, dtype=dtypes.string)
labels = ops.convert_to_tensor(label_list, dtype=dtypes.int32)

# Makes an input queue
input_queue = tf.train.slice_input_producer([images, labels],
                                            num_epochs=num_epochs,
                                            shuffle=True)

image, label = read_images_from_disk(input_queue)

# Optional Preprocessing or Data Augmentation
# tf.image implements most of the standard image augmentation
image = preprocess_image(image)
label = preprocess_label(label)

# Optional Image and Label Batching
image_batch, label_batch = tf.train.batch([image, label],
                                          batch_size=batch_size)

另请参阅TensorVision 示例中的generic_input_producer 以了解完整的输入管道。

【讨论】：

您似乎将num_labels 传递给read_images_from_disk，这不是此函数的参数。我应该在哪里实际传递这些信息？对不起，这是我在用更大的代码生成最小示例时犯的错误。我现在删除了num_labels。从文件读取时不需要num_labels。如果您提前知道num_labels，您可以将其用于检查（断言）并生成one hot labels。后者在很多情况下不再需要，因为tf.nn.sparse_softmax_cross_entropy_with_logits 允许直接使用整数标签。我的问题是这些方法（WholeFileReader vs tf.read_file）在性能和为缓冲加载创建的队列方面有什么区别？没有。我也发布到 ML。我通过在 read_images_from_disk 函数中设置图像的形状让它工作：example.set_shape([IMAGE_HEIGHT, IMAGE_WIDTH, NUM_CHANNELS])【参考方案3】：

除了提供的答案之外，您还可以做一些其他事情：

将您的标签编码到文件名中。如果您有 N 个不同的类别，您可以将文件重命名为：0_file001, 5_file002, N_file003。之后，当您从 reader key, value = reader.read(filename_queue) 读取数据时，您的键/值是：

Read 的输出将是一个文件名（键）和该文件的内容（值）

然后解析您的文件名，提取标签并将其转换为 int。这将需要对数据进行一些预处理。

使用TFRecords，这将允许您将数据和标签存储在同一个文件中。

【讨论】：

以上是关于TensorFlow 读取带有标签的图像的主要内容，如果未能解决你的问题，请参考以下文章