Tensorflow目标检测实战数据训练的前处理

Posted 2022-11-19 海里的鱼2022

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Tensorflow目标检测实战数据训练的前处理相关的知识，希望对你有一定的参考价值。

一、背景：在数据集准备过程中需要完成的处理任务：

1. 遍历目录和子目录下（目前是二级）的jpg/jpeg文件和苹果手机拍摄的heic格式图片

2. 苹果收集如果开启了动态模式，会一张图片一个压缩包，然后带mov的一个动画，占用空间，在扫描文件过程中可以直接删除

3. 图片大小都处理成宽640的图片格式，减少后面训练时间

后续需要增加的处理流程：

1. 自动调节白平衡

2. 自动调节曝光，暗光和强反光（过曝）的照片需要做个处理，改善图片质量

3. 如果图片质量太差（失脚或者模糊）需要跳过或直接删除

二、处理流程：

扫描获得image文件列表，再按文件列表处理

三、中间技术点：

1. 文件夹和目录的遍历，删除，保存等操作

2. heic格式文件读取，包括pillow读取后转换为opencv能处理的数据格式

上代码：

###resize the jpg to width=640

import cv2
import os
import pyheif
from PIL import Image
import shutil
import numpy as np




jpg_path="dataset_test/"
dsc_path="dataset_test/resized/"

jpg_list=[]
for file in os.listdir(jpg_path):
    print("check filelist",file)
    if (file.endswith(".jpg")) or (file.endswith(".jpeg") ) or (file.endswith(".heic") ) :
        jpg_list.append(os.path.join(jpg_path,file))
            
    else: # if os.path.isfile(file)==False: # isdir(file):
        print("process the next layer. ",file)
        for item in os.listdir(os.path.join(jpg_path, file)):
            if (item.endswith(".jpg")) or (item.endswith(".jpeg") )or (item.endswith(".heic") ) :
                jpg_list.append(os.path.join(jpg_path,file, item))
            if (item.endswith(".mov")):
                os.remove(os.path.join(jpg_path,file,item))
print(jpg_list)

count=0
for file in jpg_list:
    path,file_name= os.path.split(file)
    image_id,_=os.path.splitext(file_name)
    if (file.endswith(".jpg")) or (file.endswith(".jpeg") ) :
        
        image=cv2.imread(file,1)
    elif (file.endswith(".heic")):
         
         img = pyheif.read_heif(file)
		# print('img = ', img)
		# print('img.metadata = ', img.metadata)
         image = Image.frombytes(mode=img.mode, size=img.size, data=img.data)
         image = cv2.cvtColor(np.asarray(image),cv2.COLOR_RGB2BGR)

    else: 
        continue
    print(file,image)
    if image is None:
        print("read file error. ",file)
        continue
    else:
    
        print("processing: ",file,image_id)
        width,height,pixels=image.shape
        print("image size: ",image.shape)
    #cv2.imshow("check",image)
        key=cv2.waitKey(0)
        newsize=(640,int(height/(width/640)))
        print("new size: ",newsize)
        if not os.path.exists(dsc_path):
            os.makedirs(dsc_path) 
        image_resize=cv2.resize(image,newsize)
        resized_file=dsc_path+str(image_id)+".jpg"
        print("stored path: ",resized_file)
        cv2.imwrite(resized_file,image_resize)
    
             
    
    count+=1
    if key==27:
        
        break
print("check over.total: ",count)    
cv2.destroyAllWindows()

以上是关于Tensorflow目标检测实战数据训练的前处理的主要内容，如果未能解决你的问题，请参考以下文章

教程 | 盯住梅西：TensorFlow目标检测实战

Tensoflow目标检测实战训练模型转换至tflite并部署

目标检测的标注数据 .xml 转为 tfrecord 的格式用于 TensorFlow 训练

目标干脆面君：动动手，用TensorFlow API训练出自己的目标检测模型