如何在将图像提供给 CoreML 模型之前对其进行预处理？

Posted 2023-03-27

技术标签:

【中文标题】如何在将图像提供给 CoreML 模型之前对其进行预处理？【英文标题】：How do I preprocess the image before giving it to CoreML Model? 【发布时间】：2019-03-31 14:05:37 【问题描述】：

我创建了一个图像相似性模型并使用参考数据图像对其进行了测试。我测试了 turicreate 模型，我得到了参考数据图像的零距离，并且在将这段代码与 coreml 模型一起使用时也得到了同样的结果：

image = tc.image_analysis.resize(reference_data[0]['image'], *reversed(model.input_image_shape))
image = PIL.Image.fromarray(image.pixel_data)
mlmodel.predict('image':image)`

然而，当在 ios 中使用模型作为 VNCoreMLModel 时，没有返回零距离的参考图像测试，其中大多数甚至不是最短距离，即参考图像 0 与参考 id 78 的最短距离. 由于 coreml 模型在 python 中工作，我认为这是一个预处理问题，所以我在将图像传递给 CoreMLModel 之前自己对图像进行了预处理。这样做给了我一个与参考图像匹配最短距离的参考 id 的一致输出——是的。距离仍然不是零，所以我试图做任何我能想到的来影响图像以获得一些差异，但我不能让它更接近于零。预处理代码：

+ (CVPixelBufferRef)pixelBufferForImage:(UIImage *)image sideLength:(CGFloat)sideLength 
    UIGraphicsBeginImageContextWithOptions(CGSizeMake(sideLength, sideLength), YES, image.scale);
    [image drawInRect:CGRectMake(0, 0, sideLength, sideLength)];
    UIImage *resizedImage = UIGraphicsGetImageFromCurrentImageContext();
    UIGraphicsEndImageContext();

    CFStringRef keys[2] = kCVPixelBufferCGImageCompatibilityKey, kCVPixelBufferCGBitmapContextCompatibilityKey;
    CFBooleanRef values[2] = kCFBooleanTrue, kCFBooleanTrue;
    CFDictionaryRef attrs = CFDictionaryCreate(kCFAllocatorDefault, (const void **)keys, (const void **)values, 2, &kCFTypeDictionaryKeyCallBacks, &kCFTypeDictionaryValueCallBacks);
    CVPixelBufferRef buffer;
    int status = CVPixelBufferCreate(kCFAllocatorDefault, (int)(sideLength), (int)(sideLength), kCVPixelFormatType_32ARGB, attrs, &buffer);
    if (status != kCVReturnSuccess) 
        return nil;
    

    CVPixelBufferLockBaseAddress(buffer, kCVPixelBufferLock_ReadOnly);
    void *data = CVPixelBufferGetBaseAddress(buffer);
    CGColorSpaceRef colorSpace = CGColorSpaceCreateWithName(kCGColorSpaceSRGB);
    CGContextRef context = CGBitmapContextCreate(data, sideLength, sideLength, 8, CVPixelBufferGetBytesPerRow(buffer), colorSpace, kCGImageAlphaNoneSkipFirst);

    CGContextTranslateCTM(context, 0, sideLength);
    CGContextScaleCTM(context, 1.0, -1.0);

    UIGraphicsPushContext(context);
    [resizedImage drawInRect:CGRectMake(0, 0, sideLength, sideLength)];
    UIGraphicsPopContext();
    CVPixelBufferUnlockBaseAddress(buffer, kCVPixelBufferLock_ReadOnly);
    return buffer;

mlmodel 采用大小为 (224, 224) 的 RGB 图像

我还能对图片做些什么来改善我的结果？

【问题讨论】：

【参考方案1】：

我和你在同一条船上。由于图像预处理涉及到模糊的使用、从 RGB 到灰度的转换等步骤。使用 Objective C++ 包装器会更容易。下面的链接很好地理解了如何使用标头类进行链接。

https://www.timpoulsen.com/2019/using-opencv-in-an-ios-app.html

希望对你有帮助！

图片来源：https://medium.com/@borisohayon/ios-opencv-and-swift-1ee3e3a5735b

【讨论】：

以上是关于如何在将图像提供给 CoreML 模型之前对其进行预处理？的主要内容，如果未能解决你的问题，请参考以下文章