在 iOS 11 中使用视觉框架进行对象跟踪

Posted

技术标签:

【中文标题】在 iOS 11 中使用视觉框架进行对象跟踪【英文标题】:Objects Track using vision framework in iOS 11 【发布时间】:2018-05-21 06:31:30 【问题描述】:

我想检测对象并使用视觉框架跟踪该对象。我成功地完成了检测对象和一点点跟踪,但跟踪的准确性并不高。

我希望在转换帧时获得更高的准确性,因为它在跟踪对象时经常失去准确性。

请检查以下代码以检测和跟踪对象:

import UIKit 
import AVFoundation
import Vision

class ViewController: UIViewController 

private lazy var captureSession: AVCaptureSession = 
   let session = AVCaptureSession()
   session.sessionPreset = AVCaptureSession.Preset.photo
   guard let backCamera = AVCaptureDevice.default(for: .video),
       let input = try? AVCaptureDeviceInput(device: backCamera) else 
           
            return session
           
    session.addInput(input)
    return session
 ()

private lazy var cameraLayer: AVCaptureVideoPreviewLayer = 
AVCaptureVideoPreviewLayer(session: self.captureSession)

private let handler = VNSequenceRequestHandler()
fileprivate var lastObservation: VNDetectedObjectObservation?

lazy var highlightView: UIView = 
    let view = UIView()
    view.layer.borderColor = UIColor.red.cgColor
    view.layer.borderWidth = 4
    view.backgroundColor = .clear
    return view
()

override func viewDidLoad() 
   super.viewDidLoad()

    view.layer.addSublayer(cameraLayer)
   view.addSubview(highlightView)

   let output = AVCaptureVideoDataOutput()
   output.setSampleBufferDelegate(self, queue: DispatchQueue(label: 
"queue"))
   captureSession.addOutput(output)

   captureSession.startRunning()

   let tapGestureRecognizer = UITapGestureRecognizer(target: self, 
action: #selector(tapAction))
    view.addGestureRecognizer(tapGestureRecognizer)


override func viewDidLayoutSubviews() 
    super.viewDidLayoutSubviews()
    cameraLayer.frame = view.bounds


   // MARK: - Actions

  @objc private func tapAction(recognizer: UITapGestureRecognizer) 
    highlightView.frame.size = CGSize(width: 120, height: 120)
    highlightView.center = recognizer.location(in: view)

    let originalRect = highlightView.frame
    var convertedRect = 
    cameraLayer.metadataOutputRectConverted(fromLayerRect: 
    originalRect)
    convertedRect.origin.y = 1 - convertedRect.origin.y

     lastObservation = VNDetectedObjectObservation(boundingBox: 
     convertedRect)


fileprivate func handle(_ request: VNRequest, error: Error?) 
    DispatchQueue.main.async 
        guard let newObservation = request.results?.first as? 
VNDetectedObjectObservation else 
           return
        
        self.lastObservation = newObservation
        var transformedRect = newObservation.boundingBox
        transformedRect.origin.y = 1 - transformedRect.origin.y
        let convertedRect = 
        self.cameraLayer.layerRectConverted(fromMetadataOutputRect: 
       transformedRect)
        self.highlightView.frame = convertedRect
    



 extension ViewController: 
 AVCaptureVideoDataOutputSampleBufferDelegate 

func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: 
   CMSampleBuffer, from connection: AVCaptureConnection) 
        guard let pixelBuffer = 
    CMSampleBufferGetImageBuffer(sampleBuffer),
            let observation = lastObservation else 
                return
         
        let request = VNTrackObjectRequest(detectedObjectObservation: 
    observation)  [unowned self] request, error in
            self.handle(request, error: error)
       
       request.trackingLevel = .accurate
        do 
            try handler.perform([request], on: pixelBuffer)
        
        catch 
            print(error)
        
       
       

任何帮助将不胜感激! 谢谢。

【问题讨论】:

【参考方案1】:

我不太擅长视觉和核心机器学习,但显然你的代码看起来不错。您可以做的一件事是检查视觉在缓冲区中何时没有得到任何跟踪,如果跟踪请求置信度值降至 0,则必须将其属性 isLastFrame 标记为 true。

  if !trackingRequest.isLastFrame 
      if observation.confidence > 0.7 
          trackingRequest.inputObservation = observation
       else 
                trackingRequest.isLastFrame = true
            
          newTrackingRequests.append(trackingRequest)

   

这样很容易发现视觉跟踪请求是丢失了跟踪对象还是只是跟踪了错误的对象。

【讨论】:

以上是关于在 iOS 11 中使用视觉框架进行对象跟踪的主要内容,如果未能解决你的问题,请参考以下文章

TLD视觉目标跟踪框架原理与实践

视觉框架坐标系如何转化为ARKit?

对象跟踪所需的硬件 [计算机视觉]

在相机源 + Vision API iOS11 上跟踪面部对象

使用视觉框架从图像中检测目标

是否可以在 Interface Builder 中使用跟踪图像进行视图设计?