如何使用 Google Cloud Vision API 读取一列文本

Posted 2023-04-17

技术标签:

【中文标题】如何使用 Google Cloud Vision API 读取一列文本【英文标题】：How to read one column texts with Google Cloud Vision API 【发布时间】：2019-05-25 17:38:39 【问题描述】：

我有下一张文档图片

当我尝试将图像转换为文本时，结果是下一个：

顶部文字

参考：Rad：Dte：Ddo：

Ejecutivo 76520400300 Banco de Bogotá Luz Adriana

按钮文字

问题是 Google API 将其识别为两列所以，我如何配置 Google API 以获得一列文本？

我的目标是获得：

顶部文字

Ref:Ejecutivo Rad: 76520400300 Dte: Banco de Bogotá Ddo:Luz Adriana

按钮文字

【问题讨论】：

【参考方案1】：

根据the issue 的更新，Google 团队成员回复说 Document AI 比 Cloud Vision 工作得更好

【讨论】：

这并没有提供问题的答案。一旦你有足够的reputation，你就可以comment on any post；相反，provide answers that don't require clarification from the asker。 - From Review 感谢您的反馈。【参考方案2】：

Cloud Vision API 没有特定的请求属性来指定用于读取或排序文件数据的格式。相反，我认为可用的解决方法是使用 BoundingPoly 和 Vertex 响应属性，显示与图像中包含的每个单词相关的坐标，以便在代码逻辑中处理顶点数据并定义文本需要按列和行分组。您可以查看this link，其中包含一些包含这些属性的响应示例。

如果此功能无法满足您当前的需求，您可以使用位于service public documentation 左下角和右上角的发送反馈按钮，以及采取查看Issue Tracker 工具以查看raise a Vision API feature request 并向Google 通知此所需功能。

【讨论】：

Archived page for the link 在上一个关于边界多边形和顶点响应示例的答案中

以上是关于如何使用 Google Cloud Vision API 读取一列文本的主要内容，如果未能解决你的问题，请参考以下文章

Google Cloud Vision - 如何使用 Node.js 发送请求属性

如何使用 Google Cloud Vision API 读取一列文本

如何修复 Google Cloud Vision 的分段错误？

如何使用 Google Cloud Vision API 确认图像（包含手写和打印文本）是不是包含手写文本？

使用 Google Cloud Vision 的 OCR PDF 文件？

Google Cloud Vision AI 如何从提供的图像范围中选择相似的图像