谷歌 OCR 在特定领域工作
Posted
技术标签:
【中文标题】谷歌 OCR 在特定领域工作【英文标题】:Google OCR working on specific area 【发布时间】:2017-12-08 16:16:43 【问题描述】:我目前正在使用来自com.google.android.gms.vision
的SurfaceView
和CameraSource
来捕获图像上检测到的文本,但是由于它捕获了SurfaceView
区域上的所有内容,因此我需要丢弃一些恢复的内容。
目标是让SurfaceView
像下一张图片一样工作,忽略所有检测到的红叉区域中的文本,只给我蓝色方块上的东西。
这可能吗?
这是布局(没什么特别的):
<?xml version="1.0" encoding="utf-8"?>
<android.support.constraint.ConstraintLayout
xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:app="http://schemas.android.com/apk/res-auto"
android:layout_
android:layout_>
<SurfaceView
android:id="@+id/fragment_surface"
android:layout_
android:layout_
app:layout_constraintBottom_toBottomOf="parent"
app:layout_constraintLeft_toLeftOf="parent"
app:layout_constraintRight_toRightOf="parent"
app:layout_constraintTop_toTopOf="parent" />
</android.support.constraint.ConstraintLayout>
这里有关于类的 OCR 相关代码:
public class CameraActivity extends AppCompatActivity
private SurfaceView surfaceView;
private CameraSource cameraSource;
private StringBuilder builder;
@Override
protected void onCreate(Bundle savedInstanceState)
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_camera);
surfaceView = (SurfaceView) findViewById(R.id.fragment_surface);
TextRecognizer recognizer = new TextRecognizer.Builder(getApplicationContext()).build();
if (recognizer.isOperational())
cameraSource = new CameraSource.Builder(getApplicationContext(), recognizer)
.setFacing(CameraSource.CAMERA_FACING_BACK)
.setRequestedPreviewSize(1280, 1024)
.setRequestedFps(15.0f)
.setAutoFocusEnabled(true)
.build();
surfaceView.getHolder().addCallback(new SurfaceHolder.Callback()
@Override
public void surfaceCreated(SurfaceHolder holder)
if (ActivityCompat.checkSelfPermission(getApplicationContext(), Manifest.permission.CAMERA) != PackageManager.PERMISSION_GRANTED)
ActivityCompat.requestPermissions(CameraActivity.this, new String[]Manifest.permission.CAMERA, 100);
return;
try
cameraSource.start(surfaceView.getHolder());
catch (IOException e)
e.printStackTrace();
@Override
public void surfaceChanged(SurfaceHolder holder, int format, int width, int height)
//
@Override
public void surfaceDestroyed(SurfaceHolder holder)
cameraSource.stop();
);
recognizer.setProcessor(new Detector.Processor<TextBlock>()
@Override
public void release()
//
@Override
public void receiveDetections(Detector.Detections<TextBlock> detections)
final SparseArray<TextBlock> items = detections.getDetectedItems();
if (items.size() != 0)
builder = new StringBuilder();
for (int i = 0; i < items.size(); i++)
TextBlock it = items.valueAt(i);
builder.append(it.getValue());
String read = builder.toString().trim().replace(" ", "").replace("\n", "");
//It continues doing other things here
);
@Override
public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults)
switch (requestCode)
case 100:
if (grantResults[0] == PackageManager.PERMISSION_GRANTED)
try
if (ActivityCompat.checkSelfPermission(getApplicationContext(), Manifest.permission.CAMERA) != PackageManager.PERMISSION_GRANTED)
return;
cameraSource.start(surfaceView.getHolder());
catch (IOException e)
e.printStackTrace();
break;
【问题讨论】:
你弄明白了吗?我也面临同样的问题 对不起,但不是这种方法。我们最终将它与 OpenCV 混合在一起。您可以使用该工具选择一个确切的区域并在其上使用 OCR。 啊,太糟糕了 【参考方案1】:当设备处于纵向模式(如您的图片中)时,非红色区域应该是相机预览的裁剪部分(填满整个屏幕),因此:
如果您想显示整个相机预览并仅在裁剪区域上执行 OCR:那么您必须获取 SurfaceView 的“屏幕截图”(完整区域),然后裁剪该区域以获得所需的即时像素李> intead,如果您只想显示裁剪区域(因为红色区域充满了其他界面、按钮、TextView 等):那么您必须使用 SurfaceView 来仅渲染所需的部分通过使用 Matrix 参数在该特定位置进行相机预览我建议您“升级”到 TextureView,这有点难以管理/使用,但允许通过在其内部纹理上使用矩阵来根据需要裁剪、缩放和缩放预览。
【讨论】:
【参考方案2】:简介
我正在尝试对您现有的代码进行最小的更改,因此只需像现在一样扫描您的整个图像,并过滤掉超出范围的结果单词(或块)。
代码
使用rect.intersect 查找word
的bounding-box
并查看它是否在您的rectangle
(rect
) 中:
Rect yourRect = new Rect(10, 20, 30, 40);
rect.intersect(yourRect);//also see Drawable d = d.getBounds();
尝试将此代码添加到您的 @Override public void receiveDetections()
方法中:
//Loop through each `Block`
foreach (TextBlock textBlock in blocks)
IList<IText> textLines = textBlock.Components;
//loop Through each `Line`
foreach (IText currentLine in textLines)
IList<IText> words = currentLine.Components;
//Loop through each `Word`
foreach (IText currentword in words)
//Get the Rectangle/BoundingBox of the word
RectF rect = new RectF(currentword.BoundingBox);
// Check if the word boundingBox is inside the area required
// using: rect.intersect(yourRect);
//...
所以它看起来像这样:
recognizer.setProcessor(new Detector.Processor<TextBlock>()
@Override
public void release()
//
@Override
public void receiveDetections(Detector.Detections<TextBlock> detections)
final SparseArray<TextBlock> items = detections.getDetectedItems();
if (items.size() != 0)
builder = new StringBuilder();
for (int i = 0; i < items.size(); i++)
TextBlock it = items.valueAt(i);
builder.append(it.getValue());
String read = builder.toString().trim().replace(" ", "").replace("\n", "");
List<TextBlock> blocks = new List<TextBlock>();
TextBlock myItem = null;
for (int i = 0; i < items.Size(); ++i)
myItem = (TextBlock)items.ValueAt(i);
//Add All TextBlocks to the `blocks` List
blocks.Add(myItem);
//Loop through each `Block`
foreach (TextBlock textBlock in blocks)
IList<IText> textLines = textBlock.Components;
//loop Through each `Line`
foreach (IText currentLine in textLines)
IList<IText> words = currentLine.Components;
//Loop through each `Word`
foreach (IText currentword in words)
//Get the Rectangle/BoundingBox of the word
RectF rect = new RectF(currentword.BoundingBox);
// Check if the word boundingBox is inside the area required
// using: rect.intersect(yourRect);
//put the word in a filtered list...
//It continues doing other things here
);
只有六行代码!
【讨论】:
以上是关于谷歌 OCR 在特定领域工作的主要内容,如果未能解决你的问题,请参考以下文章
喜报平安科技OCR团队勇夺图像识别领域评测比赛COCO Text世界第一!