Google Vision의 감지 영역 제한, 텍스트 인식

11

나는 하루 종일 해결책을 찾고 있습니다. 내 문제와 관련하여 여러 스레드를 확인했습니다.

그러나 그것은별로 도움이되지 않았습니다. 기본적으로 카메라 미리보기는 전체 화면이지만 텍스트는 사각형이 그려지는 화면 중앙에서만 인식됩니다.

내가 사용하는 기술 :

광학 문자 인식 (OCR)을위한 Google 모바일 비전 API
의존성 : play-services-vision

내 현재 상태 : BoxDetector 클래스를 만들었습니다.

public class BoxDetector extends Detector {
    private Detector mDelegate;
    private int mBoxWidth, mBoxHeight;

    public BoxDetector(Detector delegate, int boxWidth, int boxHeight) {
        mDelegate = delegate;
        mBoxWidth = boxWidth;
        mBoxHeight = boxHeight;
    }

    public SparseArray detect(Frame frame) {
        int width = frame.getMetadata().getWidth();
        int height = frame.getMetadata().getHeight();
        int right = (width / 2) + (mBoxHeight / 2);
        int left = (width / 2) - (mBoxHeight / 2);
        int bottom = (height / 2) + (mBoxWidth / 2);
        int top = (height / 2) - (mBoxWidth / 2);

        YuvImage yuvImage = new YuvImage(frame.getGrayscaleImageData().array(), ImageFormat.NV21, width, height, null);
        ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
        yuvImage.compressToJpeg(new Rect(left, top, right, bottom), 100, byteArrayOutputStream);
        byte[] jpegArray = byteArrayOutputStream.toByteArray();
        Bitmap bitmap = BitmapFactory.decodeByteArray(jpegArray, 0, jpegArray.length);

        Frame croppedFrame =
                new Frame.Builder()
                        .setBitmap(bitmap)
                        .setRotation(frame.getMetadata().getRotation())
                        .build();

        return mDelegate.detect(croppedFrame);
    }

    public boolean isOperational() {
        return mDelegate.isOperational();
    }

    public boolean setFocus(int id) {
        return mDelegate.setFocus(id);
    }

    @Override
    public void receiveFrame(Frame frame) {
        mDelegate.receiveFrame(frame);
    }
}

그리고이 클래스의 인스턴스를 여기에 구현했습니다.

   final TextRecognizer textRecognizer = new TextRecognizer.Builder(App.getContext()).build();

    // Instantiate the created box detector in order to limit the Text Detector scan area
    BoxDetector boxDetector = new BoxDetector(textRecognizer, width, height);

    //Set the TextRecognizer's Processor but using the box collider

    boxDetector.setProcessor(new Detector.Processor<TextBlock>() {
        @Override
        public void release() {
        }

        /*
            Detect all the text from camera using TextBlock
            and the values into a stringBuilder which will then be set to the textView.
        */
        @Override
        public void receiveDetections(Detector.Detections<TextBlock> detections) {
            final SparseArray<TextBlock> items = detections.getDetectedItems();
            if (items.size() != 0) {

                mTextView.post(new Runnable() {
                    @Override
                    public void run() {
                        StringBuilder stringBuilder = new StringBuilder();
                        for (int i = 0; i < items.size(); i++) {
                            TextBlock item = items.valueAt(i);
                            stringBuilder.append(item.getValue());
                            stringBuilder.append("\n");
                        }
                        mTextView.setText(stringBuilder.toString());
                    }
                });
            }
        }
    });


        mCameraSource = new CameraSource.Builder(App.getContext(), boxDetector)
                .setFacing(CameraSource.CAMERA_FACING_BACK)
                .setRequestedPreviewSize(height, width)
                .setAutoFocusEnabled(true)
                .setRequestedFps(15.0f)
                .build();

실행시이 예외가 발생합니다.

Exception thrown from receiver.
java.lang.IllegalStateException: Detector processor must first be set with setProcessor in order to receive detection results.
    at com.google.android.gms.vision.Detector.receiveFrame(com.google.android.gms:play-services-vision-common@@19.0.0:17)
    at com.spectures.shopendings.Helpers.BoxDetector.receiveFrame(BoxDetector.java:62)
    at com.google.android.gms.vision.CameraSource$zzb.run(com.google.android.gms:play-services-vision-common@@19.0.0:47)
    at java.lang.Thread.run(Thread.java:919)

누군가 단서가 있거나 내 잘못이 무엇인지 또는 대안이 있다면 정말 감사하겠습니다. 감사합니다!

이것이 내가 달성하고자하는 것입니다. 텍스트 영역 스캐너 :

— 앨런
소스

0

구글 비전 감지는 입력 프레임입니다. 프레임은 이미지 데이터이며 관련 데이터로 너비와 높이를 포함합니다. U는이 프레임을 검출기로 전달하기 전에 처리 할 수 있습니다 (작은 중앙 프레임으로 자릅니다). 이 과정은 빠르고 카메라 처리 이미지와 함께 수행되어야합니다. 아래에서 내 Github을 확인하십시오 (FrameProcessingRunnable 검색). U는 프레임 입력을 볼 수 있습니다. 당신은 거기에서 스스로 프로세스를 수행 할 수 있습니다.

카메라 소스

— 탄 하인
소스

안녕하세요, 답변 주셔서 감사합니다! 코드를보고 궁금한 점이 있습니다. 코드에서 무엇을 변경해야합니까? 내가 추가해야 할 유일한 것은 프레임 처리 부분입니까? (2 개의 개인 수업)?

— Alan

예, U는 마지막 탐지기 작동으로 프레임을 전달하기 전에 프레임을 수정해야합니다. mDetector.receiveFrame(outputFrame);

— Thành Hà Văn

추가해야하는 코드로 답변을 편집하여 코드를 작성하고 바운티를 수여 할 수 있습니까?

— Alan

0

google-vision에서는 Mobile Vision API를 사용하여 이미지에서 텍스트 위치를 얻는 방법에 설명 된 것처럼 감지 된 텍스트의 좌표를 얻을 수 있습니다 .

당신은 얻을 TextBlocks에서 TextRecognizer당신은을 필터링 TextBlock에 의해 결정될 수있다 자신의 좌표에 의해 getBoundingBox()또는 getCornerPoints()방법 TextBlocks클래스 :

텍스트 인식기

인식 결과는 detect (Frame)에 의해 반환됩니다. OCR 알고리즘은 텍스트 레이아웃을 유추하려고 시도하고 각 단락을 TextBlock 인스턴스로 구성합니다. 텍스트가 감지되면 하나 이상의 TextBlock 인스턴스가 반환됩니다.

[..]

공개 방법

public SparseArray<TextBlock> detect (Frame frame)이미지의 텍스트를 감지하고 인식합니다. 현재 비트 맵 및 NV21 만 지원합니다. int 도메인에 텍스트 블록의 불투명 한 ID를 나타내는 int를 TextBlock에 매핑합니다.

출처 : https://developers.google.com/android/reference/com/google/android/gms/vision/text/TextRecognizer

TextBlock

public class TextBlock extends Object implements Text

OCR 엔진에 의해 간주되는 텍스트 블록 (문단으로 생각)

공개 메소드 요약

Rect getBoundingBox() TextBlock의 축 정렬 경계 상자를 반환합니다.

List<? extends Text> getComponents() 이 엔티티를 구성하는 더 작은 구성 요소 (있는 경우).

Point[] getCornerPoints() 왼쪽 상단부터 시작하여 시계 방향으로 4 개의 모서리 지점.

String getLanguage() TextBlock에서 통용되는 언어.

String getValue() 인식 된 텍스트를 문자열로 검색하십시오.

출처 : https://developers.google.com/android/reference/com/google/android/gms/vision/text/TextBlock

따라서 기본적으로 Mobile Vision API를 사용하여 이미지에서 텍스트 위치를 얻는 방법에서 와 같이 진행하십시오 . 그러나 당신은 한 줄로 블록을 나누지 않고 다음과 같은 단어로 줄을 나눕니다.

//Loop through each `Block`
            foreach (TextBlock textBlock in blocks)
            {
                IList<IText> textLines = textBlock.Components; 

                //loop Through each `Line`
                foreach (IText currentLine in textLines)
                {
                    IList<IText>  words = currentLine.Components;

                    //Loop through each `Word`
                    foreach (IText currentword in words)
                    {
                        //Get the Rectangle/boundingBox of the word
                        RectF rect = new RectF(currentword.BoundingBox);
                        rectPaint.Color = Color.Black;

                        //Finally Draw Rectangle/boundingBox around word
                        canvas.DrawRect(rect, rectPaint);

                        //Set image to the `View`
                        imgView.SetImageDrawable(new BitmapDrawable(Resources, tempBitmap));


                    }

                }
            }

대신 당신은 얻을 경계 상자 의 모든 텍스트 블록을 다음 화면 / 프레임의 중앙 또는 사용자가 지정하는 사각형에 가장 가까운 좌표와 경계 상자를 선택 (즉, 내가 중심 X를 얻을 수있는 방법, 안드로이드에서 내보기의 Y? ) 이를 위해 getBoundingBox()또는 getCornerPoints()방법 을 사용합니다 TextBlocks...

— 랄프 htp
소스

내일 감사드립니다

— Alan

나는 그것을 시도했지만 그것을 올바르게 구현하는 방법을 몰랐다

— Alan