Screen-OCR-floating-button does not recognize a character 儿 ...?

#1
Hello

I use Screen-OCR-floating-button satisfactorily.

But today I notice that it cannot recognize a certain character 儿 (zhèr).
See the screenshot.
Why?

I "enter" the Chinese sentence directly to Pleco .. Pleco recognizes 儿(zhèr) without problem.

Thank you.

B: 哦,那不是我的车,我的车在这儿。
B: Ò, nà bú shì wǒ de chē, wǒ de chē zài zhèr.
B: Oh, that isn't my bike. This is my bike here.
 

Attachments

#3
Hello

No.
Resizing doesn't help.

BUT .. by using another app to display the text file ... the error could disappear!
There are "thousands" apps to display a text file.
It seems that certain fonts ... are the "culprit" ... right?
Thank you.
 
#4
Hello

With Alreader as a display app .. I can "understand" the Chinese sentence.
See attachment.

BTW .. with Screen-OCR-floating-button .. Pleco helps me more than Wenlin!
Screenshot_20180329-072029.png
 

mikelove

皇帝
Staff member
#5
That makes sense, but if it's a text file then why not look for a reader that works well with our Screen Reader function? (some do better than others) That way you'd get characters recognized with perfect fidelity since we'd be going by the original text.
 
#6
Hello

In my laptop (Windows-10) I collect, edit, manage Chinese books/text in mostly PDF-files.
In this special case I've downloaded the text from Oxford-Chinese-Websites.
I have two files .. a PDP and a TXT files.
I transfered them via WLAN to my Nexus-7.
For the PDF-file .. to display .. I used mostly Moon+ Reader Pro, Bookari or Librera.
For the TXT-file I used (first) mostly WPS Office, ES-Notiz-Editor.
Both files have the same problem .. that is .. Screen-OCR-floating-button cannot recognize a character.
Then I find that AlReader can recognize that character.

My question is: what is .. "look for a reader that WORKS WELL WITH our Screen Reader function" ??
Which criteria should I use to decide that "Screen-Reader function" can recognize ALL characters with a certain reader?
In Android-world if an app doesn't "work" ... put it aside .. use the next app ... etc.
Throw-away apps .... except Pleco ;-)

Thank you.
 
#7
Hello

Please see the picture.

A:欸,是新生吧,我来帮你拿行李。
A: Éi, shì xīnshēng ba, wǒ lái bāng nǐ ná xíngli.
A: Hey, you're a freshman, aren't you? Let me help with your luggage.

欸 is not correctly recognized by the Screen-OCR.
Instead of ei2 (an interjection) ... kuan3.
Why?

The screen-OCR (floating-button) is not 100% error-free ....?!

What should/could I do?
Thank you.

BTW ..
I'm still confused concerning Pleco's utilities-names! Which is which ..
I used the below-floating-button since last year.

Screenshot_20180330-113703.png
 
#8
Hello

For the TXT-file ... better to use Pleco's File Reader.
For a PDF-file ... better to use Screen-OCR-floating-button (and pray) than Pleco's File Reader.
 
#9
Hello

I made an experiment to compare 2 ways to know what a characters means:

WPS Office app displays a Chinese text "A:欸,是新生吧,我来帮你拿行李。"

1. With my forefinger I touched (a few seconds) the character 欸.
Pleco reacted .. and displayed the right answer.

2. With my forefinger I touched the below-floating-button.
The green box appeared .. and the WRONG answer was displayed. See "old" screenshots.

Who made the wrong answer?
Screen-OCR?

BTW .. In my laptop is an ABBYY-OCR-Engine.
I made the following actions:
unsearchable PDF -> searchable PDF -> Copy&Paste -> Wenlin.
Wenlin recognized the right character and gave me .. more than enough information about the right character.
 
#10
Hi dxcarnadi,

one way of getting good results may be the following:

Converting an unsearchable PDF to a searchable PDF using the ABBYY OCR engine, then copying and pasting the text into a text file and opening that in Pleco instead of Wenlin (more dictionaries and functionality).

Cheers,

Shun


BTW .. In my laptop is an ABBYY-OCR-Engine.
I made the following actions:
unsearchable PDF -> searchable PDF -> Copy&Paste -> Wenlin.
Wenlin recognized the right character and gave me .. more than enough information about the right character.
 
Last edited:

mikelove

皇帝
Staff member
#11
Hmm... what about if you open that OCR'ed PDF in our File Reader? You mention that not being workable, but why?

As far as what works well with Screen Reader - to be honest that's so much a function of Android version + the precise version of the reader you're using that it's hard for us to offer definitive advice, about all I can suggest is that you try a bunch of free ones and see what works. This is for Screen *Reader*, not OCR - the one with the two-diamond icon, not the camera icon - so it's an all-or-nothing thing; if it works then you'll see all of the text from the reader overlaid on the screen in correctly positioned boxes, if not then you won't.
 
#12
Hello

File Reader can read (successfuly) a 500 pages OCRed PDF-file .. but .. I can navigate it only with a left-arrow and a right-arrow....!?
Little bit arduous ...

Another thing ...

I use a TXT-file (no PDF-file) in FileReader to test Pleco.
It has 2 lines:

热们 = popular
好赚 = easy to find

FileReader CANNOT find 热们 although Pleco-direct-input is successful.
FileReader can find the english defintion of 好赚 but Screen-OCR cannot find the english definition, Screen-OCR can find the chinese definition of 好赚 thru GF.
I'm confused!
Chinese language is unfortunately not redundant .. unlike English or German.

Installed CE dictionaries in my Pleco:
2 x Tuttle C-E
ABC
GF
KEY
ADS
PLC
CC

Should I buy another CE dictionaries? .. Oxford etc.?
I use the old Nexus7(2013). It doesn't have a SD Card.
Thank you.
 

mikelove

皇帝
Staff member
#13
You can long-press one of those arrows to jump to a specific page number. (a little better at least)

热们 = did you mean 热门?

好赚 = are you sure Screen OCR got the two characters correctly?
 
#14
Hello

1. Yes .. a jump is a lot better than walking ..
I have an PDF ebook with 1000 pages.

2. Yes .. "men" was 门 .. I thought it was an error.
I changed it to 们 .. with a text editor in Windows10.
Do you mean it was not an error ..?

3. No .. I made a mistake .. Sorry.
see attachment number 1.
But only GF find the defintion.
The ebook is from China.
BTW .. what can I do in this case? .. to get from Pleco an English explanation .. when the definition is only in Chinese ..?

4. See picture nr. 2
In FileReader .. the character means an abbreviation for "Verb" .. only GF use it .. and can recognize it.
Only CC knows it.
The same question .. like point 3

5. FileReader with an OCR-ed PDF file.
See picture number 3
The definition window uses 2 characters .. but it marks 4,5 characters .. always 4,5 characters.
In this case .. I cannot experiment with the arrows .. I lost the control ..

Thank you..
I'm in a "Brownian motion" thru Pleco .... Wenlin in steroids
1. Screenshot_20180402-193417.png 3. Screenshot_20180403-181750.png 2. Screenshot_20180403-140851.png
 

mikelove

皇帝
Staff member
#15
2) I believe 门 is the correct option here, yes.

3) Tap the top-right icon to bring this up in the regular dictionary and you can then tap on Chinese words in this Chinese definition to look them up.

4) Sorry, so you can't toggle to another dictionary?

5) That one may just be wacky character placement in your PDF OCR system - not necessarily something we can do much about if that's how the characters are placed in the original file.
 
#16
Hello Mike

You wrote "5) .. wacky character placement ..".
Can you please formulate the "error" more technically ..

I scanned that book with a relativ expensive high-tech scanner CZUR .. made in Shenzhen.
It uses Abbyy-OCR-engine.

The CZUR's support asked me .. what kind of error?
So that she can test herself and contact Abbyy.

Thank you.
 

mikelove

皇帝
Staff member
#17
The error in this case would be that the boxes for the characters on the PDF don't actually correspond to the locations of the characters on the page. (it could also be an issue in the PDF decoder we use on our end - does character selection give you the correct locations in another PDF reader?)
 
#18
Hello

To use File Reader .. one needs:
1. Text file with f.e. UTF 16 BE format
2. PDF-file, which must be searchable in Chinese BUT not OCR-ed!!

The other type PDF-files must be used with a normal PDF-reader and Pleco Screen-OCR (floating-button).

I have tested all 3 possibilities.
 
Top