pinyin for phrase

herve

举人
Bonjour tout le monde (*)

When you see a chinese text, you may not remember all chinese characters. Sometimes, the way to say it can help you to understand, just because writing + pronunciation is more efficient than just one of them.
Here comes the proposal:
I would like to write a sequence of characters using the handwriting recognition facility, and Oxford Dict will give back the sequence of pinyin words. Of course, when several possibilities exist (eg: le, liao3, liao4), the Oxford Dict should give the first character found, and show by some means that it is just a guess.
The translation will be difficult to show: it requires more than just a dictionary... The user should simply be able to "copy to input field" a part of the text and see what the dictionary gives back.

Is that a good proposal ?

Herve

(*) ni3men hao3
 

mikelove

皇帝
Staff member
I see what you're getting at... so basically we would just be providing a way to rapidly recognize/output pinyin for a string of characters, perhaps even with some sort of simple tokenizer that used the built-in dictionary database to try to add spaces in between words. It might be prohibitively difficult to implement, but we'll look into it at least. Thanks!
 

herve

举人
I agree with you : it might become difficult to implement. But I could be happy with a *low* level of functionality (at least for the time being... :wink: ).
For instance:
* level 1:
a-at the beginning of the chinese characters string, until a translation is found, the dictionary acts as it currently does (pinyin and entry in the dictionary are displaied);
b-starting at the first character in the string for which no more complete translation is found, Oxford Dict only shows pinyin, possibly simply based on the first corresponding pinyin found in the characters database.
I guess that this is a simple extension of the current program. It is a new possibility, to some extend an alternative to the "delete non-matching characters" option.
* level 2:
a-start like level 1 (i.e. like current version)
b-restart like in step a, but do not delete the first caracters and their pinyin output. We would here forget about the translation, and possibly let translation found in step a displayed.
The difficulty here will be mainly in the management of the GUI. Search algorithms are unchanged.
* levels 3 and more:
-for each "substring" found in the dictionary, show the pinyin and the translation. I understand that this might me difficult and useless with the Oxford Dictionary (or its successor), due to the lenght of the entries in the dictionary. However, for those people like me who are building very simple personal dictionaries (one french or english word for each chinese word, no long examples), it would make sense. This would give a very basic phrase translation.
Do I help you in your investigations?
Herve
 

mikelove

皇帝
Staff member
That does clarify it for me, yes. Levels 1 and 2 in your list should certainly be possible; it's trivial to come up with a character-to-Pinyin translation table (there are a number of free data sources, or we can just use the dictionary itself for this), and beyond that you're correct that it's really just a user interface problem. And level 3 is quite doable as well, it's really just a matter of breaking the string into every possible substring and looking for matches; we could even wrap it up in some sort of nifty hierarchical view that would display the English definitions for every available breakdown of the input string.

So the big question is whether we'll have enough time to do this and whether we'll be able to come up with a decent interface for it, but there's certainly a good chance this will make it into the ABC and "Oxford 3.0".
 
Top