Unihan fields

mikelove

皇帝
Staff member
We're packaging up our version of the Unihan database for PlecoDict 1.0, and I was wondering if anybody had any particular Unihan fields they considered important to include.

The Unihan website is at http://www.unicode.org/charts/unihan.html - basically it's a repository of data on individual Chinese characters, pronunciations, encodings, dictionary cross-references, definitions, etc. The "UniChin" dictionary database for Oxford E&C was based on it, and we're planning to produce a similar conversion for PlecoDict.

The current list of fields we're planning to use:

The character itself
Pinyin pronunciation(s)
Definitions
Simplified/traditional variants
Cantonese pronunciation(s)
Radical
Total strokes

Is there anything else people would like us to add? It seems a little silly to add Korean/Japanese pronunciation in a Chinese dictionary, but we certainly could... We might also consider adding Hanyu Da Zidian or Cihai references, though since those are tied to particular editions of those books I'm not sure how useful they'll be for most people.

Of course, since we're also planning to release MakeDict for PlecoDict in a few weeks, if you don't like our choices you'll always be welcome to make your own version.
 

koreth

榜眼
At first I thought, "Japanese pronunciations would have some novelty value, but not much more." Then it occurred to me that it'd be helpful for decoding proper nouns such as Japanese people's names.

I doubt that makes it worth including in the pre-packaged version of the dictionary, but I have space to burn on my memory card so I'd probably include it in my own version once you release the software to do it.
 

johnh113

榜眼
Dear Mike,

Japanese pronunciations would also be useful for me, as I'm also studying Japanese.

John
 

Attachments

  • Backup1[1].XML
    6.7 MB · Views: 1,171

mikelove

皇帝
Staff member
Well I suppose that adding Japanese pronunciation wouldn't grow the file size too much, but the handwriting recognizer doesn't even support some of the standard Kanji (since it's only designed for Chinese) so I don't know how useful this would really be. Hmm...
 
A

Anonymous

Guest
Mathews and Karlgren, please

I'd like to see Mathews and Karlgren included, please.

Regards,
Chris Gait
 

curwenx

秀才
Cangjie codes would be nice.

Wubi codes would also be nice if you have any way to include them,
since they're not included in the unihan database.
 

mikelove

皇帝
Staff member
Hmm... actually I suppose it might make sense to just include everything, we could put the more widely-used items on top (the ones I mentioned before) and try to put everything else in later. We'll see where we end up on Monday.

Wubi codes would be nice, but we don't have time to add them in for next week's release - hopefully we'll get around to it at some point, or perhaps someone else will take this on and make their own own Wubi database...
 
Top