How can you SORT Chinese characters...single and multiple

alex_hk90

状元
In English, we sort any thing under the sun, but how you sort Chinese characters?

From what I understand, traditionally it was generally by number of strokes and/or radicals, and more recently it has been by Pinyin (romanisation of pronunciation).
 

朱真明

进士
1- Stroke order
2- Radical
3- By category (usually used in speciality dictionaries like a medical dictionary. eg. Circulatory system, Bones,muscles and ligaments, Diagnostics, Pharmacology etc.)
4- Zhu Yin (Usually used in Taiwan)
5- Alphabetically using any romanisation system (usually used for bilingual dictionaries)
 

Sy

进士
Thanks ,you both.
笔画,部首,拼音are the 3 ways used to sort as you both said.
And I am also aware of them..... Incl 注音.
These systems offer no fixed positions and awkward to use too.
The first 2 are ancient .
pinyin is new . Since characters have same sound ,they create
Problems. ARE YOU SATISFIED WITH THE 3 systems?
Can you do some thing to remedy the arising problem?
 

朱真明

进士
I wouldn't discount the first two just because they are ancient, that is referred to as the "Appeal to Novelty" fallacy. In fact I would prefer the first two over Zhu Yin or romanisation due to the fact that it focus on the structure of the character rather than the pronunciation. Chinese characters inherently represent meaning not sound and have been used quit consistently over the last 2000 years in terms of structure and meaning where as the pronunciation has frequently changed and is always variable due to the fact of dialects and multiple pronunciations of a single character.

Regardless of whatever system you use to categorize Chinese characters whether it be structure, meaning, pronunciation, grammar, etymology, stroke order etc. there will always be overlaps that will create categorical confusions. This is the same in English where words may have multiple pronunciations or ways of writing the same word. Of course Chinese characters are more complex due to their far greater history, many variations popped up along the way for various reasons.

Personally I have no answer the the question of remedying the current problems. If I were to create a new categorical system to categorize Chinese characters it would have to be quit precise and inclusive of the whole history of the characters beginning and their development which would take a tremendous amount of study and may not even be possible. Categories are inherently flawed that only allow you to focus a certain aspects which is practical but still delusional.

See this Youtube video if you would like to see more on categorical thinking.
 

alex_hk90

状元
Thanks ,you both.
笔画,部首,拼音are the 3 ways used to sort as you both said.
And I am also aware of them..... Incl 注音.
These systems offer no fixed positions and awkward to use too.
The first 2 are ancient .
pinyin is new . Since characters have same sound ,they create
Problems. ARE YOU SATISFIED WITH THE 3 systems?
Can you do some thing to remedy the arising problem?

To be honest I don't find the need to sort Chinese characters often.
When I'm using Pleco to look up a word I search for particular Hanzi, Pinyin or meaning which is good enough for me.
 

Sy

进士
真明 is very theorical .all I want is a practical system to index / sort characters.
So I can index a list of names instead of in pinyin in which homonyms cause problem.
I see In newspaper often that a list of names are sorted by stroke and not practical to sort by Bushou /radicals because of multiple tables for reference required.
Alex has no need to sort.
As for me , I like to sort the names in a phone directory of 50,000 items or more
For easy retrieval
Or to compile a glossary of vocabulary of terms for study.
 

Abun

榜眼
I see In newspaper often that a list of names are sorted by stroke and not practical to sort by Bushou /radicals because of multiple tables for reference required.
I think that is actually managable because you would learn the order of the radicals very quickly. If you're looking sth up in a contact list of 50,000 entries, you can be pretty sure that there are already entries with the same radical, so you just have to quickly skim through and look for that. For example, if you're looking for somebody with the surename 許 you would look for entries with the radical 言 (radical 149, so it's rather towards the end, especially seeing as there are not many surnames with high stroke count radicals such as 龍), and when you've found the correct area, you know that 午 only has 5 strokes, so you'll want to look pretty much at the very beginning of the 言 characters.

As for other systems, I guess sorting by Cangjie would be possible as well; that's only 24 symbols + 1 placeholder symbol 難 (which is used in a few rather messy looking characters, but never as a first symbol), compared to 214 Kangxi radicals. I've never seen a wordlist sorted by Cangjie, though. Windows by default sorts by radical.
 

Sy

进士
a phrase of encouragement
image.jpeg
 

feng

榜眼
Sy, what is it that you are looking for? I've read your and the others' posts on this thread, and I am not clear what you want. A phone directory will have a lot of certain characters. A vocabulary list to be sorted for what reason?

When you go to buy a car, it matters if you want to go fast or get good gas mileage or have four kids or like to drive places with bad roads. There is no one best car. It depends what you want to do with your car.

Could you please explain what you don't like about the existing systems? Are you familiar with stroke count and stroke type of the first two strokes used by 漢語大字典. I forget, but it has something like 60,000 characters (second edition). It eliminates needing to worry about which radical certain irksome characters are listed under. Far fewer characters under each section than for any phonetic system (zhuyin or romanization).

笔画,部首,拼音are the 3 ways used to sort as you both said.
And I am also aware of them..... Incl 注音.
These systems offer no fixed positions and awkward to use too.
The first 2 are ancient .
pinyin is new .

Hanyu Pinyin is new, but the first attempt to use the Roman alphabet for pronunciation (pinyin), and presumably ordering, predates the first ordering of entries by stroke count in a Chinese dictionary (though I can not say for certain that stroke count was not used outside of dictionaries before Mssr. Mei).
 

Sy

进士
Sy, what is it that you are looking for? I've read your and the others' posts on this thread, and I am not clear what you want. A phone directory will have a lot of certain characters. A vocabulary list to be sorted for what reason?

When you go to buy a car, it matters if you want to go fast or get good gas mileage or have four kids or like to drive places with bad roads. There is no one best car. It depends what you want to do with your car.

Could you please explain what you don't like about the existing systems? Are you familiar with stroke count and stroke type of the first two strokes used by 漢語大字典. I forget, but it has something like 60,000 characters (second edition). It eliminates needing to worry about which radical certain irksome characters are listed under. Far fewer characters under each section than for any phonetic system (zhuyin or romanization).

笔画,部首,拼音are the 3 ways used to sort as you both said.
And I am also aware of them..... Incl 注音.
These systems offer no fixed positions and awkward to use too.
The first 2 are ancient .
pinyin is new .
Hanyu Pinyin is new, but the first attempt to use the Roman alphabet for pronunciation (pinyin), and presumably ordering, predates the first ordering of entries by stroke count in a Chinese dictionary (though I can not say for certain that stroke count was not used outside of dictionaries before Mssr. Mei).


Mr Feng, Thank you for coming back to tackle this problem. I will explain in the next post later in detail.
If i can compile a phone book in sort order,I can seek out the name/phone number fast.Likewise if I can compile a dictionary in sorted order, I can look up the meaning/sound of a new character. Placo is good but serves in a different way.
I may hand write the future post rather than typing in english.
 

feng

榜眼
Mr Feng, Thank you for coming back to tackle this problem. I will explain in the next post later in detail.
If i can compile a phone book in sort order,I can seek out the name/phone number fast.Likewise if I can compile a dictionary in sorted order, I can look up the meaning/sound of a new character. Placo is good but serves in a different way.
I may hand write the future post rather than typing in english.
"meaning/sound of a new character. Placo is good but serves in a different way." You are specifically looking to avoid a system based on phonetic look-up? Or just wanting alternatives?

Your first post said, "In English, we sort anything under the sun, but how you sort Chinese characters?" What do you sort in English (or, I guess we are asking how)? English is typically sorted by pinyin (i.e. not Hanyu pinyin), unless we wish to sort by category (e.g. nouns, three syllable words, words starting or ending with a certain letter, etc.), or am I misunderstanding your question? A phone book is simply "Smith, John". Of course, in Chinese you may not know the pronunciation of a character, and hence not know its pinyin, but in English we sometimes do not know the spelling of a word (names especially), and have no other recourse (well, Google may help), whereas Chinese has options such as 部首 or stroke count and type. ABC Comprehensive Dictionary is not good as a dictionary, but it does line up 196,000+ entries by pinyin and the number of duplicate spellings under polysyllabic entries are never very large (for each case).

Take your time. No hurry to reply.
 

Sy

进士
Hi Sobr....
I use iPad as my tool.
I go to baidu apps...finger write 汉字改革的不可能性 as my search key.
....that article pops out on the first line.
In fact, key in 汉字改革,you get many sites in baidu.

One character is usually a character.bigram is a word as you know.
I aim to sort both.
eg, 中,中心,中间,中国…etc
Sort can be word by word...each term in each line
Or word with sub heading for bigram.
 

feng

榜眼
Sobr: http://www.126doc.com/p-16508763.html

Sy: Li Wang died in 1986, so an article talking about 50 years since the founding of new China is unlikely to be written by him, or at the very least it would seem that some person edited an article he wrote earlier. I will respond to the article and your posts separately.
 

feng

榜眼
Looking at the main points of the article:

All from page one.
Line 2: This is because of simplified characters. They are harder to learn because they are less distinct from each other. They create more similar looking characters (e.g. 壓/压、莊/庄、etc.) because they use fewer strokes, fewer 偏旁.

Lines 3 and 4: As I have asked rhetorically elsewhere on this forum, how is it that Taiwan and Hong Kong have such superlative literacy rates? This can be looked at in various cultures through history: the government (or some sort of leadership institution) needs to foster education, then it happens (Edo Japan, colonial America, etc.).

Line 7: This isn't so much the fault of Chinese characters. In 1945, certain German speaking areas were ceded to Italy and France, and people there still speak German (in Italy this is partly due to the fact that schools in those areas use German). There are tiny pockets in southern Italy where some people (maybe just older people) still speak Greek (2,000+ years later!). This sort of situation can be found in a number of places in Europe (e.g. ancient Greek spoken in a villiage in 21st century Turkey). What does this have to with Chinese? Cantonese, to use one example, is a separate language from Mandarin and people quite like their language.

Middle of page 1:
Computers make writing Chinese easier.
Computer languages may use English words and basic syntax, but to the uninitiated they indeed are languages, separate from English and unintelligible to the untrained English speaker.
As for English holding back Chinese from having good computer skills, all the news about Chinese hackers in recent years would seem to counter this argument :D

bottom of page 1: I have seen Chinese and Taiwanese type lightning fast using a variety of different methods, even Microsoft's sabotaged zhuyin fuhao input.

As for page two, yeah, characters suit the Chinese language. Chinese would be harder if books were written in pinyin.
 
Last edited:

feng

榜眼
As for your English comments (creative way to post!), I think there is not such a stark contrast between Chinese and English dictionaries. In 新華字典 the characters are in fact in a fixed position, ordered by pronunciation just as in an English dictionary. If one does not know at least one pronunciation of that character (very unlikely), one can simply use the 部首 index. Having to use such an index is a slight inconvenience, but I think one generally knows the pronunciation of the character one wishes to look up.

When you talk about the Hong Kong phonebook only ordering the first two characters, you mean the surname and one of two characters in the given name (if it is a two character given name)? Any list for Chinese names is a problem outside of, though made easier by, characters. Chinese surnames are few in number, and the number of common surnames fewer still.

OK, 20,000 magazine names. What's wrong with a phonetic ordering of Chinese characters? Pleco uses Hanyu pinyin as its romanization system. Even with 200,000+ different words it does not seem that one can get a terribly long list of possibilities typing in pinyin. Typing "yiyi" or "shishi" are perhaps among the worst cases, and even they are not too bad. Most combinations of syllables seem to offer fews choices.

I can't see how organizing a dictionary or list by 形碼 would give you the fixed position you seek. It would be a kind of index, just like bushou.

Sorry I have not been able to help.
 
1- Stroke order (...) 5- Alphabetically ...
Yes! And the sort order is a problem for UNICODE consortium too!

For example, if we sort a set of characters, like https://en.wikipedia.org/wiki/List_of_CJK_Unified_Ideographs,_part_1_of_4 what about when a rare character o a new character is coded or created?
Some problems are explained in https://en.wikipedia.org/wiki/CJK_Unified_Ideographs, note how the same "word" appear in different aspects in Chines Simplified, Traditional, Japanese...!
 

Attachments

  • CJKV_variant_glyphs.jpg
    CJKV_variant_glyphs.jpg
    16.9 KB · Views: 1,174
Top