3000 Single words in order of frequency

ptou

Member
I adpted these flashcards to Pleco:
http://www.zein.se/patrick/3000char.html
I give credits to Patrick Hassel Zein for his great work.

When imported creating a custom dict, it is a real new dictionary on its own.
:idea: I find it good for learning, because homographs are shown together.
For example:


[de] <grammatical particle marking genitive as well as simple and composed adjectives>; 我的 wǒde my; 高的 gāode high, tall; 是的 shìde that's it, that's right; 是...的 shì...de one who...; 他是说汉语的. Tā shì shuō Hànyǔde. He is one who speaks Chinese.
[dì] 目的 mùdì goal
[dí] true, real; 的确 díquè certainly


I divided the words in categories of 100 words each, for easy learning. So far I have learned 700, a long way to go till 3000! :D (2800 in reality, some words are missing).
So you can tell how much chinese you can read:

100 characters → 42% understanding 1600 characters → 95.0% understanding
200 characters → 55% understanding 1700 characters → 95.5% understanding
300 characters → 64% understanding 1800 characters → 96.0% understanding
400 characters → 70% understanding 1900 characters → 96.5% understanding
500 characters → 75% understanding 2000 characters → 97.0% understanding
600 characters → 79% understanding 2100 characters → 97.4% understanding
700 characters → 82% understanding 2200 characters → 97.7% understanding
800 characters → 85% understanding 2300 characters → 98.0% understanding
900 characters → 87% understanding 2400 characters → 98.3% understanding
1000 characters → 89% understanding 2500 characters → 98.5% understanding
1100 characters → 90% understanding 2600 characters → 98.7% understanding
1200 characters → 91% understanding 2700 characters → 98.9% understanding
1300 characters → 92% understanding 2800 characters → 99.0% understanding
1400 characters → 93% understanding 2900 characters → 99.1% understanding
1500 characters → 94% understanding 3000 characters → 99.2% understanding

http://www.zein.se/patrick/3000en.html

I did not include the traditional chinese in the flashcards, maybe in a future release.

Hope this may be useful to everybody, happy learning! :wink:
Pietro
 

Attachments

It's important to consider how any frequency number is derived--from spoken or written corpora (collected langauge database) (books--fiction or non-fiction, newspapers, websites, etc.) As for the above list, I went to Jun Da's website which states:
This website provides character frequency lists generated from a large corpus of Chinese texts collected from online sources. It also provides bigram frequency lists as well as individual mutual information scores generated from two sub-corpra.
http://lingua.mtsu.edu/chinese-computing/

I've been trying to develop a corpus from spoken langauge. By downloading/locating a number of Chinese subtitle files (.sub/.srt) (of foreign movies/Chinese TV shows, etc), my intent is eventually to run them through Oxford University's 'Wordsmith Tools' (corpus software). I just haven't located a significant quantity yet for my satisfaction. Until then, I'm considering using this list. Thanks for the links!!!
 

insighter

举人
@LongShiKou: That sounds like a really cool idea. I've really become impressed by how hardworking and smart the Chinese learning community is....maybe that's because we need to be :)

@OP: I was working on trying to do the same thing today but couldn't get my xls file to txt file while retaining the Chinese character components. Maybe had something to do with the websites encoding, but idk. Thankfully I found this post though because I'm no good with excel obviously.
 
Hi,

After importing 3000 single words file, the definitions included in the document do not appear, even though I choose the "file only" or "prefer file" option and even though the file definitions appear in the "preview confirmation screen"

Can someone help me how I can import and see "file definitions" instead of the dictionary definitions?
 

mikelove

皇帝
Staff member
singaturka said:
Hi,

After importing 3000 single words file, the definitions included in the document do not appear, even though I choose the "file only" or "prefer file" option and even though the file definitions appear in the "preview confirmation screen"

Can someone help me how I can import and see "file definitions" instead of the dictionary definitions?
Have you already imported another list of flashcards that map to dictionary entries? If so, you'll want to set the importer to "allow" duplicate cards - otherwise, it'll link the cards in the vocabulary list to your existing flashcards for the same characters, rather than creating new flashcards with the definitions from the file.
 
mikelove said:
singaturka said:
Hi,

After importing 3000 single words file, the definitions included in the document do not appear, even though I choose the "file only" or "prefer file" option and even though the file definitions appear in the "preview confirmation screen"

Can someone help me how I can import and see "file definitions" instead of the dictionary definitions?
Have you already imported another list of flashcards that map to dictionary entries? If so, you'll want to set the importer to "allow" duplicate cards - otherwise, it'll link the cards in the vocabulary list to your existing flashcards for the same characters, rather than creating new flashcards with the definitions from the file.
Thanks for the answer Mike,

I too realized that it was due to the default import setting of "skip" instead of "allow". Since the characters in the file were found in existing flashcards, I was still able to import them but it "skipped" their definitions and when I clicked on them, the file definitions were missing. When I set it to "allow" the problem is solved. I feel that setting it to "allow" by default might help beginner Pleco user like me.
 

mikelove

皇帝
Staff member
singaturka said:
I too realized that it was due to the default import setting of "skip" instead of "allow". Since the characters in the file were found in existing flashcards, I was still able to import them but it "skipped" their definitions and when I clicked on them, the file definitions were missing. When I set it to "allow" the problem is solved. I feel that setting it to "allow" by default might help beginner Pleco user like me.
Great!

A lot of Pleco's design is optimized around the idea that people will only have one copy of each card - many aspects of the flashcard system work better when that's the case - so that's why we don't make this the default. We may, however, add an option in the future that will update the existing card to use the new definition but preserve the rest of it, so you can update the text of your cards without ending up with more than one.
 

etm001

状元
Hi,

For anyone interested, I've updated the file originally posted in this thread as follows:
  • Converted all simplified characters to traditional characters, excluding "什" (as I rarely see "甚" used on a daily basis).
  • Organized cards into the following structure:
    • Top 3000 Single Character Words
      • 100
      • 200
      • 300
      • etc
Note: I believe Pleco's flashcard functionality supports the re-mapping of simplified/traditional characters via the "Display > Force Character Set" option. I tried multiple times to use this option, but was unsuccessful in re-mapping the simplified characters in the "freq3000.txt" file to traditional characters. (I'm sure I was doing something wrong, or perhaps misunderstanding the functionality, but eventually I gave up in frustration).
 

Attachments

mikelove

皇帝
Staff member
Note: I believe Pleco's flashcard functionality supports the re-mapping of simplified/traditional characters via the "Display > Force Character Set" option. I tried multiple times to use this option, but was unsuccessful in re-mapping the simplified characters in the "freq3000.txt" file to traditional characters. (I'm sure I was doing something wrong, or perhaps misunderstanding the functionality, but eventually I gave up in frustration).
Actually no, all that does is force flashcards to use a character set different from the one configured in the rest of the app - it won't actually do any converting. (in general we're kind of gun-shy about automated simplified/traditional conversion, since even for one-to-one mappings there's often a lot of controversy / regional variation)
 

sangormam

举人
Is this good place to start?
Beginning from october 2013, to today i have learned first 300 characters, but this are only single characters, not words, i still could not construct any word.
For example i know what 現 mean and i know what 在 means, but i would never know that 現在 means "now".
Shouldnt i learn words, not characters?
 

alex_hk90

状元
Is this good place to start?
Beginning from october 2013, to today i have learned first 300 characters, but this are only single characters, not words, i still could not construct any word.
For example i know what 現 mean and i know what 在 means, but i would never know that 現在 means "now".
Shouldnt i learn words, not characters?
When I started learning my teachers taught in words rather than characters. I'd recommend you find words to learn and then come back to these individual characters later.
 

denmitch

探花
When I started learning my teachers taught in words rather than characters. I'd recommend you find words to learn and then come back to these individual characters later.
Learning characters in a frequency list like this is a good supplement but I agree you'll need to learn words to communicate. Which is kinda the whole point of language.
 

abdrifter

举人
Is this good place to start?
Beginning from october 2013, to today i have learned first 300 characters, but this are only single characters, not words, i still could not construct any word.
For example i know what 現 mean and i know what 在 means, but i would never know that 現在 means "now".
Shouldnt i learn words, not characters?
Unless you are learning classical Chinese, the emphasis should definitely be on words. I'd like to recommend the Tuttle Learner's Dictionary add-on. It gives you most commonly used words, breaking them into individual characters, explaining each character's meaning in the word. It also gives some nice example sentences. I think it's a very good tool.
 

sangormam

举人
Unless you are learning classical Chinese, the emphasis should definitely be on words. I'd like to recommend the Tuttle Learner's Dictionary add-on. It gives you most commonly used words, breaking them into individual characters, explaining each character's meaning in the word. It also gives some nice example sentences. I think it's a very good tool.
is Tuttle better than PLC dictionary ?
 

abdrifter

举人
is Tuttle better than PLC dictionary ?
It's pretty much apples and oranges. PLC is definitely more comprehensive, wider variety of examples given, etc. Tuttle has just a few thousand entries, but they are all "must learn", and as I said, it breaks down words into components for easier understanding, plus provides some useful examples. Actually, I keep kicking myself for not getting Tuttle in my earlier stages of learning. You'll eventually grow out of Tuttle (I haven't quite yet), whereas PLC and ABC are life-long references. BTW, ABC also tells you what part of speech you are looking at, nicely hyper-linked in the new version of Pleco.
 
Top