Hello,
I would like to share with you something I've been working on for quite some time. Namely, a flashcard list and a user dictionary based on all the five volumes of the new edition of Practical Audio-Visual Chinese (新版實用視聽華語). This is the most popular book for learning Chinese in Taiwan, particularly at the National Taiwan Normal University Mandarin Training Center (國立臺灣師範大學國語教學中心, popularly known as Shida), although it is also used in many other places. More details about the books can be found here.
The files were last updated in May 2015. For the list of changes and download links, see the bottom of this post.
Scope. Included are words appearing anywhere within the first four volumes, either in the textbook or copybook, except technical pages. For the fifth volume, the list includes all words and proverbs from the textbook list for all chapters. Also included, in separate categories (see below), are the vocabulary lists for all four levels of the TOP (Test of Proficiency - Huayu: Beginner, Learner, Superior and Master) exam, less any words that had already appeared before somewhere within the lesson categories. In total, the dictionary comprises nearly 7,000 entries.
Categorization. The words are categorized by book and chapter in the format of AV Chinese/Book n/Lesson [m]m, where n is the volume number, and [m]m is the lesson number. For example, the words from lesson 6 of book 3 can be found under the category AV Chinese/Book 3/Lesson 6. An additional subcategory for each book, AV Chinese/Book n/Extra, includes words that appear outside the main word lists (i.e. as footnotes, in the grammar section, in the supplementary exercises at the end of each chapter, or in the copybook) but were never included in the main word lists in any of the chapters, until the end of the last book. These words can be considered non-essential, and somewhat random, but many are actually very useful to know in the long term. Finally, all the TOP vocabulary that did not appear in the books is included under the categories AV Chinese/TOP/{Beginner,Learner,Superior,Master}, separately for each exam level.
Definition syntax. Each entry is marked for the part of speech it represents. The abbreviations used are as follows: A: adverb, AT: attributive, AV: auxiliary verb, CONJ: conjuction, CV: co-verb, DEM: demonstrative pronoun, I: interjection, IE: idiomatic expression, M: measure word (classifier), N: noun, P: particle, PRON: pronoun, PV: proverb (note this is used differently in the books), RC: resultative compound, SV: stative verb, V: verb, VO: verb-object compound. If a word can appear as more than one part of speech, multiple abbreviations will be listed, separated with a comma, and the respective multiple definitions will be delimited by the semicolon sign. If the entry is a compound that can be split into several parts, abbreviations for each of these will be listed, separated by a hyphen. To simplify distinguishing parts of speech, verb definitions are always preceded by "to," and those of stative verbs are always preceded by "to be." Nouns are, however, not preceded by an article.
Definition tags. Each definition is followed by the book and chapter number, provided in square brackets, [ and ], in the format PAVC-nmm, where n is the book number, and mm is the lesson number, padded with zeroes if less than 9. For example, a word with its definition tagged as [PAVC-306] can be found in lesson 6 of book 3 (possibly within the Extra vocabulary). The TOP words are tagged as [TOP-c], where the character c is the first letter in the English name of the pertaining exam level, and can be any of the following: B for Beginner, L for Learner, S for Superior, M for Master.
Additional notes:
(Feel free to adjust these steps to suit your usage scenario.)
I would like to share with you something I've been working on for quite some time. Namely, a flashcard list and a user dictionary based on all the five volumes of the new edition of Practical Audio-Visual Chinese (新版實用視聽華語). This is the most popular book for learning Chinese in Taiwan, particularly at the National Taiwan Normal University Mandarin Training Center (國立臺灣師範大學國語教學中心, popularly known as Shida), although it is also used in many other places. More details about the books can be found here.
The files were last updated in May 2015. For the list of changes and download links, see the bottom of this post.
Scope. Included are words appearing anywhere within the first four volumes, either in the textbook or copybook, except technical pages. For the fifth volume, the list includes all words and proverbs from the textbook list for all chapters. Also included, in separate categories (see below), are the vocabulary lists for all four levels of the TOP (Test of Proficiency - Huayu: Beginner, Learner, Superior and Master) exam, less any words that had already appeared before somewhere within the lesson categories. In total, the dictionary comprises nearly 7,000 entries.
Categorization. The words are categorized by book and chapter in the format of AV Chinese/Book n/Lesson [m]m, where n is the volume number, and [m]m is the lesson number. For example, the words from lesson 6 of book 3 can be found under the category AV Chinese/Book 3/Lesson 6. An additional subcategory for each book, AV Chinese/Book n/Extra, includes words that appear outside the main word lists (i.e. as footnotes, in the grammar section, in the supplementary exercises at the end of each chapter, or in the copybook) but were never included in the main word lists in any of the chapters, until the end of the last book. These words can be considered non-essential, and somewhat random, but many are actually very useful to know in the long term. Finally, all the TOP vocabulary that did not appear in the books is included under the categories AV Chinese/TOP/{Beginner,Learner,Superior,Master}, separately for each exam level.
Definition syntax. Each entry is marked for the part of speech it represents. The abbreviations used are as follows: A: adverb, AT: attributive, AV: auxiliary verb, CONJ: conjuction, CV: co-verb, DEM: demonstrative pronoun, I: interjection, IE: idiomatic expression, M: measure word (classifier), N: noun, P: particle, PRON: pronoun, PV: proverb (note this is used differently in the books), RC: resultative compound, SV: stative verb, V: verb, VO: verb-object compound. If a word can appear as more than one part of speech, multiple abbreviations will be listed, separated with a comma, and the respective multiple definitions will be delimited by the semicolon sign. If the entry is a compound that can be split into several parts, abbreviations for each of these will be listed, separated by a hyphen. To simplify distinguishing parts of speech, verb definitions are always preceded by "to," and those of stative verbs are always preceded by "to be." Nouns are, however, not preceded by an article.
Definition tags. Each definition is followed by the book and chapter number, provided in square brackets, [ and ], in the format PAVC-nmm, where n is the book number, and mm is the lesson number, padded with zeroes if less than 9. For example, a word with its definition tagged as [PAVC-306] can be found in lesson 6 of book 3 (possibly within the Extra vocabulary). The TOP words are tagged as [TOP-c], where the character c is the first letter in the English name of the pertaining exam level, and can be any of the following: B for Beginner, L for Learner, S for Superior, M for Master.
Additional notes:
- To make the most use of Pleco features, entries were trimmed not to exceed four characters. Wherever the entry is actually a part of a longer phrase, this will be mentioned in parentheses in the entry definition.
- Wherever there is more than one word with a given meaning, and both words are likely to be studied at the same time (i.e. they are in the same chapter or in the Extra subcategory for the same book), the definition provides a cue by revealing one (usually the first) letter of its pronunciation. This might be useful if you configure a test to only show the definition (without revealing pronunciation).
- The pronunciation follows the book, which follows Taiwanese conventions. This sometimes means different tones, and sometimes may even mean entirely different sounds (as is in the case of 血 or 垃圾). The definitions are based on those provided in the book, but were significantly edited for clarity, or rewritten where incomprehensible. The list focuses on Taiwanese usage, and does not provide information on the Mainland Chinese pronunciation.
- Simplified character variants are included for completeness. These do not appear anywhere within the books or TOP word lists, and were automatically generated with the OpenCC tool.
(Feel free to adjust these steps to suit your usage scenario.)
- Install the dictionary:
- The easy way: point Pleco to the file AV Chinese.pqb, which is a ready-to-use dictionary file for the current version of Pleco (3.2 as of the time of writing), and can be added as an existing user dictionary to your Pleco installation.
- The other way: create a new user dictionary on your own, and import the entries from the file AV Chinese Dictionary.txt.
- Install the flashcards:
- The easy way assuming you don't need to keep your current flashcards: use the provided Pleco Flashcards.pqb file to restore your flashcard backup, or - if you're on Android - just copy it over your current flashcard file. (Warning: all your current flashcards will be irreversibly lost.)
- The other way: import the flashcards from the file AV Chinese Flashcards.txt. The recommended settings are:
- Text encoding: UTF-8
- Definition source: Prefer dicts
- Dictionaries: 1 dicts, choose the newly-installed AV Chinese
- Store in user dict: Off
- Ambiguous Entries: Prompt
- Duplicate Entries: Prompt
- Install the HSK flashcards addendum: this additional flashcard list contained in the file AV Chinese Flashcards HSK Addendum.txt provides the vocabulary from all levels of the old and new HSK exam that did not appear in any of the previous categories. There are no definitions, so change the import settings accordingly to map them to the dictionary of your choice (for convenience, set the "Duplicate Entries" to "Skip", and "Ambiguous Entries" to "Use First" for this import). More details about the HSK list can be found in another thread.
- July 2010: First public release, 3982 unique entries.
- November 2010: Updated to include all the extra words from volumes 3 and 4, as well a complete rewrite of the lists for the second part of book 4 and book 5, which were previously based on another source of insufficient quality. Also, many definitions from the earlier chapters were edited and the occasional errors were fixed. 4674 unique entries.
- February 2012: Extended to include all the vocabulary from all four levels of the TOP - Huayu (TOCFL) exam that did not appear in any of the books.
- May 2015: Added simplified headwords. Brought back per-chapter categories for the first two books. Cosmetic changes to category names and definition tags. Usage instructions rewritten. Removed duplicate entry for 協助. Fixed duplicated quotation marks. 6679 unique entries.
Attachments
Last edited: