How can I create a custom user dictionary via a text editor and computer?

NickF

秀才
I'd like to create my own dictionary, but I'd rather type on a real keyboard than on my phone. Switching screens constantly is also tedious. Also, I'd like for this dictionary's entry to be selectable from Flashcards.

It seems like such a thing is possible. I started a custom dictionary on my phone and opened the .pdb file in a text editor. However, even after switching to SQL and UTF-8, the format didn't make a lot of sense.

Could somebody walk me through the process of creating a custom user dictionary, from format to where to put it on the device to how to import it, or point me in the right direction?

谢谢!

PS ... Still loving the heck out of Pleco! After four quarters of college Chinese, I sort-of knew 600-ish words. I'm now somewhere past 2000 words. However, words expressing my gratitude escape me in two languages now!
 

mikelove

皇帝
Staff member
Thanks!

You can import user dictionary entries from a text file - format is:

simplified[traditional]<tab>pinyin (with tone numbers)<tab>definition

one entry per line. Once you've got your file, go into Manage Dictionaries, select your dictionary, then tap on the "Import" button near the bottom of the screen.

The SQL format is undocumented and in any event is poised for dramatic changes soon, so it's not something I'd recommend trying to reverse-engineer.
 

mikelove

皇帝
Staff member
Awkwardly - right now the only way to do it is to use the private use Unicode character U+EAB1 in place of an actual newline.
 

NickF

秀才
Something isn't working.

I've got a new txt file and I've put it on my SD card.

tumblr_o2chfrHWQa1s20rf3o1_1280.jpg


When I try to import it via Settings > Dictionary Management/Manage Dictionaries > Add User (there's no "Import" button here) > Existing, I get a toast saying "Not a Dictionary Backup" and "Sorry, but this file does not appear to be a valid user dictionary database ..."

Any idea as to what I'm screwing up here?
 

mikelove

皇帝
Staff member
1) Don't actually put "U+EAB1" there; instead, use the 'insert custom character' feature of your text editor to insert that unicode character code. Or just copy and paste it from between these brackets: <>.

2) Choose "Create New" instead of "Load Existing," then go into your newly created dictionary in Manage Dictionaries and tap on Import on its detail screen.
 

NickF

秀才
Thank you again. I'll try my best to make sure this is the last time I bug ya! :D

I'm still having difficulty with the custom character that's between the brackets. I made sure that my browser and that my text editor (Notepad++) are using UTF-8; however, both are rendering the character as a blank rectangle which becomes a rectangle with an X inside in Pleco. The same thing seems to be happening using the MS Character Map.

Any advice?
 

mikelove

皇帝
Staff member
It might show up that way in the Pleco text viewer, but if you actually import the file into the user dictionary it should show up as a newline instead.
 

NickF

秀才
I've got it now! I think (not entirely sure) my problem was the encoding on the editor. Thank you again!
 

NickF

秀才
Great! Sorry for the hassle.

Totally not your fault.

One last thing ... I can't speak for anyone else (though I bet I'm not alone in this sentiment), but I would purchase a t-shirt and/or a hooded sweatshirt (hopefully not white) emblazoned with any of the Pleco logos if they were available. :)
 
  • Like
Reactions: ZKD

mikelove

皇帝
Staff member
Heh, we've thought about that, but the logistics of shipping physical products around the world are a nightmare - have to spend a whole lot of time figuring out customs forms and we'd probably nonetheless end up with quite a few customers being told they had to pay double what the shirt itself cost in fees and taxes in order to retrieve it from their local post office.
 
  • Like
Reactions: ZKD

pdwalker

状元
regarding tshirts: cafepress? http://www.cafepress.com/make/custom-t-shirts

regarding custom user dictionaries - based on what you wrote above, the format is "prorietary" and a bit of a pain to work with, although still possible for someone determined enough.

Would you consider releasing some documentation on it, like you did above? I'd love to have my custom user dictionaries to have the lists, example sentences, additional highlighting (bold) - basically whatever your current dictionaries support now. I understand that I would be completely on my own and that any death and destruction would be entirely my fault should I choose to use that information.
 

mikelove

皇帝
Staff member
In 4.0 we've moved to (a severely restricted subset of) HTML5 for user dictionary / flashcard formatting, so it'll be much easier to do nifty custom entries then. (also for our own dictionaries, so in theory you'll be able to export an entry and combine it with the right CSS file and get something that looks almost exactly like it does in Pleco)
 
Hey folks, I'm quite thrilled with the idea of being able to create a custom dictionary within Pleco!

However, I am rather confused as how to implement the technical details. Basically what I want to do is import a list of Chinese characters and have a simple definition beside each one. (It would be great if the basic character recognition would automatically generate the pinyin and input that into the definition as well.)

Basically what I am trying to create is a customized dictionary for a proofreading/translation job with a lot of technical terms. I don't have the expertise to program anything. Is there anyway to make a simple spreadsheet or document which could then be imported into Pleco, without having to enter lines of code like the ones in the above comments?

Or maybe I'm better off entering the custom definitions directly into Pleco? But I have several thousand terms! Any thoughts, ideas, or suggestions?

Thanks,
Marc
 

mikelove

皇帝
Staff member
Yes - basically you want it tab formatted like this:

characters<tab>pinyin<tab>definition

One item per line. Can export it as tab-delimited text in Excel though you might need to do a find-and-replace to remove extra quotation marks afterwords.

If you don't want to fill in Pinyin, theres a workaround though it's a bit awkward - leave the Pinyin blank, and add a line at the top of the file:

//My User Dictionary Entries

Then, import the list into flashcards instead of into a user dictionary, and use the flashcard Edit / Batch command to fill in the missing Pinyin, then the separate Edit / Batch command to dump all of these flashcards to the user dictionary.

(this will all get a lot less awkward in 4.0 with flashcards + user dictionaries now getting merged into the same thing)
 
Ok, thank you for the quick response!

So I'm thinking to go ahead and try implementing as per your above instructions: leaving the pinyin blank, with the (//My User Dictionary Entries) line at the top of the file, then the import into flashcards followed by dumping into the user dictionary.

Just a few questions to clarify:
Is the above method only for Excel or could I also use a text editor? Could the same method be setup within Google Sheets? If I go with a text editor method is there a different setup of the document? For the aforementioned Excel method, I'm not sure what is meant by the tab-delimited text in Excel. Is this a setting within Excel?

Thanks again for the support, much appreciated!
 

mikelove

皇帝
Staff member
Should be pretty much the same regardless of spreadsheet app, tab-delimited text is just a formatting option when exporting spreadsheets to text files.
 
Top