User dict: importing nonstandard pinyin

I'm trying to create a user dict from txt with some nonstandard pinyin.

Some seem to import without issue: jiai, xiai, etc.

But I'm struggling to find any ng- inital words - I'm not sure if they get flagged as Cantonese or something.

Are there special settings I should be using to generate the dict?
 

mikelove

皇帝
Staff member
No, actually anything you create or import in 4.0 is handled with zero syllable-specific processing - whatever the dictionary treats as a syllable (by putting whitespace or punctuation or a tone number after it) Pleco will duly index as a syllable.

The search engine does have a list of syllables that it uses to break down a long string of letters into potential syllables, but it's normally supposed to also try the whole string of letters just in case someone has entered a non-standard syllable or some letters (BP极) or whatever, so it seems like that might not be working correctly here. Could you give me an example of an entry's Pinyin that is failing to match, and the exact search query that's not matching it?
 
Here's an example of 澳洲黑 in the dict:
IMG_2413.jpg


Searching with characters is fine.

Pinyin "aozou..." pulls up the first entry no problem.

"ngaoz" should bring up the second entry:
IMG_2414.jpg


Looks like the /ng-/ initial has issues with pinyin search.

===================================

Ed.:

Here's another one “” (U+2C465): (could we get visualizations in the search bar as well?)

Screenshot 2024-03-31 at 11.44.02 AM 2.jpeg


Pinyin:

Screenshot 2024-03-31 at 11.44.41 AM.jpeg


Just as a side note, the ext. characters seem to mess up the auto Jyutping:

Screenshot 2024-03-31 at 11.52.07 AM.jpeg


===================================

Ed. 2:

Screenshot 2024-03-31 at 12.01.13 PM.jpeg


&&

Screenshot 2024-03-31 at 12.00.48 PM.jpeg


===================================

Screenshot 2024-03-31 at 12.01.49 PM.jpeg


Again, "juo" doesn't work.

===================================

jiai, xiai, yiai also don't work, opposite to what I said in my question.
 
Last edited:

mikelove

皇帝
Staff member
So it turns out that in some cases - like "ng" - we do insert syllable breaks without tone numbers, and there would be some significant negative consequences if we stopped doing that.

It looks like this is Sichuan Pinyin, right? I think for that what you'd really want is a separate field with its own search type + text processing settings. Creating those things at the moment is kind of awkward + lacks documentation, but we're working on streamlining it and making it package-able into a user dictionary file.
 
Top