User dict: importing nonstandard pinyin

I'm trying to create a user dict from txt with some nonstandard pinyin.

Some seem to import without issue: jiai, xiai, etc.

But I'm struggling to find any ng- inital words - I'm not sure if they get flagged as Cantonese or something.

Are there special settings I should be using to generate the dict?
 

mikelove

皇帝
Staff member
No, actually anything you create or import in 4.0 is handled with zero syllable-specific processing - whatever the dictionary treats as a syllable (by putting whitespace or punctuation or a tone number after it) Pleco will duly index as a syllable.

The search engine does have a list of syllables that it uses to break down a long string of letters into potential syllables, but it's normally supposed to also try the whole string of letters just in case someone has entered a non-standard syllable or some letters (BP极) or whatever, so it seems like that might not be working correctly here. Could you give me an example of an entry's Pinyin that is failing to match, and the exact search query that's not matching it?
 
Here's an example of 澳洲黑 in the dict:
IMG_2413.jpg


Searching with characters is fine.

Pinyin "aozou..." pulls up the first entry no problem.

"ngaoz" should bring up the second entry:
IMG_2414.jpg


Looks like the /ng-/ initial has issues with pinyin search.

===================================

Ed.:

Here's another one “” (U+2C465): (could we get visualizations in the search bar as well?)

Screenshot 2024-03-31 at 11.44.02 AM 2.jpeg


Pinyin:

Screenshot 2024-03-31 at 11.44.41 AM.jpeg


Just as a side note, the ext. characters seem to mess up the auto Jyutping:

Screenshot 2024-03-31 at 11.52.07 AM.jpeg


===================================

Ed. 2:

Screenshot 2024-03-31 at 12.01.13 PM.jpeg


&&

Screenshot 2024-03-31 at 12.00.48 PM.jpeg


===================================

Screenshot 2024-03-31 at 12.01.49 PM.jpeg


Again, "juo" doesn't work.

===================================

jiai, xiai, yiai also don't work, opposite to what I said in my question.
 
Last edited:

mikelove

皇帝
Staff member
So it turns out that in some cases - like "ng" - we do insert syllable breaks without tone numbers, and there would be some significant negative consequences if we stopped doing that.

It looks like this is Sichuan Pinyin, right? I think for that what you'd really want is a separate field with its own search type + text processing settings. Creating those things at the moment is kind of awkward + lacks documentation, but we're working on streamlining it and making it package-able into a user dictionary file.
 
Is it getting split on the backend like before with neutral tones, i.e.: 脚 juo2 ---> ju5o2?

Search type + text processing settings, sounds very involved. Is it as complicated as it sounds?
 

mikelove

皇帝
Staff member
Is it getting split on the backend like before with neutral tones, i.e.: 脚 juo2 ---> ju5o2?
Potentially, yes.

Search type + text processing settings, sounds very involved. Is it as complicated as it sounds?
Yeah, the biggest remaining Major Improvement I'd Like To Make For The First 4.0 Release is retooling settings and especially Expert settings, and I think at least until we figure out a better way to bridge the gap between Expert and non-Expert, we're going to hide most of the Expert ones altogether, and offer simpler ways to do the few things that people are actually interested in doing with them - like creating fields for non-standard romanizations or customizing the order of buttons in the reader / sections in the definition screen - instead.

Frankly, a lot of this is so arcane and technical that if somebody wants to add, say, POJ romanization support for Minnan, it'll take less time for us to simply add a "make a custom field for POJ readings" option than it will to walk all of the people who might like to do that through the process of doing it + of enabling support for it in other people's copies of Pleco.

And in retrospect I probably should have realized this was going to be a problem a year or two ago - I was more focused on making sure it was possible to add Feature X with the primitives we'd created than I was on actually figuring out how end users would go about doing so. Though a good deal of that complexity only emerged in the half-year or so before we launched the first beta, when we were busily getting every obscure-but-important little 3.0 behavior to work correctly in 4.0.
 
Top