We've decided we need to ship some sort of official support for a couple of Chinese topolects in Pleco 4.0 - rather than expecting users to go through dozens of pages of reference documentation to figure out how to do so themselves - and I'm looking for some advice on which romanization systems to default to and how to use them.
Hokkien: it seems like Tâi-lô is the predominant system, and for the most part we can approach it like we do Mandarin and (Yale) Cantonese - search with suffix tone numbers and display with diacritics. Does that make sense?
Hakka: there seems to be more active competition here, but since, as with Hokkien, the Taiwan MOE seems to publish a lot of material for this and it generally seems like most of the interest in working with Hakka languages is coming from users in Taiwan, we should probably follow their 臺灣客家語拼音方案 scheme, correct? The Wikipedia article on that shows floating diacritic marks after syllables, but then the actual MOE Hakka dictionary uses superscript numbers, 2 per syllable, which to me look tidier - would the latter system likely be acceptable to most users? And for searching, would we want to let you enter zero or two digits and ignore suffixes of just one digit?
Wu: again more competition but I get the impression that the most favored system at the moment is Wugniu? And that between sandhi chains and other complications online dictionaries generally don't bother to make the tones searchable? Is it worth the trouble to implement a sandhi notation system like Wiktionary's? (it seems like they've got about the only open-source dictionary data for it)
Sichuanese: it seems like the system to use here is Sichuanese Pinyin, which again we can treat like regular Pinyin but with more syllables and superscripts instead of diacritics?
Hokkien: it seems like Tâi-lô is the predominant system, and for the most part we can approach it like we do Mandarin and (Yale) Cantonese - search with suffix tone numbers and display with diacritics. Does that make sense?
Hakka: there seems to be more active competition here, but since, as with Hokkien, the Taiwan MOE seems to publish a lot of material for this and it generally seems like most of the interest in working with Hakka languages is coming from users in Taiwan, we should probably follow their 臺灣客家語拼音方案 scheme, correct? The Wikipedia article on that shows floating diacritic marks after syllables, but then the actual MOE Hakka dictionary uses superscript numbers, 2 per syllable, which to me look tidier - would the latter system likely be acceptable to most users? And for searching, would we want to let you enter zero or two digits and ignore suffixes of just one digit?
Wu: again more competition but I get the impression that the most favored system at the moment is Wugniu? And that between sandhi chains and other complications online dictionaries generally don't bother to make the tones searchable? Is it worth the trouble to implement a sandhi notation system like Wiktionary's? (it seems like they've got about the only open-source dictionary data for it)
Sichuanese: it seems like the system to use here is Sichuanese Pinyin, which again we can treat like regular Pinyin but with more syllables and superscripts instead of diacritics?