Hainanese Resources for Pleco

furisas

Member
I know of an on-going effort which is attempting to create a Hainanese dictionary of some 5,500 words. It will have 4 searchable indexes:
(1) Chinese word characters, (2) Chinese pronunciation pinyin, (3) Hainanese Wenchang pronunciation phonetics, (4) English meaning.

For example:
早上,zao3 shang4, da4 jio2, morning
笨蛋,ben3 dan4, pong1 gang3, stupid

I am interested to know how this can be inputed or licensed into Pleco.

Would Pleco one able to use TTS to voice the Hainanese Pronunciation Phonetics? If this is unsatisfactory, can voice recordings be attached to each of the words?

Can someone explain to me how this can be done? Can this be done in this current Pleco version, or does it need the capabilities in the upcoming Pleco 4.0?

If not doable in this version of Pleco, can this be done in Anki? The idea is that in the meanwhile, the work on the digital hainanese dictionary and flashcards can first be done in Anki, and then imported from Anki into Pleco 4.0 when that comes out, when it does come out!

Although not a new Pleco user, I am not a digital native and I struggle somewhat to be any sort of 'power user'. Please explain like you are talking to a 5 year old. Thank you!
 

mikelove

皇帝
Staff member
Would Pleco one able to use TTS to voice the Hainanese Pronunciation Phonetics? If this is unsatisfactory, can voice recordings be attached to each of the words?
Sorry, what TTS would we use? You can attach voice recordings, sure, but I'm not aware of any Hainanese TTS engines we could tap into; if this is only on Android and somebody's created a third party system Hainanese TTS engine for that then you could hook even in the current version of Pleco into it by choosing it as your System TTS Engine.

Can someone explain to me how this can be done? Can this be done in this current Pleco version, or does it need the capabilities in the upcoming Pleco 4.0?
You could do it now if you put the Hainanese pronunciation at the start of the definition field. Would not be possible to search it (or display it separately from the definition) until 4.0.

The idea is that in the meanwhile, the work on the digital hainanese dictionary and flashcards can first be done in Anki, and then imported from Anki into Pleco 4.0 when that comes out, when it does come out!
It would probably be easier to do it from a text file, honestly. Anki import would work OK, but it wouldn't really be any different for this purpose than a text import - in either case you've got a bunch of columns of semantically ambiguous text which Pleco needs to be told what to do with.
 

DavidMars

举人
Is there any preliminary data set available, with 100 or 500 or 1,000 words? Is there any link to the people doing this work? Thanks, David
 

furisas

Member
furisas said:
Would Pleco one able to use TTS to voice the Hainanese Pronunciation Phonetics? If this is unsatisfactory, can voice recordings be attached to each of the words?
Mike Love: Sorry, what TTS would we use? You can attach voice recordings, sure, but I'm not aware of any Hainanese TTS engines we could tap into; if this is only on Android and somebody's created a third party system Hainanese TTS engine for that then you could hook even in the current version of Pleco into it by choosing it as your System TTS Engine.
Mike,
Thanks for responding so quickly! My sincere apologies for being the one who is so slow. I have been so caught up. I really do feel your love for Pleco users. Thank you!

Please do bear with me if I do not properly comprehend the technology and techniques you use in Pleco. That is why I asked for your indulgence to explain to me like I was a 5 year old, even though I am past 60.

My aim is to learn to construct resources to use in Pleco, to help me learn Hainanese more easily, and consequently use it to teach Hainanese to others keen to learn.

I am shooting to improve this sort of learner - someone who already knows intermediate level of chinese (but can learn more using Pleco) and is very weak in Hainanese. The idea is to use visual chinese characters in a conversation text as the scaffolding to help the learner progress to strong conversational skills. The idea is to revive the Hainanese speaking skills in our Hainanese community. The observation in our community is that this hainanese speaking ability is gradually being weakened, even lost, even amongst the younger handiness families on Hainan island, as Mandarin and its pinyin takes stronger root.

I am thinking of benchmarking it to HSK 1-6. Please guide me if there is another better standard more suited to conversational contents. I suppose this learning eventually has to lead to reading chinese novels or newspapers in Hainanese.

Are you saying that the present Pleco 3.2.59 (both iOS ? and Android ?) is already be able help me if I have resource contents where the chinese characters had its Hainanese pronunciation
1. in a voice file
2. in phonetics (Would such phonetics be pronounceable using some generic text to speech software? I do not know if any such software exist)
 

mikelove

皇帝
Staff member
Sorry, no, the 'can' was referring to what you can do in the upcoming 4.0. If you had a Hainanese text to speech system (a complicated program somebody would have to write) you could theoretically install that on Android and have the current version of Pleco play audio through it, but that's not at all straightforward.
 

furisas

Member
Mike,

Hainanese Voice Recordings
Are you saying that including Hainanese Voice Recordings for each of the words into chinese words (both simple and compound words) has to wait for the upcoming Pleco 4.0?

If so, could you advise me if it is better to first add these Hainanese Voice Recordings to trial in the ANKI system (both for the User Defined Custom Dictionary and the Flash Cards system).... I am thinking this would iron out whatever production issues there may be... so as to be ready to port a smooth Hainanese Voice Recordings system into Pleco 4.0 when it comes out.

Hainanese Text to Speech:
If there are no further assistance I can call upon, I think the Hainanese Text to Speech system would be a very difficult project. Form the little I know, Hainanese dialect has some initials and finals not present in Mandarin Pinyin and Cantonese Jyutping. Most of the words use just 5 tones. The other words use 6th, 7th and 8th tones, but these could be neglected or ignored, as an approximation to the 5th could already be passable and good enough in the conversational domain.

In this Text-to-Speech area, I would need to read up on it. I would really appreciate if you can help point me to a few good resources that can help me get over the technical hump. Otherwise, I am not competent to discuss the matter. But I dearly love to be able to appreciate technically so that I can then look up and discuss with a technically competent person to see how to actualise the Hainanese text to speech system.

Hainanese Speech to Text:
The reverse effort is also an interesting area to me. I have voice recordings where the teacher explains in Hainanese. I want to turn this into text. How can I go about this?

I would be very grateful for your assistance to explain or to point me to somewhere to get help.

Sincerely,
Tsu Li
 

mikelove

皇帝
Staff member
Are you saying that including Hainanese Voice Recordings for each of the words into chinese words (both simple and compound words) has to wait for the upcoming Pleco 4.0?
Yes. You could certainly try it in Anki if you like but I don't know if that would really do anything to help with the process in Pleco 4.0.

If there are no further assistance I can call upon, I think the Hainanese Text to Speech system would be a very difficult project. Form the little I know, Hainanese dialect has some initials and finals not present in Mandarin Pinyin and Cantonese Jyutping.
Android TTS engines just send characters, actually, not romanized text, so this would basically be writing an Android TTS plugin that mapped characters to Hainanese recordings. (in theory you could actually also use that with AnkiDroid, since it'll hook into any configured Android TTS engine just like we do)

I would really appreciate if you can help point me to a few good resources that can help me get over the technical hump.
Honestly I haven't ever made more than a cursory investigation of how to write our own Android TTS engine (which we're mostly forbidden from doing for license reasons anyway) so I wouldn't be able to help much.

The reverse effort is also an interesting area to me. I have voice recordings where the teacher explains in Hainanese. I want to turn this into text. How can I go about this?
I don't know of any voice recognition engines that support Hainanese. Making your own would require building a model for an open-source voice recognition engine like DeepSpeech, for which you'd need an AI expert and a *ton* of training data (thousands of hours of transcribed audio of people speaking Hainanese).
 
Top