Beta 7 Audio Error Thread

marsch

举人
mikelove said:
With #7, do you mean "wen1nuan3"? There's no entry for "wen1huan3" in any of our dictionaries.
It was probably wen1nuan3, given that my handwriting is just a bit messy :) Checked that again and can hear it clearly, though, so maybe I was thinking of sth. different... Anyway: just forget it.
 
Hi Mike

Below Are Some Entries That Don't Sound Quite Right As ABC Dictionary Flashcards. (Volume Set To 10)
Not Major Issues,
Just Reporting Them So You Can Check.

美丽 měilì (f) Sounds A Little Like 'měidì'.
滑冰 huábīng (f) Truncated 2nd Syllable.
汉堡 hànbǎo (f) Repeated 2nd Syllable.
机场 jīchǎng (f) Truncated 2nd Syllable.
烤鸭 kǎoyā (f) '子' Added to end.
常常 chángcháng (f) Truncated 2nd Syllable.
常 cháng (f) Truncated, Sounds Like 'Cha'.

The Duelist
 

mikelove

皇帝
Staff member
Thanks - I'm hearing issues in most of these too. (though mei3li4 I think is just a combination of her accent and the high-level data compression)
 

mikelove

皇帝
Staff member
Lots of those, unfortunately - we really need to run through the woman audio samples with a noise filter at some point.
 

Alexis

状元
Can someone listen to "die2" and let me know if it sounds like "bie2" on your hardware?

- Alexis

PS. I've finished checking up to "h" so far. "d" is the most troublesome because a portion of the entries (about 60 or so) sound like they start with a "b" on my ipaq210.
 

mikelove

皇帝
Staff member
Thanks again for doing this! Yeah, that's another case of the audio compression algorithm doing a less-than-stellar job - it sounds fine in the original recordings but those tip-of-the-tongue 'd' sounds are the sort of thing that the compression seems to shortchange. Another area that will get improved once we stop supporting Palm (and therefore can switch to MP3 without any patent issues, since support for it is built into every other platform Pleco might be released on - MP3 seems to handle those sounds a lot better than Vorbis for some reason).

Our bigger worry at the moment is words that don't match up with the sound at all, extra syllables / not-even-close incorrect syllables / incorrect tones and the like; small consonant shifts are more likely to be data compression issues than actual errors.
 

Alexis

状元
mikelove said:
Our bigger worry at the moment is words that don't match up with the sound at all, extra syllables / not-even-close incorrect syllables / incorrect tones and the like; small consonant shifts are more likely to be data compression issues than actual errors.

Good to know! I will ignore funny consonants (and consistent accents!) :)
 

mikelove

皇帝
Staff member
The Duelist - sounds OK here, the woman's accent does tend to truncate those 'ai's a bit.

I'm starting to think we should release a less-aggressively-compressed set of audio files as an experiment and see if people find the improvement worth the extra disk space (roughly double if we turned the compression down enough to make a perceptible difference) - everybody seems to have 2 GB SD cards anyway now, and we've got plenty of download bandwidth, so it wouldn't be an unreasonable thing to supply as an optional / alternative add-on.
 

Alexis

状元
Mike,

You may want to check this out (taken from the Vorbis FAQ http://vorbis.com/faq/#_speech)

How does Vorbis fare for speech compression?
It works well, but is generally not the optimal solution. Vorbis is designed for the compression of music and general purpose audio. Special purpose codecs can achieve much greater compression of speech than Vorbis. Vorbis also tends to have a latency that is too high for telephony, a common use of speech codecs. Read the Speech Coding and Compression FAQ (http://www.speech.cs.cmu.edu/comp.speech/FAQ3.html) for more details. Those looking for an open-source, patent-free speech codec should take a look at Speex (http://www.speex.org/).
 

radioman

状元
Just a note with regard to 2G cards, I might be wrong, but I do not believe the Palm E2 can support above 1Gb (at least mine can't as I have tried a number of formats and sizes). But alternative audio file options (compressed vs. less compressed) would certainly not be a problem. Also, not sure how much more designing around the E2 Pleco would want to take on - even with my E2 I would rather have the energy put to the iphone/ipod touch platform.
 
mikelove said:
I'm starting to think we should release a less-aggressively-compressed set of audio files as an experiment and see if people find the improvement worth the extra disk space (roughly double if we turned the compression down enough to make a perceptible difference) - everybody seems to have 2 GB SD cards anyway now, and we've got plenty of download bandwidth, so it wouldn't be an unreasonable thing to supply as an optional / alternative add-on.

Would a 4x (size) improve quality notibly more than 2x? I ask because that also gives the choice to limit to a single voice, with the "highest" quality.

You could put up a sample block of words at each setting and let people check it on various platforms.
 

mikelove

皇帝
Staff member
Alexis - it's true Vorbis isn't ideal for voice, but we can't use MP3 on Palm due to patent concerns (we'll switch to MP3 as soon as we stop supporting Palm, since every other mobile platform has it built-in and therefore we don't need to worry about patent licensing) and there were few other codecs that could be easily ported to Palm, so it was the best we could do for now. Speex is more designed for VoIP than static recordings, and it's not as Palm-friendly anyway.

radioman - that's true, so this would definitely be an optional set of files rather than the only set available.

stephanhodges - it's unlikely that would make much of an additional difference - also at some point we have to start worrying about server bandwidth here, going from 300 to 600 MB is manageable but bumping that up to 1.2 GB could start to push our limits a bit.
 

mikelove

皇帝
Staff member
Yeah, we keep finding more and more like that - we may actually end up making the male voice the default after all, with the female files a separate download at least initially; if anybody suggests we're being gender-biased we can invite them to try both samples themselves and see which set sounds better.
 

radioman

状元
Just my two cents on female vs. male - I certainly would like to retain the woman voice or at least be provided the option outside the main download set. I think it is easier to hear (typically higher toned). This would be my view even if clean up on the female voice did not take place.
 

mikelove

皇帝
Staff member
Oh we're certainly keeping the female audio, it just might be in a separate download - we might keep the female files in "beta" for a while after 2.0 is released while we finish cleaning it up.
 
Hello Mike,

I definitely support having a less compressed audio set out there for those that have the storage capacity. 2GB cards are pretty cheap these days, so I think it is more of a hardware issues than a financial one (though for some I could be proved wrong on this), where some hardware won't support larger than 1GB. Of course, I have an iPaq with an SD and CF slot so I am fine :D Personally I would download the least compressed audio file set that you put out there.

Like everyone else, I too have static in many of the woman's audio samples. I must say though, that not only do I like having both male and female sets available, but that I tend to listen to the female voice more. I feel like her pronunciation makes it easier for me to pick out the tones (when the static isn't involved, of course).

Are the users looking at a long lag between Pleco 2.0 being released and the cleaned version of the woman's audio files being released?

Oh. The die2 files, both male and female, sound fine on my iPaq hx2495, and the female "ai's"don't seem to be truncated enough to really bother me. Of course, my friends speak Mandarin with so many different accents that it could just be that I am not picky anymore.

Keep up the excellent work!

Darrol
 

mikelove

皇帝
Staff member
Hopefully not a long lag, just depends on how difficult the cleanup work proves - if batch processing won't cut it for noise removal then it could take quite a while to manually clean up each static-y file.
 
Top