Integrating BCC Corpus Data into Dictionary

Shun

状元
Thank you very much for your detailed explanation.! Yes, that makes sense. Also, by importing the card as a user dictionary you gain additional benefits without losing anything!, So if my understanding is correct it seems there are no significant downsides:)

You're welcome! Yeah, it's true, for version 3.2, importing into a user dictionary as well as flashcards is probably the best you can do.

In my case though, if I create such a user dictionary, I'd like to include all of the individual corpus frequency rankings into a single dictionary entry for each term, so it will require a little more work:rolleyes:

Yeah, but if you'd like to have a tag for each ranking, I believe you'd need to have a separate Flashcards category and separate user dictionary entries for each, as well, so you can give one tag to each category a flashcard is in.

Great! It does indeed seem that, as we expected, Python is indeed quite a bit quicker than VBA!

May well be! I don't really know about VBA yet, because the indexing, as far as I can tell from your clear description, relied exclusively on Excel formulæ to do its job. So it's certainly much faster than Excel. The next part (generating flashcards from the indexed word-sentence-list) I expect to be a lot less work for the computer than the indexing, though. (but it's trickier to program)

Of course, I will post the source and the results here when I'm done.
 
Last edited:

leguan

探花
Yeah, but if you'd like to have a tag for each ranking, I believe you'd need to have a separate Flashcards category and separate user dictionary entries for each, as well, so you can give one tag to each category a flashcard is in.

Good point! Yes, you are right.

I guess in my case, since I wouldn't be keen on having individual user dictionaries for each corpus, I could go with per-corpus flashcard sets (expanded to also include corpus terms without Pleco dictionary entries) to keep the per-corpus tagging, and create one user dictionary (without tags) with all the per-corpus ranking info included in one entry per term. Do you think there would be any issues doing things this way?
 
Last edited:

Shun

状元
Yeah, but if you'd like to have a tag for each ranking, you'd need to have a separate Flashcards category and separate user dictionary entries for each, a well, so you can give one tag to each category a flashcard is in.
Good point! Yes, you are right.

Thanks, no problem.

I guess in my case, I could go with per-corpus flashcard sets to keep the per-corpus tagging, and one user dictionary (without tags) with all the per-corpus ranking info included in one entry per term. Do you think there would be any issues doing things this way?

No, not at all. You just always add to that user dictionary each time you import from a different corpus' flashcard list. It's no problem because each flashcard just has its own link to a particular user dictionary entry.

Regards, later,

Shun
 

leguan

探花
The next part (generating flashcards from the indexed word-sentence-list) I expect to be a lot less work for the computer than the indexing, though. (but it's trickier to program)

Yes, I believe you will find this to be the case. Certainly the VBA code completes execution in just a few minutes compared to the hours it takes to perform the indexing.

As I have been away from Japan for the last 10 days, I haven't been able to attend to your request for some of the excel intermediate files but I will try to look into this tomorrow now that I am back in Japan.
 
Last edited:

Shun

状元
Yes, I believe you will find this to be the case. Certainly the VBA code completes execution in just a few minutes compared to the hours it takes to perform the indexing.

As I have been away from Japan for the last 10 days, I haven't been able to attend to your request for some of the excel intermediate files but I will try to look into this tomorrow now that I am back in Japan.

Many thanks! That would be great.
 
Top