help: act in haste, repent in leisure.

pdwalker

状元
Never try to organize things at 4am in the morning. Nothing good ever happens at that time.

In a fit of stupidity, I was trying to clean out my user dictionary when I deleted entries I shouldn't have. Now I have a bunch of orphaned flash cards that map to a dictionary entry that "appears to be missing or unavailable"

What I am trying to do is to correct these flash cards, recreate or relink them to a dictionary entry (user or otherwise) and keep them without losing my flashcard scores

Searching for "incomplete entries" will turn up the problem cards.

If I reimport the cards with "add to user dict" enabled, they don't get re-added to the user dictionary

If I edit the flash card card and try to "Change dictionary entry", no suitable entries appear.

If I edit the flashcard and try to "edit user dictionary entry", I can refill in all the details, save it, and then the flashcard loses the text it had, the user dictionary entry doesn't appear to get saved, but the blank flash card remains with a score.

At this point, anything I try further just makes things worse, so I'm hoping for some advice from someone who may have experienced something similar.

And no, I didn't back things up before I made my sweeping changes. It was 4am stupidity. Yes, I know I shouldn't have done that.

Any suggestions?

Much appreciated.
 

mikelove

皇帝
Staff member
Set the 'duplicate entries' behavior in the Import Cards screen to "Update Text" or "Update + Merge" - that will update the text of your existing cards with the text from the import file. (by default we don't do that since it's a potentially destructive behavior - we don't want to obliterate carefully constructed flashcards on an import)
 

pdwalker

状元
Hi Mike,

Thanks for taking the time to reply.

That seems to be working, not perfectly as some entries are still weird, but it's definitely making a difference. I'll see how many I can get fixed this way.

Next thing: My user dictionary has lots of duplicate entries. I have no idea how that happened. I've often done an import (+ create dict entries), checked the cards and then undid the import if it wasn't perfect.

Exporting it as text, and I see things like:
三年前 three years before
三年前 three years before
三年前 three years before

This is what I was trying to clean up.

Do you have any thoughts on the best way to remove duplicates from a user dict, or should I just do it, and then reimport my flash cards as above to relink the flash cards?

Much appreciated.

[edit: after reimporting all the flash cards I created (at least I kept those), I've gone from 140 funky disassociated flash cards to 89. An improvement.
 
Last edited:

pdwalker

状元
note: that's not "find duplicates" in the flashcards (which works well and is helping me clean up my mess considerably)
 

mikelove

皇帝
Staff member
At the moment the best bet would probably be to just do it.

A possibly more complicated alternative would be to back up your flashcard and user dictionary databases, export the flashcards to a text file *with definitions included* and the user dictionary entries to another text file, delete both databases, edit the user dictionary text file to give it a category header like //My User Dictionary Entries, import both files into flashcards, create a new user dictionary, then go into that My User Dictionary Entries category of flashcards, go into Edit / Batch and tap on the command to convert custom definitions to user dictionary entries.

In 4.0 we're merging user dictionaries with flashcards (we'll still have a way to package up separate user dictionaries too but the intent is that those will mainly be for distribution/sharing and not for live editing) so this will all get a lot easier then.
 

pdwalker

状元
Thank you,

I will be trying this on another device this weekend latest. I'll let you know how it works.
 

pdwalker

状元
still working on this, and I now know why everything got messed up so badly in the first place.

essentially, what I have done, am doing and will do is:
  1. export, cleanup, and recreate my user dictionary
  2. export all my flash cards as xml
  3. clean up all my flashcards in the xml file (OCD dialed to 11)
  4. reimport the xml flashcards file with no categories to reestablish the scores
  5. reimport my original flash cards with categories and merge.
  6. post cleanup - finding any cards that do not have a dictionary entry, cleaning them up and adding them to the user dictionary
That should leave me with a good user dict, clear of any duplicates, good flashcards with scores preserved and no duplicates, and all flash cards linked to proper dictionary entries, or user dictionary entries.

I'm bouncing between 3 and 4 at the moment as I test my cleanup process

longer post to follow
 
Exporting it as text, and I see things like:
三年前 three years before
三年前 three years before
三年前 three years before
If you has Excel, exporting in CSV format, importing in Excel and using the feature "Remove duplicate" (finally saving/reloading) can be useful.
Also FreeOffice has a similar option.
In my opinion.
 

pdwalker

状元
Deduping entries is actually the easiest part.

The hard part is working out everything else while trying to preserve the integrity of the new user dictionary and keeping my flash card scores, and categories intact

For that, I must use the xml export file, and that one is not easily loaded into excel. Vim and sed are my goto tools for this job

I'll document what I've done and why when I'm finished in case some other OCD ridden individual wants to do something similar.
 

pdwalker

状元
Done. And I must say, the discovery process was very time intensive.

In the end, this is what I did:
  1. exported all my flashcards with scores to an xml file
  2. I cleaned out everything except the Putonghua headwords and the scores from that file
    • I took the opportunity to clean out any scores that didn't belong, and flashcards I no longer needed
    • I sorted and deduped the file to remove any duplicates
    • Lastly, cleaned up any definitions, and made everything consistent
  3. Almost all the flashcards I created, I started with text files that I imported containing the categories. While I cleaned up the xml file, I also cleaned up those source text files so they would match.
  4. On a test device, I cleaned out everything, user dicts, all flashcards, all score files (except defauly which I reset)
  5. I imported the xml file containing the cards and scores
    • import settings: prefer dict, do not add to user dict, fill in missing entries, ambigious entries first, duplicate prompt (I did have some legitimate duplicates)
  6. Next I ran searches to check for and correct:
    • incomplete entries
    • duplicates
    • cards not connected to a dict, select these entries and convert to a user dictionary
  7. Then I imported my text file containing all my cards with categories.
    • this recreated and reassigned my flash cards to the correct categories
    • my tests lose their category entries so I have to reselect them for my tests
    • also, my categories needed to be sorted manually
  8. When I was happy that it was working correctly, I redid the entire process on my main device.
I guess that I only lost about 20ish flashcards in the process, or the scores for those flashcards - say, 1%? That's not too bad, considering how badly things were messed up before I started this cleanup.

There were some other little gotchas that I've not documented, and there may be a way to do it more efficiently than I did, but by this point, I was sick to death of this and just wanted it finished, even if it look slightly longer.


Things I learned:

1/ when importing new flash cards from text file, DO NOT IMPORT INTO THE USER DICTIONARY. This is where I started going wrong. I ended up putting in a bunch of duplicate, almost duplicate entries into my dictionary making a mess in the first place. Bad idea. The "undo import" (great feature, by the way), does not undo user dict entries. I never really thought about it, so I didn't realize the problem I was creating as I would often import, undo, import, undo several times before I was happy with the results.

2/ Saving your import text files is useful, especially if you keep them updated

3/ The import/export flashcard in xml format was a complete lifesaver. Yes, I know the format is undocumented, but it was easy enough to figure out and the format was very flexible and forgiving which allowed me to clean things up and preserve my scores (this was the one thing I was worried about losing the most).

4/ The flashcard search is wonderful. Searching for incomplete and duplicate entries really helped to clean things up.

5/ In the future, when I import new flash card lists, I will always import them to a spare device and get everything cleaned up before even considering doing it on my main device.


Other observations:

1/ I cannot search for flashcards without a category. Sure, I can list the "Uncategorized" cards, but I'd also like to include that in the searches in case I want to limit what I am searching for. Unfortunately, that is not an option on my phone.

2/ Mapping a flash card to a user dictionary option doesn't always work, even when the user dictionary entry was created from that flash card

For example, I had this in a text file that I imported to a flash card:
太...了 <tab> tai4...le5 <tab> too <something>
给你介绍 <tab> gei3 ni3 jie4shao4 <tab> to introduce to you (give you an introduction to...)​

Next, these cards would be converted to a user dictionary entry (and the flash card linked to it - no problem there.

Next, I cleared out all the flashcards and reimported them, using the "prefer dict" option. When I did so, the first flash card would not be linked to the user dictionary, but the second would be. Attempting to link the first to the user dict would would fail because it could not find that entry. The second one would link properly.

If I added the first into the user dict, then I'd have duplicate entries in the user dict database

The only thing that the cards showing this problem had in common was using "..." (three dots/periods/full stops) in the Mandarin headword.


Finally. 99% of you won't ever have any need of this. I've documented it in case you're one of those hard core, techie, OCD ridden nerds who have need of it. Good luck if you do!
 

mikelove

皇帝
Staff member
1/ I cannot search for flashcards without a category. Sure, I can list the "Uncategorized" cards, but I'd also like to include that in the searches in case I want to limit what I am searching for. Unfortunately, that is not an option on my phone.

That's odd, it certainly should be - so you can't "AND" your other search terms with category Uncategorized?
 

pdwalker

状元
Category "Uncategorized" doesn't show up on my phone and ipad air, either under "Search for Category" or "Search for Exact Category".

Also, just checked my wife's phone and it's the same, and I've done nothing at all with her settings unlike my devices.

Need screenshots?

I've an ipad 3 in the office which I will check tomorrow to see if it shows anything different.
 
Top