Pleco using ~8GB storage despite minimal visible data

kasim

举人
Hi Pleco team,

I’m trying to understand why Pleco is taking up such a large amount of storage on my iPhone.

Currently, the app shows around 8+ GB under “Documents & Data”, which makes it by far the largest app on my phone — something that is quite surprising for a dictionary application.

What confuses me is that the actual visible files (flashcards, dictionaries, and a few PDFs) do not come anywhere close to that size. Based on what I can see, I would expect perhaps a few hundred megabytes at most.

To troubleshoot, I have already performed the following steps inside Pleco:

- Removed orphaned items
- Compacted databases
- Rebuilt database indexes
- Refreshed derived fields
- Cleared search history
- Cleared documents and bookmarks

After all of this, the storage usage decreased by only about 300 MB, and the app is still taking up over 8 GB.

At this point, I genuinely cannot account for where this data is coming from.

Could you please clarify:

- What exactly is included in Pleco’s “Documents & Data” on iOS?
- Are there hidden caches, indexes, OCR data, audio packs, or internal databases that are not visible in the file list?
- Is there any way to see a detailed storage breakdown within the app?
- Is there a safe way to clear unnecessary data without losing flashcards and user content?

Right now, this level of storage usage seems excessive and difficult to justify, especially given the actual content stored in the app.

I would really appreciate a detailed explanation or guidance on how to reduce this.

Best regards,
Kasim
 

mikelove

皇帝
Staff member
I don’t suppose you have any particularly enormous user dictionary files? Or, say, hundreds of thousands of flashcards? That wouldn’t be 8 GB for any individual dictionary, but it could be quite large - there are a few user dictionaries floating around (BKRS for example) with half a million entries or more in them, and a dictionary that size plus a fast full-text index of it can legitimately run as big as 1 GB.

Other than that, my best guess is that this is some sort of a virtual memory bug - we do allocate large amounts of temporary memory in certain circumstances, like a database import, and it’s possible the system might have counted that as long-term disk space usage instead of short-term virtual memory usage. Did you try rebooting your phone? Normally that ought to be enough to reclaim the space.
 

kasim

举人
I don’t suppose you have any particularly enormous user dictionary files? Or, say, hundreds of thousands of flashcards? That wouldn’t be 8 GB for any individual dictionary, but it could be quite large - there are a few user dictionaries floating around (BKRS for example) with half a million entries or more in them, and a dictionary that size plus a fast full-text index of it can legitimately run as big as 1 GB.

Other than that, my best guess is that this is some sort of a virtual memory bug - we do allocate large amounts of temporary memory in certain circumstances, like a database import, and it’s possible the system might have counted that as long-term disk space usage instead of short-term virtual memory usage. Did you try rebooting your phone? Normally that ought to be enough to reclaim the space.
Hi,

Thanks for your response.

To clarify my setup more precisely, the largest user-side items I currently have are:

- bkrs.txt (~33.2 MB)
- dictionary-import-file.txt (~45.9 MB)
- GPT-240420.pqb (GPT examples dictionary, ~56.5 MB)
- plus a few small text files (radicals, RU-EN, etc.) and a couple of small PDFs

In addition, I have:
- all dictionaries available within Pleco downloaded
- HSK 3.0 flashcards downloaded from here

Previously, I also had multiple “Pre-Schema-Migrate” backup files, but I have now removed the older ones, which freed roughly ~1 GB.

Despite this cleanup, Pleco still reports approximately:

- ~7.5 GB in user data (Documents & Data)
- plus the app size itself on top of that

This still appears significantly higher than expected given the actual visible data. Even accounting for dictionary databases and indexing, the current footprint seems disproportionately large.

I have also:
- run all maintenance options (remove orphaned items, compact databases, rebuild indexes, etc.)
- restarted the device

but the reported storage usage remains unchanged.

At this point, I am trying to understand whether:
- large internal indexes or caches (e.g. full-text search indexes for multiple dictionaries) could realistically account for several GB
- or whether this might be related to the virtual memory / temporary allocation issue you mentioned

However, since a reboot did not reclaim the space, it does not appear to be transient.

Could you advise whether there is any way to:
- inspect or quantify internal storage (e.g. per-dictionary index size)
- clear or rebuild those components specifically
- or whether a full reinstall is currently the only way to reset this usage

Right now, the reported ~7.5 GB of user data seems difficult to reconcile with the actual files present.

Best regards,
Kasim
 

mikelove

皇帝
Staff member
There isn't currently a system to quantify internal storage use for a specific dictionary, though we could look at adding one. "Rebuild indexes" should erase and rebuild all of the index files for all of the user dictionaries, so if that doesn't reset it there isn't anything else that would.

Could you send me backups of all of your user dictionary databases? (backup command in Settings / Manage Dictionaries) We can try them out on a system here and see how big their indexes are supposed to get. As I said, BKRS can run as big as 1 GB with indexes, so it's not totally inconceivable that BKRS plus two other dictionaries larger than BKRS plus every available Pleco dictionary could start to get close to that 7.5 GB figure. If it turns out that they're a lot lower we can then try to figure out what else might be causing this.

In general, we do have pretty large indexes, and the reason for that is that we need the speed boost they provide in order to be able to make searching/merging/sorting entries from dozens of dictionaries at once work efficiently; since even low-end iPhones ship with 256 GB of storage these days, we generally optimize around performance rather than storage space, and so we would not necessarily deem this level of disk usage with this many large user dictionaries a problem.

However, it would be quite easy to add options to not index some parts of user dictionaries - if, say, you're only interested in searching your example database by simplified characters and don't need traditional or pinyin or full-text - so if it does turn out that the 7.5 GB is not reflective of some other bug, we can certainly consider adding some options like that.
 

mikelove

皇帝
Staff member
One other thing to note here: part of the reason our indexes are so big is because they're sparse; compressed, they take up around 1/4 as much space as they do uncompressed, and iPhones have been compressing files in their internal storage for like a decade now. So while the file size might be listed as 7.5 GB, the actual impact on your available storage should be considerably lower than that. (unlike, say, a video file, which is already compressed and can't be compressed much more)
 
Top