Very large lexicon folder
-
hucker last edited by
Using Opera M2 for email and news (one of each account, mail is POP3), version 1.0, build 1044.
I can find very little info on this, only that the lexicon folder should be the same or smaller than the store folder. My store was 2GB, and my lexicon 8GB (approx). 6 months of emails and news. I closed M2, deleted all store files older than 1 month (approx, I just deleted pre-November), then deleted the lexicon folder. I restarted M2 and it created a new Lexicon folder, and was usable immediately, but it continued to make larger and larger lexicon files, getting up to 8GB again, even though the mail store is now only 1.5GB. Why is it so big?
I'm now trying to find the larger unnecessary files (I think some binary posts were made in a text group, but Opera should handle binaries? - Perhaps yenc encoding messes up the indexer?)
-
hucker last edited by
I think I sorted this myself. Deleting many very large news files (300KB to 3.5MB each, total over a GB) from within the store folder, using treesize to help find them, The Lexicon was recreated at 60% of the size of store, instead of 600% of the size. I'm assuming Opera was trying to index yenc attachments (which it can't understand) as text, whereas it successfully ignores mime and uu attachments.
I do wish someone would continue M2 development
-
burnout426 Volunteer last edited by
@hucker said in Very large lexicon folder:
I'm assuming Opera was trying to index yenc attachments (which it can't understand) as text, whereas it successfully ignores mime and uu attachments.
Not sure, but that's a possibility. I do not have technical details for how Opera processes messages to build the lexicon files. The lexicon folder is for the search index where Opera splits up messages by words to build an index for fast searching. It could be as big as all the words in all the messages plus the overhead of storing all the info in the files in the binary format that's used. There more be more to it than that even, so I don't know if there's any general rule about the size of the lexicon folder compared to the store folder. Deleting the lexicon folder just causes Opera to rebuild that search index from all the messages again.
-
hucker last edited by hucker
@burnout426 Reducing the NUMBER of files in store to a third had little effect on the new size of a rebuilt lexicon. Deleting the few large files in store caused the lexicon to change size dramatically, from several times larger than store, to half the size of store. I think it must have been making a bad attempt at finding words in yenc encoding, which would essentially be a 3 million character word, or thousands of very long ones. I used to have binary groups in Opera and it usually coped. I think yenc just confuses it.
I don't really care if there's an 8GB store, but the problem is with a huge amount of random disk access when it's messing about trying to do whatever it's doing. Which puts disk usage to 100% continuously, and slows everything else down. This machine has a rotary drive, but my last one I had Opera on had an SSD, and suffered a similar problem. I don't really like the possibility of wearing out either type of drive, which I'm sure would happen if Opera continued at 100% usage 24/7.