Sat, 18 Dec 2010

Archiving E-mails

When you don't use Gmail - or used e-mail before it existed

Over the years I’ve used different E-mail clients and their export function to backup my E-mails. This, of course, resulted in a huge mess of files with different formats and backups that contain the same E-mails over and over again. So how should I organise all those messages in a coherent and useful way?

MailStore Home seems to be part of the solution. It can import E-mails from all major file formats and applications directly and removes duplicates automatically. So it’s easy to consolidate all E-mails from a few years back into MailStore itself. Of course it allows you to search and view the imported messages, which is great, but there is one drawback: All your E-mail are belong to MailStore. And I don’t really like that. The database format of MailStore seems to be proprietary and the export functionality only allows exporting as .eml or .msg files. That means: thousands of files named message-1-1.eml, message-1-2.eml, ... in one directory. I’ll admit that this is coherent, but for me that doesn’t qualify as “useful”.

However, there is a simple solution for that problem too: ImportExportTools for Mozilla Thunderbird (yep, that’s what I’m using, sorry). Just import all those .eml files into Thunderbird, organise them further if needed, and use ImportExportTools again to export them as standard MBOX files.

Taking thousands of E-mails in different and redundant files, running them through the process described above and ending up with clean MBOX files qualifies as coherent and useful in my book. So this is what I will do over the next few days whenever I’m motivated enough to drag files around. When I’m done I’ll post a short update to let you know if everything went according to plan.


Importing MBOX files into MailStore worked as expected. When importing .eml files however, no dupe-check is performed, so I ended up with 3 years of duplicated E-mails. Luckily those years were low volume, so manual deletion was relatively painless. Still, an option to search and eliminate duplicated messages (based on the message ID) would be very helpful. Such a function to be missing is a head-scratcher for me.

Creating searches in MailStore to collect E-mails by year from all different sources was simple and so was exporting the resulting messages as .eml files. Even for years that contained more than 14000 messages everything worked perfectly. Not even a slow-down, even as they are exported into one single directory. (NTFS partition if you are wondering)

Finally the import into Thunderbird was equally painless. Just like exporting from MailStore it took less than a minute to import the messages into Thunderbird’s Archive folder.

Now I have nearly 15 years of archaic E-mails conveniently archived and backed-up. However 9 months of E-mails are missing because of a damaged ZIP file. Unfortunately back then I didn’t care much for redundancy in back-ups. Stupid me. Another 15 months are missing because of a RAID that got destroyed. Why I didn’t back-up my mails during that time.. I don’t know. But a number of years have passed since then and my back-up/archiving concepts have changed. They are much more bullet-proof now than they ever were. That is, until something unforseen happens….

Enable Javascript to see comments