Needle in a Haystack
Conclusions
The script presented here does not spare you the unavoidable task of manually separating the wheat from the chaff. If you run the script on a large mailbox, the result will be many files with either cryptic or very similar names (Figure 3). In the latter case, finding out which file is the true final (or initial) version requires manual examination.
Also, be prepared to fix permissions and ownership of files manually. By default, email folder and files permissions on Linux are set to 600
, which means "only readable by the owner." Depending on how you configure the script, many of the files it extracts will have the same permissions, which may or may not be what you want.
Final thought: Some weird combination of character encodings and recursively embedded messages surely exists out there that would make this extraction script fail and requiring tweaking or other manual work. Unfortunately, there is nothing to be done about this scenario. However, considering that some files from just 15 or 20 years ago are already unreadable, you should be happy that you can still process all email messages ever created without particular problems. This all goes to prove that the best "innovation" is based on simple and really open standards.
Infos
- MIME: https://www.hunnysoft.com/mime/mime-guide.html
- Email format overview: https://wiki2.dovecot.org/MailboxFormat
- Using Mutt as a mailbox converter: https://foolab.org/node/1737
- Save tagged attachments with Mutt: https://unix.stackexchange.com/questions/37218/how-to-really-easily-save-all-tagged-attachments-in-mutt
- procmail: https://unix.stackexchange.com/questions/421433/procmail-save-attachment-with-received-date-in-filename @IE
« Previous 1 2 3 4
Buy this article as PDF
(incl. VAT)