Fork me on GitHub

Protecting Your Local Data

Posted by Bjarni on November 23, 2016

Everyone understands physical security.

Starting in play-school, we learn that if we don't want the other kids to play with our favourite toys, then steps must be taken. Hopefully most of us also learned to share and play nice, but that's another matter...

Keeping physical objects safe is such a basic skill that it requires no explanation. We all know the importance of keeping our wallets or our phones safe, our jewellry, our keys. We may not do a great job all the time, some of us are absent minded and maybe we live in a safe place and don't worry too much - but we all understand the concept.

One of Mailpile's main goals is to bring this kind of intuitive security to our digital lives, to e-mail. If a physical device in your possession stores your mail, then you already know the basics of keeping your e-mail secure, private and safe.

So far, so good!

But what happens when things go wrong? What happens when your Mailpile machine gets lost or stolen? Or dropped on the floor?

The Risks

When something goes wrong with your Mailpile device, there are two bad things that are likely to happen:

  1. You lose access to your Mailpile data
  2. Someone who shouldn't gains access to your Mailpile data

The first case is more common; it can happen when a machine gets lost, gets stolen, gets damaged in fire or flood, or malfunctions of its own accord. Given enough time, it's almost inevitable.

The second case is much less common, but when it happens the consequences can be dire. If your e-mail contains very sensitive material (confidential documents, nude photos, incriminating evidence) your job, your relationships or even your life may be at risk.

So those are the problems. How about some solutions?

Avoiding Data Loss: Backups

Protecting you from losing access to your Mailpile is conceptually simple. You just need backups. That's all.

Famous last words...

Making backups available and easy is actually quite hard, because it is ultimately a physical problem. The data needs to be copied to another device that won't be lost or damaged by the calamity that made you need backups in the first place.

Any local physical solution ultimately relies on user education and user action; which means our less skilled or more absent-minded users will not end up with working backups. This is unacceptable.

A much better solution would be to use the network, automatically store Mailpile users' backups somewhere else. But where? If Mailpile were to directly provide an online backup solution, that would make us stewards of your data (a role we would like avoid) and would also cost us a bunch of money to do properly - money which the project simply does not have.

We also need to take care that the backups themselves don't become a security risk - we don't want to create a backup solution that accidentally grants unauthorized parties access to your mail!

Our current strategy is simple: we want to encrypt all your data and then upload it right back to one of the IMAP servers you already have access to.

Simple right? A mere matter of programming (this will take some time).

Avoiding Data Loss part II: Keys, Keys, Keys

We have a plan for backups! But what about the encryption keys?

If those are lost, all is lost.

If those are stolen, all is stolen.

Currently, Mailpile requires users log in to their Mailpile with a password (or passphrase). Internally, this passphrase unlocks a file encrypted using GnuPG, a file which contains the actual encryption key used for everything else.

If we trust GnuPG's security (which we do), it should be perfectly fine to store this along with all the other data. Right?

Unfortunately, no. The problem is, our users may not choose strong passwords. No matter how much we try to educate, some will end up using the same password for their Mailpile as they used for the IMAP server we're uploading the backups to. Since the operator of the IMAP server can access that password with relative ease, that would leave the backups exposed.

So this part of the plan requires more thought; the current strategy we are leaning towards, is to create an unguessable "recovery key", made up of 9-12 dictionary words. This recovery key will be used to encrypt the keys to the kingdom, before they too are uploaded to the cloud.

Once backups are enabled, the app will do the following:

  1. Ask the user to print out or write down the recovery key, and keep it safe.
  2. Ask the user to nominate a few trusted friends or family members, and e-mail each of them a part of the recovery key.

The first option is for people who don't have any friends. The second option is for everyone else!

When a user needs to restore their Mailpile from backups, they will either fetch the printout from the shoebox under the bed (or from the bank vault), or they will contact their friends asking for help. Either way, it's a relatively simple procedure.

We think... your feedback is most welcome!

There are of course lots of little details; the length of the recovery key, the number of trusted friends who receive key details versus the number of friends required to respond for successful recovery, the wording of the key fragment e-mails. And so on! But I think the basic idea is sound and the implementation should be able to adapt to the needs of both casual users and users with strong security requirements.

Incidentally, this procedure can also serve as a decentralized "password reset" feature for people who forget their Mailpile password. They just have to remember who their trusted friends were...

Keeping Your Data Private

The final piece of this puzzle, is keeping data private.

This is also conceptually simple: we encrypt everything!

Considering that strong cryptography is available "off the shelf" from multiple mature libraries these days, our job is to choose one of the industry-standard algorithms from one of the standard libraries, and make use of it.

The main constraints are performance and reliability. We would like the encryption to not slow things down too much, and we would like the encryption to not increase the odds of data loss.

Given those two constraints, our current preference is to use AES in CTR (Counter) mode, with a SHA256-based MAC to detect data corruption and/or tampering.

Since CTR mode localizes any damage (unlike AES-CBC which we were mistakenly using before), it can be argued that the encryption doesn't reduce reliability. Of course in practice this will depend on the what the app does when corruption is detected and how easy (and safe) it is to recover - but as long as the data is mostly intact, our recovery strategies can evolve and improve over time.

An alternative mode, GCM (Galois Counter Mode) would also be a good candidate, but it's currently not as widely available as CTR mode. We may switch to it in the future.

Another item for the long-term wishlist is use of error correction codes to automatically recover from minor corruption; potentially making Mailpile Encrypted Storage more resiliant than normal unencrypted files.

Current State

This is all well and good. But we're not there yet!

Here is a summary of the current state of things in Mailpile and what we expect for version 1.0:

  1. Automated backups are not implemented yet and probably will not be ready by version 1.0. We have however laid the groundwork by defining an on-disk storage format which can be easily uploaded to IMAP servers (since the format looks like e-mail).

  2. Key backups are not implemented, but are a goal for 1.0 due to their importance and the extreme risk of data loss if they are compromised. Join the discussion here.

  3. Local data encryption is implemented. We are in the process of switching encryption algorithms to AES-CTR.

Thanks for reading!

Search as a Core Feature

Posted by Bjarni Rúnar on October 26, 2016

One of the main differences between Mailpile and most other Free Software e-mail clients has to do with the approach we take to handling e-mail.

The first generation of e-mail clients focused on the e-mail itself and provided mailboxes or folders as places to store it. Organizing your e-mail meant moving it around, from one mailbox to another. This is how most desktop e-mail clients work today.

Mailpile's approach is different. Inspired by GMail, we decided to make search the central metaphor. Organizing mail in Mailpile then became a matter of labeling messages in such ways that they could easily be searched for; this is how Mailpile's tags work.

Tags are much more flexible and powerful than mailboxes; once the search engine has indexed all your mail it no longer matters in which mailbox or folder the mail is stored since you can access and organize any combination of messages using a mixture of tags and search terms. Searching a well designed index is actually faster (both for the human and the computer) than finding and opening the right mailbox.

So Mailpile is a search engine first and foremost. Most of the other features it has are built on top of that foundation.

What About Mailboxes?

This is all well and good.

But the fact remains that sometimes we need to open a mailbox and look inside; after all, that is where the mail is!

Mailpile has struggled with this from the beginning. Being built around a single search engine meant Mailpile couldn't really do much with the contents of a mailbox until it had all been processed and added to the search index.

This led to usability problems. If Mailpile was given a large number of mailboxes to process it could take quite some time before it got to the one the user was interested in. If the background indexing process had a problem, mail would just never appear. Users coming from traditional mail clients had expectations which Mailpile could not meet. And last but not least, it made troubleshooting very difficult because there were so many layers of code, each introducing potential bugs or delays, that an e-mail had to travel through before it appeared (or failed to appear) in the user interface.

Sometimes you just want to open up a mailbox and look inside, without having to add all the mail to your Mailpile.

Embracing Search

The solution I have found to this problem wasn't to stop treating search as a core feature. It was to embrace it and take it a step further: who says there should only be one way to search?

Many IMAP servers offer search features. We should be able to make use of them. Similarly, searching a raw mailbox is relatively straightforward - it may not be elegant or as fast as Mailpile's native index, but running "grep" or the equivalent very often gives useful results. It turns out there are many ways to search mailboxes that haven't been fully processed and added to the main Mailpile index.

So Mailpile now internally supports multiple search indexes. At the moment it only searches one at a time, and some search indexes are not very good at searching yet... but the code is elegant and clean, works well and has interesting potential for the future.

Maybe someday we'll have hybrid search, which searches both remote IMAP servers and the local index. Maybe someday we'll be able to pull in results from notmuch or some alternate index.

But for now, at least this approach makes it easy and quick for Mailpile to look inside raw mailboxes. That alone clears a major roadblock out of the way for a 1.0 release.

I am still mulling over how best to expose this in the user interface. At the very least, it will be the default behaviour when accessing mailboxes from the browsing tool. I may also make it accessible via the per-mailbox tags in the sidebar; doing so is likely to match user expectations better than using Mailpile's internal index. But I'm not entirely sure.

I'll be pushing this up for review once I've finished a few more test.

Rebooting the Mailpile Development Process

Posted by Bjarni Rúnar on September 23, 2016

Hello again!

As warned about in our last blog post, development has been on hiatus for the last few months as I moved to (and from) Iceland, bought and sold a house and helped the Icelandic Pirate Party prepare for the up-coming parliamentary elections. It was frankly, an exhausting summer and I was sad not to have any time for Mailpile.

But it's the summer is well and truly over, and I'm back. Mailpile is again one of my top priorities.

I've warmed up by responding to a few issues on Github, fixing a couple of simple bugs and reconnecting with people on the #mailpile IRC channel on Freenode.

Taking a break is almost always healthy for a project, as it makes room for new insights and perspective.

I'm still in the process of "rebooting" the development effort, but I've already decided to make at least one change - I will take advantage of Github's new code-review tools to get community members to review my own work. This should improve the code quality in Mailpile, while helping spread knowledge and understanding of the code to a wider group of people - over time improving the project's "bus factor". A long-time contributor, Jocelyn Delande, has volunteered to be my first reviewer, and hopefully more people will pitch in over time.

More soon. It's good to be back!

Delegate, Automate, Collaborate, Pirate

Posted by Bjarni Rúnar on April 18, 2016

Avast! Be welcome to this latest irregular Mailpile status update!

In this episode, I will discuss:

  1. A Strategic Spin-Off Project
  2. Deletion and Tag Automation
  3. GnuPG Collaboration
  4. Piracy in Iceland

Progress towards a release has been very slow. This is entirely due to me being busy with other work - things that pay the bills, looking after my lovely baby daughter, buying an apartment and moving to Iceland. I'm swamped!

As I am often exhausted and pressed for time, I have had a hard time sticking to anything resembling a schedule and have basically indulged any vaguely productive impulses, rather than worry about roadmaps.

So if this doesn't look like progress towards a release, you're probably right. I've been very distracted. But it's progress all the same!

A Strategic Spin-Off Project

As mentioned before, Mailpile's desktop integration on Mac OS X and Windows is currently unacceptable and needs a lot of work.

We do have some code, however, and a rough design. In order to encourage people to help out (and maximize the utility of the code we've already written), I spun off the existing GUI code into a separate project: GUI-o-Matic.

This should both lower the barrier to entry and encourage contributions; You no longer need to check out all of Mailpile to hack on the GUI-o-Matic. And because it's a stand-alone utility, it's more likely that other projects will want to make use of it. We hope!

If you've ever wanted an easy way to add a cross-platform desktop graphical user-interface to your code (not just Python!), take a look: GUI-o-Matic is a bit like "dialog" for modern desktop environments.

Old farts will understand.

Deletion and Tag Automation

Did you know the current incarnation of Mailpile cannot actually delete e-mail? It's true. This was actually a deliberate, conservative choice to avoid losing valuable data during development. It was never meant to be permanent, but temporary hacks do tend to outstay their welcome...

In the context of shipping 1.0... well, a mail client isn't really a mail client if it cannot delete mail, is it? More pressingly, a tool which aims to safeguard user privacy has to support the most basic privacy feature of all: deleting unwanted data.

So I decided to (yet again) break the feature freeze and implement message deletion.

This had a knock-on effect. Mailpile's deletion strategy was supposed to be similar to that of other webmail: once things have sat in the Trash for a while, they get deleted automatically. Similarly, messages should automatically move from Spam to Trash after a while and blank drafts should get purged and deleted.

So Mailpile needed a way to a) detect messages had been untouched for a period of time and b) a way to trigger actions once a) was satisfied for a message carrying a particular tag.

So now Mailpile has exactly that!

The search-engine is used to keep a record of when a message tags were last modified, and each tag now has an automation section which specifies a number of days and an action to perform. A few times a day, Mailpile will search for idle messages in tags with automation enabled and either retag or delete the matching messages.

While I was implementing the configuration interface for this, I also added an option to enable statistical auto-tagging for any tag, as described in A Plan for Spam ... and Bacon!, and exposed a few more of the technical tag settings in the Tag settings editor. All features that already existed, but weren't really accessible.

So there we go, tags now have automation and you don't need any command-line black magic to create your own statistical tagging or time-based workflows.

These capability are now available to all tags, including user-created ones. Some of the potential use-cases include:

  1. Deleting Trash after a while
  2. Moving Spam to Trash
  3. Moving untouched blank drafts to Trash
  4. Creating statistical categories for promotions or paperwork
  5. Creating a "Postponed" tag which hides mail from view for a few days

Now we just need an auto-responder and Mailpile will be able to automatically recognize and reply to tech support requests that have been unanswered for more than a week...

(In the process I also fixed bugs in the bayesian auto-trainer, the periodic scheduler that triggers it and the tag editing tools the UI - proper release work after all!)

GnuPG Collaboration

I write this, sitting on a train back home from London.

I was in London today to meet with Neal of the GnuPG project. We discussed how the projects could collaborate more closely in the future and some of the difficulties Mailpile has had integrating with GnuPG.

It was an excellent meeting and I'm optimistic that once GnuPG 2.1 (or 2.2) becomes widespread, Mailpile will be able to make full use of it without any horrible hacks.

Conversations will continue!

Piracy in Iceland

Finally, some bad news.

Iceland's government is broken and I feel an obligation to help fix it. I will be dedicating some time this summer to helping the local Pirate Party prepare for our next elections. Mostly I'll be working behind the scenes on internal party tools, but this inevitably means I will continue being distracted from Mailpile work. But don't worry, I'm not running for a seat in parliament. ;-)

If you can help out in some way to help pick up the slack, please get in touch on #mailpile on Freenode.

That's it for now, thanks for reading.

Time to pack some boxes and move to Iceland!

Older stuff