I read an article this morning about encryption on Cloud Storage Services. The idea was that the latest NSA spying scandals revealed how much the government can access, and the assumption is that a lot of the thing you store are sensitive enough that you wouldn’t want the government to have random access.
The article then went on and introduced a series of software/service combinations that perform a sync and add encryption to your files. The list was separated in three main categories:
- services that store unencrypted, like Amazon Cloud, Google Drive, and Microsoft SkyDrive
- services that encrypt the data on the server, but pass around unencrypted data, like Dropbox
- services that encrypt the data on the client, and the server never sees unencrypted data nor does it keep a key
Having a “little” experience in cryptology, just enough to be dangerous, let me start by saying that relying on any service to provide encryption is as good as not relying on encryption at all.
The problem is that since you do not know how the encryption is handled, all the precautions in the world won’t make sure that you actually get what was promised. In particular, since you use the same software to enter the key and to sync, you will never know whether the key stays where it is supposed to stay, or if it is actually used at all. You cannot see the traffic from and to the servers (it is hopefully encrypted in transit), you cannot see what is stored on the servers, and even if you could do either or both, nobody is going to tell you about an update to the server-side software that could compromise the whole scheme.
Plus, you do not have the time to worry about all this. What you really need is a system that separates encryption and storage/syncing. By using that, you can completely do without encryption when you don’t care, and keep the content of encrypted files off the record of the storage company, no matter what it does with your data.
The article suggested as much, using TrueCrypt. I love TC, use it on all my removable drives. I like the fact it works cross-platform (even though I don’t run Windows much anymore), that it encrypts both containers and file systems, and that it’s easy to use.
I don’t like, on the other hand, that TC creates virtual file systems, and that it’s not well integrated with the system. The former means that you have to sync a file that is the entire size of everything you want to store encrypted, the latter that you (typically) have to start a GUI and connect with your drives on every start and wake event.
A different solution that works much better for me is EncFS. EncFS is a FUSE file system that uses shadow directories: you specify a directory that stores encrypted files, another that stores the same decrypted, and EncFS will automatically translate between the two. The decrypted directory is a mount, that is, it’s not present all the time. If you unmount the decrypted directory, the encrypted directory is still around, but unless you mount it again, the files will stay encrypted.
[Note: a similar concept is used in ecryptfs, which is what Linux uses by default to encrypt user home directories. There is nothing wrong with ecryptfs, and you could conceivably use it instead of EncFS – except for the fact that ecryptfs does not allow nested encryption. If a directory (like your user home directory) is already encrypted, you cannot create another encrypted container within it.]
Setting this up is really simple.
I’ll leave this one to the experts – just go to dropbox.com and follow the instructions there. [Note: Dropbox mysteriously comes as a Nautilus, i.e. Gnome plugin. If you use KDE, still install the Gnome package and live with the extra software. You don’t have to use Nautilus to use Dropbox syncing.]
I should note that, considering we are doing the encryption ourselves, we don’t need the servers to encrypt anything. You can use Google Drive or Amazon Cloud Sync just as well as Dropbox (at least until either declares they don’t like not being able to read your data).
You should be careful about double encryption. All software that encrypts on the client will force your machine to double encrypt and decrypt everything, so there is some additional potential for instability or sluggishness. Test it all out! Some of the software does really well!
Once you install the sync software, create or designate a folder for syncing. Since we are basing this off Dropbox, we’ll use the default Dropbox folder, ~/Dropbox.
Follow the instructions for your distro. On (K)ubuntu, it’s a simple:
sudo apt-get install encfs
No additional configuration required.
Shadow the Folders
Create a folder outside the Dropbox folder that will contain the unencrypted files. Let’s call it, ~/Lockbox
Create a folder inside the Dropbox folder that will contain the encrypted files. Let’s call it, ~/Dropbox/Crypto
Tell EncFS which is which. The first argument to encfs is the encrypted folder; the second is the decrypted folder, like this:
encfs ~/Dropbox/Crypto ~/Lockbox
Now encfs will ask about a password/passphrase. This one is really important, for two reasons:
- If you lose it, you lose access to your data completely. There is no way to recover a lost key! So keep it in a safe place!
- If your password/passphrase is too weak (short, non-random, predictable), it can be cracked within a short time. Choose something unreasonably long!
“Unreasonably long” in this context means something you’d never want to remember, because it’s so long. Instead, you will keep it on the machine in a safe file, or on a USB key, or anywhere not online.
Remember, we are not securing the files on the machine. We only care about the storage on Dropbox. We can live with the encryption provided by ecryptfs on our local computer, but we cannot live with the encryption chosen by Dropbox.
This part is obviously handled by Dropbox. Once you copy files into the decrypted folder (outside Dropbox), encfs will encrypt them and store the encrypted version into the shadow folder (inside your Dropbox). From there, Dropbox will sync them to its servers, and make them available on every computer to which you have acces – in encrypted form.
Once you sync those files onto a different computer, you use the same trick of creating a shadow directory and mounting the EncFS container. The files will be instantly decrypted and accessible.
Automate the Process
This is all nice, but every time you want to access the decrypted files, you have to first make sure the encfs container is mounted.
There are two different ways to do without this requirement. The first one is to use pam_encfs. That loads the encrypted folder on login. I haven’t done that because it seemed too complicated an fidgety. (And the documentation stank.)
The other option, which I ended up preferring, is autofs. autofs is a file system that mounts directories automatically when you try to access them. In this case, it needs to start encfs automatically once you access the decrypted version. [Note: details in a separate article on autofs.]
Who is this good for? Really, anyone that wants to store private data on a cloud service. The use case for cloud services is fairly wide: backup, synchronization across machines, sharing with third parties, etc. Whether you want or need your files to be private is up to you. You should know, though, that pretty much anything you store on an Internet server is fair game for employees of the provider or (apparently) the government.
This article is based on the notion that you don’t particularly care about the security of the encrypted files on your machine(s). That means that if someone steals your machine, you either have encryption on the entire file system, or on your home directory, but that you don’t rely on this scheme and EncFS to secure your local files.
The security of the encryption is as good as the parameters you give EncFS. In particular, the one parameter that matters the most is the encryption passphrase. You should get something really good – maybe a key generated by ssh-keygen. Remember, you are to keep the passphrase on each machine, because we’d rather have a really strong key the servers can’t crack and that you can’t remember than something you can remember and the server can reliably crack in a short time.