Digital Signing and Encryption

The purpose of this web page is to give an overview of the most common digital signing and encryption protocols, as well some of the most common tools that implement them in the context of Fedora Linux.

Basic Encryption

In order to use encryption, it's useful to understand just a little of the underlying theory. Before about 1980, most encryption was done symmetrically, that is, the same key was used for both encryption and decryption. Since then, we have seen the advent of asymmetric encryption, where different keys are used for encryption and decryption. While symmetric encryption has some advantages over asymmetric encryption, asymmetric encryption is particularly well suited to modern communication over public networks.

In any asymmetric encryption scheme, keys are created in pairs. If one key in the pair is used to encrypt a message, then the other key must be used to decrypt the same message and vice versa. Either key in the pair may be used for either operation, but once one key is used, then the other key in the pair must be used in the companion operation. This characteristic of asymmetric encryption is its fundamental utility in encryption on public networks. When a keypair is created, one key is kept secret (the private key), while the other is advertised to the world (the public key). If you have created such a keypair, anyone who wants to send you an encrypted message can encrypt it with your public key, and no matter who else gains access to it on the public network, only someone in possession of your private key can read it. If your private key has indeed been kept private, then the message is secure. Similarly, if you want to prove a message is from you (which is what is meant by digital signing), all you need to do is encrypt a well known phrase, like the first sentence from the message, with your private key. Anyone can decrypt it with your public key, but then they will know that only someone in possession of your private key could have sent in the first place. So if you and your correspondent have both created keypairs, and you are each in possession of the other's public key, then you can both encrypt or digitally sign messages to one another.

This illustrates why asymmetric encryption is used on public networks; it does not matter who sees your public key. In fact, you would like everyone to see it, since all it can be used for is to encrypt messages that only you can decrypt, or to verify your digital signature on a document. There is no reason you would want to prevent anyone from doing these things. On the other hand, to use a symmetric encryption system, once a key is generated it must be transmitted to both parties who are going to communicate. You cannot do this securely on a public network, unless you want to encrypt the key itself, but this raises the question of how you will do that. In fact, symmetric keys are often encrypted with asymmetric keys for distribution across public networks, but we don't need to know about this in order to use the basic tools available for encryption.

For this reason, asymmetric encryption is often known as public key encryption, and we will use that expression here. So now all you need to do is create a keypair, distribute the public key to your correspondents, get them to do the same and acquire their public keys, and you are ready to go. Well, yes, if you want to do all the encryption and decryption yourself, but you probably don't. Rather it would be nice to use existing software, much as we use existing software for just about everything else we do on the net. And unfortunately, different bits of software are good for different things, and many use different formats for the keypairs. So the easiest thing to do is to create more than one keypair, and keep track of which pair is intended for which purpose, and use them appropriately.

Digital Signatures

The brief mention of digital signature given above really covers the basic ground, but a little more explanation is in order. One thing that is often done to a document as part of an encryption protocol is to hash the text before applying an encryption algorithm to it. Hashing is a process that has wide use in computer science, typically to gather some heterogenous collection of data into a predetermined format. For digital signing, the message is hashed into a text of an established size which is then encrypted with the private key. The signature can then be decrypted with the public key and the message hashed by the recipient, and the result of the hash and the decryption match, then the signature is verified. There are many ways to perform a hash, but if a good method is chosen, then it is extremely unlikely that two different unencrypted messages will hash to the same text. As a result, the digital signature provides authentication, since only someone in possesion of the private key could have encrypted it, as well as integrity, since no other message is likely to have hashed to the same result, and thus if the message was changed after it was signed, the signature will not decrypt properly.

Public Keys, Fingerprints, and Certificates

At this point, it should be clear that in order for public key encryption to be useful, your correspondents must all have a copy of your public key, and you must have copies of all of theirs. Since there might be quite a few of these, it makes sense that these keys live in files that contain some information about how to use the key, and to whom it belongs. Such files are called certificates.

There are, unfortunately and inevitably, many formats for certificates. At its most spare, a certificate may include no more than a binary number which is the public key itself, but I have never seen a certificate in actual use that looked like that. At the very least, a certificate will also contain the name or the protocol for which the certificate is intended (since there are many ways to implement an asymmetric cipher), and possibly the name and contact information for the owner of the key, a shortened form of the key known as a fingerprint, and perhaps one or more digital signatures from signing authorities. The certificate may also be nicely formatted so it is easy to read and to include in electronic messages.

The fingerprint for a cipher key is a little like a human fingerprint in that while it is theoretically possible for two keys to have the same fingerprint, is so unlikely as to be impossible for all practical purposes. Thus a fingerprint can be used to ensure that you have the right key, either in software or face-to-face, when you meet one of your correspondents in person. Fingerprints are often used in this way as a form of authentication.

Since it is possible to digitally sign a document in a tamper-proof way, so that any digital document that is signed cannot be changed without losing the signature, this provides a form of authentication for certificates. To see this, consider the problem of distributing your certificate. If you meet someone in person (and you are sure it the right person, and likewise that person is just as sure of you), great. Swap certificates on memory sticks and you are ready to go. But this is impractical most of the time, so independent entities known as signing authorities have sprung up to meet this new demand. You can submit your certificate to a signing authority and they will go to some trouble to verify that you are really the person you claim to be, for example by calling you and making sure the phone number they use is real, etc. Your correspondents will see that the certificate is signed in a tamper-proof way by someone they trust, providing added assurance that the certificate really belongs to you, and not someone masquerading as you on the Internet. There are both for-profit and non-profit signing authorities who will sign your certificate for you, but unless you are engaged in E-commerce it is probably unnecessary. It also complicates things even further than they would already be if your private key is compromised and you want to rescind the certificate.

Incidentally, fans of "six degrees fo separation" can pursue this game in a modestly productive way by signing certificates that you believe to be authentic, either because you have verified it directly, or you trust one of the previous signers. Certificates typicaly have an expiration date, so this is an exercise that not only never ends, but recycles itself.

RSA, PGP, GPG, and electronic mail

RSA stands for Rivest-Shamir-Adelman, the three inventors of the eponymous public key protocol. It was not the first public key protocol, nor is it perfect, but it is quite good and easy to understand, and it is also quite old, so the early patents on it have expired. In fact, software patents haven't been a big practical issue in many encryption systems, although arguments over patents as well as the US government's interest in encryption, makes for very entertaining reading. PGP (GPG) is a software tool that implements RSA and other encryption protocols in a very nice way.

You probably already use encryption on public networks for such things as Internet commerce, but in these applications the encryption is taken care of for you, and you wouldn't be able to configure it even if you wanted to. Electronic mail is one of the first places people generally want to add encryption and digital signing to their own workflow. The most commonly used protocol for this is PGP, also known as GPG. Here is a quick primer on PGP (GPG).

First off, you probably want to know, "well, which is it, PGP or GPG?" It's both. Very briefly, Phil Zimmermann invented PGP (which stands for "Pretty Good Privacy"), and eventually sold his company, but established an OpenPGP initiative at the same time. Anyone can implement PGP, but only PGP inc., or its successors, can call it "PGP". The GNU organization, the original free software outfit, implemented their own version of PGP and called it Gnu Privacy Guard, or GPG. So GPG is a software tool that creates keypairs and certificates just like PGP, and the two can coexist. You can use GPG to decrypt a message encrypted with PGP, but using GPG certificates, etc. I use GPG in my Linux system, but I have heard that PGP is a very nice package with a clean usable interface that runs on lots of systems.

Since I use GPG and not PGP, I can tell you that in GPG the only thing you must do is generate a keypair by executing the command:

gpg --gen-key
and then answer the questions it asks. You can almost surely answer all the question it asks with the default that it offers you, such as which algorithm do you want to use, or how long should your key be. If you really care about these sorts of things, you will need far more detail than can be found in this document. Once you finish, GPG will create a directory (folder) with your public and private keys in it, as well as some other files it uses for bookkeeping. One thing it will do is ask you for a passphrase, and I and almost everyone else will recommend that you come up with a good one so that anyone who gains access to your computer doesn't automatically also have access to your private key.

Once you have done this one thing that you must do, there are many other things that you may want to do. You may want to create a very nice and comprehensive certificate that even has your photo in it. You may want to send the certificate to a keyserver, so that other people will have easy access to it. Keyservers are computers on the Internet that gather PGP(GPG) certificates and make them available to anyone who wants them. Since the keyservers all talk to each other, you can send it to any one of them and it will propogate everywhere. You can read the PGP(GPG) web page to find out what else you can do, but once you have a keypair, you are ready to sign or encrypt messages.

PGP and GPG have commands that allow you to encrypt or sign a file, and then you can send the file to one of your correspondents either as the body of the message, or as an attachment to a dummy message, but many, if not most, modern mail clients support PGP and make it very convenient to encrypt and sign, as well as to decrypt and verify messages directly from the client's own user interface. The mail client I use, Thunderbird, does this through an extension called "Enigmail", which is very convenient. Once you have Enigmail installed and your keypair created, the mailer includes two icons (a pen and a key) for signing or encrypting messages. If you receive an encrypted or signed message, Enigmail senses this and offers to decrypt it or verify it if you have the appropriate public key, and try to find the key on a keyserver if you don't. Other mail clients like Eudora, Outlook, or Apple Mail have similar features.

So if you want to send signed or encrypted mail with PGP(GPG), here is all you have to do:

  1. Install PGP(GPG) on your computer, if it is not already there
  2. Generate a keypair, and export the public key to a keyserver
  3. Upgrade your mail client so it handles PGP(GPG), if necessary
  4. Ensure that your correspondents do the same
Once this is done, you will find sending encrypted or signed email very easy. One thing that may cramp your style is that you can't encrypt images or anything else that isn't "vanilla text", but usually sensitive data is not found in images. The one real fly in the ointment, however, is accomplishing step 4 in the list given above. If one of your correspondents doesn't want to use encryption or signing at all, well, you're on your own. But it is possible that he or she will want to use some protocol other than PGP(GPG).

OpenSSL and S/MIME

It isn't really correct to call PGP a protocol; it is more than that. It is really a toolkit that implements many protocols, and defines its own file formats. There are other packages that will implement useful encryption protocols, but the most useful general purpose tools besides PGP(GPG) that you are likely to run into are OpenSSL and S/MIME.

In fact, all of these tools and protocols coexist quite nicely. Indeed, when you download OpenSSL files from the official site, authentication is provided by PGP signatures. And there is a certain amount of convergence of these tools going right now. Still, if you want a complete encryption package for now, it is best to have some understanding of all three. You have almost surely used S/MIME's cousin and predecessor MIME, whether you know it or not. Information on MIME can be found in many places, but very briefly it is a protocol for including files that require an extended character set in protocols that only support a restricted set. The most common place where this is used is electronic mail, when one attaches images or other special files, or even sends the letter in HTML format. S/MIME is an extension to MIME specifically intended for sending encrypted messages. It specifies a MIME type for encrypted messages and signature, a certificate format, and since it is an extension of MIME, it handles all sorts of character sets, so you can "encrypt" a picture of yourself, for example.