I spent a couple of days struggling with encryption this week, so the weekend seemed like a perfect time to brush up as I hope to close the door on my issue Monday. Then, a funny thing happened. While I was searching Google to understand the syntax of one of an encryption function, I found my own article:
It turns out I had the same problem almost exactly a year ago, right down to syntax that can be confusing if you don’t know where to look in the python-gnupg documentation. So I had a couple of takeaways:
- If you see a list as the second argument in
gpg.encrypt
, it’s because we’re using multiple public key fingerprints for the encryption. This way, multiple recipients can decrypt the data with their own private keys. - Sometimes you have to learn something more than once to really retain the information.
A Demo
I won’t be treading too much new ground in this post, but I’ll do something I probably should have done the first time: generate a real PGP key and write a real Python script for encryption and decryption. Hopefully that’ll help me remember how this works.
input.py
To start, we need a private and a public key. Following this tutorial from Paul Mahon, I created a file that uses the gen_key
function to give us what we need. Note that before we run this, we might have to install gnupg, so give this a shot if you’re not sure whether you have it: sudo pip3 install python-gnupg
We instantiate gpg
by running the GPG
method we’ve imported from the gnupg
library. The argument we pass is the path to where the keys will be stored, so you’ll have to change this if you’re following at home. We select utf-8
as our encoding value, which will keep our data as text.
The rest of this file is responsible for generating our keys. It starts with input_data
, which is a specially formatted command string that includes:
name_email
: just an email address to be associated with our key)passphrase
: this value will be required to access the keys we’re creatingkey_type
: RSA or DSA. They both use a public and private key for encryption and decryption, they just have different algorithms. Per GeeksforGeeks, in RSA algorithm, encryption key is public but decryption key is private.key_length
: this determines the number of bits in the primary key. More bits means more security, but also more time to generate. Since this is a sample, we can go with a relatively small number.
When we pass input_data
into the gen_key
function and we end up with a public and private key! When we print key
, we don’t see those values, but instead see the public key fingerprint:
If we want to confirm that our keys have been saved to the desired location, we can run gpg --list-keys
We’ll see a response like this for each key pair saved on our keyring. Again we see the fingerprint, but not the actual values of the keys.
encrypt.py
Once we have the keys, we can use them to encrypt a file. I created secret.txt
and then passed it as an argument here:
Like in the previous script, we import gnupg
and instantiate gpg
. Next we start reading my secret.txt
file (rb
means reading in binary mode, which is helpful for encryption). We run the encrypt_file
function, passing three arguments:
f
, the file we’re encryptingrecipients
, whoever is going to be reading the file. I had to use the email associated with the key I just created. If a third party was asking me to use their key instead, I would have to add their public key to my keyring, or this wouldn’t work.output
, or the name of the encrypted file we’re going to create
The return value of this process is saved as status
, which we then pass into a print statement to make sure the encryption worked. We also check for standard errors. In the demo on YouTube, it worked perfectly. It also worked in my case, but my machine had to find additional entropy before it ultimately succeeded with the encryption:
I don’t know why this printed so many times (the whole process still only took a few seconds), but I understand that encryption requires entropy because it is generating random values to use as code for the output. And indeed we ended up with an output that looks like this:
Now let’s confirm that this actually worked by trying to decrypt the message.
decrypt.py
The format of this file should look familiar:
The main differences we see here are:
- instead of
recipients
, we’re passingpassphrase
. We’re using the passphrase we set up earlier to access the private key, which is required for decryption. Remember that we can encrypt with a public key, but the private key, which of course should never be shared, is required for actually reading the encrypted file. - Our
output
value is a new filename so that we don’t replace our other file.
If our passphrase is correctly associated with the RSA key, we’ll have a new, decrypted file:
In practice this isn’t so easy, as it’s impractical to store keys on local machines. Instead they’re usually up in a shared drive or, like in our case, on AWS Secrets Manager. This coming week, I’ll have to learn how to update a keyring that’s saved on the secrets manager for a new client while being careful not to remove any of the credentials that are already saved there. As always, a better understanding of the basic version of this process will set me up for success whenever we find a wrinkle.
Sources
- What I Learned at Work this Week: GPG, PGP, Encryption (my old article)
- python-gnupg — A Python wrapper for GnuPG, GnuPG
- GPG/PGP Free Data Encryption with Python, Practical Python Solutions, Paul Mahon
- Difference between RSA algorithm and DSA, GeeksforGeeks
- AWS Secrets Manager