Does Python have a built-in, simple way of encoding/decoding strings using a password?
Something like this:
'John Doe', password = 'mypass') 'sjkl28cn2sx0' decode('sjkl28cn2sx0', password = 'mypass') 'John Doe'encode(
So the string “John Doe” gets encrypted as ‘sjkl28cn2sx0’. To get the original string, I would “unlock” that string with the key ‘mypass’, which is a password in my source code. I’d like this to be the way I can encrypt/decrypt a Word document with a password.
I would like to use these encrypted strings as URL parameters. My goal is obfuscation, not strong security; nothing mission critical is being encoded. I realize I could use a database table to store keys and values, but am trying to be minimalist.
Assuming you are only looking for simple obfuscation that will obscure things from the very casual observer, and you aren’t looking to use third party libraries. I’d recommend something like the Vigenere cipher. It is one of the strongest of the simple ancient ciphers.
It’s quick and easy to implement. Something like:
import base64 def encode(key, string): encoded_chars =  for i in xrange(len(string)): key_c = key[i % len(key)] encoded_c = chr(ord(string[i]) + ord(key_c) % 256) encoded_chars.append(encoded_c) encoded_string = "".join(encoded_chars) return base64.urlsafe_b64encode(encoded_string)
Decode is pretty much the same, except you subtract the key.
It is much harder to break if the strings you are encoding are short, and/or if it is hard to guess the length of the passphrase used.
If you are looking for something cryptographic, PyCrypto is probably your best bet, though previous answers overlook some details: ECB mode in PyCrypto requires your message to be a multiple of 16 characters in length. So, you must pad. Also, if you want to use them as URL parameters, use
base64.urlsafe_b64_encode(), rather than the standard one. This replaces a few of the characters in the base64 alphabet with URL-safe characters (as it’s name suggests).
However, you should be ABSOLUTELY certain that this very thin layer of obfuscation suffices for your needs before using this. The Wikipedia article I linked to provides detailed instructions for breaking the cipher, so anyone with a moderate amount of determination could easily break it.
Python has no built-in encryption schemes, no. You also should take encrypted data storage serious; trivial encryption schemes that one developer understands to be insecure and a toy scheme may well be mistaken for a secure scheme by a less experienced developer. If you encrypt, encrypt properly.
You don’t need to do much work to implement a proper encryption scheme however. First of all, don’t re-invent the cryptography wheel, use a trusted cryptography library to handle this for you. For Python 3, that trusted library is
I also recommend that encryption and decryption applies to bytes; encode text messages to bytes first;
stringvalue.encode() encodes to UTF8, easily reverted again using
Last but not least, when encrypting and decrypting, we talk about keys, not passwords. A key should not be human memorable, it is something you store in a secret location but machine readable, whereas a password often can be human-readable and memorised. You can derive a key from a password, with a little care.
But for a web application or process running in a cluster without human attention to keep running it, you want to use a key. Passwords are for when only an end-user needs access to the specific information. Even then, you usually secure the application with a password, then exchange encrypted information using a key, perhaps one attached to the user account.
Symmetric key encryption
Fernet – AES CBC + HMAC, strongly recommended
cryptography library includes the Fernet recipe, a best-practices recipe for using cryptography. Fernet is an open standard,
with ready implementations in a wide range of programming languages and it packages AES CBC encryption for you with version information, a timestamp and an HMAC signature to prevent message tampering.
Fernet makes it very easy to encrypt and decrypt messages and keep you secure. It is the ideal method for encrypting data with a secret.
I recommend you use
Fernet.generate_key() to generate a secure key. You can use a password too (next section), but a full 32-byte secret key (16 bytes to encrypt with, plus another 16 for the signature) is going to be more secure than most passwords you could think of.
The key that Fernet generates is a
bytes object with URL and file safe base64 characters, so printable:
from cryptography.fernet import Fernet key = Fernet.generate_key() # store in a secure location print("Key:", key.decode())
To encrypt or decrypt messages, create a
Fernet() instance with the given key, and call the
Fernet.decrypt(), both the plaintext message to encrypt and the encrypted token are
decrypt() functions would look like:
from cryptography.fernet import Fernet def encrypt(message: bytes, key: bytes) -> bytes: return Fernet(key).encrypt(message) def decrypt(token: bytes, key: bytes) -> bytes: return Fernet(key).decrypt(token)
'John Doe' encrypt(message.encode(), key) 'gAAAAABciT3pFbbSihD_HZBZ8kqfAj94UhknamBuirZWKivWOukgKQ03qE2mcuvpuwCSuZ-X_Xkud0uWQLZ5e-aOwLC0Ccnepg==' token = _ decrypt(token, key).decode() 'John Doe'key = Fernet.generate_key() print(key.decode()) GZWKEhHGNopxRdOHS4H4IyKhLQ8lwnyU7vRLrM3sebY= message =
Fernet with password – key derived from password, weakens the security somewhat
You can use a password instead of a secret key, provided you use a strong key derivation method. You do then have to include the salt and the HMAC iteration count in the message, so the encrypted value is not Fernet-compatible anymore without first separating salt, count and Fernet token:
import secrets from base64 import urlsafe_b64encode as b64e, urlsafe_b64decode as b64d from cryptography.fernet import Fernet from cryptography.hazmat.backends import default_backend from cryptography.hazmat.primitives import hashes from cryptography.hazmat.primitives.kdf.pbkdf2 import PBKDF2HMAC backend = default_backend() iterations = 100_000 def _derive_key(password: bytes, salt: bytes, iterations: int = iterations) -> bytes: """Derive a secret key from a given password and salt""" kdf = PBKDF2HMAC( algorithm=hashes.SHA256(), length=32, salt=salt, iterations=iterations, backend=backend) return b64e(kdf.derive(password)) def password_encrypt(message: bytes, password: str, iterations: int = iterations) -> bytes: salt = secrets.token_bytes(16) key = _derive_key(password.encode(), salt, iterations) return b64e( b'%b%b%b' % ( salt, iterations.to_bytes(4, 'big'), b64d(Fernet(key).encrypt(message)), ) ) def password_decrypt(token: bytes, password: str) -> bytes: decoded = b64d(token) salt, iter, token = decoded[:16], decoded[16:20], b64e(decoded[20:]) iterations = int.from_bytes(iter, 'big') key = _derive_key(password.encode(), salt, iterations) return Fernet(key).decrypt(token)
'John Doe' password = 'mypass' password_encrypt(message.encode(), password) b'9Ljs-w8IRM3XT1NDBbSBuQABhqCAAAAAAFyJdhiCPXms2vQHO7o81xZJn5r8_PAtro8Qpw48kdKrq4vt-551BCUbcErb_GyYRz8SVsu8hxTXvvKOn9QdewRGDfwx' token = _ password_decrypt(token, password).decode() 'John Doe'message =
Including the salt in the output makes it possible to use a random salt value, which in turn ensures the encrypted output is guaranteed to be fully random regardless of password reuse or message repetition. Including the iteration count ensures that you can adjust for CPU performance increases over time without losing the ability to decrypt older messages.
A password alone can be as safe as a Fernet 32-byte random key, provided you generate a properly random password from a similar size pool. 32 bytes gives you 256 ^ 32 number of keys, so if you use an alphabet of 74 characters (26 upper, 26 lower, 10 digits and 12 possible symbols), then your password should be at least
math.ceil(math.log(256 ** 32, 74)) == 42 characters long. However, a well-selected larger number of HMAC iterations can mitigate the lack of entropy somewhat as this makes it much more expensive for an attacker to brute force their way in.
Just know that choosing a shorter but still reasonably secure password won’t cripple this scheme, it just reduces the number of possible values a brute-force attacker would have to search through; make sure to pick a strong enough password for your security requirements.
An alternative is not to encrypt. Don’t be tempted to just use a low-security cipher, or a home-spun implementation of, say Vignere. There is no security in these approaches, but may give an inexperienced developer that is given the task to maintain your code in future the illusion of security, which is worse than no security at all.
If all you need is obscurity, just base64 the data; for URL-safe requirements, the
base64.urlsafe_b64encode() function is fine. Don’t use a password here, just encode and you are done. At most, add some compression (like
import zlib from base64 import urlsafe_b64encode as b64e, urlsafe_b64decode as b64d def obscure(data: bytes) -> bytes: return b64e(zlib.compress(data, 9)) def unobscure(obscured: bytes) -> bytes: return zlib.decompress(b64d(obscured))
b'Hello world!' into
If all you need is a way to make sure that the data can be trusted to be unaltered after having been sent to an untrusted client and received back, then you want to sign the data, you can use the
hmac library for this with SHA1 (still considered secure for HMAC signing) or better:
import hmac import hashlib def sign(data: bytes, key: bytes, algorithm=hashlib.sha256) -> bytes: assert len(key) >= algorithm().digest_size, ( "Key must be at least as long as the digest size of the " "hashing algorithm" ) return hmac.new(key, data, algorithm).digest() def verify(signature: bytes, data: bytes, key: bytes, algorithm=hashlib.sha256) -> bytes: expected = sign(data, key, algorithm) return hmac.compare_digest(expected, signature)
Use this to sign data, then attach the signature with the data and send that to the client. When you receive the data back, split data and signature and verify. I’ve set the default algorithm to SHA256, so you’ll need a 32-byte key:
key = secrets.token_bytes(32)
You may want to look at the
itsdangerous library, which packages this all up with serialisation and de-serialisation in various formats.
Using AES-GCM encryption to provide encryption and integrity
Fernet builds on AEC-CBC with a HMAC signature to ensure integrity of the encrypted data; a malicious attacker can’t feed your system nonsense data to keep your service busy running in circles with bad input, because the ciphertext is signed.
The Galois / Counter mode block cipher produces ciphertext and a tag to serve the same purpose, so can be used to serve the same purposes. The downside is that unlike Fernet there is no easy-to-use one-size-fits-all recipe to reuse on other platforms. AES-GCM also doesn’t use padding, so this encryption ciphertext matches the length of the input message (whereas Fernet / AES-CBC encrypts messages to blocks of fixed length, obscuring the message length somewhat).
AES256-GCM takes the usual 32 byte secret as a key:
key = secrets.token_bytes(32)
import binascii, time from base64 import urlsafe_b64encode as b64e, urlsafe_b64decode as b64d from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes from cryptography.hazmat.backends import default_backend from cryptography.exceptions import InvalidTag backend = default_backend() def aes_gcm_encrypt(message: bytes, key: bytes) -> bytes: current_time = int(time.time()).to_bytes(8, 'big') algorithm = algorithms.AES(key) iv = secrets.token_bytes(algorithm.block_size // 8) cipher = Cipher(algorithm, modes.GCM(iv), backend=backend) encryptor = cipher.encryptor() encryptor.authenticate_additional_data(current_time) ciphertext = encryptor.update(message) + encryptor.finalize() return b64e(current_time + iv + ciphertext + encryptor.tag) def aes_gcm_decrypt(token: bytes, key: bytes, ttl=None) -> bytes: algorithm = algorithms.AES(key) try: data = b64d(token) except (TypeError, binascii.Error): raise InvalidToken timestamp, iv, tag = data[:8], data[8:algorithm.block_size // 8 + 8], data[-16:] if ttl is not None: current_time = int(time.time()) time_encrypted, = int.from_bytes(data[:8], 'big') if time_encrypted + ttl < current_time or current_time + 60 < time_encrypted: # too old or created well before our current time + 1 h to account for clock skew raise InvalidToken cipher = Cipher(algorithm, modes.GCM(iv, tag), backend=backend) decryptor = cipher.decryptor() decryptor.authenticate_additional_data(timestamp) ciphertext = data[8 + len(iv):-16] return decryptor.update(ciphertext) + decryptor.finalize()
I’ve included a timestamp to support the same time-to-live use-cases that Fernet supports.
Other approaches on this page, in Python 3
AES CFB – like CBC but without the need to pad
This is the approach that All ?? V????y follows, albeit incorrectly. This is the
cryptography version, but note that I include the IV in the ciphertext, it should not be stored as a global (reusing an IV weakens the security of the key, and storing it as a module global means it’ll be re-generated the next Python invocation, rendering all ciphertext undecryptable):
import secrets from base64 import urlsafe_b64encode as b64e, urlsafe_b64decode as b64d from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes from cryptography.hazmat.backends import default_backend backend = default_backend() def aes_cfb_encrypt(message, key): algorithm = algorithms.AES(key) iv = secrets.token_bytes(algorithm.block_size // 8) cipher = Cipher(algorithm, modes.CFB(iv), backend=backend) encryptor = cipher.encryptor() ciphertext = encryptor.update(message) + encryptor.finalize() return b64e(iv + ciphertext) def aes_cfb_decrypt(ciphertext, key): iv_ciphertext = b64d(ciphertext) algorithm = algorithms.AES(key) size = algorithm.block_size // 8 iv, encrypted = iv_ciphertext[:size], iv_ciphertext[size:] cipher = Cipher(algorithm, modes.CFB(iv), backend=backend) decryptor = cipher.decryptor() return decryptor.update(encrypted) + decryptor.finalize()
This lacks the added armoring of an HMAC signature and there is no timestamp; you’d have to add those yourself.
The above also illustrates how easy it is to combine basic cryptography building blocks incorrectly; All ?? V????y‘s incorrect handling of the IV value can lead to a data breach or all encrypted messages being unreadable because the IV is lost. Using Fernet instead protects you from such mistakes.
AES ECB – not secure
If you previously implemented AES ECB encryption and need to still support this in Python 3, you can do so still with
cryptography too. The same caveats apply, ECB is not secure enough for real-life applications. Re-implementing that answer for Python 3, adding automatic handling of padding:
from base64 import urlsafe_b64encode as b64e, urlsafe_b64decode as b64d from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes from cryptography.hazmat.primitives import padding from cryptography.hazmat.backends import default_backend backend = default_backend() def aes_ecb_encrypt(message, key): cipher = Cipher(algorithms.AES(key), modes.ECB(), backend=backend) encryptor = cipher.encryptor() padder = padding.PKCS7(cipher.algorithm.block_size).padder() padded = padder.update(msg_text.encode()) + padder.finalize() return b64e(encryptor.update(padded) + encryptor.finalize()) def aes_ecb_decrypt(ciphertext, key): cipher = Cipher(algorithms.AES(key), modes.ECB(), backend=backend) decryptor = cipher.decryptor() unpadder = padding.PKCS7(cipher.algorithm.block_size).unpadder() padded = decryptor.update(b64d(ciphertext)) + decryptor.finalize() return unpadder.update(padded) + unpadder.finalize()
Again, this lacks the HMAC signature, and you shouldn’t use ECB anyway. The above is there merely to illustrate that
cryptography can handle the common cryptographic building blocks, even the ones you shouldn’t actually use.
As you explicitly state that you want obscurity not security, we’ll avoid reprimanding you for the weakness of what you suggest 🙂
So, using PyCrypto:
import base64 from Crypto.Cipher import AES msg_text = b'test some plain text here'.rjust(32) secret_key = b'1234567890123456' cipher = AES.new(secret_key,AES.MODE_ECB) # never use ECB in strong systems obviously encoded = base64.b64encode(cipher.encrypt(msg_text)) print(encoded) decoded = cipher.decrypt(base64.b64decode(encoded)) print(decoded)
If someone gets a hold of your database and your code base, they will be able to decode the encrypted data. Keep your
The “encoded_c” mentioned in the @smehmood’s Vigenere cipher answer should be “key_c”.
Here are working encode/decode functions.
import base64 def encode(key, clear): enc =  for i in range(len(clear)): key_c = key[i % len(key)] enc_c = chr((ord(clear[i]) + ord(key_c)) % 256) enc.append(enc_c) return base64.urlsafe_b64encode("".join(enc)) def decode(key, enc): dec =  enc = base64.urlsafe_b64decode(enc) for i in range(len(enc)): key_c = key[i % len(key)] dec_c = chr((256 + ord(enc[i]) - ord(key_c)) % 256) dec.append(dec_c) return "".join(dec)
Disclaimer: As implied by the comments, this should not be used to protect data in a real application, unless you read this and don’t mind talking with lawyers:
Here’s a Python 3 version of the functions from @qneill ‘s answer:
import base64 def encode(key, clear): enc =  for i in range(len(clear)): key_c = key[i % len(key)] enc_c = chr((ord(clear[i]) + ord(key_c)) % 256) enc.append(enc_c) return base64.urlsafe_b64encode("".join(enc).encode()).decode() def decode(key, enc): dec =  enc = base64.urlsafe_b64decode(enc).decode() for i in range(len(enc)): key_c = key[i % len(key)] dec_c = chr((256 + ord(enc[i]) - ord(key_c)) % 256) dec.append(dec_c) return "".join(dec)
The extra encode/decodes are needed because Python 3 has split strings/byte arrays into two different concepts, and updated their APIs to reflect that..
Disclaimer: As mentioned in the comments, this should not be used to protect data in a real application.
As has been mentioned the PyCrypto library contains a suite of ciphers. The XOR “cipher” can be used to do the dirty work if you don’t want to do it yourself:
from Crypto.Cipher import XOR import base64 def encrypt(key, plaintext): cipher = XOR.new(key) return base64.b64encode(cipher.encrypt(plaintext)) def decrypt(key, ciphertext): cipher = XOR.new(key) return cipher.decrypt(base64.b64decode(ciphertext))
The cipher works as follows without having to pad the plaintext:
'notsosecretkey', 'Attack at dawn!') 'LxsAEgwYRQIGRRAKEhdP' decrypt('notsosecretkey', encrypt('notsosecretkey', 'Attack at dawn!')) 'Attack at dawn!'encrypt(
Credit to https://stackoverflow.com/a/2490376/241294 for the base64 encode/decode functions (I’m a python newbie).
Here’s an implementation of URL Safe encryption and Decryption using AES(PyCrypto) and base64.
import base64 from Crypto import Random from Crypto.Cipher import AES AKEY = b'mysixteenbytekey' # AES key must be either 16, 24, or 32 bytes long iv = Random.new().read(AES.block_size) def encode(message): obj = AES.new(AKEY, AES.MODE_CFB, iv) return base64.urlsafe_b64encode(obj.encrypt(message)) def decode(cipher): obj2 = AES.new(AKEY, AES.MODE_CFB, iv) return obj2.decrypt(base64.urlsafe_b64decode(cipher))
If you face some issue like this https://bugs.python.org/issue4329 (
TypeError: character mapping must return integer, None or unicode) use
str(cipher) while decoding as follows:
In : encode(b"Hello World") Out: b'67jjg-8_RyaJ-28=' In : %timeit encode("Hello World") 100000 loops, best of 3: 13.9 µs per loop In : decode(b'67jjg-8_RyaJ-28=') Out: b'Hello World' In : %timeit decode(b'67jjg-8_RyaJ-28=') 100000 loops, best of 3: 15.2 µs per loop
Working encode/decode functions in python3 (adapted very little from qneill’s answer):
def encode(key, clear): enc =  for i in range(len(clear)): key_c = key[i % len(key)] enc_c = (ord(clear[i]) + ord(key_c)) % 256 enc.append(enc_c) return base64.urlsafe_b64encode(bytes(enc)) def decode(key, enc): dec =  enc = base64.urlsafe_b64decode(enc) for i in range(len(enc)): key_c = key[i % len(key)] dec_c = chr((256 + enc[i] - ord(key_c)) % 256) dec.append(dec_c) return "".join(dec)