Pseudonymization with keyed-hash function in Python and AWS

Salted-hash function is not enough to secure personally identifying information (PII)?! I can strongly recommend keyed-hash function as pseudonymization technique to be GDPR compliant.

Read more about pseudonymization hashing technique…

Disclaimer: I do not represent my current/previous employers on my personal Medium blog.

Guide below covers main steps for GDPR compliance in Data Science area. Do not use this guide in operation or security area. Salt/key reuse is common mistake.

Navigate to and click Create a key button:

Enter Alias (e.g. medium) and Description (e.g. Key for AWS Secrets Manager). Click Next button:

Provide tags like Team, Owner, and Impact. Click Next button:

Click Next button. Click Next button. Review policy and click Finish button:

Navigate to and click Store a new secret button:

Select Other type of secrets secret type. Enter variable name for hash key (e.g. hash_key). Enter value for hash key (e.g. passwd). Select KMS key from Step 1 (e.g. medium). Click Next button:

Enter name (e.g. Medium) and description (e.g. Secret key for keyed-hash function). Provide tags like Team, Owner, and Impact. Click Next button:

Select Disable automatic rotation option and click Next button:

Review and click Store button:

Pseudonymize e-mail address with keyed-hash function in Python.

Import hashlib and json Python standard libraries:

import hashlib
import json

Install boto3 Python lib. Set up AWS Credentials and Region for Development or configure the AWS CLI. Import boto3 Python lib:

import boto3

Initialize email variable with value:

email = ''

Get your secret (e.g. Medium), created in Step 2.

secretsmanager = boto3.client('secretsmanager')
response = secretsmanager.get_secret_value(SecretId='Medium')
secret_string = response['SecretString']
hash_key = json.loads(secret_string)['hash_key']

Use sha3_512() function from standard hashlib Python library. Add secret key to email address. Hash e-mail address with keyed-hash function:

sha3 = hashlib.sha3_512()
data = email + hash_key
digest = sha3.hexdigest()

See all parts of code in one file below:

Python Developer and Artificial Intelligence Engineer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store