Easy-to-use GDPR guide for Data Scientist. Part 1/2

3 min readApr 9, 2019

As successful Data Scientist, what can I do and what cannot to be GDPR compliant? Amazon Web Services (AWS) vs on-premise. De-identification vs Anonymization. Anonymization: removing, masking or suppression, generalization, k-anonymization, scrambling, blurring. Pseudonymization: tokenization, hashing, encryption, key deletion or crypto-shredding.

Disclaimer: I do not represent my current/previous employers on my personal Medium blog.

Guide below covers main steps for data processor, not data controller.

Processor means a natural or legal person, public authority, agency or other body which processes personal data on behalf of the controller
Source: http://bit.ly/2OYpwVA

AWS vs on-premise

I can strongly recommend store data in cloud like AWS or GCP. Do not use storage services like Google Drive, Dropbox, Box, and OneDrive. Store only encrypted data on your USB flash drives and external HDD/SSD.

Amazon S3 is object storage built to store and retrieve any amount of data. I’ll show you how looks S3 GDPR compliant bucket in next section.

Amazon S3 bucket encryption

Step is not required. As successful Data Scientist, I can use Amazon S3 bucket encryption.

Navigate to https://console.aws.amazon.com/kms/ and click Create a key button:

Enter Alias (e.g. medium) and Description (e.g. Key for Amazon S3 bucket encryption). Click Next button:

Provide tags like Team, Owner, and Impact. Click Next button:

Click Next button. Click Next button. Review policy and click Finish button:

Navigate to https://console.aws.amazon.com/s3/ and select your bucket. Select Properties tab. Click Default encryption button:

Select AWS-KMS option. Select KMS key (e.g. medium) and click Save button:

Amazon S3 automatically encrypting objects stored in bucket now.

Amazon S3 bucket access

Step is required! Navigate to https://console.aws.amazon.com/s3/ and find your bucket (e.g. korniichuk.enc). Verify Bucket and objects not public access settings:

If you need fix access settings, select your bucket (e.g. korniichuk.enc) and click Edit public access settings button:

Summary: Amazon S3 bucket and object are not public. You can use Amazon S3 bucket encryption as extra option.

Navigate to Part 2

Easy-to-use GDPR guide for Data Scientist. Part 1/2

Table of Contents

AWS vs on-premise

Amazon S3 bucket encryption

Amazon S3 bucket access

Written by Ruslan Korniichuk