Easy-to-use GDPR guide for Data Scientist. Part 1/2

As successful Data Scientist, what can I do and what cannot to be GDPR compliant? Amazon Web Services (AWS) vs on-premise. De-identification vs Anonymization. Anonymization: removing, masking or suppression, generalization, k-anonymization, scrambling, blurring. Pseudonymization: tokenization, hashing, encryption, key deletion or crypto-shredding.

Table of Contents

Disclaimer: I do not represent my current/previous employers on my personal Medium blog.

Guide below covers main steps for data processor, not data controller.

Processor means a natural or legal person, public authority, agency or other body which processes personal data on behalf of the controller
Source: http://bit.ly/2OYpwVA

AWS vs on-premise

Amazon S3 is object storage built to store and retrieve any amount of data. I’ll show you how looks S3 GDPR compliant bucket in next section.

Amazon S3 bucket encryption

Navigate to https://console.aws.amazon.com/kms/ and click Create a key button:

Enter Alias (e.g. medium) and Description (e.g. Key for Amazon S3 bucket encryption). Click Next button:

Provide tags like Team, Owner, and Impact. Click Next button:

Click Next button. Click Next button. Review policy and click Finish button:

Navigate to https://console.aws.amazon.com/s3/ and select your bucket. Select Properties tab. Click Default encryption button:

Select AWS-KMS option. Select KMS key (e.g. medium) and click Save button:

Amazon S3 automatically encrypting objects stored in bucket now.

Amazon S3 bucket access

If you need fix access settings, select your bucket (e.g. korniichuk.enc) and click Edit public access settings button:

Summary: Amazon S3 bucket and object are not public. You can use Amazon S3 bucket encryption as extra option.

Navigate to Part 2

Python Developer and Artificial Intelligence Engineer

Python Developer and Artificial Intelligence Engineer