Easy-to-use GDPR guide for Data Scientist. Part 1/2
As successful Data Scientist, what can I do and what cannot to be GDPR compliant? Amazon Web Services (AWS) vs on-premise. De-identification vs Anonymization. Anonymization: removing, masking or suppression, generalization, k-anonymization, scrambling, blurring. Pseudonymization: tokenization, hashing, encryption, key deletion or crypto-shredding.
Table of Contents
- AWS vs on-premise
- Amazon S3 bucket encryption
- Amazon S3 bucket access
- De-identification vs Anonymization (part 2)
- Removing (part 2)
- Masking or suppression (part 2)
- Generalization (part 2)
- K-anonymization (part 2)
- Scrambling (part 2)
- Blurring (part 2)
- Pseudonymization (part 2)
- Tokenization (part 2)
- Hashing (part 2)
- Encryption (part 2)
- Key deletion or crypto-shredding (part 2)
Disclaimer: I do not represent my current/previous employers on my personal Medium blog.
Guide below covers main steps for data processor, not data controller.
Processor means a natural or legal person, public authority, agency or other body which processes personal data on behalf of the controller
Source: http://bit.ly/2OYpwVA
AWS vs on-premise
I can strongly recommend store data in cloud like AWS or GCP. Do not use storage services like Google Drive, Dropbox, Box, and OneDrive. Store only encrypted data on your USB flash drives and external HDD/SSD.
Amazon S3 is object storage built to store and retrieve any amount of data. I’ll show you how looks S3 GDPR compliant bucket in next section.
Amazon S3 bucket encryption
Step is not required. As successful Data Scientist, I can use Amazon S3 bucket encryption.
Navigate to https://console.aws.amazon.com/kms/ and click Create a key
button:
Enter Alias
(e.g. medium
) and Description
(e.g. Key for Amazon S3 bucket encryption
). Click Next
button:
Provide tags like Team
, Owner
, and Impact
. Click Next
button:
Click Next
button. Click Next
button. Review policy and click Finish
button:
Navigate to https://console.aws.amazon.com/s3/ and select your bucket. Select Properties
tab. Click Default encryption
button:
Select AWS-KMS
option. Select KMS key (e.g. medium
) and click Save
button:
Amazon S3 automatically encrypting objects stored in bucket now.
Amazon S3 bucket access
Step is required! Navigate to https://console.aws.amazon.com/s3/ and find your bucket (e.g. korniichuk.enc
). Verify Bucket and objects not public
access settings:
If you need fix access settings, select your bucket (e.g. korniichuk.enc
) and click Edit public access settings
button:
Summary: Amazon S3 bucket and object are not public. You can use Amazon S3 bucket encryption as extra option.