Easy-to-use GDPR guide for Data Scientist. Part 1/2
As successful Data Scientist, what can I do and what cannot to be GDPR compliant? Amazon Web Services (AWS) vs on-premise. De-identification vs Anonymization. Anonymization: removing, masking or suppression, generalization, k-anonymization, scrambling, blurring. Pseudonymization: tokenization, hashing, encryption, key deletion or crypto-shredding.
Table of Contents
- AWS vs on-premise
- Amazon S3 bucket encryption
- Amazon S3 bucket access
- De-identification vs Anonymization (part 2)
- Removing (part 2)
- Masking or suppression (part 2)
- Generalization (part 2)
- K-anonymization (part 2)
- Scrambling (part 2)
- Blurring (part 2)
- Pseudonymization (part 2)
- Tokenization (part 2)
- Hashing (part 2)
- Encryption (part 2)
- Key deletion or crypto-shredding (part 2)
Disclaimer: I do not represent my current/previous employers on my personal Medium blog.
Guide below covers main steps for data processor, not data controller.
Processor means a natural or legal person, public authority, agency or other body which processes personal data on behalf of the controller
AWS vs on-premise
I can strongly recommend store data in cloud like AWS or GCP. Do not use storage services like Google Drive, Dropbox, Box, and OneDrive. Store only encrypted data on your USB flash drives and external HDD/SSD.
Amazon S3 is object storage built to store and retrieve any amount of data. I’ll show you how looks S3 GDPR compliant bucket in next section.
Amazon S3 bucket encryption
Step is not required. As successful Data Scientist, I can use Amazon S3 bucket encryption.
Navigate to https://console.aws.amazon.com/kms/ and click
Create a key button:
Key for Amazon S3 bucket encryption). Click
Provide tags like
Next button. Click
Next button. Review policy and click
Navigate to https://console.aws.amazon.com/s3/ and select your bucket. Select
Properties tab. Click
Default encryption button:
AWS-KMS option. Select KMS key (e.g.
medium) and click
Amazon S3 automatically encrypting objects stored in bucket now.
Amazon S3 bucket access
Step is required! Navigate to https://console.aws.amazon.com/s3/ and find your bucket (e.g.
Bucket and objects not public access settings:
If you need fix access settings, select your bucket (e.g.
korniichuk.enc) and click
Edit public access settings button:
Summary: Amazon S3 bucket and object are not public. You can use Amazon S3 bucket encryption as extra option.