Skip to main content

AWS S3

Prepare the storage

In this quickstart, you will create an AWS S3 bucket and set up credentials with the required access.

Create an S3 bucket using the command below, using your own bucket name (or do so in the AWS console):

$ aws s3 mb s3://<your-bucket-name>
note

S3 has evolved into a protocol, instead of just an Amazon product. It is possible to use our s3 data connector with a non-AWS storage solution, as long as it provides an S3 compatible API. For example min.io.

Create a file with the policy document below and save it in the current directory. This file contains the permissions the data connector needs.

strm-policy.json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [ "s3:GetBucketLocation" ],
"Resource": "arn:aws:s3:::<your-bucket-name>"
},
{
"Effect": "Allow",
"Action": [ "s3:PutObject" ],
"Resource": "arn:aws:s3:::<your-bucket-name>/<optional-prefix>/*"
},
// if your data-connector is used in batch jobs it also needs GetObject rights
{
"Effect": "Allow",
"Action": [ "s3:GetObject" ],
"Resource": "arn:aws:s3:::<your-bucket-name>/<optional-prefix>/*"
}
]
}
  1. fill in your bucket name
  2. fill in your bucket name and an optional path prefix. You might add additional constraints via a file suffix (*.jsonl, *.csv).
info

Make sure you replace all occurrences of <your-bucket-name> with the actual name of your S3 bucket and replace the <optional-prefix> with the prefix directory in which STRM Privacy should put the files. If there is no prefix, also leave out the last slash (as a double slash will not work).

The provided policy document shows the minimal set of permissions needed by the data connector. They are used as follows:

  • GetBucketLocation: This is an unfortunate necessity, because the AWS SDK requires to connect to the same region as from where the bucket has originally been created. STRM Privacy cannot know this in advance, so we need to query it using this operation.
  • PutObject: the data connector will be able to write files to the specified location (bucket + prefix).
  • GetObject: the data connector will be able to read files from the specified location (bucket + prefix).

No more permissions than these are required. STRM Privacy prefers to have as few permissions as possible.

Next, an IAM user needs to be created that adheres to this policy. This quickstart uses the username strm-export-demo, but it is recommened that you use a more descriptive name for your organization.

First create the user:

$ aws iam create-user --user-name strm-export-demo

Then attach the policy that you have just created. This example assumes the policy document is in the same directory. Replace the file name strm-policy.json with the correct file name.

aws iam put-user-policy --user-name strm-export-demo \
--policy-name strm-bucket-write-access \
--policy-document file://strm-policy.json

Finally, create the access key for this user and download the credentials (keep these safe, as they provide access to the bucket):

aws iam create-access-key --user-name strm-export-demo > s3.json
note

STRM Privacy validates the user and credentials configuration by writing an empty JSONL file (file name: .strm_test_<random UUID>.jsonl) to the specified bucket/prefix upon creation of the batch exporter. If the data connector used has delete permissions, this file will be deleted after the test and therefore absent.

First, make sure you have a file named s3.json in your current directory, with the following contents:

s3.json
{
"AccessKey": {
"UserName": "<your user name>",
"AccessKeyId": "<your access key>",
"Status": "Active",
"SecretAccessKey": "<your secret access key>",
"CreateDate": "<the creation date>"
}
}
tip

This is the same JSON as returned by aws iam create-access-key.

Create the data connector

With the AWS credentials in the file s3.json, you can create the data connector using the following command:

$ strm create data-connector s3 my-s3 strmprivacy-export-demo --credentials-file=s3.json
{
"ref": {
"name": "my-s3",
"projectId": "30fcd008-9696-...."
},
"s3Bucket": {
"bucketName": "strmprivacy-export-demo"
}
}

This will create a data connector named my-s3 for the bucket strmprivacy-export-demo, using the provided credentials. Specify the actual name of your bucket, and any name for the data connector itself.