Exporting encryption keys

The output streams feature manages the decryption of data for you. The keys exist only within the STRM Privacy keys database, for a duration a little longer than the keys rotation period [1]. The output streams have a default retention of 7 days, so if you don’t have the keys, and don’t consume or export the output streams within 7 days, you lose the capability to decrypt the personal data attributes.

Assuming your company decides that it wants to have the actual encryption keys [2], you need to configure STRM Privacy to provide you with the keys. A pre-requisite is that your account is enabled for this capability. If not the features below will be forbidden to you.

Exporting keys is only permitted if your account allows this. There is currently no way to enable this setting from the customer portal. Contact us if you need this feature.

The Encryption Keys

We use Google Tink as an abstraction library for standard AES-256 encryption with a synthetic initialization vector. The SIV means that for a certain plain-text value, the corresponding cypher text will be identical (for a certain encryption key).

Have a sink ready

We’re using a demo gcloud sink that we have created with

strm create sink demo strm-demo --credentials-file gcloud.json

using service account credentials created via the the Google Cloud console.

When using an AWS S3 sink, the mechanism is identical and described here.

Accessing the bucket contents works with the aws cli tool for S3 or the gsutil Google Cloud cli tool for Google Cloud.

Creating an exporter

Currently, we only provide batch exporters for the encryption keys, that work very similar to the events batch exporters. So you need the same mechanism with authenticated and authorized IAM users.

$ strm create batch-exporter --help
Create batch exporter

Usage:
  strm create batch-exporter [stream-name] [flags]

Flags:
      --export-keys          Do we want to export the keys stream
  -h, --help                 help for batch-exporter
      --interval int         Interval in seconds between batches (default 60)
      --name string          optional batch exporter name
      --path-prefix string   path prefix on bucket
      --sink string          name of the sink. Optional if you have only one defined sink.

We’re looking for the --export-keys option. Provided key exporting is enabled for your account, you can do the following:

strm create batch-exporter demo --export-keys \
  --interval 30 --path-prefix demo-keys --sink demo
{
  "ref": { (1)
    "billingId": "demo8542234275", "name": "demo-demo-keys"
  },
  "keyStreamRef": { (2)
    "billingId": "demo8542234275", "name": "demo"
  },
  "interval": "30s",
  "sinkName": "demo", (3)
  "pathPrefix": "demo-keys" (4)
}
1 the reference to the batch-exporter
2 the reference to the key stream
3 the name of the sink to use
4 a directory to use in the bucket for storing keys.
If you have more than 1 sink defined, you must give the name of that sink. If you have 1, it is chosen as the default option.
The current implementation (released on 04 May 2021) of this key export mechanism does not export keys that were created more than 7 days earlier [3].

Exported keys in the bucket

We have been running strm sim run-random demo for a while in another terminal, so there are keys data.

You can have a look at the output

gsutil ls gs://strm-demo/demo-keys/
gs://strm-demo/demo-keys/2021-08-18T12:09:00-keys-3b398d5c-2d7c-4673-9f73-3693e137ddbb---0-1-2-3-4.jsonl
gs://strm-demo/demo-keys/2021-08-18T12:09:30-keys-3b398d5c-2d7c-4673-9f73-3693e137ddbb---0-1-2-3-4.jsonl
gsutil cat gs://strm-demo/demo-keys/2021-08-18T12:09:30-keys-3b398d5c-2d7c-4673-9f73-3693e137ddbb---0-1-2-3-4.jsonl | tail -1
{
  "keyLink": "d478e24c-d12d-466e-80dd-055736bba704",
  "tinkKey": {
    "primaryKeyId": 2140201303,
    "key": [
      {
        "keyData": {
          "typeUrl": "type.googleapis.com/google.crypto.tink.AesSivKey",
          "keyMaterialType": "SYMMETRIC",
          "value": "EkAho6Jgghn8m//At...."
        },
        "outputPrefixType": "TINK",
        "keyId": 2140201303,
        "status": "ENABLED"
      }
    ]
  }
}

You can do exactly the same for an AWS S3 bucket. Inspect the keys in the sink like so

aws s3 ls strmprivacy-export-demo/perf-test-keys/
2021-05-04 15:41:37          0 .strm_test...95-dfec21be8251.jsonl (1)
2021-05-04 16:13:01     166008 2021-05-04T14:13:00-keys-e1...-7-8-9.jsonl (2)
2021-05-04 16:13:31     701824 2021-05-04T14:13:30-keys-e1...-7-8-9.jsonl

aws s3 cp \
  s3://strmprivacy-export-demo/perf-test-keys/2021-05-04T14:13:00-keys-e1...-7-8-9.jsonl \
  - | head -1

{ "keyLink": "44861053-6a95-4ec6-8b33-96fd1f748402", (3)
  "tinkKey": {"primaryKeyId":84683988,"key":[
    {"keyData":{"typeUrl":"type.googleapis.com/google.crypto.tink.AesSivKey",
    "keyMaterialType":"SYMMETRIC",
      "value":"EkDzauIHozdnF.....WkpB8Xu"}, (4)
      "outputPrefixType":"TINK","keyId":84683988,"status":"ENABLED"}]}
}
1 This is a test file created by STRM Privacy to verify that we can actually write in this bucket. Because it starts with a . it is ignored by most tools. This has not yet been implemented for Gcloud type sinks.
2 Because the interval is 30 seconds, we’ll have a file every 30 seconds. Each file contains json lines with one key per line. The line contains both a keyLink attribute, with the key link of the events, and a tinkKey attribute that contains the serialized Tink key. The format is described in this protobuf file. The keyLink value is the same value you’ll find in the strmMeta/keyLink field of each event.
3 the key link that exists on all STRM Privacy events.
4 the actual AES-256 encryption key.

1. default 24 hours
2. with the associated security and personal data hassles!
3. the retention period of the keys Kafka topic