Skip to main content

The strmMeta section

To allow STRM Privacy to manage privacy transformations, each schema needs to contain a specific section with meta information that enables this. This article describes the details of the strmMeta section.

note

Most of the content of this article refers to stream processing as examples, though strmMeta is also applicable and of relevance to batch processing.

Every schema needs an strmMeta section

Every schema in STRM Privacy has a section called strmMeta. Why is this? And why does even a private schema, created by you require it?

The strmMeta section exists because it provides a link to the rules that need to be applied to an event, once it has been deserialized by the STRM Privacy Event Gateway. The serialization schema defines the shape of the event, and is the first way that STRM Privacy helps in maintaining the quality of event data.

Once deserialized, STRM Privacy needs to apply rules to:

  1. validate event attribute contents
  2. apply encryption to personal data attributes
  3. determine if events belong to the same data owner

In order to be able to do that, events are assigned a data contract, and the data contract defines the rules.

An event example

The fields outside the strmMeta section can be used for anything as your organization requires (as long as the event fields conform to the schema). However, the strmMeta must exist, and it will also end up in your persistent storage. This way, the Data Contract that was applied to the event, the consent that was provided by the data subject, the link to the encryption key (among others), are still known, even when data is at rest in your persistent storage. Below is an example event, based on the schema that is shown in the tabs, as convenience.

{
"strmMeta": {
"eventContractRef": "strmprivacy/example/1.3.0",
"nonce": 15082564,
"timestamp": 1629192833072,
"keyLink": "55c2f72b-cff8-4814-ae33-e125c77e50f9",
"billingId": "demo8542234275",
"consentLevels": [ 0, 1, 2, 3 ]
},
"uniqueIdentifier": "unique-14",
"consistentValue": "session-740",
"someSensitiveValue": "ASB9bJrnYjxjNF5Txc+Wc2k1zvzFAmE03SYK499WK5Du",
"notSensitiveValue": "not-sensitive-39"
}
  1. eventContractRef
    required
    : the reference to the data contract that governs the privacy and validation rules. The sending application must set this field to a (handle/name/version) reference of an event contract that refers to this serialization schema.
  2. nonce: a random integer added to each event on acceptance. This makes it easy to detect possible data duplications in downstream processing. The sending application does not need to set this field.
  3. timestamp: a millisecond accuracy timestamp added upon acceptance in the STRM Privacy gateway. The sending application does not need to set this field.
  4. keyLink: a random value that provides a link to the encryption key that was used to encrypt the PII fields of this event. The sending application does not need to set this field.
  5. billingId: deprecated this field was required in the past, but not anymore. Will be removed in a future version of strmMeta.
  6. consentLevels
    required
    : 0 or more consents that were given by the data subject for the further use of this event. Each value refers to a specific purpose. The sending application must set this field. Read more on purposes here.
info

The fact that the strmMeta section does not use dataContractRef, but eventContractRef, is due to legacy. This will be changed in a backwards compatible way in the future, though the two references can be considered identical.

Reference to the Data Contract (eventContractRef)

An STRM Privacy event is transmitted to the Event Gateway with the serialized event in the body of the HTTP/2 POST call, and a header named Strm-Schema-Ref that tells the Event Gateway how to deserialize these data.

Once deserialized, the event gateway will look for the value of strmMeta/eventContractRef (inside the deserialized event) to determine the rules to be applied to this event. More details on this process here.

The sending application must fill in this field with a list of consents given by the data-owner for the use of this event. Technically this field holds a list of 0 or more integers, which refer to your organization's purpose maps. If no consent levels are set, the data subject does not give any permissions to use their sensitive (personal) data fields. In such case, all sensitive data are permanently hidden in the encrypted stream.

See here for a discussion on purposes in your organization.

When the STRM Privacy Event Gateway determines that an event belongs to a new sequence (via the value of the keyField in the data contract), or that an existing sequence has lasted longer than 24 hours (or as the Privacy Algorithm dictates), it will generate a new encryption key for the personal data attributes.

This keyLink field provides a UUID value that is used to look up this encryption key. This lookup is done in a decrypted stream, but can also be done in case the encryption keys were exported to the customer.

Unique identifier per event (nonce)

This is a convenience attribute, it is not technically necessary by STRM Privacy. Hard experience has taught that data duplication by hiccups in stream processing is quite common. This might happen inside STRM Privacy but also downstream with the customers further processing. Providing a unique random nonce in the STRM Privacy Event Gateway will make it possible to detect duplicates easily, especially combined with the event timestamp.

Process-time event time (timestamp)

This is a convenience attribute, it is not technically necessary by STRM Privacy. It contains the millisecond UTC time since the Unix epoch when the event was accepted by the STRM Privacy Event Gateway.