PII Field Encryption
STRM Privacy aims to protect PII data, by encrypting content specified in event fields, that are marked as sensitive in the data contract.
Privacy Algorithm
The process of encrypting PII data according to the time-based Privacy Algorithm is shown below.
- An event is sent to the STRM Privacy Event Gateway.
- An HTTP Header specifies the reference to the schema that was used to serialize the message. The schema is retrieved
from the Data Contracts API and the message can be deserialized. Next,
the
strmMeta
section is extracted from the event data. - The reference to the Data Contract that should be applied to this event is extracted from
strmMeta
. - The Data Contract is retrieved and the names of the sensitive fields and the name of the
keyField
are extracted from the Data Contract. - Get an existing / generate a new
keyLink
based on the value in thekeyField
of the event data. - Encrypt the PII fields using the encryption key.
- 24 hours after the
keyLink
and encryption key have been generated, thekeyLink
and the encryption key rotate. This is called the time-based Privacy Algorithm.
As can be seen, the keyLink
and the keyField
are closely related, but different.
Read more about the differences here.
If the time-based Privacy Algorithm does not match your needs, please contact us to think of other algorithms.
Field encryption
STRM Privacy uses Google Tink as an
abstraction library for standard AES-256
encryption with a synthetic
initialization vector. The SIV means that for a certain
plain-text
value, the corresponding cipher text will be identical (for a certain
encryption key).
Using the encrypted data
When sending data to STRM Privacy, the PII data fields are encrypted. The resulting data stream is called the encrypted stream, or source stream. By design, this data stream does not contain any sensitive data anymore. This implies that anyone in your company can use it 1. In case these data become compromised, you have a business issue, but not a privacy issue.
The same credentials that are used for sending data to STRM Privacy can be used to consume the encrypted stream
Though the sensitive data are encrypted, these data are still useful. With a typical clickstream, where url
is
not considered personal data, you could identify dead ends on your site,
or train recommender engines on the encrypted stream, because the
attributes that identify the sequence even though encrypted, remain the
same for 24 hours2. This is plenty long enough to understand typical
customer journeys, without compromising the privacy of your users.
Using the decrypted data
Identify the data purposes you need
Ask the Data Protection Officer the specific data purposes your use case requires, or is allowed to use. Data for purposes that do not apply to your use case will not be decrypted when creating a privacy stream.
Create a privacy stream
Here you instruct STRM Privacy to create a derived stream where event data corresponding to the requested purposes is decrypted. This only happens for events where (data subject) consent is granted for these purposes. As such, STRM Privacy will:
Exclude all events that have not been allowed to be used for these requested purposes.
Decrypt event data (fields/attributes) filed under the purposes you requested (in the event's data contract). Attributes with corresponding to other purposes will not be decrypted.
This means that data consumers will only receive the data they are (legally) allowed to process.
For more info about creating (privacy) streams, see our streams quickstart.