Advanced Configuration
Using a values.yaml
It is recommended to use a values.yaml
over the --set
option, as this prevents secrets from ending up in
your terminal history and eases repeatability. If you plan to use a values.yaml
instead of the inline Helm values:
- make sure to set the
license.installationType
toAWS_MARKETPLACE
- the
registry.imagePullSecret
can be omitted / left blank, as this is facilitated by your AWS Marketplace deployment
Run the following command to install the Helm chart (ensure that your working directory is awsmp-chart
as
also shown in the --set
example above):
helm install strmprivacy --namespace strmprivacy --create-namespace ./* --values values.yaml
After these steps, you should end up with a namespace strmprivacy
with, by
default, all components enabled. If you
wish otherwise, you can edit the values.yaml
to match your needs.
Using managed prerequisites for the Data Plane
As the STRM Privacy Data Plane depends on Kafka, Redis and/or a Postgres Database, you should only use the embedded instances of these prerequisites to deploy your initial version. For production purposes, we recommend to use managed instances.
Purpose of using managed instances
The subcharts for Kafka, Redis and PostgreSQL that are included in the STRM Privacy Data Plane Helm Chart are not meant for production purposes, as they have not been configured as such. Furthermore, not all Kubernetes Clusters fulfill the pre-requisites for this (e.g. support for persistent storage). The more convenient route here, is to use managed instances of the prerequisites for your Data Plane. The following sections discuss how to set up these managed instances.
AWS RDS for PostgreSQL
To be able to run Batch Jobs, a PostgreSQL database is required. Please follow the steps from the AWS RDS for PostgreSQL guide , in order to setup a PostgreSQL database for your STRM Privacy Data Plane. Make sure to implement the best practices to backup and restore data at any point in time as described here. General remarks considering the database:
- Create a separate user (following
the principle of least privilege) with read and write
access rights to the database. Set the credentials in the
values.yaml
for the PostgreSQL user. - Data usage will increase over time, therefore, it is wise to enable AWS RDS Storage Autoscaling to prevent manual interventions.
AWS MSK for Apache Kafka
To be able to run any streaming tasks, a Kafka (or Kafka API compatible) cluster is required. Please follow the steps
from the AWS MSK for Apache Kafka guide to
setup a managed Kafka cluster in your AWS account. Take note of the private bootstrap servers (and
possibly credentials) and set the value in the values.yaml
.
AWS ElastiCache for Redis
To be able to run any streaming tasks, a Redis (or Redis API compatible) deployment is required. Please follow the steps
from
the AWS ElastiCache for Redis guide
to setup a managed Redis deployment. Take note of the endpoint (and
possibly credentials) and set the value in the values.yaml
.
Routing traffic
The Helm chart includes ClusterIP
Kubernetes services by default to route traffic to. If you need to route traffic
from outside the cluster to one of the STRM Privacy applications, set services.loadbalancer.enabled
to
true
to create a LoadBalancer
Kubernetes service.
Arbitrary annotations can be added with services.loadbalancer.annotations
, which allows you to configure
the Network Load Balancer to fit your
needs (view all annotations here):
services:
loadbalancer:
enabled: true
annotations:
service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"