Replicating Data
Replicating Data
Some constraints are impossible to write without access to more state than just the object under test. For example, it is impossible to know if a label is unique across all pods and namespaces unless a ConstraintTemplate has access to all other pods and namespaces. To enable this use case, we provide syncing of data into a data client.
Replicating Data with SyncSets (Recommended)
Feature State: Gatekeeper version v3.15+ (alpha)
Kubernetes data can be replicated into the data client using SyncSet resources. Below is an example of a SyncSet:
apiVersion: syncset.gatekeeper.sh/v1alpha1
kind: SyncSet
metadata:
name: syncset-1
spec:
gvks:
- group: ""
version: "v1"
kind: "Namespace"
- group: ""
version: "v1"
kind: "Pod"
The resources defined in the gvks field of a SyncSet will be eventually synced into the data client.
Working with SyncSet resources
- Updating a SyncSet's
gvksfield should dynamically update what objects are synced. - Multiple
SyncSets may be defined and those will be reconciled by the Gatekeeper syncset-controller. Notably, the set union of all SyncSet resources'gvksand the Config resource'ssyncOnlywill be synced into the data client. - A resource will continue to be present in the data client so long as a SyncSet or Config still specifies it under the
gvksorsyncOnlyfield.
Replicating Data with Config
Feature State: Gatekeeper version v3.6+ (alpha)
The "Config" resource must be named
configfor it to be reconciled by Gatekeeper. Gatekeeper will ignore the resource if you do not name itconfig.
Kubernetes data can also be replicated into the data client via the Config resource. Resources defined in syncOnly will be synced into OPA. Below is an example:
apiVersion: config.gatekeeper.sh/v1alpha1
kind: Config
metadata:
name: config
namespace: "gatekeeper-system"
spec:
sync:
syncOnly:
- group: ""
version: "v1"
kind: "Namespace"
- group: ""
version: "v1"
kind: "Pod"
You can install this config with the following command:
kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper/master/demo/basic/sync.yaml
Working with Config resources
- Updating a Config's
syncOnlyfield should dynamically update what objects are synced. - The
Configresource is meant to be a singleton. The set union of all SyncSet resources'gvksand the Config resource'ssyncOnlywill be synced into the data client. - A resource will continue to be present in the data client so long as a SyncSet or Config still specifies it under the
gvksorsyncOnlyfield.
Accessing replicated data
Once data is synced, ConstraintTemplates can access the cached data under the data.inventory document.
The data.inventory document has the following format:
- For cluster-scoped objects:
data.inventory.cluster[<groupVersion>][<kind>][<name>]- Example referencing the Gatekeeper namespace:
data.inventory.cluster["v1"].Namespace["gatekeeper"]
- Example referencing the Gatekeeper namespace:
- For namespace-scoped objects:
data.inventory.namespace[<namespace>][groupVersion][<kind>][<name>]- Example referencing the Gatekeeper pod:
data.inventory.namespace["gatekeeper"]["v1"]["Pod"]["gatekeeper-controller-manager-d4c98b788-j7d92"]
- Example referencing the Gatekeeper pod:
Auditing From Cache
The audit feature does not require replication by default. However, when the audit-from-cache flag is set to true, the audit informer cache will be used as the source-of-truth for audit queries; thus, an object must first be cached before it can be audited for constraint violations. Kubernetes data can be replicated into the audit cache via one of the resources above.