Skip to main content
Version: v3.16.x

Metrics & Observability

Observability

This section covers how to gather more detailed statistics about Gatekeeper's query performance. This can be helpful in diagnosing situations such as identifying a constraint template with a long execution time. Statistics are written to Gatekeeper's stdout logs.

Logging Constraint Execution Stats

  • set --log-stats-audit. This flag enables logging the stats for the audit process.

  • set --log-stats-admission. This flag enables logging the stats for the admission review process.

Example Log Line

To see how long it takes to review a constraint kind at admission time, enable the --log-stats-admission flag and watch the logs for a constraint kind K8sRequiredLabels, for example:

{
"level": "info",
"ts": 1683692576.9093642,
"logger": "webhook",
"msg": "admission review request stats",
"hookType": "validation",
"process": "admission",
"event_type": "review_response_stats",
"resource_group": "",
"resource_api_version": "v1",
"resource_kind": "Namespace",
"resource_namespace": "",
"request_username": "kubernetes-admin",
"execution_stats": [
{
"scope": "template",
"statsFor": "K8sRequiredLabels",
"stats": [
{
"name": "templateRunTimeNS",
"value": 762561,
"source": {
"type": "engine",
"value": "Rego"
},
"description": "the number of nanoseconds it took to evaluate all constraints for a template"
},
{
"name": "constraintCount",
"value": 1,
"source": {
"type": "engine",
"value": "Rego"
},
"description": "the number of constraints that were evaluated for the given constraint kind"
}
],
"labels": [
{
"name": "TracingEnabled",
"value": false
},
{
"name": "PrintEnabled",
"value": false
},
{
"name": "target",
"value": "admission.k8s.gatekeeper.sh"
}
]
}
]
}

In the excerpt above, notice templateRunTimeNS and constraintCount. The former indicates the time it takes to evaluate the number of constraints of kind K8sRequiredLabels, while the latter surfaces how many such constraints were evaluated for this template. Labels provide additional information about the execution environemnt setup, like whether tracing was enabled (TraceEnabled).

Caveats

The additional log volume from enabling the stats logging can be quite high.

Metrics

If you are using a Prometheus client library, for counter metrics, the _total suffix is recommended and sometimes automatically appended by client libraries to indicate that the metric represents a cumulative total.

Below are the list of metrics provided by Gatekeeper:

Constraint

  • Name: gatekeeper_constraints

    Description: Current number of known constraints

    Tags:

    • enforcement_action: [deny, dryrun, warn]

    • status: [active, error]

      Aggregation: LastValue

Constraint Template

  • Name: gatekeeper_constraint_templates

    Description: Number of observed constraint templates

    Tags:

    • status: [active, error]

      Aggregation: LastValue

  • Name: gatekeeper_constraint_template_ingestion_count

    Description: Total number of constraint template ingestion actions

    Tags:

    • status: [active, error]

      Aggregation: Count

  • Name: gatekeeper_constraint_template_ingestion_duration_seconds

    Description: Distribution of how long it took to ingest a constraint template in seconds

    Tags:

    • status: [active, error]

      Aggregation: Distribution

Expansion Template

  • Name: gatekeeper_expansion_templates

    Description: Number of observed expansion templates

    Tags:

    • status: [active, error]

    Aggregation: LastValue

Webhook

  • Name: gatekeeper_validation_request_count

    Description: The number of requests that are routed to validation webhook

    Tags:

    • admission_status: [allow, deny]

    • admission_dryrun: [true, false]

      Aggregation: Count

  • Name: gatekeeper_validation_request_duration_seconds

    Description: The validation webhook response time in seconds

    Tags:

    • admission_status: [allow, deny]

      Aggregation: Distribution

  • Name: gatekeeper_mutation_request_count

    Description: The number of requests that are routed to mutation webhook

    Tags:

    • admission_status: [allow, deny]

      Aggregation: Count

  • Name: gatekeeper_mutation_request_duration_seconds

    Description: The mutation webhook response time in seconds

    Tags:

    • admission_status: [allow, deny]

      Aggregation: Distribution

Audit

  • Name: gatekeeper_violations

    Description: Total number of audited violations

    Tags:

    • enforcement_action: [deny, dryrun, warn]

      Aggregation: LastValue

  • Name: gatekeeper_audit_duration_seconds

    Description: Latency of audit operation in seconds

    Aggregation: Distribution

  • Name: gatekeeper_audit_last_run_time

    Description: Timestamp of last audit run starting time

    Aggregation: LastValue

  • Name: gatekeeper_audit_last_run_end_time

    Description: Timestamp of last audit run ending time

    Aggregation: LastValue

Mutation

  • Name: gatekeeper_mutator_ingestion_count

    Description: Total number of Mutator ingestion actions

    Tags:

    • status: [active, error]

      Aggregation: Count

  • Name: gatekeeper_mutator_ingestion_duration_seconds

    Description: The distribution of Mutator ingestion durations

    Tags:

    • status: [active, error]

      Aggregation: Distribution

  • Name: gatekeeper_mutators

    Description: The current number of Mutator objects

    Tags:

    • status: [active, error]

      Aggregation: Count

  • Name: gatekeeper_mutator_conflicting_count

    Description: The current number of conflicting Mutator objects

    Tags:

    • status: [active, error]

      Aggregation: Count

Sync

  • Name: gatekeeper_sync

    Description: Total number of resources of each kind being cached

    Tags:

    • status: [active, error]

    • kind (examples, pod, namespace, ...)

      Aggregation: LastValue

  • Name: gatekeeper_sync_duration_seconds

    Description: Latency of sync operation in seconds

    Aggregation: Distribution

  • Name: gatekeeper_sync_last_run_time

    Description: Timestamp of last sync operation

    Aggregation: LastValue

Watch

  • Name: gatekeeper_watch_manager_watched_gvk

    Description: Total number of watched GroupVersionKinds

    Aggregation: LastValue

  • Name: gatekeeper_watch_manager_intended_watch_gvk

    Description: Total number of GroupVersionKinds with a registered watch intent

    Aggregation: LastValue