Monitoring Kubernetes Audit Logs

Steve Waterworth
September 5, 2025

Kubernetes audit logs are requests to the Kubernetes API. The API server processes all the changes to the Kubernetes state and acts as a gatekeeper to the internal database etcd where the state is recorded. All changes to the Kubernetes cluster go through the API server making it the ideal place to keep an eye on what’s happening. Requests to the API server can come from many sources:

  • The control plane (built in controllers, the scheduler)
  • Node daemons (Kubelet, Kube proxy, etc.)
  • Cluster services (autoscaler, kube state metrics, CoreDNS, etc.)
  • Users making kubectl requests
  • Applications, controllers and operators via kube client (GitOps)
  • The API server itself

Audit logs record all this activity detailing which users or services made the request to cluster resources and why the access was allowed or denied. Audit logs are structured logs in JSON format. Each log entry contains a comprehensive set of metadata including key attributes.

Kubernetes audit logs are not enabled by default, adding a Policy defining log levels for Kubernetes resources is used to enable audit logs. Start the API server with the policy file and the location for the log:

$ kube-apiserver
--audit-log-path=/var/log/kubernetes/apiserver/audit.log
--audit-policy-file=/etc/kubernetes/audit-policies/policy.yaml

A simple policy to get started with which records metadata for every request:

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
   -level: Metadata

Use your regular log shipper to send the audit log to Grepr and then after processing on to the log aggregation and storage backend (Datadog, Splunk, New Relic).

Reasons For Monitoring K8s Audit Logs

Once enabled the audit logs are retained in etcd for only one hour. By keeping them for longer it makes it possible to go further back in time to answer questions such as: “Why was that Pod evicted?”, “What lifecycle operations were performed when a deployment was updated yesterday?”.

Most observability platforms can set up alerts to notify of any unexpected activity, such as a sudden increase in authorisation failures. From Kubernetes version 1.11 the audit logs include annotations for authorisation success/failure and the reason. Here is an example audit log for a user performing a “kubectl get ns” command:

{	
   "id": 
"AwAAAZjnD23mDcjQxwAAABhBWmpuRDVNa0FBQ1c4OHpkcVJDMXFnQXEAAAAkZjE5OGU3M
GYtN2Q4OS00ZjVlLTg1MjUtMWU4MjgwMmYzNWI3AAAAFg",	
   "content": {	
       "timestamp": "2025-08-26T15:46:44.838Z",
       "tags": [	
           "grepr.messagetype:passthrough",	
           "dirname:/tmp/kind",	
           "filename:audit.log",	
           "processor:grepr",	
           "pipeline:datadog_eu",
           "source:audit-log",	
           "datadog.submission_auth:api_key"	
       ],	
       "host": "tuxtop",
       "service": "audit-log",
       "attributes": {	
                 "requestReceivedTimestamp": 
  "2025-08-26T15:46:44.316187Z",
                 "auditID": "89ba5fe7-bf05-44fc-8ef5-f5bc0abe045c",			
                 "objectRef": {				
                      "apiVersion": "v1",				
                      "resource": "namespaces"	
                 },			
                 "level": "Metadata",
                 "kind": "Event",		
                 "verb": "list",
                 "annotations": {	
                      "authorization": {		
                            "k8s": {	
                                 "io/reason": "RBAC: allowed by 
     ClusterRoleBinding \"kubeadm:cluster-admins\" of ClusterRole 
     \"cluster-admin\" to Group \"kubeadm:cluster-admins\"",						                                               "io/decision": "allow"		
                                }	
                           }	
                      },			
                      "userAgent": "kubectl/v1.33.4 (linux/amd64)
      kubernetes/74cdb42",	
                      "requestURI": "/api/v1/namespaces?limit=500",
                      "responseStatus": {	
                           "code": 200		
                      },	
                      "stageTimestamp": "2025-08-26T15:46:44.319034Z",
                      "hostname": "tuxtop",		
                      "sourceIPs": [		
                           "172.18.0.1"	
                      ],		
                      "apiVersion": "audit.k8s.io/v1",	
                      "stage": "ResponseComplete",	
                      "service": "audit-log",
                      "grepr": {		
                           "receivedTimestamp": 1756223209273
                       },			
                       "user": {	
                            "extra": {	
                                 "authentication": {
                                      "kubernetes": {							                                                        "io/credential-id": [							"X509SHA256=0bb5e1ebe3fa2223374546e66c0526ac8a791d593427accb0f10b38a47
c07af7"		
                                      ]		
                                  }		
                            }				
                      },				
                      "groups": [		
                           "kubeadm:cluster-admins",					                                                "system:authenticated"	
                      ],			
                      "username": "kubernetes-admin"	
                },		
                "status": "info",
                "timestamp": 1756223204838	
          }	
     }
 }

Other alert scenarios include a sudden change in log volume. An increase in the number of calls to the API server could be a security threat or a configuration issue. A decrease in the number of calls handled by the API server could indicate performance problems with the API server and it is failing to keep up with the request load.

The vast majority of the calls to the API are just routine and could potentially mask more interesting messages. Fortunately the way that Grepr processes log messages results in those noisy routine messages being sent through as summaries and the important infrequent messages getting passed straight through. The outcome being a significant improvement in the signal to noise ratio for Kubernetes administrators working with the audit log.

Get The Most Out Of Audit Logs

With modern containerised applications relying on Kubernetes to orchestrate the multifarious components and their interaction, any further insight that can be gained into its operation can only be a good thing. Using Grepr Intelligent Observability Engine enables you to maximise the benefit of enabling audit logs on Kubernetes without the penalty of dramatically increasing the costs of your observability platform.

Share this post

More blog posts

All blog posts
Product

Utilise Cloudflare Logs For Cost Optimisation

Cloudflare generates numerous logs of different types, including HTTP request logs, firewall events, access logs, DNS query logs, etc. These logs contain plenty of useful information that can provide insight into the health and performance of web applications. However, the profusion of data presents a challenge extracting the useful signals from all the noise. The Grepr Intelligent Observability Data Engine can suppress the noise and provide a clear signal.
September 8, 2025
Product

Use Grepr With Splunk

This blog post provides a comprehensive, step-by-step guide on how to seamlessly integrate the Grepr Intelligent Observability Data Engine with Splunk. It explains that with a few simple configuration changes, you can reroute your logs to Grepr, which uses machine learning to automatically detect and summarize frequent log patterns. This process can reduce your Splunk log volume and associated cloud costs by up to 90%, all without discarding any data. The post walks you through the entire setup, from configuring integrations for Splunk S2S or HEC to creating pipelines and datasets, ultimately demonstrating how to achieve significant cost savings while maintaining full diagnostic visibility.
August 29, 2025
Product

Structured Logging - What It Is and Why You Need It

In modern, complex software environments, unstructured logs can create chaos and make it difficult to gain insights. This blog post explains why structured logging, which captures log data in a consistent, machine-readable format like JSON, is an essential practice. By standardizing your logs, you can dramatically improve observability, ensure consistency across teams, and future-proof your systems. The post details how this approach facilitates faster troubleshooting, enables powerful automation, and turns your log data into a valuable source for metrics and analytics, ultimately transforming logs from simple text files into a critical source of truth for your applications.
August 25, 2025

Get started free and see Grepr in action in 20 minutes.