Cloudflare is a content delivery network (CDN) that enables organisations to accelerate, protect and optimise their websites, applications and APIs. It performs as a reverse proxy between the internet and the website’s servers providing DDoS protection, caching and load balancing. Organisations use Cloudflare to improve website performance, reduce egress costs and enhance security.
A Cloudflare Log
Cloudflare logs are structured logs in JSON format. Here are some of the key fields that provide useful information.
- EdgeStartTimestamp - When Cloudflare received the request
- EdgeEndTimestamp - When Cloudflare finished processing the request
- ClientRequestQuery - The request parameters
- EdgeResponseStatus - HTTP status code: 200, 404, 500, etc.
- CacheStatus - Was the request server from cache: HIT, MISS, EXPIRED, BYPASS
- OriginIP - The IP address of the webserver
- OriginTLSVersion - TLS version of the connection to the origin server
- OriginResponseDurationMs - Response time in milliseconds of the origin server
- WAFAction - Web Application Firewall action, ALLOW, BLOCK, CHALLENGE
- BotScore - Indication score for bot detection
- ThreatScore - Threat intelligence score
- ASN - Autonomous System Number of the client IP
- ClientSSLProtocol - SSL/TLS version of the client
- ClientCipher - Cipher suite used by the client
Acquiring Cloudflare Logs
Cloudflare can be configured to forward logs via a number of different methods. Splunk HEC (HTTP Event Collector) is an easy one to set up both on Cloudflare and Grepr. A HEC endpoint is created in Grepr by adding a Splunk integration; it does not matter if the ultimate destination for the logs is not Splunk. Cloudflare log push is configured to send to the Grepr endpoint. If you want them to ultimately end up in Splunk, the integration has already been created. If you want to go somewhere else (Datadog, New Relic), create an integration for that platform. Finally create a Grepr pipeline to process the logs. Use the Splunk HEC as the source and whichever ultimate destination as the sink. If it is not Splunk, remove Splunk from the sink. The logs will now start flowing into Grepr.
Using Cloudflare Logs To Manage Cost
Cloudflare logs provide great detail on the traffic flowing through, which can be used to aid tweaks in configuration to optimise costs. Reducing unnecessary bandwidth, compute and log storage. To find the signal in the noise, consider the following fields and how they can help inform decision making.
Typically your cloud platform will be charging for egress, the more data you send the more it’s going to cost. Looking at EdgeResponseBytes which is the size of the payload sent to Cloudflare from the origin server. Crossreferencing this with the OriginIP will inform you which services are utilising the egress the most. The engineers may be able to make some optimisations here. Another source of high egress is cache misses (CacheStatus: MISS), where the payload was served by the origin and not cached by Cloudflare. Check that cache everything is enabled for static assets such as: CSS, Javascript and images. Additionally check the setting of Cache-Control headers, for example setting this to max-age=31536000 would cache assets for a year.
Using Grepr To Stop Log Cost Escalation
Sending copious logs from Cloudflare to your log aggregation and storage platform (Splunk, Datadog, New Relic) could have a serious impact on your next billing cycle. However, Grepr reduces log volume by 90% while retaining 100% insight. Verbose messages are sent through as summaries, while unique messages are passed straight through. This allows you to collect and extract intelligence from varying log sources without being overly concerned with the cost implications.
More blog posts
All blog posts
Monitoring Kubernetes Audit Logs

Use Grepr With Splunk
