Unstructured logs are a tax on every engineering team that touches them. They are harder to parse, harder to search, harder to alert on, and more expensive to store. Every time a developer writes logger.info(f"User {user_id} placed order {order_id}") they create a string that requires regex extraction to be useful downstream.
Structured logging fixes this by emitting logs as key-value pairs, typically in JSON format. The data is machine-readable from the moment it leaves the application. Parsers, pipelines, and query engines all benefit.
This post covers the practices that matter most for production systems in 2026, with concrete examples in Python and patterns that apply across languages.
Use JSON as Your Log Format
JSON is the standard. Every major log shipper (Fluent Bit, Vector, OpenTelemetry Collector), every observability platform (Datadog, Splunk, Grafana), and every object storage query engine (Athena, DuckDB, Trino) handles JSON natively.
In Python, use the structlog library or python-json-logger instead of the default logging module's text formatter.
import structlog
structlog.configure(
processors=[
structlog.processors.TimeStamper(fmt="iso"),
structlog.processors.JSONRenderer()
]
)
logger = structlog.get_logger()
logger.info("order_placed", user_id="u-9823", order_id="ord-4411", total=129.99)
This produces:
{"event": "order_placed", "user_id": "u-9823", "order_id": "ord-4411", "total": 129.99, "timestamp": "2026-03-17T14:22:08Z"}Every field is immediately queryable. No parsing rules needed downstream.
Include Correlation IDs in Every Log Line
Distributed systems generate logs across dozens of services for a single user request. Without a shared identifier, tracing a request from ingress to database to response is manual detective work.
Assign a correlation ID (also called a request ID or trace ID) at the entry point of every request. Propagate it through every service call. Include it in every log line.
import uuid
import structlog
def middleware(request, call_next):
correlation_id = request.headers.get("X-Correlation-ID", str(uuid.uuid4()))
structlog.contextvars.bind_contextvars(correlation_id=correlation_id)
response = call_next(request)
return response
Now every log line emitted during that request carries the same correlation_id. When something breaks, you search for that single value and get the complete timeline across all services.
If you already use OpenTelemetry, the trace ID serves this purpose. Attach it to your logger context so it appears in every log line alongside the span ID.
Set Log Levels Correctly in Production
Log levels exist to control signal density. Production systems should emit at INFO level by default. DEBUG should be off in production unless you are actively investigating an issue.
Here is what each level means in practice:
ERROR: Something failed and requires attention. A database query threw an exception. An API call returned a 500. A payment was declined unexpectedly. Alerting systems should watch this level.
WARN: Something unexpected happened but the system handled it. A retry succeeded on the second attempt. A cache miss triggered a fallback. A deprecated endpoint received traffic.
INFO: Normal business events that matter for operations. A user signed up. An order shipped. A deployment completed. A job finished processing.
DEBUG: Internal state useful during development or troubleshooting. Variable values, function entry/exit, intermediate computation steps. Off in production by default.
The common mistake is logging too much at INFO. If a message fires thousands of times per minute during normal operation, it belongs at DEBUG or it should not exist. High-frequency INFO logs are the primary driver of observability cost inflation.
Standardize Your Field Names
When every service uses different field names for the same concept, downstream queries become painful. One service logs userId, another logs user_id, a third logs uid. Dashboards and alerts need to account for every variant.
Define a schema for your organization. Document it. Enforce it through shared logging libraries.
A minimal standard schema:
{
"timestamp": "ISO 8601",
"level": "info|warn|error|debug",
"service": "service name",
"environment": "production|staging",
"correlation_id": "trace or request ID",
"event": "human-readable event name",
"message": "optional detail string"
}
Service-specific fields get added alongside these base fields. The base fields never change names across services. This makes cross-service queries, dashboards, and alerts dramatically simpler.
Log Events, Not Sentences
Good structured logs describe what happened, not what the developer was thinking. Compare these two approaches:
Bad:
logger.info("We just finished processing the batch and it went fine")
Good:
logger.info("batch_processed", batch_id="b-1192", records=4500, duration_ms=2340)
The first version is a sentence for humans. The second version is data for machines and humans. The event name batch_processed is searchable. The fields are filterable and aggregatable. You can build dashboards on duration_ms and alert on records dropping below a threshold.
Use past-tense verb phrases for event names: order_placed, payment_failed, user_authenticated, deployment_completed. These read naturally in log viewers and query results.
Do Not Log Sensitive Data
Structured logging makes it easier to accidentally include PII, secrets, or credentials in your log stream. A JSON serializer will happily dump an entire request object, including authorization headers, session tokens, and personal data.
Build sanitization into your logging library. Strip or redact fields that match known sensitive patterns before serialization. Common targets include authorization headers, password fields, credit card numbers, social security numbers, and API keys.
In regulated environments (healthcare, finance), logging PII creates a compliance obligation. Every log line containing a patient identifier or account number is now subject to HIPAA or PCI-DSS retention and access control requirements. It is cheaper and safer to not log that data in the first place.
Control Log Volume at the Source
Structured logging does not solve cost problems by itself. If you emit structured JSON at DEBUG level for every function call, you still generate massive volume. Structure helps you query and process that volume more efficiently, but the cheapest log line is the one you never create.
Review your log output periodically. Ask three questions about every log statement: Does anyone look at this? Does it drive an alert or dashboard? Would we miss it during an incident? If the answer to all three is no, remove it or drop it to DEBUG.
For the patterns that remain, consider what happens downstream. Healthcheck logs, heartbeat pings, and polling responses create repetitive patterns that add volume without adding signal. Filter those at the source or at the pipeline layer. For a step-by-step guide, see How to Drop Noisy Healthcheck Logs Before They Hit Your Observability Platform.
What Comes After Good Logging Hygiene
Structured logging with consistent field names, correlation IDs, and appropriate log levels gives your team a foundation. From here, the next place to gain efficiency is your observability pipeline.
Grepr uses the structure in your logs to identify repetitive patterns automatically. When your logs are well-structured, the ML engine can group duplicates more precisely and preserve unique events more accurately. Good logging practices upstream make pipeline-level reduction more effective and more trustworthy.
The combination of disciplined logging at the application layer and intelligent reduction at the pipeline layer is what gets teams to 80 or 90 percent volume reduction without losing the signal they depend on.
Frequently Asked Questions
What is structured logging and why does it matter for modern applications?
Structured logging means emitting log events as machine-readable key-value pairs, typically in JSON format, instead of free-form text strings. It matters because structured logs can be parsed, filtered, queried, and aggregated automatically by every major observability platform, log shipper, and query engine. Unstructured text logs require custom regex patterns to extract useful fields, which is fragile, slow, and expensive at scale.
What is a correlation ID and how do you use it in distributed systems?
A correlation ID is a unique identifier assigned at the entry point of a request and propagated through every service that handles that request. By including the correlation ID in every log line, you can trace a single user action across dozens of microservices by searching for one value. If you use OpenTelemetry, the trace ID serves the same purpose. Attach it to your logger context so it appears in every log event automatically.
What log level should I use in production?
Production systems should run at INFO level by default. ERROR captures failures that require attention. WARN captures handled anomalies like successful retries or cache misses. INFO captures meaningful business events like signups, orders, and deployments. DEBUG should be off in production unless you are actively investigating a specific issue. The most common mistake is logging too much at INFO, which inflates volume and cost.
How do you prevent sensitive data from appearing in application logs?
Build sanitization directly into your logging library or middleware. Automatically strip or redact fields that match known sensitive patterns, including authorization headers, passwords, credit card numbers, API keys, and personally identifiable information. In regulated environments like healthcare or finance, logging PII creates retention and access control obligations under HIPAA or PCI-DSS. Prevention at the source is cheaper than compliance after the fact.
How does structured logging reduce observability costs?
Structured logging itself does not reduce volume, but it makes every downstream optimization more effective. Consistent field names and event types allow pipeline tools to identify and aggregate repetitive patterns with higher accuracy. Proper log levels let you suppress debug noise in production. Standardized schemas eliminate duplicate parsing and transformation costs. Combined with pipeline-level reduction, structured logging is the foundation that makes 80 to 90 percent volume reduction achievable.
More blog posts
All blog posts
Grepr at Observability Summit 2026
.gif)
How to Reduce Telemetry Data Costs Without Losing Coverage



