Structured Logging Best Practices for Modern Apps in 2026

Steve Waterworth
May 4, 2026
Hooded developer in headphones typing at a keyboard with green code flowing across a dark curved monitor.

Unstructured logs are a tax on every engineering team that touches them. They are harder to parse, harder to search, harder to alert on, and more expensive to store. Every time a developer writes logger.info(f"User {user_id} placed order {order_id}") they create a string that requires regex extraction to be useful downstream.

Structured logging fixes this by emitting logs as key-value pairs, typically in JSON format. The data is machine-readable from the moment it leaves the application. Parsers, pipelines, and query engines all benefit.

This post covers the practices that matter most for production systems in 2026, with concrete examples in Python and patterns that apply across languages.

Use JSON as Your Log Format

JSON is the standard. Every major log shipper (Fluent Bit, Vector, OpenTelemetry Collector), every observability platform (Datadog, Splunk, Grafana), and every object storage query engine (Athena, DuckDB, Trino) handles JSON natively.

In Python, use the structlog library or python-json-logger instead of the default logging module's text formatter.

import structlog

structlog.configure(
    processors=[
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.JSONRenderer()
    ]
)

logger = structlog.get_logger()

logger.info("order_placed", user_id="u-9823", order_id="ord-4411", total=129.99)

This produces:

{"event": "order_placed", "user_id": "u-9823", "order_id": "ord-4411", "total": 129.99, "timestamp": "2026-03-17T14:22:08Z"}

Every field is immediately queryable. No parsing rules needed downstream.

Include Correlation IDs in Every Log Line

Distributed systems generate logs across dozens of services for a single user request. Without a shared identifier, tracing a request from ingress to database to response is manual detective work.

Assign a correlation ID (also called a request ID or trace ID) at the entry point of every request. Propagate it through every service call. Include it in every log line.

import uuid
import structlog

def middleware(request, call_next):
    correlation_id = request.headers.get("X-Correlation-ID", str(uuid.uuid4()))
    structlog.contextvars.bind_contextvars(correlation_id=correlation_id)
    response = call_next(request)
    return response

Now every log line emitted during that request carries the same correlation_id. When something breaks, you search for that single value and get the complete timeline across all services.

If you already use OpenTelemetry, the trace ID serves this purpose. Attach it to your logger context so it appears in every log line alongside the span ID.

Set Log Levels Correctly in Production

Log levels exist to control signal density. Production systems should emit at INFO level by default. DEBUG should be off in production unless you are actively investigating an issue.

Here is what each level means in practice:

ERROR: Something failed and requires attention. A database query threw an exception. An API call returned a 500. A payment was declined unexpectedly. Alerting systems should watch this level.

WARN: Something unexpected happened but the system handled it. A retry succeeded on the second attempt. A cache miss triggered a fallback. A deprecated endpoint received traffic.

INFO: Normal business events that matter for operations. A user signed up. An order shipped. A deployment completed. A job finished processing.

DEBUG: Internal state useful during development or troubleshooting. Variable values, function entry/exit, intermediate computation steps. Off in production by default.

The common mistake is logging too much at INFO. If a message fires thousands of times per minute during normal operation, it belongs at DEBUG or it should not exist. High-frequency INFO logs are the primary driver of observability cost inflation.

Standardize Your Field Names

When every service uses different field names for the same concept, downstream queries become painful. One service logs userId, another logs user_id, a third logs uid. Dashboards and alerts need to account for every variant.

Define a schema for your organization. Document it. Enforce it through shared logging libraries.

A minimal standard schema:

{
  "timestamp": "ISO 8601",
  "level": "info|warn|error|debug",
  "service": "service name",
  "environment": "production|staging",
  "correlation_id": "trace or request ID",
  "event": "human-readable event name",
  "message": "optional detail string"
}

Service-specific fields get added alongside these base fields. The base fields never change names across services. This makes cross-service queries, dashboards, and alerts dramatically simpler.

Log Events, Not Sentences

Good structured logs describe what happened, not what the developer was thinking. Compare these two approaches:

Bad:

logger.info("We just finished processing the batch and it went fine")

Good:

logger.info("batch_processed", batch_id="b-1192", records=4500, duration_ms=2340)

The first version is a sentence for humans. The second version is data for machines and humans. The event name batch_processed is searchable. The fields are filterable and aggregatable. You can build dashboards on duration_ms and alert on records dropping below a threshold.

Use past-tense verb phrases for event names: order_placed, payment_failed, user_authenticated, deployment_completed. These read naturally in log viewers and query results.

Do Not Log Sensitive Data

Structured logging makes it easier to accidentally include PII, secrets, or credentials in your log stream. A JSON serializer will happily dump an entire request object, including authorization headers, session tokens, and personal data.

Build sanitization into your logging library. Strip or redact fields that match known sensitive patterns before serialization. Common targets include authorization headers, password fields, credit card numbers, social security numbers, and API keys.

In regulated environments (healthcare, finance), logging PII creates a compliance obligation. Every log line containing a patient identifier or account number is now subject to HIPAA or PCI-DSS retention and access control requirements. It is cheaper and safer to not log that data in the first place.

Control Log Volume at the Source

Structured logging does not solve cost problems by itself. If you emit structured JSON at DEBUG level for every function call, you still generate massive volume. Structure helps you query and process that volume more efficiently, but the cheapest log line is the one you never create.

Review your log output periodically. Ask three questions about every log statement: Does anyone look at this? Does it drive an alert or dashboard? Would we miss it during an incident? If the answer to all three is no, remove it or drop it to DEBUG.

For the patterns that remain, consider what happens downstream. Healthcheck logs, heartbeat pings, and polling responses create repetitive patterns that add volume without adding signal. Filter those at the source or at the pipeline layer. For a step-by-step guide, see How to Drop Noisy Healthcheck Logs Before They Hit Your Observability Platform.

What Comes After Good Logging Hygiene

Structured logging with consistent field names, correlation IDs, and appropriate log levels gives your team a foundation. From here, the next place to gain efficiency is your observability pipeline.

Grepr uses the structure in your logs to identify repetitive patterns automatically. When your logs are well-structured, the ML engine can group duplicates more precisely and preserve unique events more accurately. Good logging practices upstream make pipeline-level reduction more effective and more trustworthy.

The combination of disciplined logging at the application layer and intelligent reduction at the pipeline layer is what gets teams to 80 or 90 percent volume reduction without losing the signal they depend on.

Frequently Asked Questions

What is structured logging and why does it matter for modern applications?

Structured logging means emitting log events as machine-readable key-value pairs, typically in JSON format, instead of free-form text strings. It matters because structured logs can be parsed, filtered, queried, and aggregated automatically by every major observability platform, log shipper, and query engine. Unstructured text logs require custom regex patterns to extract useful fields, which is fragile, slow, and expensive at scale.

What is a correlation ID and how do you use it in distributed systems?

A correlation ID is a unique identifier assigned at the entry point of a request and propagated through every service that handles that request. By including the correlation ID in every log line, you can trace a single user action across dozens of microservices by searching for one value. If you use OpenTelemetry, the trace ID serves the same purpose. Attach it to your logger context so it appears in every log event automatically.

What log level should I use in production?

Production systems should run at INFO level by default. ERROR captures failures that require attention. WARN captures handled anomalies like successful retries or cache misses. INFO captures meaningful business events like signups, orders, and deployments. DEBUG should be off in production unless you are actively investigating a specific issue. The most common mistake is logging too much at INFO, which inflates volume and cost.

How do you prevent sensitive data from appearing in application logs?

Build sanitization directly into your logging library or middleware. Automatically strip or redact fields that match known sensitive patterns, including authorization headers, passwords, credit card numbers, API keys, and personally identifiable information. In regulated environments like healthcare or finance, logging PII creates retention and access control obligations under HIPAA or PCI-DSS. Prevention at the source is cheaper than compliance after the fact.

How does structured logging reduce observability costs?

Structured logging itself does not reduce volume, but it makes every downstream optimization more effective. Consistent field names and event types allow pipeline tools to identify and aggregate repetitive patterns with higher accuracy. Proper log levels let you suppress debug noise in production. Standardized schemas eliminate duplicate parsing and transformation costs. Combined with pipeline-level reduction, structured logging is the foundation that makes 80 to 90 percent volume reduction achievable.

Share this post

More blog posts

All blog posts
Observability Summit 2026 Promotional Image
Events

Grepr at Observability Summit 2026

Datadog bills are climbing again, and the teams paying them will be in Minneapolis this month. Grepr will be there too.
May 5, 2026
Green waveforms on an oscilloscope: a jagged noisy trace above two steady sine waves, illustrating signal versus noise.
Engineering Guides

How to Reduce Telemetry Data Costs Without Losing Coverage

Filter rules force you to guess which data matters before you need it. Pattern-based sampling with full retention doesn't.
April 24, 2026
Animated GIF of a crowd of green three-eyed Toy Story aliens with navigation labels (Stories, Collections, Favs, Settings, Videos, Analytics, Uploads, Bio) floating across them.
Signals

How to Retain Raw Telemetry Data for HIPAA Compliance Without Breaking Your Budget

HIPAA requires six years of telemetry retention, and a two-tier architecture that sends reduced data to your observability platform and raw events to S3 with Parquet and Iceberg satisfies the mandate without the six-figure storage bill.
April 23, 2026

Get started free and see Grepr in action in 20 minutes.