Nobody sets out to overspend on observability. You add a service, spin up some dashboards, and suddenly finance is asking why the Datadog bill tripled.
1. Log Volume is Growing Faster Than Your Infrastructure
Your compute footprint grew 20% last quarter. Your log volume grew 80%. That math doesn't work.
Jitsu, a last-mile delivery company, handles millions of shipments per month. Each shipment generates 400 logs. Over 99% of shipments are successful, meaning most of those logs just confirm things are working. As CTO Evan Robinson put it, "We only need to review logs to understand why a shipment has gone wrong."
That's the pattern. Most log volume comes from healthy operations, not problems. You're paying to ingest and index confirmations that everything is fine.
2. You're Paying Premium Rates to Store Data Nobody Queries
The access logs from your load balancer? Probably untouched unless there's an incident. Debug output from healthy services? Accumulating charges. That Kubernetes audit log you enabled because someone said you should? Burning money in the background.

Goldsky, a Web3 data platform, was ingesting 12 terabytes of logs per month into Datadog. After analyzing what they actually needed for alerting and troubleshooting, they cut that to 795 gigabytes. The other 11+ terabytes? Noise they were paying premium rates to store.
3. Your Observability Bill is Approaching Your Infrastructure Bill
FOSSA, an open-source compliance platform, built over 100 custom logging dashboards and about 30 log-based alerts in Datadog. As their platform scaled, so did log volume, and so did the bill.
According to Dave Bortz, VP of Engineering: "We first tried to reduce our bills by being more selective about what we logged, but that's a slow moving process that requires a lot of engineering time, and ultimately limits how much visibility we have into our system."
That's the trap. You can either pay the bill or pay engineers to manually tune logging across every service. Neither option is good.
4. Engineers Are Making Instrumentation Decisions Based on Budget
Your platform engineers are in sprint planning saying "we can't add tracing to that service, it'll blow our budget" or "let's sample these logs at 10% to save money."
That's not cost-consciousness. That's building blind spots into your system.
The real cost shows up later. An outage hits, someone asks "what happened at 3:47 AM," and the answer is "we don't know, we sampled that out." Teams spend weeks debugging issues that would have taken hours with complete data.
5. You Can't Explain Where the Money Goes
Which services produce the most log volume? What's the split between error, warn, info, and debug? How much is retention versus ingestion?
Some observability platforms break down usage by service and environment. Others hand you a monthly invoice with a single number. If you're in the second category, start tagging your logs by source and tracking volume over time.
What Actually Works
The pattern behind all five signs is the same: you're paying premium prices for commodity data.
Keep your raw logs in object storage. Send summarized data and anomalies to your observability platform for alerting. When you need the full detail, pull it from storage.
Goldsky reduced indexed logs from 5.7 billion messages to 250 million, a 96% reduction. Their impact on mean time to resolution? "No impact," according to Lead Engineer Paymahn Moghadasian. "In fact, the reduction in noise made the logs easier to read through and understand."
Read Case Study: Goldsky: 96% Reduction in Datadog Logging Costs
Jitsu deployed in under an hour and saw 90% volume reduction within 15 minutes. They still meet their 13-month compliance retention requirement by storing raw logs in S3.
FOSSA saw a 95% reduction in log volume. "We didn't have to change a thing in our app," Bortz said.
Read Case Study: How FOSSA Reduced Their Logs by 95% Without Burdening Their Engineers
The raw data still exists. The important signals still hit dashboards. The bill drops.
Try Grepr for free today.
FAQ
How much can you realistically reduce observability costs?
Companies typically see 85% to 96% reductions in log ingestion costs. Goldsky reduced indexed logs from 5.7 billion to 250 million messages per month. FOSSA and Jitsu both achieved over 90% reductions. Results depend on how much repetitive, low-signal data your systems generate.
Will reducing log volume hurt my ability to debug incidents?
No. The approach keeps all raw data in low-cost storage and sends summarized data to your observability platform. When an incident occurs, you backfill the detailed logs you need. Goldsky reported "no impact" on mean time to resolution after cutting log volume by 96%.
How long does it take to implement log volume reduction?
Initial deployment typically takes under an hour. Jitsu saw results within 15 minutes of deployment. A full production rollout with validation takes longer. Goldsky completed their phased rollout in 4 weeks, migrating service by service.
Do I have to change my existing dashboards and alerts?
No. The goal is to reduce costs without disrupting workflows. FOSSA kept their 100+ custom dashboards and 30 log-based alerts intact. Summarized data maintains the same patterns and metrics your existing monitoring depends on.
What happens to compliance and audit requirements?
Raw logs are stored in object storage like S3 for as long as you need them. Jitsu maintains 13 months of log retention for compliance while paying a fraction of what full indexing would cost. You keep the data; you just stop paying premium rates to store it.
More blog posts
All blog posts
The Hidden Cost Crisis in Observability: What Your Team Needs to Know in 2026

Why First Mile Log Processing Reduces Costs Before Ingestion
.png)


