Three Weeks, Three Conferences, One Clear Message About Observability Costs

Summer Lambert
December 17, 2025
Abstract high-fidelity technology landscape showing flowing data streams in blue, teal, violet, and green with interconnected network nodes, geometric light layers, and rising signal indicators representing observability data flow and infrastructure scale

We attended KubeCon, AWS re:Invent, and Gartner's IT Infrastructure conference over the past few weeks. The same cost concerns came up at all three.

At KubeCon's 10th anniversary, the OpenTelemetry sessions had standing room only. Teams are standardizing collection methods because vendor pricing varies wildly across hosts, metrics, logs, and traces. Pipeline inefficiencies now translate directly to budget overruns.

The Ingress NGINX retirement (used in nearly half of all Kubernetes clusters) added migration planning to already stretched budgets. (O'Reilly KubeCon Recap)

AWS Focused on Agentic AI and Cost Reality

Re:Invent centered on agentic AI. AWS unveiled Graviton5 processors, Trainium3 UltraServers, frontier agents, and the Nova model family. All powerful technology, but the subtext remained constant: AI infrastructure demands observability, and that observability must scale economically.

One observation from the week captured the tension perfectly: "Nobody wants to be Datadog's number one customer." Being the top spender on any observability platform signals a data strategy problem, not a success metric.

AWS positioned CloudWatch updates around predictable costs and integrated data management rather than feature sprawl. The message: as AI infrastructure scales, observability needs to scale economically alongside it.

Gartner Addressed Budget Justification

The Gartner conference drew 4,000 attendees focused on strategic decisions and budget justification. Paul Delory and Autumn Stanish's opening keynote highlighted that 52% of CIOs list cost-cutting as their top 2026 priority, with AI positioned as the enabler.

Nathan Hill's presentation identified misalignment between infrastructure heads and CIOs as the primary I&O challenge. For observability, this disconnect typically results in collecting data that serves no business purpose while missing signals stakeholders actually need.

Data Volumes vs. Data Value

Teams collect far more data than they analyze. Secoda's research found 80% of log data provides no analytical value, yet companies pay to ingest, store, and query it anyway. (Source)

Here's what that looks like: a traffic spike hits, your Kubernetes cluster autoscales from 50 to 300 pods. Each new pod starts logging. Ten minutes later the spike ends, pods terminate, but you've already paid to ingest and index logs from 250 containers that no longer exist. Repeat daily.

Storage costs climb. Teams add more tools to manage costs. You end up running 10+ monitoring platforms. Vendors charge by volume. Your volumes grow exponentially.

Your Team Goals for 2026

Stop collecting everything by default. OpenTelemetry standardizes how you collect data, but you still decide what to keep. Intelligent sampling, storage tiering, and selective retention cut costs by 60-80% according to multiple vendors at the conferences. (Source)

Justify your observability spend. 54% of IT decision-makers now face pressure from leadership to explain observability costs. Source) Only 17% view it as a growth investment; the remaining 70% focus on optimization.

Focus on signals that matter. Platform teams that enforce data collection standards save money while improving incident response. Keep high-fidelity telemetry for critical paths. Sample aggressively everywhere else.

The Path Forward

The "collect everything" approach worked when data was cheap and budgets were unlimited. Both conditions have ended. Observability pipelines, intelligent sampling, and adaptive telemetry are table stakes now.

Your observability strategy needs to answer: 

  • What data actually improves incident response time? 
  • What can you safely sample or discard? 
  • Where do you need full-fidelity traces versus aggregated metrics?

Build these decisions into your collection layer from the start. Retrofitting cost controls after you're already paying for petabytes of low-value logs doesn't work.

Ready to optimize observability and reduce your costs while improving visibility? Grab a free demo to get your team started with observability that prioritizes signal over noise.

Share this post

More blog posts

All blog posts
Engineering Guides

5 Signs Your Observability Pipeline is Costing You Too Much

Most observability overspending comes from paying premium prices to store logs nobody queries.
January 9, 2026
A dark, minimal abstract header with horizontal bands of small rectangular data marks that resemble log lines. Thin cyan lines run across a few layers to suggest signal flow inside noisy data. The layout stays flat and modern without gradients or 3D effects, creating a quiet visualization of observability patterns.
Engineering Guides

The Hidden Cost Crisis in Observability: What Your Team Needs to Know in 2026

Observability spending hit $28.5 billion in 2025, and 96% of organizations are now actively working to bring costs under control.
January 8, 2026
Abstract visualization of data flow through an intelligent log processing system, showing incoming blue data streams being filtered through a central processing core, then split into red high-signal streams directed to premium platforms and green low-signal streams routed to cost-effective storage
Engineering Guides

Why First Mile Log Processing Reduces Costs Before Ingestion

First mile log processing with Grepr filters and routes logs before they reach expensive observability platforms, reducing costs by 90% while preserving 100% visibility by sending high-signal data to premium platforms and routing routine logs to low-cost storage.
January 2, 2026

Get started free and see Grepr in action in 20 minutes.