All Observability Data Is Equal But Some Is More Equal Than Others

Steve Waterworth
June 24, 2025
A golden statue of Lady Justice holding a sword and scales stands atop a dome against a clear blue sky, representing fairness and law.

Observability vendors like to tell us that all data is important all the time and that if you do not collect absolutely everything in microscopic detail you’ll regret it. They promote FOMO (fear of missing out) preaching that the piece of data you chose not to collect is exactly the item you need to solve an issue and there’s going back in time to get it; the horse has bolted and you missed the chance to close the stable door. Of course they often fail to point out that they charge by data volume; the more you send, the better it is for their shareholders.

Different Levels Of Equality

Observability data falls into one of two categories: heartbeat or heart attack. Some data is required all the time to provide heartbeat confirmation, with the absence of this data being a cause for alarm. For example the access logs for a web server, it’s comforting to see requests passing through with 2XX status codes. This gives us a warm feeling that the website is up and serving requests in a prompt and error free manner.

Should the web server access log fall silent, this is a cause for concern. Either our website has suddenly become immensely unpopular or something broke. It is under these heart attack circumstances that the rest of the observability data suddenly becomes handy.

Every Heartbeat

It’s agreed that heartbeat observability data is essential to ensure service continuity by providing the data for alerting, is every log line really necessary, can we skip a heartbeat or two? There’s that FOMO feeling again. All that is really required to know all is well is that the number of 2XX status log entries is steadily increasing. However, if those log entries are being used to create a metric then skipping a beat or two would have consequences on metric precision.

Step Back In Time

When attempting to resolve an issue, every morsel of observability data is indispensable to assist in determining the root cause. This is where FOMO really kicks in and Murphy’s Law comes into force. It is surely a complete certainty that the data you chose not to collect is exactly the data you need. If only it was possible to step back in time and collect the missing data, then the root cause would be obvious and service level objectives would be promptly restored.

Heartbeat Summaries

Grepr’s AI powered data pipeline analyses the semantics of observability data in real time. It dynamically manages a set of transformation rules that reduce data volume by 90%, consequently reducing the observability platform costs by a similar amount. Taking web server access logs as an example, frequent messages are summarised while sparse messages are passed straight through.

As well as summarizing, the dynamic transformation rules also add supplementary metadata. The extra metadata is used to ensure total accuracy of any derived metrics.

The grepr.repeatCount field is used as the basis for metric generation; this will be 1 for messages that are not summarized.

Time Travel Is Possible

All data sent to Grepr is retained in low cost storage, no data is dropped. It gets even better, that data is available to query with a language you are already familiar with: Splunk, Datadog, New Relic and more coming soon. Using the Grepr web dashboard the entire dataset is available to assist in identifying the root cause of issues. The results of a query can optionally be submitted as a backfill job making the data quickly available on your observability platform. It’s like stepping back in time and turning up the detail just before the issue occurred. Continue to use the same workflows in the tool you are already familiar with; zero disruption and no learning curve. Backfill jobs can also be triggered automatically via web hook and query matches.

Don’t Miss Out

With Grepr you no longer have the fear of missing out, you can collect it all and not fear the cost implications. Heartbeat data is summarized and heart attack response data is immediately available via backfill. With Grepr you achieve 100% insight with 10% data and contain observability platform expenditure.

Share this post

More blog posts

All blog posts
Abstract visualization of log data streams with redacted sections, featuring geometric patterns and flowing lines in purple and teal tones representing data security and observability
Engineering

Remove Sensitive Data From Your Logs With the SQL Transform

Grepr's SQL transform enables real-time redaction of sensitive data like passwords from log events before they reach your data lake or monitoring platform, using familiar SQL syntax within your log processing pipeline.
December 29, 2025
Abstract visualization of data pipeline stages with flowing connections between zones, validation checkmarks, and node points in purple, teal, and green tones representing pipeline testing and data flow
Product

Grepr Live View: Test Pipeline Changes with Production Data

Live View clones your production pipeline so you can test configuration changes against real data streams without any deployment risk.
December 10, 2025
Graphic showing the Gartner Cool Vendor 2025 badge on the left and the Grepr logo on the right, displayed on a blue background.
Announcements

Grepr Recognized by Gartner as a Cool Vendor for AI Driven Operations

Grepr was recognized by Gartner as a Cool Vendor in AI for IT Operations for its ability to give AI driven systems cleaner signal, lower cost, and real-time pattern detection that powers advanced LLM workflows.
December 3, 2025

Get started free and see Grepr in action in 20 minutes.