Monitored Objects: How Grepr reduces Datadog metrics and host costs

Jad Naous
February 12, 2025
A diagram showing how Grepr reduces host counts: on the left, a grid of approximately 100 host icons with most in green (normal) and two in red (anomalous); an arrow points to the Grepr logo in the center; another arrow points to the output on the right showing only three host icons, representing the aggregated normal hosts and individual anomalous hosts sent to Datadog.

If you’re managing observability with Datadog, you’ve likely noticed two big cost drivers: custom metrics and host counts. Datadog charges based on unique metric time series (i.e. metric cardinality) and active hosts, both of which can get large. At Grepr, we’ve developed a smarter way to manage observability—cutting those costs significantly without compromising developer workflows.

Why is metrics observability so expensive?

Metrics storage systems (time series databases or TSDBs), are expensive because they offer:

  • Fast low-latency queries for immediate answers
  • Fresh data within seconds of arriving
  • Historical access for comparisons

But do all metrics need this level of service? We studied how metrics are used and found four core use cases:

  1. Monitoring and alerting: Detect issues in real-time
  2. Troubleshooting: Pinpoint the source of an issue
  3. Dashboarding: Display aggregated performance views
  4. Optimization & capacity planning: Compare changes, fine-tune performance, understand utilization.

Not every use case requires all TSDB features or all the data all the time, and by aligning metrics service to actual needs, we can dramatically cut costs.

The Grepr Approach: Smart Tiering with Integrated Anomaly Detection

The TSDB (Datadog) is your expensive hot data storage, so Grepr only stores the most useful data there and only when needed. Grepr combines the TSDB with an observability-optimized data lake as a low-cost cold data storage. Grepr automatically manages what data goes to which tier based on the current situation.

How It Works:

  1. Monitored Objects: Metrics are grouped by objects like hosts, containers, jobs, AI model, or HTTP requests. Grepr applies built-in anomaly detection to identify abnormal behavior in monitored objects.
  2. Aggregating Normal Data: “Normal” objects are aggregated into virtual entities to reduce noise and save costs. The original raw data is stored in the data lake, while the aggregated data is sent to the TSDB.
  3. Fine-Grained Anomaly Data: When an object becomes anomalous, Grepr sends its detailed metrics to the TSDB along with historical data from the data lake for full visibility.

Result: If you’re billed by objects, such as hosts, or by metrics cardinality or number of datapoints sent, you’ll see a major drop in your billable items.

Example: Host Reduction

Grepr has customizable built-in anomaly detection for host metrics to identify hosts that behave differently from others. Grepr reduces the number of hosts and all their metrics that are visible to Datadog by aggregating all normal hosts together. Anomalous hosts are sent individually along with historical data. You only get billed on anomalous hosts, and the virtual “aggregate” hosts.

Benefits for Metrics Use Cases

  • Monitoring and alerting: instead of issuing TSDB queries against metrics periodically, Grepr uses stream processing to detect anomalies, eliminating the need for a full TSDB. At the same time, Grepr enables more powerful anomaly detection using unsupervised machine learning.
  • Troubleshooting: Grepr automatically loads fine-granularity relevant data for anomalies, significantly reducing the amount of data that needs to be in hot storage all the time. Engineers don’t have to compromise on the granularity of metrics for troubleshooting.
  • Dashboarding: Aggregated data powering dashboard is sent in real-time, and Grepr can be configured to pass fine-granularity metrics for others.
  • Optimization & capacity planning: Full granularity metrics are available for as long as needed since they’re stored in low cost storage.

How Grepr Reduces its Impact

Grepr aims to reduce the impact on any existing workflows with a slew of capabilities:

  • Automatic configuration: Grepr can automatically allow metrics used in alerts or dashboards to pass through unmodified.
  • Default settings: Grepr comes with pre-configured sane settings to get going immediately.
  • Compatible Query Languages: Grepr is multilingual! We support a Datadog-like syntax so users don’t have to learn a new language.

There are many other exciting details to complete this picture, such as clustering of monitored objects by cohort, built-in anomaly detection, and others. Try Grepr free and see what Grepr can do in 10 minutes here.



FAQs

1. What are Monitored Objects in Grepr?

Monitored Objects are logical groupings of metrics by entity type, such as hosts, containers, jobs, AI models, or HTTP requests. Grepr applies built-in anomaly detection to these objects to identify abnormal behavior, then intelligently routes data based on whether the object is behaving normally or anomalously.

2. How does Grepr reduce Datadog host costs?

Grepr aggregates all normally-behaving hosts into virtual "aggregate" entities, sending only summarized data to Datadog. When a host becomes anomalous, Grepr sends its individual detailed metrics along with historical context. You only pay for anomalous hosts plus the aggregate virtual hosts, rather than every host in your infrastructure.

3. Will I lose access to historical metrics data with Grepr?

No. Grepr stores full-granularity raw data in a low-cost observability-optimized data lake. When an anomaly occurs, Grepr automatically pulls relevant historical data from cold storage and sends it to Datadog for complete visibility. For capacity planning and optimization work, full metrics remain accessible in the data lake for as long as you need them.

4. Does Grepr work with my existing Datadog dashboards and alerts?

Yes. Grepr can automatically detect metrics used in your existing alerts and dashboards and allow them to pass through unmodified. Grepr also supports Datadog-compatible query syntax, so your team does not need to learn a new query language.

5. How does Grepr handle anomaly detection differently than Datadog?

Instead of running periodic queries against the TSDB, Grepr uses stream processing to detect anomalies in real time before data reaches Datadog. This approach enables more powerful unsupervised machine learning detection while eliminating the need to store all metrics in expensive hot storage just for alerting purposes.

Share this post

More blog posts

All blog posts
Animated GIF from The Simpsons showing Homer, Marge, and Maggie standing in a doorway wearing Hawaiian shirts, looking out at heavy rain with disappointed expressions.
Engineering Guides

How to Reduce New Relic Costs With Grepr: A Step-by-Step Setup Guide

Grepr reduces New Relic costs by applying ML-based log reduction upstream of ingest, summarizing high-volume patterns while preserving unique events, anomalies, and any logs referenced by your existing dashboards and alerts.
February 3, 2026
Retro CGI animation of a yellow humanoid figure standing next to a green geometric computer terminal on a black background
Engineering Guides

Privacy and Data Ownership in Observability Pipelines

Grepr lets you keep your raw log data in your own S3 bucket while still getting the benefits of a managed observability platform.
January 28, 2026
Animated GIF of an intense cartoon ping pong scene featuring a determined purple character in the foreground gripping the table, flanked by two teammates, one green and one pink, all poised for action with dramatic lighting.
Product Features

Observability Cost Control: How Grepr and Edge Delta Take Different Paths to the Same Goal

Both Edge Delta and Grepr use AI to process observability data streams, but Grepr's automatic pipeline management delivers faster time-to-value with minimal configuration while Edge Delta requires ongoing manual maintenance.
January 27, 2026

Get started free and see Grepr in action in 20 minutes.