Goldsky Case Study

Jad Naous
June 6, 2025

Goldsky is Web3’s Realtime Data Platform, enabling developers to build powerful dApps faster with high-performance blockchain indexing, instant subgraphs, and custom data streaming pipelines. The team reached out to us 6 months ago asking for help in managing log volumes and reducing costs, because they were feeling that the value they were receiving from collecting and storing so many logs was not aligned to the spend. Grepr was deployed soon after, and we were able to reduce their Datadog logging bills by 96%.

Paymahn Moghadasian, Lead Engineer at Goldsky, was responsible for the deployment.

Deployment 

Goldsky uses Terraform internally to manage their infrastructure. They have separate staging and production environments. The Datadog agents in those environments are also managed via Terraform.

Paymahn created a pipeline and then pointed the Datadog agents for staging to Grepr within 20 minutes, and Grepr got to work. The volume was minimal, about 8 messages/second, but even with this, Grepr was getting about 80% reduction.

Paymahn left Grepr working for about a week to gain confidence, and then decided to roll out to production. Here, the situation was different because we want to minimize risk. For the prod rollout, Paymahn used Datadog’s ability to dual-ship logs, adding Grepr as a destination while continuing to send logs to Datadog at the same time. Here are the steps he took:

  1. Enable dual shipping in Datadog for logs (https://docs.datadoghq.com/agent/configuration/dual-shipping)
  2. For each service: Add a filter in Grepr to drop all logs except the service you want to migrate.
  3. Once logs for that service are passing through Grepr correctly and being received by Datadog, add a Drop Rule to drop logs for that service not coming from Grepr.
  4. Tune set up to add exceptions if needed so as not to modify existing alerts or dashboards.
  5. Run for one day to validate everything is working as expected.
  6. Repeat for the next service
  7. Optionally update some alerts or dashboards to increase reduction and use summarized data instead of raw data.
  8. Run for two weeks to validate. Turn off dual-shipping from the agents.

The whole process took 4 weeks, end-to-end.

Cost savings

For May 2025:

  • Indexed Logs: 5.7 billion messages reduced to 250 million messages = ~96% reduction
  • Ingested Logs: 12 Terabytes of logs reduced to 795 Gigabytes = ~93% reduction

Dollar savings on Goldsky’s bill were commensurate. When combined with Grepr’s costs, Goldsky was able to save over 85% of their logging spend on Datadog.

Impact on MTTR

Two words from Paymahn: “no impact”. In fact, they reported that the reduction in noise made the logs easier to read through and understand than before Grepr. 

Other benefits

  • Get time back: by tackling log costs quickly and decisively with Grepr, Goldsky was able to get back time for building their product.
  • Historical searches with no rehydrations: they were able to search through logs over multiple months without having to rehydrate them and pay an additional cost.
  • Readable logs: when noise is filtered out, it’s easier to read the logs.

In Paymahn’s words:

  1. Grepr’s immediate, high-touch, white-glove support was excellent. We always felt taken care of.
  2. The Grepr UI was good for our needs. Definitely not competitive with Datadog’s UI, but that’s not the point.
  3. Grepr’s was always up and available for us.
  4. Logs arrived at Datadog with some minimal added latency, but not enough to actually matter in any meaningful sense.

Paymahn’s final comment, “Grepr allowed us to not substantially change any of our established observability use cases or processes by essentially getting rid of the noise in the data. It's the best of both worlds, lower costs without any retraining!”

Share this post

More blog posts

All blog posts
Product

How FOSSA Reduced Their Logs by 94% Without Burdening Their Engineers

Is your observability bill growing faster than your engineering team can say "log volume"? You're not alone. FOSSA, a leader in software supply chain management, faced a similar challenge. Their reliance on Datadog, while providing essential visibility, was becoming a significant financial burden as their platform scaled. Instead of a painful, time-consuming overhaul of their entire logging strategy, FOSSA found a smarter way. They discovered a solution that allowed them to dramatically reduce their Datadog costs without sacrificing the crucial insights they needed to monitor and troubleshoot their systems. Want to know how FOSSA achieved a whopping 95% reduction in log volume and kept their observability costs in check? Click to read the full story and discover their secret!
July 30, 2025
Product

Stuck Between A Rock And A Hard Place

Observability tools are vital for troubleshooting, but their high operational cost, driven by data volume, creates a tension between DevOps teams needing more data and businesses seeking lower bills. This dilemma stems from platforms treating all data as equally important, leading to an "impossible situation." Grepr breaks this conundrum by acting as a shim between log shippers and backends, using semantic machine learning to summarize frequent, noisy messages while passing critical, unique ones straight through. This innovative approach reduces log volume by 90-98% for significant cost savings, yet all data remains accessible in low-cost storage via the Grepr dashboard, REST API, and familiar query syntaxes (Splunk, Datadog, New Relic). This ensures that while you pay only for the 2-10% of data actively used, the rest is available on demand for queries or backfilling during incident investigations, solving the operational versus cost challenge and allowing you to pay only for the data you truly need, when you need it.
July 24, 2025
Product

What if You Had an AI-powered Observability Data Engine?

This blog post introduces a revolutionary approach to observability, addressing the long-standing "AI-in-a-Haystack" problem in log analysis. Traditional methods struggle with the sheer volume and lack of context in modern telemetry data, making AI analysis financially and technically unfeasible. Grepr offers a unique solution built on three core principles: intelligent telemetry reduction, which de-noises log volumes by over 99% before storage; a stateful stream processing engine, providing AI with the necessary memory and context to understand data trends; and dynamic pipeline control, enabling the AI to reconfigure data streams on the fly to "zoom in" on specific issues. These capabilities transform monitoring from a reactive chore into a proactive, conversational partnership, allowing AI to intelligently flag issues, suggest causes, and dynamically adjust its focus, ultimately leading to faster incident resolution and more efficient operations.
July 17, 2025

Get started free and see Grepr in action in 20 minutes.