Vector is an open source (sponsored by Datadog) high performance observability data pipeline that collects, transforms and routes from multiple sources to multiple sinks. It is built with Rust which compiles to a single binary making it easy to install. It is configured by YAML, TOML or JSON files with additional processing by Vector Remap Language (VRL) and/or Lua.
Grepr is an intelligent observability pipeline that orchestrates data between infrastructure, applications, and observability vendors. Grepr deploys between your agents and observability platform automatically transforming, aggregating, analyzing and routing your observability data. Our customers use Grepr to slash observability costs by 90%, store data long-term, and unlock observability data for business reporting and AI. Grepr is typically provided as SaaS but may be deployed on your own kit; physical or virtual.
Automatic vs Manual
At a superficial level both products appear to be the same, they both route data between sources and sinks with transformation along the way. However there are significant differences in how each is configured and the effort required.

Vector is a swiss army knife with multitudinous sources and sinks, together with a multiplicity of transforms including support for a couple of programming languages. You could say it's the tool that has it all. However, as a result of the profusion of options, configuration is non-trivial and requires considerable skill together with learning yet another domain specific language (VRL). Production deployment will also require some consideration as it will require multiple installs together with highly available load balancers; which tends to limit protocols to HTTP only. Splunk S2S protocol is not HTTP based and currently not supported by Vector. Although AWS S3 is supported as sink, the data is written as batched compressed files and therefore the contents can not be queried.
Grepr currently supports the usual suspects for sources and sinks with more being added in the future. Rather than provide a multitude of transforms that have to be manually configured, it uses machine learning (AI) to analyse the semantics of the observability data. It dynamically manages a collection of transforms to reduce the volume of data by 90%. Typically there are 179,000 dynamically created transforms running with a large dataset; imaging the effort required to do that manually.

No data is dropped, all data sent to Grepr is retained in low cost storage; typically AWS S3. With Grepr the data in AWS S3 is written using Apache Iceberg Parquet which means that the data can be queried. The best bit is that there is not yet another domain specific language for this, the data can be queried using Datadog, Splunk and New Relic languages with others to be added in the future. The results of a query performed against the AWS S3 data can optionally be submitted as a backfill job. Those matching entries will be sent through to the configured sinks, filling in the summary information to provide a rich data set in the tool your engineers use everyday.
Buy vs Build
You burn copious days configuring and programming Vector then deploying a highly available cluster with load balancers for Vector. Even after all that effort it would still not have anywhere near the level of automation that Grepr has. Additionally it would still be just a static configuration and would not adapt to any changes in the datastream, for example a change in the log format used for Nginx, the deployment of a new data store.
Alternatively you should just use Grepr. The SaaS platform is highly available and SOC 2 certified. In just 20 minutes you could have your first pipeline deployed and start saving on your observability platform costs. The dynamic nature of the Grepr AI means that any changes in the datastream are automatically handled.
More blog posts
All blog posts
All Observability Data Is Equal But Some Is More Equal Than Others

100% Insight With 10% Of Your Data
