Show HN: OpenObserve – Elasticsearch/Datadog alternative

Hello folks,

We are launching OpenObserve. An open source Elasticsearch/Splunk/Datadog alternative written in rust and vue that is super easy to get started with and has 140x lower storage cost compared to elasticsearch. It offers logs, metrics, traces, dashboards, alerts, functions (run aws lambda like functions during ingestion and query to enrich, redact, transform, normalize and whatever else you want to do. Think redacting email IDs from logs, adding geolocation based on IP address, etc). You can do all of this from the UI, no messing up with configuration files.

OpenObserve can use local disk for storage in single node mode or s3/gcs/minio/azure blob or any s3 compatible store in HA mode.

We found that setting up observability often involved setting up 4 different tools (grafana for dashboarding, elasticsearch/loki/etc for logs, jaeger for tracing, thanos, cortex etc for metrics) and its not simple to do these things.

Here is a blog on why we built OpenObserve - https://openobserve.ai/blog/launching-openobserve.

We are in early days and would love to get feedback and suggestions.

Daily Digest email

Get the top HN stories in your inbox every day.

gettodachoppa

Hah, funny to see this on HN, I just tried this 3 days ago when I saw it being discussed on /r/selfhosted.

As someone running a homelab, and hadn't set up logging yet, it was a great find. I didn't have to learn and combine 3+ log technologies, it's just a single all-in-one monitoring server with web UI, dashboards, log filtering/search, etc. RAM usage of the Docker container was under 100MB.

prabhatsharma

Glad you liked it. Yeah, resource usage is so low that some folks are running it in their homelab even on a raspberry-pi.

sureglymop

How does this compare to Prometheus + Grafana + Loki?

prabhatsharma

We haven't compared resource usage against grafana stack. Rust is a lot more efficient than Go in general though.

adamm255

The amount of times I've started building an ELK stack in my home lab just to get pissed and trash it! Will be giving this a go.

gitowiec

Once I wanted to use Data dog with our internal project. Unfortunately I couldn't use the trial, it ended when I was ill at home. So I asked to prolong the trial. Ohhh that was a big mistake. They came upon me with zoom meetings, with two persons from their side. Asking do many questions about everything made me angry. After our internal meeting we agreed that we will use another service. So I emailed Data dog that we will pass on their offer. And then emails started. I was like stalked about "let's do the meeting, tell us more for what do you need 3 weeks more of the trial .." and so on. That was worst corporate behaviour I experienced

amacneil

Datadog sales were so annoying I banned their entire domain in google apps.

Then they started sending me emails from their personal gmail accounts…

https://www.linkedin.com/posts/adrianmacneil_sdrs-are-the-fr...

prabhatsharma

Yeah, Reddit is filled with horror stories from Datadog. One thread here - https://www.reddit.com/r/devops/comments/zz4naq/datadog_i_do...

hamandcheese

What differentiates you from the grafana stack? Grafana labs now have solutions for every pillar of observability (metrics, logs, traces), and that whole stack also touts cheap ingest (due to minimal upfront indexing and cheap blob storage) as one of its biggest selling points.

prabhatsharma

Grafana stack is cool. It solved the problem around observability for the first time in a neat and light way. Think about it however, for someone setting up their observability stack they will need to setup 4 components: Grafana (dashboarding), Loki (logs), Mimir (Metrics) and Tempo (Traces). This is a lot of work and a lot of moving parts. It is also a lot of work to maintain and upgrade. Loki, Mimir and Tempo have their own querying languages and their own quirks. Good amount of stuff to learn. It is really hard for a small team to learn and maintain all of this. Junior engineers really struggle with this and seniors resent the time they have to spend on this. Loki also has issues around high cardinality, which is a problem for many teams.

We need something new and better.

We made sure to build a solution that is easy to setup and maintain. One single binary/container for single node setup and one stack to setup/upgrade/maintain for HA setup.

We also made sure that it is easy to use and learn. We allow standard SQL to be used for querying logs, metrics and traces - nothing extra to learn here. Metrics querying is supported using PromQL too. Drag and drop for creating panels and dashboards is there too :-).

Dashboarding is supported within the same stack. No need to setup Grafana or something else separately.

OpenObserve also offer functions that grafana stack does not offer. Give it a shot, you are going to love it.

Data is stored in open parquet format, allowing its usage even outside of OpenObserve if so desired.

Hope this helps.

vlovich123

One of the most annoying things of grafana is that panels can only be edited via the UI and the “changes” summary when you commit is near unusable with a bunch of auto generated garbage.

Having a simple language/YAML to define those panels and an easy way to preview them (eg paste in your file to preview the page, paste in a panel source to try it) in addition to editing via GUI and being able to copy changes back to revision control would be great

smw

https://grafana.github.io/grafonnet-lib/

prabhatsharma

This is a good feedback. That would be true DaC (Dashboards as code). We already have plans to implement versioning of dashboards via git. Will see how to get DaC in place.

Too

How does OpenObserve handle logs with high cardinality?

In Elasticsearch tuning these indices is one of the bigger operational burdens. Loki solves this by asking users to not put data in the indexed fields, instead brute force through it all.

Does OpenObserve have any other approach to this?

valyala

Let's determine what is the "high cardinality" logs. Grafana Loki puts log messages with the same set of labels into a single "stream". For example, all the log messages with the labels {job="nginx", instance="host123"} go into a single stream. Loki stores every stream in a separate set of files and maintains in-memory index per each stream. So big number of streams in Loki lead to "high cardinality" issues such as performance degradation, high memory usage and increased storage load. That's why Loki developers recommend avoiding labels with big number of unique values such as ip, user_id, trace_id, etc.

I'm unsure what does "high cardinality" mean for ElasticSearch case. As I know it happily accepts log messages with high-cardinality labels such as ip, user_id or trace_id.

I don't know whether OpenObserve has any issues related to high cardinality.

P.S. I'm, as VictoriaMetrics core developer, confident that the upcoming log storage and analytics solution from VictoriaMetrics - VictoriaLogs - will be free from Loki-like high cardinality issues.

prabhatsharma

Loki solves the problem of high cardinality by saying - "don't have high cardinality".

OpenObserve use parquet as the storage format which is much more resilient to high cardinality. OpenObserve has page indexes and partitions that allow it to handle this very well.

PlutoIsAPlanet

Grafana log handling also isn't the best, especially when compared to New Relic etc.

Too

Loki itself is rock solid. The Grafana integration isn’t the best yeah, it doesn’t even have tabular output. What’s the whole point of doing structural logging if you can’t do that. Many similar low hanging fruit are missing that could easily make it so much more useful. Now every service need its own bespoke query, slowing down exploration.

valyala

VictoriaMetrics core developer here.

OpenObserve looks very promising from the first sight! It prioritizes the same properties as VictoriaMetrics products:

- Easy setup and operation

- Low resource usage (CPU, RAM, storage)

- Fast performance

It would be interesting to compare the operation complexity, the performance and resource usage of OpenObserve for metrics vs VictoriaMetrics.

P.S. VictoriaMetrics is going to present its own log storage and log analytics solution - VictoriaLogs - at the upcoming Monitorama conference in Portland [1]. It will provide much smoother experience for ELK and Grafana Loki users, while requiring lower amounts of RAM, disk space and CPU [2].

[1] https://monitorama.com/2023/pdx.html

[2] https://youtu.be/Gu96Fj2l7ls

heliodor

The folks at VictoriaMetrics have a track record of producing well-documented and easy-to-use software, and they're responsive.

It's easy to overlook these aspects but they will make all the difference to your team implementing the solution, so if you don't want to gamble, their products are a solid choice.

PS: I have no personal relation or connection with them. I am a user of VictoriaMetrics. Just want to point out things that matter but get ignored when choosing your software stack.

qxip

I'm looking forward to learning more about VictoriaLogs! You guys always come up with elegant solutions and I'm sure this will be no exception.

e12e

Sounds interesting!

Will you compare with qryn? Self-hosted sentry?

https://qryn.metrico.in/

https://develop.sentry.dev/self-hosted/

valyala

VictoriaLogs will be easier to setup and operate than qryn and self-hosted Sentry. Single-node VictoriaLogs is a single small self-contained statically linked binary with zero external dependencies. It runs with optimal settings out of the box on any hardware - from Raspberry PI to high-end servers with hundreds CPU cores and terabytes of RAM.

Additionally, VictoriaLogs will provide more user-friendly query language - LogsQL, which is easier to use than SQL at ClickHouse or the query language provided by Grafana Loki.

qxip

That's inaccurare: qryn does not require SQL at all . It offers LogQL, PromQL, Tempo and other customizable APIs natively by design without requiring the user to know/learn anything.

About the easier aspect that's true - qryn is designed to be an "overlay" on top of various backends such as ClickHouse and IOx (pros and cons for each up to the user) and to provide full granular data control to the underlying set (compliance, gdpr, etc) rather than an all-in-one solution with its own proprietary formats.

samspenc

This looks great, I had some questions about the company itself if that's OK, I'm just curious to understand the background:

At the bottom of your GitHub project home page, you say the best way to join the project is to join a WeChat group (in Chinese text), but likely only a very small minority of us outside China use WeChat, so that may be a stumbling block if you are trying to encourage people outside Asia to contribute to the project.

Per https://openobserve.ai/about , the address at the bottom says San Francisco, California, but in the same page it says "headquartered in Bangalore, India". So where are you based out of?

Also curious what the relationship is between OpenObserve the open-source project and Zinc Labs, which is referenced in the website (but not in the GitHub project).

prabhatsharma

You can join Slack if you are outside of China, or wechat, if you are in China. We had some good early users whoBoth the options are available.

> headquartered in Bangalore, India This is embarrassing, just fixed it. We are a delaware based company, headquartered in San Francisco. Pure copy paste error.

Zinc Labs (Legal name) is the company behind the product OpenObserve.

lopkeny12ko

How do you typo "San Francisco" into "Bangalore"? This seems extremely shady.

Copy and paste from where? Did you even build this product or just hire cheap labor out of India to build it?

hardwaresofton

Please give the benefit of the doubt on HN.

This company created ZincSearch:

https://github.com/zincsearch/zincsearch

Prabhat is one of the core contributors/maintainers:

https://github.com/zincsearch/zincsearch/graphs/contributors

https://github.com/prabhatsharma

Also the negative insinuation of using “cheap” labor out of India to build the product is unnecessary. If you’re concerned about code quality, look at the code.

Assuming everyone working with devs in India is doing so cynically is not charitable.

I dont know why the headquarters was set as india versus SF but does it actually even matter?

reacharavindh

> Did you even build this product or just hire cheap labor out of India to build it?

Assuming _only_ cheap labour exists in India is just plain ignorant. Even a discriminatory tone towards some of the outstanding engineers and hackers who live there. Please reconsider your biases, and ask questions in good faith.

reachableceo

Plus one on this. How did this get “typo”?

aliencat

It seems that the 140x lower storage cost comes from: 1. S3 (OO) vs EBS (ES): about 5x 2. No indexing: About 10x ? 3. No data duplication (due to using S3 I assume) in HA deployment: 3x

Is my math right? Or do you use something different for compression?

2 Orders of magnitude of storage saving is pretty impressive.

prabhatsharma

You are right.

aliencat

Thanks! Best of luck! Looks like a solid product

hamandcheese

I poked around the user guide[0] and discovered this:

> Currently OpenObserve support steams of type logs, in future streams types metrics & traces will be supported.

- are metrics and traces queryable yet? I admit, I feel a little misled, if only logs are supported for now that should be made more clear.

- do (or will) metric and trace queries use a similar SQL syntax as log search?

Finally... is there a demo? I would love to be able to try out the product without actually putting in the effort to set it up.

[0]: https://openobserve.ai/docs/user-guide/streams/

prabhatsharma

Here is a quick demo - https://www.youtube.com/watch?v=fZ-ErfMdF-o . Covers basics, logs, function, dashboards. Does not cover traces and metrics.

prabhatsharma

I should fix it. Metrics and traces are supported today. Logs and traces are advanced.

Metrics is still in its infancy though.

prabhatsharma

Logs, metrics and traces are all queryable using SQL. Metrics can additionally use PromQL for querying.

prabhatsharma

You could also try the cloud version of OpenObserve without setting it up yourself.

prabhatsharma

Logs, Metrics and traces are all queryable.

SergeAx

Interesting product, thank you for your effort, definitely want to give it a try!

For me, thought, setting up a system is not the primary pain point today. FWIW, signing up for a cloud service is not hard.

The problem starts at the ingestion point. I am writing my apps according to 12 factors and running them in docker containers. What I want is an agent companion that will collect logs and metrics from these apps and containers and forward to the system. Datadog and Grafana has that, do OpenObserve?

Also, interesting that in your quickstart tutorial you have a step of unzipping logs file before sending to the system. I would suggest the ability to send them in (g)zipped format, as it is an organic format of keeping logs.

prabhatsharma

Got your point. We think that there are 3 really good agents out there that can be leveraged by anyone to capture and send telemetry data. These are Vector, fluentbit and Otel-collector. Between these 3 they can handle pretty much anything. If there is something that needs to be handled that these cannot handle then its easy enough to build plugins for fluentbit and otel-collector.

We did not want to reinvent the wheel and recommend that you use one of these.

quickstart tutorial does not reflect what you will do in real life scenario.

You will want to learn how to use log forwarders like fluentbit and vector. These are the ones that you will generally use in real life. These log forwarders read the log files and send them to OpenObserve. Here is a blog on how to use fluentbit to capture kubernetes logs using fluentbit and send them to OpenObserve - https://openobserve.ai/blog/how-to-send-kubernetes-logs-usin...

We will add more tutorials and examples soon.

Hope this helps.

edenfed

Hi SergeAx, I am building Odigos, which do exactly what you asked :) We combine OpenTelemetry and eBPF to automatically generate and deliver distributed traces, metrics and logs to Grafana, DataDog and other 15+ destinations. Check it out here: https://github.com/keyval-dev/odigos

SergeAx

Hi, great to meet another builder here on HN) Am I understand correctly that your product is like grafana_agent, but vendor-agnostic?

asnyder

Was excited about this as I've been looking to host my own DataDog substitute for aggregating logs, alerts, searches, metrics, etc. Wanted to give this a shot but unfortunately unlike Grafana Loki, which is more complicated, it does not have clear enough documentation, or at least I couldn't find out whether it had a collection agent, to easily ingest logs and metrics to submit to the system.

With DataDog they have an easy to install agent that runs and ingests log data and can also handle (somewhat) for duplications due to disruptions.

Based on the OpenObserve documentation one essentially has to curl the logs themselves to the web service to submit, standardize on fluentd or equivalent, or tie into one of the other providers / agents.

I'm sure it's possible and that one of the services/software they mention briefly provides for this hopefully so I don't have to create my own agent, or ensure all services have a central logger that curls.

Not every system is on Cloudwatch (Linode, etc), and in many cases I'd like to have different ways to ingest and manage all the different logs one may produce. AWS EC2, Lambda, Linode, etc. Was really hoping for an easy server setup, plus agent, API, etc. Also, for systems you want to observe and not modify, such as 3rd party systems, would be nice to have an easy to add agent as to not have to modify or take ownership of those processes.

benpacker

How does this compare to Parseable?

https://github.com/parseablehq/parseable

First guess is that the underlying storage / query layer is pretty similar (Parquet + Datafusion), but OpenObserve has more built in use cases?

As an aside, it’s awesome that Datafusion’s existence and maturity makes launching a product with scalable analytical reads 10x easier than before and cool to see so many projects integrating it

prabhatsharma

Parquet and data fusion are awesome. Parseable is a good project. Just has logs though. No HA either. They will get there though - I don't know anyone there, but they certainly need to if not there already.

Yes, OpenObserve has a lot more use cases than parseable and our focus on ease of use has allowed us to build far more features and join them in an elegant way making it easier to use it, than any other platform out there.

Give it a shot and let us know what you think.

We wish parseable team our best wishes.

felix_te

Will definitely give this a try. I'm sick of fighting with Loki on upgrades and a recent migration that refuses to save to S3 properly. Exactly to your point of having to dive deep in so many components.

jerrygenser

Intro video is private. I could not access it on YouTube.

https://openobserve.ai/docs/getting-started/

prabhatsharma

Thanks for pointing out. Fixed it.

Daily Digest email

Get the top HN stories in your inbox every day.