Dragonflydb – A modern replacement for Redis and Memcached

Daily Digest email

Get the top HN stories in your inbox every day.

romange

Guys, I am the author of the project. Would love to answer any questions you have. Meanwhile will try to do it by replying comments below.

baobob

Benchmarks done on only one metric are often misleading (and in commercial circumstances usually intentionally!). Would love to see visually what trade offs Dragonfly is making to achieve the numbers from the chart at the top of the README. If excellent technical work means there really is no trade off, that would also be a great reason to chart it visually.

Also as a Redis replacement, it's not clear what durability is offered, and for most Redis use cases this is close to the first question

romange

How you would suggest to demonstrate visually that there are no trade-offs?

The tradeoff the way I see it - one needs to implement 200 Redis commands from scratch. Besides, I think DF has a marginally higher 50th percentile latency. Say, if Redis has 0.3ms for 50th percentile, DF can have 0.4ms because it uses message passing for inter-thread communication. 99th percentiles are better in DF for the same throughput because DF uses more cpu power which reduces variance under load.

Re-durability - what durability is offerred by Redis? AOF ? We will provide similar durability guarantees with better performance than AOF. We already provide snapshotting that can be 30-50 faster than of Redis.

baobob

There is usually a tradeoff between latency and throughput, although I'm not so sure this would be true for your innovation, since you've eliminated a whole chunk of fixed overhead from the system (syscall batching). However, batching in general often implies added latency

If I recall ScyllaDB has some excellent examples of demonstrating this particular tradeoff visually. A simple option would be a scatter plot where X = latency, Y = load or similar, with points coloured according to the system under test. Probably there is a better option, but this would likely be enough to sell me at least

pcthrowaway

Does Dragonfly have a timeseries extension? Or does it support extensions?

thayne

Presumably, most of the performance benefit comes from using linux io_uring, which limits it to using recent linux kernels.

Of course, it is also possible there are situations where it doesn't perform as well.

YarickR2

Tradeoff is inability to run on anything besides recent Linux kernels

miohtama

Does anyone run production Redis/memcached outside Linux any case?

tyingq

I suppose because of io_uring?

It does sound like extendible hashing might have downsides in some scenarios also.

Symmetry

Sad that you can't run dragonflydb on DragonFlyBSD.

ignoramous

Thanks for your work.

Curious: Why BSL? Why not open core [0] (or xGPLv3) like what most other commercial OSS projects seem to be doing?

[0] https://archive.is/nT9DF#selection-410.0-410.1

sideeffffect

This BSL Business Source License is endorsed by neither the Free Software Foundation nor OSI

https://www.gnu.org/licenses/license-list.en.html

https://opensource.org/licenses/alphabetical

That's a big red flag.

spiffytech

I'll be interested to see where the community lands on this years from now. It's frustrating to see so many databases relicense under licenses that are almost OSS but not quite. Yet it's totally understandable - it's hard to pay database engineers when you give your product away and then AWS neuters your hosting and support business.

beanjuiceII

yep it's a big red NO for me for sure

oofbey

This looks quite cool from a technical perspective, but the unusual license definitely gives me pause. Largely because I'm not familiar with BSL. Others say it's increasingly popular, but with a little googling the acronym is still somewhat ambiguous - opensource.org lists BSL as the "Boost Software License" which looks more like BSD. This kind of confusion doesn't support the idea that this is a solid trustworthy OSS license.

Still, I really appreciate that you didn't choose a copy-left license.

On the license front, what is the "change license" clause listed? It says something about changing in 5 years. Does this mean it will become Apache licensed in 2027? Why would you put that in there?

romange

It's exactly that. It gives us a little chance to fight against the Giants.

In 5 years the initial version becomes Apache 2.0 then the next version and so on and so forth. CockroachDB uses similar license. MariaDB uses that, Redpanda Data and others. You are right that acronym is confusing - it's not Boost license, it's Business License. Every major technological startup turned away from BSD/Apache 2.0 licenses due to inability to compete with cloud providers without technological edge.

felixg3

They probably don't want to end up being replaced by an AWS/GCP/Azure service. In my opinion, the BSL is a fair license model, especially if it is limited in duration (let's say 2-3 years BSL then automatically changing to Apache/BSD).

ethbr0

A duration limit in the license, after which it becomes a permissive license, seems the critical point.

Accomplishes the goal of preventing a cloud provider from stealing customers, but also ensures customers don't get caught in an "always tomorrow" trap when the deadline comes and the company realizes it only hurts them to fully share it.

Seems to align all interests pretty nicely.

(I'm as big of an OSS supporter as anyone, but we can't pretend we still live in a time where Google / Amazon / modern-Microsoft don't exist)

Aeolun

But my entire stack is already in AWS. One service provider deciding they don’t want AWS making a service out of it just means I have to make the effort of self-hosting it, not that I’ll suddenly end up using their service (which is outside my VPC).

OPoncz

GPL is copyleft and more restrictive in some elements. BSL 1.1 is actually quite popular nowdays.

oneepic

Apologies if this was already asked -- I saw you guys already showed benchmarks in AWS itself, but would it make sense to have some benchmarks outside of AWS (or run benchmarks in a VM with Redis/DF/Keydb installed within)?

I ask this because I'm unsure if AWS Redis has any modifications on top of the Redis software itself, which would affect the speed, or even make it a bit slower. For example I know MS Azure's version of Redis restricted certain commands, and from a quick search AWS does something similar: https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/...

(edit: added "affect the speed" for clarity)

romange

We compared DF with Redis OSS and KeyDB OSS, we have not used any of managed services for the comparison. AWS was used as a platform for using EC2 and cloud environment.

thatwasunusual

Two things:

Where's the benchmark compared to memcached?

Several years ago there was memcachedb, which could flush stuff to disk. While this operation was expensive, it was also useful, because you could restart instances without being overwhelmed by missing keys (data).

For the latter: your application quickly grinds to a halt if you need to build your cache from ground up after some kind of crash. This is a deal-breaker for many.

cultofmetatron

I gotta say, I looked through some files. I wish my code was this clean. its some of the prettiest c++ I've ever seen

ignoramous

When Jens Axboe praises you for your work [0], then it is a given you're at the top of your game.

[0] https://twitter.com/axboe/status/1531371740403617792

romange

Indeed :) The moment i learned about io_uring I got hooked up immediately. People like Jens, Pavel are the type of innovators that make sure Linux stays on top of its game for the next generations.

pierrebai

I dunno... I took a look at a single file and found:

        static constexpr unsigned NUM_SLOTS = Policy::kSlotNum;
        static constexpr unsigned BUCKET_CNT = Policy::kBucketNum;
        static constexpr unsigned STASH_BUCKET_NUM = Policy::kStashBucketNum;

NUM_, _CNT, _NUM, three different prefix/suffix for what seems to me like the same concept. That just tickled my inner nit-picker.

romange

You are welcomed to send a PR with the fix :)

romange

Everyone has his dirty laundry :(

make3

that's so minor

romange

omg, can I marry you?

romange

btw, it's not just pretty - it has brains too. take it for a spin.

AtNightWeCode

From my experience, Redis performance seems to be all over the place depending on the circumstances. In what cases does this solution perform well and where does it fail? Most caches I worked with really love to have everything close by at low latency and works best when the consumers have about as much memory as the cache in the first place.

ddorian43

Did you think about re-using redis open source and "just" changing storage/replication? (like yugabytedb does with postgresql).

Why not reuse seastar framework?

Can you describe your distributed log thing? Is it like facebook-logdevice or apache-bookeeper?

romange

I honestly think it's impossible to reuse Redis OSS to make it multi-threaded besides what KeyDB did. Btw, I did reuse some of the code - but it's around single value data-structures like zset, hset etc.

With multi-threading you need to think about all things holistically. How you handle backpressure, how you do snapshotting. How you implement multi-key operations or blocking transactions. So you need special algorithms to provide atomicity, you need fibers/coroutines to be able to block your calling context yet unblock the cpu for other tasks etc. All this was designed bottom up from scratch. Seastar could work theoretically but I am not a fan of coding style with futures and continuations - they are pretty confusing, especially in C++. My choice was using fibers - which provide more natural way of writing code.

I have not designed the distributed long thingy. Will do it in the next 2 months.

reilly3000

This looks like awesome work! I appreciate you operationalizing of some of the best things to come from computer science in the past decade or so.

Out of curiosity, are you discovering any new bottlenecks to performance outside of the software, given Dragonfly is able to process far more qps than most systems? I imagine the network and disk I/O could become stressed, but also I wonder if it breaks any assumptions of cross-core performance, hypervisors, etc. I know that cloud offerings typically mean that you can attach ginormous disk IOPS and NICs, but surely there are limits.

romange

Yes, I found many different bottlenecks.

I was mostly running on AWS. In terms of hardware, for small-packets loadtests, most systems are constrained on throughput, i.e. number of packets per second. Some instances saturate on interrupts reaching 100% CPU on all cores and some can not even saturate the CPU and you will see that CPU is at 60% but you can not go beyond in throughput. The best systems network-wise are c6gn family types. They are also better than instances that other cloud provide. btw, you mentioned hypervisors... About 8 months ago I opened a bug on AWS Graviton team https://github.com/amzn/amzn-drivers/issues/195 - about performance issue they had on their instances at high throughput. Recently they issued the fix. I suspect it was in their hypervisor.

In terms of my software I found many performance bugs at those speeds. For example, using a default allocator is a big no. I use mimalloc for uncontended allocations. In general, you can not use mutexes and spinlocks at those speeds. Those will just cripple the system. Sometimes it can be very annoying since you can not rely on a 3rd party library without carefully analyzing its design. For example, I could not use openmetrics c++ library because it was not performant enough. Even to implement a simple counter, say to gather statistics for INFO command becomes an interesting engineering problem: With share nothing architecture, I use a lot of thread-local counters that I aggregate only when stats are pulled.

As a general note, I expect that Dragonfly will stay very performant with the tailwinds from recent hardware advancements. For example, c7g (Graviton 3) is much better than c6g and DF shows it.

amluto

I'm curious: since you're targeting Graviton, and AFAICT Graviton 2 and up support ARM LSE, have you tried directly using LSE for metrics? ARM LSE offers STADD to add to a 64-bit memory location without reading the contents or ordering the access. (I think LDADD with XZR as the destination is identical, but I could be missing some subtlety. I'm not an ARM expert.) You might be able to get good performance by using a global counter and just STADDing to it. x86 will perform terribly if you use XADD because it has no corresponding optimized forms.

reflexe

Interesting. Have you considered/tried to use fdio/user space networking? In my experience, it greatly improves throughout (simple ip forwarding can be more than 10mpps per cortex a53 on some platforms). Fdio also has a so that you can preload in order to use its ip stack in your app (instead of Linux's).

See https://s3-docs.fd.io/vpp/22.06/developer/extras/vcl_ldprelo...

tiffanyh

FoundationDB should have been included in their perf comparison. It’s ACID compliant and a distributed Value/Key store.

For SSD based storage, it’s getting 50k reads/sec PER core and scales linearly with # of cores you have in your cluster. (They achieved 8MM reads/sec with 384 cores)

https://apple.github.io/foundationdb/performance.html

atombender

FoundationDB does have impressive performance, but implementing compound operations like INCR, APPEND, etc. would require at least an additional network round trip between the client and the server.

For example, INCR would require one read followed by one write of the new value, and of course this will result in very inefficient mutation range conflicts (which must be retried for another couple of round trips) if you have frequent updates of the same keys in multiple concurrent transactions.

spullara

FDB also supports some atomic operations that may decrease the number of roundtrips and remove conflicts:

https://apple.github.io/foundationdb/api-python.html#api-pyt...

That said, I still don't think that it is necessarily the perfect match for implementing some of the Redis data structures.

atombender

I had completely forgotten about those built-in ones. Right, so some operations can be made fully atomic server-side, but there are still a bunch of ops (append, pubsub, set commands such as SADD, etc.) that would need round trips.

sorenbs

FDB supports atomic operations that you can use to implement at least some of this very easily in a way that avoids the extra round trip and conflicts: https://apple.github.io/foundationdb/developer-guide.html#at...

richieartoul

FDB scales really well with reads, but it will bottleneck on writes a lot faster than you think. For write heavy workloads, Redis and Dragonfly would both leave FDB in the dust (and I say this as a big FDB user/enthusiast).

endisneigh

Isn’t this because FDB persists to disk? What’s the difference compared to Redis persisting to disk?

richieartoul

No there is much more to it than that. FoundationDB has a completely different architecture that is optimized for completely different things than Redis is. It’s not just about writing to disk vs. not.

Redis is basically a very performant, single-threaded (mostly) single-node in-memory datastructures system with an efficient and readable server protocol strapped to it.

FoundationDB is a completely different beast that has like 6+ distinct roles, and is optimized almost exclusively for interactive serializable transactions, range reads, and correctness.

They’re just completely different things, I recommend reading the FoundationDB paper to get a sense for its architecture. The amount of “steps” involved in processing an FDB write is much higher than in Redis.

wutwutwutwut

> scales linearly with # of cores you have in your cluster.

So if I have 1 machine and increase from 2 to 256 core the throughout will scale linearly without the SSD ever being a bottleneck?

gigatexal

“Distributed” is the keyword here. You scale cores and storage with machines…

jimnotgym

Modern? Eeek I think of Redis as modern (2009). I'm feeling old.

mattbaker

Yeah, I’d like to see a bit more justification on what makes it “modern.” The use of io_uring maybe?

staticassertion

I think the Background section[0] is pretty helpful. One paper they cite is from 2014, another from 2020. The use of io_uring as well is also somewhat novel.

[0] https://github.com/dragonflydb/dragonfly#background

soheil

Redis and memcache are in memory key value storage. Writing to disk is not a primary function of those systems, it’s only used for taking snapshots or backup of the data. io_uring isn’t used as the core functionality and thus that alone wouldn’t make DF “modern”.

romange

Yes, we use io_uring for networking and for disk. io_uring provides a unified interface on linux to poll for all I/O events. Re disk - we use it for storing snapshots. We will use it for writing WALs.

And we have more plans for using io_uring in DF in the future.

maffydub

Is it possible they use io_uring for networking?

pluc

See the "Background" section at the bottom

nemothekid

I had the same thought, but then I realized IBM DB2 was 16 years old when Redis was released, close to Redis' age now. There is a whole generation of programmers that may consider MongoDB a "legacy database".

sudarshnachakra

I like the redis protocol compatibility and the HTTP compatibility, but from the initial skim through I guess you are using abseil-cpp and the home-grown helio (https://github.com/romange/helio) library.

Could you get me a one liner on the helio library is it used as a fiber wrapper around the io_uring facility in the kernel? Can it be used as a standalone library for implementing fibers in application code?

Also it seems that spinlock has become a defacto standard in the DB world today, thanks for not falling into the trap (because 90% of the users of any DB do not need spinlocks).

Another curious question would be - why not implement with seastar (since you're not speaking to disk often enough)?

romange

Yes, helio is the library that allows you to build c++ backends easily similar to Seastar. Unlike Seastar that is designed as futures and continuations library, helio uses fibers which I think simpler to use and reason about. I've wrote a few blog posts a while ago about fibers and Seastar: https://www.romange.com/2018/07/12/seastar-asynchronous-c-fr... one of them. You will see there a typical Seastar flow with continuations. I just do not like this style and I think C++ is not a good fit for it. Having said that, I do think Seastar is 5-star framework and the team behind it are all superstars. I learned about shared-nothing architecture from Seastar.

Re helio: You will find examples folder inside the projects with sample backends: echo_server and pingpong_server. Both are similar but the latter speaks RESP. I also implemented a toy midi-redis project https://github.com/romange/midi-redis which is also based on helio.

In fact dragonfly evolved from it. Another interesting moment about Seastarr - I decided to adopt io_uring as my only polling API and Seastar did not use io_uring at that time.

alecco

That blog post is a bit dated. That's the old way. Seastar supports C++20 coroutines.

http://docs.seastar.io/master/tutorial.html#coroutines

sudarshnachakra

Thanks for taking the time to reply - yes in fact seastar does not use io_uring but it's rust equivalent glommio does use it (IIRC it is based on io_uring). Any reasons for using c++ instead of Rust (are u more familiar with it? or just the learning curve hinders the time to market? or is it the Rc/Arc fatigue with rust async? I guess Rust should be a fairly easy language to pick up for good c++ programmers like you)

romange

If I would choose another language it would be Rust. Why I did not choose Rust?

1. I speak fluently C++ and learning Rust would take me years. 2. Foodchain of libraries that I am intimately fimiliar with in C++ and I am not familiar with in Rust. Take Rust Tokyo, for example. This is the de facto standard for how to build I/O backends. However if you benchmark Tokyo's min-redis with memtier_benchmark you will see it has much lower throughput than helio and much higher latency. (At least this is what I observed a year ago). Tokyo is a combination of myriad design decisions that authors of the framework had to do to serve tha mainstream of use-cases. helio is opinionated. DF is opinionated. Shared-nothing architecture is not for everyone. But if you master it - it's invincible.

antirez

I'm no longer involved in Redis, but I would love if people receive a clear information: from what the author of Dragonflydb is saying here, that is, memcached and Draonflydb have similar performance, I imagine that the numbers provided in the comparison with Redis are obtain with Redis on a single core, and Draonflydb running on all the cores of the machine. Now, it is true that Redis uses a core per instance, but it is true that this is comparing apple-to-motorbikes. Multi-key operations are possible even with multiple instances (via key tags), so the author should compare N Redis instances (one per core), and report the numbers. Then they should say: "but our advantage is that you can run a single instance with this and that good thing". Moreover I believe it would be fair to memcached to clearly state they have the same performance.

EDIT: another quick note: copy-on-write implementations on the user space, algorithmically, are cool in certain situations, but it must be checked what happens in the worst case. Because the good thing of kernel copy-on-write is that, it is what it is, but is easy to predict. Imagine an instance composed of just very large sorted sets: snapshotting starts, but there are a lot of writes, and all the sorted sets end being duplicated in the process. When instead the sorted sets are able to remember their version because the data structure itself is versioned, you get two things: 1. more memory usage, 2. a lot more complexity in the implementation. I don't know what dragonflydb is using as algorithmic copy-on-write, but I would make sure to understand what the failure modes are with those algorithm, because it's a bit a matter of physics: if you want to capture a snapshot at a given Time T0 of a database, somehow changes must be accumulated. Either at page level or at some other level.

EDIT 2: fun fact, I didn't comment something about Redis for two years!

romange

Thanks for providing your feedback. As Redis Manifesto states - our goal is to fight against complexity. antirez - you are our inspiration and I seriously take your manifesto close to heart.

Please allow the possibility that Redis can be improved and should be improved. Otherwise other systems will eventually take its market apart.

I appreciate your comments very much. I've wrote about you in my blog. I am an engineer and I disagree with some of the design decisions that were made in Redis and I decided to do something about it :) to your points:

1. DF provides full compatibility with single node Redis while running on all cores, compared to Redis cluster that can not provide multi-key operations across slots.

2. Much stronger point - we provide much simpler system since you do not need to manage k processes, you do not need to *provision* k capacities that managed independently within each process and you do not need to monitor those processes, load/save k snapshots etc. Our snapshotting is point in time on all cores.

3. Due to pooling of resources DF is more cost efficient. It's more versatile. We have a design partner that could reduce its costs by factor of 3 just because he could use x2gd machine with extra high memory configuration.

Regarding your note about memcached - while we provide similar performance like memcached our product proposition is anything unlike memcached and it's more similar to Redis. Having said that - I will add comparison to memcached. I do believe that memcached as performant as DF because essentially it's just an epoll loop over multiple threads.

Re you comment about snapshotting. We also push the data into serialization sink upon write, hence we do not need to aggregate changes until the snapshot completes. The complex part is to ensure that no key is written twice and that we ensure with versioning. I do agree that there can be extreme cases where we need to duplicate memory usage for some entries but it's only for the entries at flight - those that are being processed for serialization.

Update: re versioning and memory efficiency. We use DashTable that is more memory efficient that Redis-Dict. In addition, DashTable has a concept of bucket that is comprised of multiple slots (14 in our implementation). We maintain a single 64bit version per bucket and we serialize all the entries in the bucket at once. Naturally, it reduces the overhead of keeping versions. Overall, for small value workloads we are 30-40% more efficient in memory than Redis.

antirez

Thanks for the nice words romange.

The complexity here can be seen in two ways: complexity of deploying more Redis instances, or complexity of the single instance. It's a trade off. But I think that Redis may go fully threaded soon or later, and perhaps your project may accelerate the process (I'm no longer involved, just speculating).

1. Your point about Cluster, I addressed it many times: the point is, soon or later even with multi-threading you are going to shard among N machines. So I believe that to have this problem ASAP is better and more "linear".

2. Already addressed in "1" and my premise.

3. Yep there are advantages in certain use cases related to cloud costs and so forth, that's why maybe Redis will end fully threaded as well.

About memory efficiency, what I meant is that to have versioned data structures, that is an approach to do user-space copy on write even in the case of multiple changes to large single keys (big sorted set example), you need more memory likely, to augment the data structure. Otherwise the trick is to copy the whole value, that has other issues. It's a tradeoff.

phamilton

> soon or later even with multi-threading you are going to shard among N machines

In a world where cloud providers offer instances with terabytes of memory and 128 vCPUS (e.g. aws x2iedn.32xlarge family maxes out at 4TB, gcp m2 family maxes out at 12TB) is that really inevitable? Applications serving 10s of millions of users likely won't come anywhere close to that limitation.

sanjayio

I’m in nerd heaven.

reconditerose

As one of the folks that currently works on Redis, I want to highlight the "Redis can be improved and should be improved". There is a lot of really good ideas put forth that are likely worth consideration in the Redis project as well. There has been a lot of conversations about renewing multi-threading, especially to address the point of simplifying management and better resource utilization.

Glad to see you guys made a lot of progress, although a little disappointing you chose to go down the path of building yet another source available DB and not contributing to open source.

romange

I think at this point of time and the state of this 14 years old project the status quo can be changed only from the outside.

If chrome was not born you would still use microsoft explorer with aspx sites.

tayo42

Can someone/outsider realistically show up and start working on making redis multi threaded?

kristoff_it

It seems to me you didn't address the main point from parent: did you benchmark your multithreaded implementation vs a single core Redis? Nevermind the amazing advantages that having to spawn 1 process vs N brings, the question is how does your software compare when Redis is used as inteded.

romange

I benchmarked DF vs single core Redis. If there is a constructive suggestion for a different benchmark that compares similar product propositions I will happily oblige and do that. i.e. what do you mean by using Redis as intended?

reconditerose

I actually think Redis really needs to trend towards a forkless approach for replication. Although copy-on-write provides a predictable form of memory usage, it does cause rather high memory utilization and requires a good bit of over provisioning.

There was an optimization built for Redis 7 where we actually start return copied memory pages back to the kernel, https://github.com/redis/redis/pull/8974, I wonder if the testing provided on Dragonfly includes this optimization.

romange

No, we used Redis 6 for all our tests.

reconditerose

It won't change the conclusion, but we saw a lot less copy-on-write usage for most workloads.

undefined

[deleted]

manigandham

Interesting project. Very similar to KeyDB [1] which also developed a multi-threaded scale-up approach to Redis. It's since been acquired by Snapchat. There's also Aerospike [2] which has developed a lot around low-latency performance.

1. https://docs.keydb.dev/

2. https://aerospike.com/

romange

True. Keydb tackled the same problems as us. But we chose differrent paths. We decided to go for a complete redesign, feeling that there is a critical mass of innovation out there that can be applied for inmemory store. KeyDb wanted to stay close with the source and be able to integrate quickly with recent developments in Redis. Both paths have their own pros and cons.

manigandham

I see from the blog posts that you looked at KeyDB and Scylla/Seastar for background. I agree with both approaches - fewer but bigger instances and shared-nothing thread-per-core architecture - and it was a major reason for switching to ScyllaDB in my previous startup.

Will definitely follow this to see how it develops. Good luck.

romange

Thanks!

Xeoncross

I want to take a minute to appreciate and recognize the https://github.com/dragonflydb/dragonfly#background section.

A lot of projects say "faster" without giving some hint of the things they did to achieve this. "A novel fork-less snapshotting algorithm", "each thread would manage its own slice of dictionary data", and "core hashtable structure" are all important information that other projects often leave out.

staticassertion

Nothing gets me excited for a project like a bunch of cited papers.

tiffanyh

Sounds similar to DragonflyBSD unique “virtual kernels” (lockless SMP with per core hash tables) https://www.dragonflybsd.org/

Fnoord

I thought these projects were related. Perhaps a bit of an unfortunate name clash. Author could mention they're unrelated.

romange

Thank you. And if you are curious to learn more - we would love to share! And we will.

undefined

[deleted]

judofyr

Wow, this looks very nice!

I’ve seen the VLL paper before and I’ve wondered how well it would work in practice (and for what use cases). Does anyone know how they handle blocked transactions across threads? Is the locking done per-thread? If so, how do you detect/resolve deadlocks?

It also be good to see a benchmark comparing single-thread performance between DragonflyDB and Redis. How much of the performance increase is due to being able of using all threads? And how does it handle contention? In Redis it’s easy to reason about because everything is done sequentially. How does DragonflyDB handle cases where (1) 95% of the traffic is GET/SET a single key or (2) 90% of the traffic involves all shards (multi-key transaction)?

romange

It's really good questions. I invite you to try it yourself using memtier_benchmark :) if you pass `--key-maximum=0` you will get a single key contention when doing the loadtest. Spoiler alert - it's still much faster than Redis.

jitl

There are a lot of benchmarks against Redis, but where is the comparison to Memcached? Redis is quite slow for cache use-case already.

romange

Yes, I can confirm that Memcached can reach similar performance as DF. However, one of the goals of DF was to combine the performance of Memcached with versatility of Redis. I implemented an engine that provides atomicity guarantees for all its operations plus transparent snapshotting under write-heavy traffic and all this without reducing the performance compared to memcached.

Having said that, DF also has a novel caching algorithm that should provide better hit rate with less memory consumption.

xiphias2

Do the benchmarks stress test those atomicity guarantees?

Get/set operations look like they don't need it.

romange

You are correct - GET/SET do not require any locking as long as they do not contend and they do not in those benchmarks. You are right that for MSET/MGET you will see lower numbers. But still it will be much higher than with REDIS.

This is our initial release and we just did not have resources to showcase everything under different scenarios. Having said that, if you open an issue with a suggestion of a benchmark that you would like to see I will try to run soon...

anitil

Do you know what tooling is available for testing atomicity guarantees?

OPoncz

I assume DF has the same performance as Memcached. It would be great if someone makes this benchmark and share.

ed25519FUUU

To me the focus on speed is a wash now. They’re all fast. I’d like to hear about easy cross-region replication and failover as well as effortless snapshot and restoring of backups.

OPoncz

Actually snapshot is done in the background and does not use fork like Redis. You can see it here: https://github.com/dragonflydb/dragonfly#memory-efficiency-b...

mamcx

Aside nit-pick: I think is dangerous call anything "db" if is not safely stored with Acid.

People not read docs neither know the consequences of words like "eventual" or "in memory" and star using this kind of software as primary data stores, instead of caches/ephemeral ones...

staticassertion

So Cassandra isn't a database? I'd say "thing that manages data" is a database, which is to say, a lot of things are databases.

cormacrelf

Everything is either a database or a compiler.

c0l0

Actually, everything is a routing problem.

Xeoncross

We've identified the final hacker project

mrkurt

Or a proxy.

morelisp

All of A, C, and I only make sense defined relative to a particular transaction vocabulary. Redis is perfectly ACID, as long as your transactions are those supported by Redis's commands.

Conversely, plenty of DBs with programmable transactions (e.g. SQL) are considered work-a-day "ACID" enough, despite some massive gaps in their transactional model (no DDL in transactions, no nested transactions, atomic only when below a certain size, etc.)

undefined

[deleted]

vorpalhex

I think thats an issue with people who don't read/comprehend the docs.

wutwutwutwut

Haha, what. If you run a database without reading the documentation then you're the dangerous part, not the ACID-compliance aspects.

For _any_ database there will be important information only available in the documentation.

mamcx

> If you run a database without reading the documentation then you're the dangerous part

I think that covers almost all the whole dev population, for what I see in relation with RDBMS. Lucky us most RDBMs shield the mistakes in their usage, a lot.

That is why I see is "dangerous" to call ephemeral/eventual stores as "db". Marketing/positioning have impacts...

wutwutwutwut

All databases are ephemeral if the person running it don't read the docs. Your comment is hence fully redundant, as opposed to the default single-node install of any DBMS.

romange

Ok you got us. We chose dragonglydb and not dragonflystore just because the former sounds better on tongue :)

Having said that we carefully choose to write everywhere in the docs thay we are in-memory store (and not the database).

Btw, I reserve full rights to provide full durability guarantees for DF and to claim the database title in the future.

vvern

dragonflycache sounds reasonable.

12thwonder

I am amazed at how small the codebase is, and also pretty readable. great to see work like this, thank you!

Daily Digest email

Get the top HN stories in your inbox every day.