Pg_branch: Experimental Postgres extension brings Neon-like branching

Daily Digest email

Get the top HN stories in your inbox every day.

sverhagen

I think I saw an Uber-engineer's presentation (YouTube; can't find it now anymore) about their test environment having the ability to automatically fork the database for every new test run, such that they could easily "revert" (in their case ignore/walk away from) the database after a test run. I find that an appealing idea, since regular test runs in the places where I work, tend to create cruft data in the test enviroment that is lopsided and voluminous in ways that do not properly represent the production system anymore. (And no one wants to troubleshoot performance issues that only ever exist in the test environment.) I have to imagine that pg_branch might be part of building a solution similar to this? Could people speak to some practical applications for this? Maybe I'm thinking about entirely the wrong one here...

petepete

> about their test environment having the ability to automatically fork the database for every new test run, such that they could easily "revert" (in their case ignore/walk away from) the database after a test run

I think this should be really straightforward by using one database as a template for another. This is how Postgres's create database works and it defaults to Template1

https://www.postgresql.org/docs/current/manage-ag-templatedb...

jasfi

The intention of pg_branch is to optimize this process. They mention two problems:

1. The template database can't have active connections during the CREATE DATABASE process.

2. The CREATE DATABASE from a template database can be slow for large DBs.

ilyt

> The template database can't have active connections during the CREATE DATABASE process.

Why would that be a problem ?

dalore

If you have lots of data, this can take some time

The appeal of this project is that it uses CoW (copy on write), so in theory it should be fast to copy as it only maintains writes.

Zanfa

A common pattern in Ruby-world is running each test case in a database transaction that is rolled back. A simple way to keep a clean slate in your test database.

sverhagen

Been there, done that, but please beware, this means that you're missing out on some nuanced behaviors that you would get if your subject under test would actually commit the transaction. I moved away from this entirely. A more-reliable setup is to let the test commit real transactions, and cleanup afterwards. You might be able to do that by calling "delete" on all repository instances, in the tearDown method of your tests. Or you could use something like Flyway to reset and reinitialize the database entirely before every test. That sounds expensive, and it certainly depends on the kind of project we're talking, but if this is a moderately-sized microservice, and you're running tests scoped to just that microservice, the setup shouldn't be too crazy, and this may actually run pretty smooth. (We run our database in a Testcontainers instance.)

vb-8448

It's also quite common in python and I guess in any other language. It's great for unittests, but not suitable for integration or end-to-end tests.

prpl

Salesforce does this for customers:

https://developer.salesforce.com/blogs/2022/09/sandboxes-on-...

vasco

We have something like this with gitlab "review apps". Each MR launches an env with its own db that applies migrations and fake data inserters, so each MR has its own db at current state of MR.

mj4e

For anyone wondering what "Neon-like branching" is, this is worth a read: https://neon.tech/blog/database-branching-for-postgres-with-...

nerdyadventurer

Is branching library open sourced?, AFAIK Neon is.

wdb

How does this differ with Postgres AI's thin cloning and database branching? https://github.com/postgres-ai/database-lab-engine

nalexpear

It's very similar! Postgres.ai is built on ZFS and is a separate application that manages the file system and its branches/"thin clones". pg_branch is an extension that hooks directly into Postgres's own commands (specifically, CREATE DATABASE) to trigger branching.

vb-8448

Postgres AI seems to be a "managed solution" while this is an extension you could use on your self-hosted instance. But, under the hood, they are very similar.

samokhvalov

Underlying DBLab Engine is FOSS (Apache 2.0), with API, CLI and UI https://github.com/postgres-ai/database-lab-engine

klysm

Yea but there’s incentive to obscure it given the managed solution. I view it as high risk

bhouston

Interesting that branching is now better supported and almost free. I wonder if merging can be simplified or whether it already is as simple and as fast as it can be?

I guess I am inspired by Dolt’s ability to branch and merge: https://github.com/dolthub/dolt

lifty

Dolt is on a whole different level. They use prolly trees as their underlying data structure, which allowed them to replicate most of Git’s features very efficiently. And I think they’re planning on building a Postgres frontend as well.

skeptrune

Holy hell this thing is awesome. I am incredibly surprised at how little code it took to achieve a MVP on this. Adding this to local docker image would make doing so many things easier during testing and local dev.

Having it on CI actions would also be amazing.

orangepurple

Don't use BTRFS! I still encounter crippling data loss and error messages in my dmesg log after giving it another shot for the last 10 months. I try btrfs with Arch Linux every few years because of its native transparent compression abilities. Enough is enough! Just say no. Friends don't let friends use btrfs.

bcachefs is the only hope left for normal Linux users but it has not landed yet. I am using ZFS for /home (with zstd compression) on LVM on LUKS and my root partition is ext4 on LVM on LUKS.

I considered VDO for LVM because it supports compression but VDO IS A JOKE! https://github.com/dm-vdo/vdo/issues/38

PlutoIsAPlanet

I have used btrfs on Fedora for 5 years, with transparent compression enabled and have not once had any kind of data loss.

I also use it on servers used in production in RAID 1 (on ones that don't run VMs or Databases) and again, have not had any data loss or filesystem issues.

BTRFS has so much faith that Facebook use it on their server farms. I wouldn't use bcachefs in any kind of critical deployment because it hasn't gone through decades of testing that ext4, xfs and btrfs have gone through.

orangepurple

You should still use ZFS instead of btrfs for your use case. It's not worth the risk. I have been running ZFS for 6 years and have not had any issues. I have had multiple issues on freshly wiped (zero filled) btrfs partitions in that time.

zoomzoom

If interested in using Neon for ephemeral environments, Coherence (I’m a cofounder) recently launched a first-class integration with Neon for just this purpose: https://docs.withcoherence.com/docs/reference/integrations#n...

Would be cool to try using this extension on top of RDS or Cloud SQL to offer the same experience without the overhead of another SaaS tool!

nalexpear

Author here: thanks for taking a look at pg_branch! I didn't expect to have this posted to HN today, but I'm happy to answer any questions folks have about it.

curt15

Doesn't btrfs suffer crippling performance penalties for DBs unless you disable its marquee features of copy-on-write and checksumming?

nalexpear

For this project, I used btrfs because it was quick to use as a proof-of-concept and I happened to be most familiar with it of the CoW-capable file systems.

But pg_branch is easy to extend with other snapshot-capable fs (it's a single trait, internally, that someone would need to implement). Happy to accept pull requests for supporting e.g. ZFS or XFS.

e12e

Should presumably be possible to support bcachefs too?

nalexpear

Any file system that can implement the Branching trait should be possible to support. The only assumption made is that "snapshots" (or whatever equivalent term applies for that file system) are atomic.

ComputerGuru

I don’t know about btrfs, but zfs (correctly tuned) is also famously CoW and simultaneously a good option for many Postgres deployments.

smilliken

PostgreSQL + ZFS is controversial. My experience was that it was inefficient on space and CPU, and especially memory. It would hang for 10+ minutes at a time one every day or two (doing a single threaded sweep across all RAM). And sometimes it would totally hang the system, requiring a reboot. I would describe the experience as hellish. Some of these symptoms only appear when the database has high load, so basic testing on a staging system doesn't tell you about stability on a production system.

PostgreSQL + ext4 is uneventful, lower latency, and less wasteful of resources.

ComputerGuru

Curious. What OS? We’ve been running it in production on FreeBSD/ZFS for probably five years or so without issue, but not running at near system capacity.

vlovich123

Don’t XFS and ext4 also support CoW?

klysm

If ext4 does, I’m not aware of it. The only widely (up for debate) used file systems I know of that offer snapshotting are btrfs and zfs

ptman

reflink probably, but not filesystem snapshots, or do they?

vlovich123

Ah, I was thinking specifically about ref link rather than full disk snapshots.

mrinterweb

One thing I could see where this could be useful would be unit test suites where you want different sets of seed data, but you want to be sure that each test doesn't accidentally mutate and leak state to other tests. There are other ways to do this, but this seems like it might be a natural fit for isolating persistence changes.

mrinterweb

I see APFS (MacOS default file system) supports copy-on-write. I wonder if there are plans for pg_branch to support apfs.

jsmith99

For ad hoc copies I’ve found the third party schema cloning function useful. I assume performance would be an issue for huge databases though. https://github.com/denishpatel/pg-clone-schema

mason55

The difference is actually copying the data vs. copy-on-write. With the latter, you can make infinite copies without worrying about how much space you're using or the time it takes to make a copy, thus making it feasible to start with a fresh copy of the prod DB before you do literally anything. Have CI/CD create a fresh copy before every test. Have every feature branch start with a clean copy of prod and give people a button to do it again if they mess up a migration.

COW unlocks a ton of interesting use cases. It's easy to do on AWS Aurora but unfortunately their minimum instance size is way bigger than the regular RDS instances.

Daily Digest email

Get the top HN stories in your inbox every day.