Show HN: Bi-directional sync between Postgres and SQLite

Hi HN,

Today we’re launching PowerSync, a Postgres<>SQLite bi-directional sync engine that enables an offline-first app architecture. It currently supports Flutter, React Native and web (JavaScript) using Wasm SQLite in the browser, with more client SDKs on the way.

Conrad and I (Ralf) have been working on our sync engine since 2009, originally as part of a full-stack app platform. That version of the system is still used in production worldwide and we’ve learnt a lot from its use cases and scaling. About a year ago we started on spinning off PowerSync as a standalone product that is designed to be stack-agnostic.

If you’d like to see a simple demo, check out the pebbles widget on the landing page here: https://www.powersync.com/

We wrote about our architecture and design philosophy here: https://www.powersync.com/blog/introducing-powersync-v1-0-po...

This covers amongst other things how we designed the system for scalable dynamic partial replication, why we use a server authority architecture based on an event log instead of CRDTs for merging changes, and the approach to consistency.

Our docs can be found here: https://docs.powersync.com/

We would love to hear your feedback! - Ralf, Conrad, Kobie, Phillip and team

Daily Digest email

Get the top HN stories in your inbox every day.

vlovich123

Since the company seems to be answering questions I’ll give a few that come up for me. Congrats on the launch btw.

In the given design, am I understanding correctly that this means that a local commit could be seen as “committed” by the user but then later the server rejects it because of conflicts right? I guess it’s all application defined in that you could build your application in such a way as to show enqueued but uncommitted changes differently from the checkpoint, but it does mean that you could lose data if the application author doesn’t handle that well in their local code or the application server yeah? Not a critique because I think that’s true even for anything handrolled, but just making sure I understand the model.

Also, I’m a little unclear how it’s tied to postgres and what the powersync service is doing. The docs say that you fetch a JWT from your application to talk to the service but then it says that all writes are mediated by your own application code. So presumably the powersync service channel is for synchronization of Postgres -> local SQLite. Is that right? And the replication logic in powersync - is that essentially accomplishing horizontal sharding of the database for reads? Also, for the replication piece is the SQLite bit actually important or could you actually support arbitrary backends and SQLite is just convenient? Eg could you support browser LocalStorage instead of WASM SQLite or is there some piece of functionality of SQLite you’re relying on?

Finally, do you have any support for lazy local hydration / eviction? Eg if I have a Google docs like application, is it synchronizing my entire account at all times or does it support pulling a document and evicting LRU documents when some local storage limit is exceeded?

matharmin

> Also, for the replication piece is the SQLite bit actually important or could you actually support arbitrary backends and SQLite is just convenient? Eg could you support browser LocalStorage instead of WASM SQLite or is there some piece of functionality of SQLite you’re relying on?

SQLite is supported practically everywhere, so that's our first choice. We also modelled our sync rules on SQLite's type system and functions to a large extent.

However, there is nothing technical tying is to SQLite. The sync protocol works purely with JSON, and doesn't impose any restrictions on how the data is stored or queried.

For the JourneyApps platform where PowerSync was originally used, we actually have an implementation on top of IndexedDB for browsers, which ends up being more lightweight. It needs more work before we can expose it as a general library, so we just started with SQLite for now.

ochiba

> In the given design, am I understanding correctly that this means that a local commit could be seen as “committed” by the user but then later the server rejects it because of conflicts right?

> it does mean that you could lose data if the application author doesn’t handle that well in their local code or the application server yeah?

Yes, this is correct and the backend developer needs to make sure that conflicts are handled appropriately (I went into a bit more detail on this in my other comment in reply to langarus which you may have seen)

> So presumably the powersync service channel is for synchronization of Postgres -> local SQLite. Is that right?

Yes, this is correct.

> And the replication logic in powersync - is that essentially accomplishing horizontal sharding of the database for reads?

Assuming I understood this question correctly — yes, the PowerSync Service handles the complexities of dynamic partial replication of the database to different users. In our announcement blog post we wrote a bit more about the trade-offs and design considerations: https://www.powersync.com/blog/introducing-powersync-v1-0-po... (see section "A scalable dynamic partial replication system")

> Also, for the replication piece is the SQLite bit actually important or could you actually support arbitrary backends and SQLite is just convenient?

We do currently use a few different features of SQLite, but something that we are considering is making the client-side more database agnostic and potentially supporting more local database options (details TBD).

> Finally, do you have any support for lazy local hydration / eviction? Eg if I have a Google docs like application, is it synchronizing my entire account at all times or does it support pulling a document and evicting LRU documents when some local storage limit is exceeded?

It is possible to accomplish some of this kind of functionality using PowerSync's Sync Rules. It should be possible to design the Sync Rules such that flags are set (e.g. based on LRU) which would trigger certain rows to be synced/un-synced.

I think my co-founder (matharmin) may also want to weigh in with more detail on some of the answers. We are based in different timezones so we may reply with more information in a few hours.

vlovich123

Yeah I figured that given the static nature of declaring the sync rules ahead of time, dynamic control over what is synchronized may be tricky. You mention using a flag column, but wouldn’t that mean that you’re generating writes for reads to update that flag/timestamp? Or can you choose to have extra local columns that aren’t aren’t part of the replication data?

ochiba

Note that dynamic control from the client-side over what is synced is currently supported to an extent via token parameters from the client, and this will potentially be expanded in the future.

> You mention using a flag column, but wouldn’t that mean that you’re generating writes for reads to update that flag/timestamp?

There may be other ways to solve this, but the solution that came to mind was:

- Set a "last accessed at" timestamp on the client when the user opens a specific item. This would sync to Postgres.

- Have a recurring task on the server-side that updates items in Postgres based on "last accessed at" and sets a flag that causes an item to be de-synced for that user once the elapsed time exceeds some threshold

Persistent local-only columns are not currently supported. Local-only tables are currently supported.

smartmic

Let me throw SymmetricDS into the ring: https://symmetricds.org/

In my opinion it is quite underestimated giving its various cross platform and cross product sync capabilities. My first experiences with it are quite promising.

matharmin

Thanks for the link, I haven't come across SymmetricDS before. The list of supported databases is impressive.

It appears to be primarily aimed at server-to-server replication. It does support Android and iOS, but I don't see much documentation on that. Do you know if it works in practice to implement offline-first applications?

srameshc

Thanks for sharing. Sometimes I feel had it not been for discussion and community at HN, I wouldn't find such stuff.

accra4rx

You stole my words! Thank you again for sharing this option. I just knew about big brands OracleGate , Qlik, IBM IIDR etc but this is interesting. I hope somebody shares the experience with the Pro version and how was the support experience.?

m9t

This looks interesting! How does this compare to ElectricSQL[1]? Next to not using CRDTs.

[1] https://electric-sql.com/

thruflo

Hey, James here from Electric. Congratulations to the PowerSync team :)

As a sibling comment says, PowerSync actually wrote up a comparison here https://www.powersync.com/blog/electricsql-vs-powersync

Aside from Electric being open source and PowerSync a proprietary service, the primary difference is in the programming model on the write path.

Electric provides finality of local writes. So once a non-malicious write is accepted locally, it is final and won’t be rejected by the server. This simplifies the programming model and means you don’t have to code for rollbacks.

PowerSync is a server authoritative system. Local writes are tentative and can be rejected at the server. You run an API on the write path and write logic to handle conflicts and rollbacks.

The different approaches come with different trade offs, both operationally and in terms of the programming model.

On the topic, if interesting, we have a list of alternative projects here: https://electric-sql.com/docs/reference/alternatives

Anticlockwise

I looked at these two systems recently and noted another crucial difference for anyone using them for a use case that included access control. ElectricSQL doesn't seem to (yet) support table joins in a way that would support standard web app access control design patterns. If you're replicating a table to a device, you're replicating the entire table, not a user-limited selection.

Supporting joins is in development, but I'm not yet clear on whether the current dev branch on it goes far enough to support access control use cases.

There's a hack in place that's supposed to help - you can define an electric_user_id on the table - but that isn't actually usable in the majority of use cases, because most ACL cases include records where multiple users can access it. I did explore using views, but electricsql doesn't currently support postgres views.

(if I'm wrong or missed something in electricsql, I'd love to be corrected, as it looks like an exciting project otherwise)

thruflo

This page lists current limitations https://electric-sql.com/docs/reference/roadmap

The key features for us on this are:

1. permissions https://electric-sql.com/docs/usage/data-modelling/permissio... which are defined using DDLX rules, authorise data access and can be used to filter data 2. shapes https://electric-sql.com/docs/usage/data-access/shapes which are the main, more expressive way to control what data syncs on and off the local device, including where clauses, joins, include trees, etc.

These are both in development and due soon. From your comment, I think you’ve seen the shapes branch with where clauses and include trees already working, for example.

In the meantime, the shapes API over syncs the full table. This is temporary and obviously suboptimal but it means you can develop today using the shape APIs and still filter data you display using local queries. Then when the proper functionality lands, the sync will become more fine grained and optimal without your app code needing to change.

Hope that makes sense. We’re very much not a full table sync system. Our role is to provide the best possible model for controlling dynamic partial replication (and to maintain integrity across replication boundaries).

matharmin

Thanks for chiming in! ElectricSQL is great - we see it come up a lot in discussions since on the surface it solves the same problems (syncing between Postgres and SQLite), despite the details being very different. I love seeing all the innovation in the offline-first space!

Fire-Dragon-DoL

Wow the amount of effort you have put in this open source project is incredible!

Congrats, I'm so looking forward an idea or project where I can use electricsql

prawnstar

Hey there's a comparison of the two here https://powersync.com/blog/electricsql-vs-powersync

alfor

I am looking for something like this. We develop IoT devices, thousand of linux computer running behind customer routers and have a central server with web admin, all in python.

I wish i could have "syncronized objects", something were a change in a device would be reflected back to the server, web interface even better and in the other direction too, a way to see the whole system as one thing, not a hodge podge of api, calls, update and so on.

Is there an elegant way to do this?

bongodongobob

It's called RMM and there are tons of products that do exactly this. Most will be geared towards MSPs, but that's likely right up your alley as you probably want to silo customers into their own sites.

jddj

Yeah, during the IoT hype there was a (supposed? I never tried it) solution to this from every cloud provider.

I think Amazon and Google both deprecated it since then, not sure how Azure's is going.

ochiba

This is an interesting use case and sounds like it could be a fit.

What stack is the software that runs on the Linux devices?

Also feel free to shoot me an email if you'd like to discuss the use case privately (email in bio) or chat on Discord https://discord.gg/powersync

alfor

on the linux devices we have a web admin (made with node-red) that control the linux machine, network, config, updates.

Then the web admin run docker container, we have a python backend on the device with fastapi as the base and another with vue.js for the frontend (touchscreen)

Then the server is fastapi/mongodb/vue.js.

We have a continuous websocket from the python program to server and another one from web dashboard to our server admin.

Lot's of moving parts, lot's of problems, not a lot of cohesion and structure.

ochiba

Thanks for the additional info. As another commenter mentioned, since you are using MongoDB, it may be worth looking into MongoDB Realm / Atlas Device Sync.

maxfurman

I'm reminded of RealmDB, which I think Mongo bought at some point

jinjin2

We have been using realm in production for a while and it seems to have steadily improved since moving to MongoDB (surprisingly for an acquisition). It is definitely best of class when it come to doing live sync between a backend and a local database.

Realm itself is amazing, fast and intuitive, and the bidirectional sync to MongoDB just works. The only downside is the messy and confusing web admin interface.

redwood

Out of curiosity, what kind of use case is yours?

kobieps

Yeah they did acquire Realm

halfcat

Turso might be a fit

https://turso.tech/

sgt

Well done on the launch! I have been using PowerSync for several months already ( although our own app hasn't quite reached production yet, due to other reasons).

Very happy with PowerSync. The docs are detailed and well thought out, and the technology is solid. One can see it builds on a decade of sync experience from the Journey team.

prawnstar

Thanks for the feedback! Along with the docs we also have an active Discord that anyone is welcome to join for any queries, requests, discussions etc https://discord.gg/powersync

chaostheory

Thank you - it’s great to have an alternative to electric-sql.

I actually need a local first, disconnected framework. What many people don’t realize is that even in the US, there are many parts without any network connectivity like our parks and preserves. Rangers still need to make queries in those areas.

amit9gupta

You can try Watermelon DB. It is local first disconnected framework. And it has a sync framework as well, but you have to create your own backend (using any DB) for syncing. Stable product and works great

https://watermelondb.dev/docs/Sync/Intro

jpbtavares

Hi everyone, I worked with watermelonDB and recently switched my entire project to powersync. Watermelon has its limitations for an offline applicationFirst, if you really want to make use of the application completely offline, you will have to build a synchronizer like CRDT by hand and manage the request queues. With powersync all of this is managed by them, and with a simple code I can choose which information from the database I will sync for each user. In my first tests with WatermelonDB, synchronization proved to be unfeasible due to the amount of synchronized data. In short, Powersync has proven to be a wonderful tool that has allowed my company to move forward with offline services.

whilenot-dev

You can find more alternatives here: https://localfirstweb.dev/

metadat

I dig it, super cool idea!

What about timestamps, though? SQLite doesn't support them.. does it end up being restricted to the lowest common denominator between the two databases in terms of supported types/functionality?

ochiba

Co-founder (Conrad) here. SQLite does indeed have a narrow set of types. Timestamps in Postgres are converted to text in SQLite in a format that is compatible with ISO8601 and SQLite's functions. Type conversions are documented here: https://docs.powersync.com/usage/sync-rules/types

zlg_codes

I see you're using ISO8601 and including timezone data, which seems best for generalizing.

Somewhat related: if you were storing UNIX timestamps that only need the resolution of a day and will always be UTC, would you use an integer for the UNIX time or just a shortened ISO8601 without H:M:S data?

I've been trying to decide on this for a feature in a tool I'm making. In either case, SQLite has the ability to convert back and forth, so it'd really be the exported JSON containing the data whose readability would be affected more than anything.

matharmin

Personally I prefer the shortened ISO8601 format (YYYY-MM-DD). While it's not the most efficient, efficiency doesn't often matter on that level. It's unambiguous, can be parsed by practically any library, and easily human-readable.

I'd avoid using unix timestamps for day-resolution, since it's very easy to make a mistake with the timezone when parsing, which you only notice when changing between a positive and negative timezone.

Julian day is another option, but support for it is not quite as universal. If efficiency is important, that's the format I'd use.

saurik

SQLite has a number of functions to work with time but the underlying storage format is just going to be--depending on what precision/range of timestamp you want--a float or an integer... to the extent, of course, that SQLite has types at all (as it frankly doesn't, thereby making this question kind of moot).

simonw

SQLite has five types, documented here: https://www.sqlite.org/datatype3.html

    null, integer, real, text, blob

It has a historically cavalier attitude to enforcing them (which I believe it inherited from TCL) but that changed in November 2021 with the release of strict table mode in version 3.37.0: https://www.sqlite.org/stricttables.html

bch

> It has a historically cavalier attitude to enforcing them (which I believe it inherited from TCL)

Richard Hipp (inventor, principal dev of SQLite) called it “manifest typing”[0], indeed likely inspired by Tcl[1], of which he is a Core Team member emeritus, and of which SQLite was initially born, as a loadable Tcl package.

[0] https://www.sqlite.org/different.html

[1] https://wiki.tcl-lang.org/page/Everything%20is%20a%20string?...

saurik

Ah! I mean, only two years old, but OK ;P. Regardless: my point stands that you can trivially store and work with timestamps in SQLite as they are simply a real or integer or text; like, the idea that you need a special timestamp data type or you can't work with such values in an engine that only two years ago got support for checking the types at all and encourages you to just write code is highly strange.

o11c

JSON is a sixth type in practice due to the way `text` acts differently depending on whether it is fed directly from one JSON function to another.

See https://sqlite.org/json1.html#value_arguments

farmeroy

This looks great! I've been planning a project and I wanted exactly this functionality and started looking into running SQlite in the browser. Very excited to try this out

davidlorean1985

I'm excited for the Capacitor SDK that's on the roadmap. Any rough ballpark estimate of when that might be available?

I haven't used Capacitor but I will soon. Is it possible to use the web SDK in a mobile Capacitor app or is there something about the webview browser environment that breaks the normal functioning of Powersync on the web?

rkistner

It may be possible to use the web SDK directly in a Capacitor app - it could be sufficient for initial development, but I wouldn't recommend that for production use. It would use the webview's IndexedDB as for persistence, which is not ideal for a native app. For the Capactitor SDK, we would use native SQLite, which would have better performance and more control over storage.

random_kris

Possible integration with drizzle orm ?

Let me define schema with drizzle and use its generated objects to interact with db. Postgres driver for server SQLite driver for client

ochiba

It is on our near-term roadmap to support ORMs for the client-side SQLite database. Our goal with ORMs is to fit in with existing popular libraries on each platform (e.g. Flutter, React Native, etc.), rather than creating a PowerSync-specific ORM over all platforms. Drizzle is one of the integrations that we're looking at, along with Prisma and others.

JCharante

Heads up:

The submission link https://powersync.com/ results in ERR_ADDRESS_UNREACHABLE

You linked the proper landing page here: https://www.powersync.com/

Your web infrastructure doesn't redirect root visitors to www (or pass those requests on to your web server)

ochiba

That's strange, we are unable to reproduce this on our end — both URLs load correctly for our team. I'll shoot an email to the address in your bio to see if there's any details you could perhaps share that could help us debug. Thanks for letting us know :)

mariopt

This seems really interesting and I do have some questions:

How similar is this to MeteorJs minimongo collections?

Does this provide an architecture where a client subscribes to a collection and the server keeps track of which data the client has and only sends minimal update messages?

Is this ideal for intense real-time applications (ex:Chat)?

Is self-hosted a future option?

matharmin

> How similar is this to MeteorJs minimongo collections?

> Does this provide an architecture where a client subscribes to a collection and the server keeps track of which data the client has and only sends minimal update messages?

It is similar in that PowerSync also supports local queries and real-time/streaming updates. However, I believe minimongo / MeteorJS uses an in-memory database, while PowerSync fully persists the data, allowing full offline capabilities.

The PowerSync client does subscribe to incremental changes, similar to MeteorJS (although the client keeps track of the state, not the server).

> Is this ideal for intense real-time applications (ex:Chat)?

Yes, PowerSync will work well for that. One caveat is that PowerSync may have slightly more latency than some other real-time systems (in the order of tens to hundreds of milliseconds), in favor of consistency and offline persistence of data. This means PowerSync will not be suitable for time-sensitive games as an example.

> Is self-hosted a future option?

Yes - see other comments for details on self-hosting and our open-source plans.

Daily Digest email

Get the top HN stories in your inbox every day.