Brian Lovin
/
Hacker News
Daily Digest email

Get the top HN stories in your inbox every day.

kiwicopple

Every project on https://supabase.io uses PostgREST. It's an amazing tool, especially when combined with Row Level Security.

We have built client libraries for Javascript, Rust, Python, Dart, and a few more on the way thanks to all the community maintainers.

Supabase employs the PostgREST maintainer (Steve), who is an amazing guy. He works full-time on PostgREST, and there are a few cool things coming including better JSONB and PostGIS support.

We recently benchmarked PostgREST for those interested: https://github.com/supabase/benchmarks/issues/2

nb: i'm a supabase cofounder

steve-chavez

Glad to be part of the Supabase team!

Clickable links to the Supabase client libraries, for those interested:

- JS: https://github.com/supabase/postgrest-js/

- Dart: https://github.com/supabase/postgrest-dart

- Rust: https://github.com/supabase/postgrest-rs

- Python: https://github.com/supabase/postgrest-py

Also, you can see how they're used together on https://pro.tzkt.io.

kiwicopple

Also C#, which is still WIP but moving fast https://github.com/supabase/postgrest-csharp (created and managed by a community contributor)

hrishi

Great work guys, love what you're building at Supabase!

Did you run any loadtests that stressed the system enough to start dropping/failing requests? I'm wondering where that threshold is.

kiwicopple

Yes, we are benchmarking all of the components we use in Supabase, including the end-to-end system here: https://github.com/supabase/benchmarks/issues

We still have a few benchmarks to complete, but PostgREST has been thoroughly tested now. Steve just spent the past month improving throughput on PostgREST, with amazing results (50% increase).

tldr: for simple GET requests on a t3a.nano you can get up to ~2200 requests/s with db-pooling.

hrishi

Thanks! It's amazing what you can get done with optimization vs horizontal scaling.

What would you say are the current failure modes? Say for a t3a.nano, what combination of payload size/queue length/rps/other parameters would absolutely mandate an upgrade in capability?

ErunamoJAZZ

I have been using Postgraphile (similar to this one but for GraphQL) for many months. The great thing about this way to create systems is that you don't expend time doing glue code, just in the real business logic. But a big pain is to maintain your database code, by example the version control of your functions. There are not suitable linters, and testing can't be done over postgres functions but must be done over GraphQL instead. Using things like this will save you months of development time!, Even if I agree there are some systems that will not be good idea to implement in this way.

ruslan_talpa

The things you said are a pain (or can’t be done) here is how you do it https://github.com/subzerocloud/postgrest-starter-kit

Specifically, testing the sql side with sql (pgtap) and having your sql code in files that get autoloaded to your dev db when saved. Migrations are autogenerated* with apgdiff and managed with sqitch. The process has it’s rough edges but it makes developing with this kind of stack much easier

* you still have to review the generated migration file and make small adjustments but it allows you to work with many entities at a time in your schema without having to remember to reflect the change in a specific migration file

rapind

I’ve used pgtap. It works but it’s not awesome if your coming from something like rspec. So... I just used rspec to test my functions instead (pgtap let’s you easily test types and signatures too, but I was more interested in unit testing function results given specific inputs).

I’m sure you could argue either way. Just adding this as an option to consider.

awb

> Using things like this will save you months of development time!

When you need to integrate with 3rd parties though you're right back to writing traditional backend code. So now you have extra dependencies and a splintered code base.

Yes, this automates boilerplate which is awesome for small, standalone apps, but in my experience I haven't seen months of development time saved with these tools.

berkes

Yup. Postgres cannot send confirmation mails, push notifications, make calls to Stripe or anything really.

I cannot think of a situation where an API only handles CRUD data and lacks any behaviour.

But, If your API really only is pushing data around, ' such tooling is usefull and probably saves a lot of time.

michelpp

> Yup. Postgres cannot send confirmation mails, push notifications, make calls to Stripe or anything really.

You put an item in table queue using SELECT FOR UPDATE ... SKIP LOCKED and some out of band consumer sends your email.

PostgREST isn't about doing everything in the database it's about doing all the same patterns you already do but with less boilerplate.

berkes

As awb points out: you now have a splintered codebase.

I'd add that you also have hard to spec couplings and difficult to manage microservices setup.

Tools like MQTT or paradigms like eventsourcing might help. But those all presume your database is a datastore. And not the heart of your businesslogic.

qaq

It sure can, should you do it though is a dif. question. You can use PL/Python that ships with PG.

CuriouslyC

This isn't really true, in most cases you can write FDWs over foreign resources that let you query them like any other table, use them in joins, etc. Postgres is really more of an application platform than a database at this point. Just don't try to have the same PG instance be your master and your platform.

lumberjack24

You might want to check Forest Admin in that case. It creates an extensible admin REST API on top of your Postgres.

Comes with native CRUD routes for all your tables but you can also declare your own to refund a transaction using Stripe's API, send an email using Twilio's API...

Here's an explanatory video - https://bit.ly/ForestAdminIntro5min

michelpp

> When you need to integrate with 3rd parties though you're right back to writing traditional backend code. So now you have extra dependencies and a splintered code base.

Those third parties can still talk to the same database. We use this pattern all the time, PostgREST to serve the API and a whole bag of tools in various languages that work behind the scenes with their native postgres client tooling.

berkes

It sounds like it quickly becomes extremely difficult to change the datascheme then.

Do you tightly orchestrate releases? Or do you simply never change the schema?

GordonS

Does PostgREST have functionality for authentication and authorization?

I guess you could front it with a reverse proxy if not, but would be nice to have auth built in.

dwwoelfel

You could try https://onegraph.com. It won't allow you to get rid of all your backend code, but you can definitely get further!

steve-chavez

Regarding linters, check plpgsql_check[1]. Also, as snuxoll mentioned, for tests there's pgTAP. Here's a sample project[2] that has some tests using it.

[1]: https://github.com/okbob/plpgsql_check

[2]: https://github.com/steve-chavez/socnet/blob/master/tests/ano...

ggregoire

> But a big pain is to maintain your database code, by example the version control of your functions.

The solution I came with is to have a git repository in which each schema is represented by a directory and each table, view, function, etc… by a .sql file containing its DDL. Every time I make a change in the database I make the change in the repository. It doesn't automate anything, it doesn't save me time in the process of modifying the database, it's actually a lot of extra work, but I think it's worth it. If I want to know when, who, why and how a table has been modified over the last 2 years, I just check the logs, commits dates, authors and messages, and display the diffs If I want to see exactly what changed.

411111111111111

Liquibase supports plain sql files (just needs one comment for author at the start) and custom folder structures, so if you actually want to take it one step further, do check it out ;)

It could not only be great for documentation purposes but also actually help maintenance by making sure that all statements are executed in all environments

gauravphoenix

+1 for Liquibase. I love the simplicity of it.

Twisell

On one of my oldest postgreSQL project which I usually edit on live database I also added version control in the form of a one liner pg_dump call that backup --schema-only.

This way whenever I make a change I call this one liner, overwrite previous sql dump in git repo, then stage all and commit.

This way il also enjoy diff over time for my schemas.

djrobstep

I don't use postgraphile myself, but as the author of a schema diff tool, I know a lot of people use schema diff tools to manage function changes automatically, and seems to work pretty well for them.

snuxoll

Lack of linters can be a pain, but testing is easily handled with pgTAP.

tobyhede

I've looked quite extensively at Postgraphile and the extensive dependency on database functions and sql is an issue. Really hard to write tests and SQL itself is not the greatest programming language. The whole setup lacks so many of the affordances of modern environments.

lukeramsden

> SQL itself is not the greatest programming language

SQL has been the biggest flaw in this stack for me. I love using PostgREST/Postgraphile et al, but actually writing the SQL is just... eh. Maybe (lets hope) EdgeDB's EdgeQL or something similar could rectify this. The same Postgres core database and introspection for Postg{REST,raphile} but with a much improved devx

tobyhede

I was wondering of the V8 engine integration would be something to play with, but think ot just adds inefficiency and doesn't really fix some of the core problems.

rishav_sharan

Why would you need to write SQL if you are using grapql on postgraphile? Graphql queries are much simpler anyway.

w1

How is Postgraphile different from Hasura?

lukeramsden

Not enormously. The biggest difference is Hasuras console, which Postgraphile lacks. However I don't really see that as an advantage in favour of Hasura, postgraphile + graphile-worker + graphile-migrate (all by the same author) has worked so much better for me than Hasura.

rattray

One thing is that it's open source, not proprietary. You can run it on your own servers, modify it, etc, for free.

Another thing is it's Node, and can be used as a plug-in to your express app. This can make it easier to customize and extend over time. Eg; have parts of your schema be postgraphile and parts be custom JS.

gavinray

Hasura isn't proprietary though? It's OSS software, code lives on the GH repo:

https://github.com/hasura/graphql-engine

xdanger

Biggest difference I think is that Hasura doesn't use RLS for security. It has it's own privileges/roles implementation. Postgraphile kinda works in postgresql, hasura works with postgresql.

mildbyte

PostgREST is great and really reduces the boilerplate around building REST APIs. They recommend implementing access rules and extra business logic using PostgreSQL-native features (row level security, functions etc) but once you get your head around that, it really speeds things up!

If you're interested in playing around with a PostgREST-backed API, we run a fork of PostgREST internally at Splitgraph to generate read-only APIs for every dataset on the platform. It's OpenAPI-compatible too, so you get code and UI generators out of the box (example [0]):

    $ curl -s "https://data.splitgraph.com/splitgraph/oxcovid19/latest/-/rest/epidemiology?and=countrycode.eq.GBR,adm_area_3.eq.Oxford)&limit=1&order=date.desc"
    [{"source":"GBR_PHE","date":"2020-11-20", "country":"United Kingdom", "countrycode":"GBR", "adm_area_1":"England", "adm_area_2":"Oxfordshire", "adm_area_3":"Oxford", "tested":null, "confirmed":3079, "recovered":null, "dead":41, "hospitalised":null, "hospitalised_icu":null, "quarantined":null, "gid":["GBR.1.69.2_1"]}]
[0] https://www.splitgraph.com/splitgraph/oxcovid19/latest/-/api...

dang

(I had to add some whitespace to your JSON because it was breaking the page layout. Sorry; it's our bug. Still valid JSON though!)

chrisweekly

> "OpenAPI-compatible too, so you get code and UI generators out of the box"

That is a pretty awesome feature to be mentioning as an "oh yeah, also..."! :) Bookmarked.

uhoh-itsmaciek

Why a fork?

mildbyte

Couple reasons:

We don't actually have a massive PostgreSQL instance with all the datasets: we store them in object storage using cstore_fdw. In addition, we can have multiple versions of the same dataset. Basically, when a REST query comes in, we build a schema made out of "shim" tables powered by our FDW [1] that dynamically loads table regions from object storage and point the PostgREST instance to that schema at runtime.

When we were writing this, PostgREST didn't support working against multiple schemas (I think it does now but it still only does introspection once at startup), so we made a change to PostgREST code to treat the first part of the HTTP route as the schema and make it lazily crawl the new schema on demand.

Also, at startup, PostgREST introspects the whole database to find, besides tables and their schemas, also FK relations between tables. This is so that you can grab an entity and other entities related to it by FK with a single query [2]. In our case, we might have thousands of these "shim" tables in a database, pointing to actual datasets, so this introspection takes a lot of time (IIRC it does a giant join involving pg_class, pg_attribute and pg_constraint?). We don't support FK constraints between different Splitgraph datasets anyway, so we removed that code in our fork for now.

[1] https://www.splitgraph.com/docs/large-datasets/layered-query...

[2] https://postgrest.org/en/v7.0.0/api.html#resource-embedding

michelpp

There's always this confusion that comes up whenever PostgREST is discussed on HN, which is many times at this point. There is the misconception that arises that you use PostgREST to remove your existing stack entirely and you do everything in SQL. This is not true, you're not going to send emails from SQL or run entire multi-GB ETL reductions during an http request.

If you want an API call to kick off some external processing, then insert that job into a queue table and do the same thing you always did before, consume the queue out of band and run whatever process you want.

Another one that comes up is that somehow postgrest is "insecure". Of course, if you invert the problem, you see that postgrest is actually the most secure because it uses postgres' native security system to enforce access. That access is enforced onto your API, and you know what, it's enforced on every other client to your DB as well. That's a security unification right there. That's more secure.

What PostgREST does is let you stop spending months of time shuttling little spoonfuls of data back and forth from your tables to your frontend. It's all boilerplate, install it in a day, get used to it, and move onto the all those other interesting, possibly-out-of-band, tasks that you can't get to because the API work is always such a boring lift.

Fire-Dragon-DoL

Is there an admin UI for postgrest, along the lines of ActiveAdmin?

That would be game breaking for me, lot of software can be skipped with such a thing

bbaumgar

There is no built in admin UI but it's about a 15 minute task to connect up to Forest Admin, which plays like a dream.

swyx

i guess supabase.io also counts? since they bundle postgrest and have an admin UI

undefined

[deleted]

sopooneo

I am clearly missing something, but at that point, why not just allow the client (presumably a web app running in a browser in most cases) to just make requests via SQL? You would of course want to add authentication/authorization layers. You’re want record and field level restriction. But if the entirety of you data is in an RDMS, why put this “REST” paradigm in the middle? Why not just admit you’re querying a database, and query the database? If the only reason is access control, it seems a simpler solution must be possible.

dragonwriter

> I am clearly missing something, but at that point, why not just allow the client (presumably a web app running in a browser in most cases) to just make requests via SQL?

Because REST-over-HTTP is low impedance with browser-based web apps, whereas SQL is...not.

Plus, with REST, you abstract whether all, some, or none of your data is an RDBMS; the fact that you've implemented something with PostgREST doesn't mean everything related to it and linked from it is implemented the same way.

piaste

It's difficult to grant an untrusted client a level of raw SQL access that won't let them do harm.

For example, even without access to any specific table or function, even with rate limits, I can denial-of-service the server by asking it to make a massive recursive pure computation.

tiew9Vii

I often thought the same with the use of GraphQL. Instead of building a highly curated API with business logic, optimisations and other general stuff you are building a direct interface around how your data looks in the db, a 1:1 mapping and crud interface. In software we are taught to prefer loose coupling vs tight coupling yet here people are preferring the tight coupling.

I’m not a fan of this as the user interface (API) has a tight coupling with how you store your data. Then like you say, why not just speak SQL as you have all the same issues, essentially multiple clients writing to/owning the same tables.

taffer

The PostgREST documentation makes it pretty clear that for decoupling and security reasons the data schema should be separated from the API schema. You should only provide the client with views and functions and keep all data in a separate schema. This also solves the versioning problem, because if your API schema is separate from your data, you can simply drop and re-create the entire schema for deployment without needing migrations.

tehlike

graphql kind of does that.

SQL is very complex, T-SQL is turing complete, meaning you can do lots of damage. you can bring servers to a halt if unchecked. It's pretty hard to restrict what can be done keeping flexibility.

ruslan_talpa

This is the “corect” response to the question above. :)

CuriouslyC

Postgres isn't 100% set up to let fully untrusted clients do stuff. For example, right now there is no access control around listen/notify. You'd have to do some baseline query pre-processing to be completely secure, which could be a non-trivial task depending on just what you want to let clients do.

I've done it in read-only internal business apps though, it's great.

xwdv

Most web applications are basically just interfaces to a database. Why even have “REST” at all?

jaikant77

You got me thinking. GET/PUT/POST is not really needed for a completely server rendered app.

ntr--

I think we will see more of a push towards SSR technologies like LiveView for exactly this reason:

https://hexdocs.pm/phoenix_live_view/Phoenix.LiveView.html

talolard

I really love postgrest! I did a lot of django before and found myself constantly wrestling with the orm to express what I wanted, then writing endless pages of serializers and view sets and had to debug why they were always slow.

Postgrest has a learning curve, but the performance boost vs django is huge, and I can use more advanced db features like triggers more easily and in a way that’s native to the overall project.

otar

I've been witness of a fully functional http client wrapper written in PL/SQL on Oracle. It's been working very well on REST and SOAP APIs. Codebase was a mission critical system for a large financial organization.

Development soeed was advantage, but the trade-off was that the good database developer skill is still rare and you had to grow and teach other [junior] devs for years. They used to stick with the team much longer time than the average developer, but still I believe it is a disadvantage.

What about PostgREST, the biggest issue I have with it is a DB server being available publicly in the net, I usually try my best to either place DB servers in the private network or "hide" them.

Other than this argument, it's a pleasure to develop on that low level. SQL is an important skill and it's strange why so many devs know it superficially.

SahAssar

> What about PostgREST, the biggest issue I have with it is a DB server being available publicly in the net, I usually try my best to either place DB servers in the private network or "hide" them.

I've heard this argument many times (and thought it myself), but when dealing with postgrest it seems that if you have a proper JWT setup (which is how postgrest handles AuthN) and use postgres' security features (like row level security) perhaps it should not be thought as a rule anymore.

IMO it seems like having the api layer only assume a role and having the DB handle AuthZ would mean better security since you can implement more fine grained rules that are actually verified by the part of the stack that knows the data structure already.

It's also not allowing arbitrary SQL, it's translating from HTTP to SQL, so nobody can do "SET ROLE 'admin';" unless you write a specific SQL function that does that.

somurzakov

you can probably hide postgREST behind combination of API Gateway/Load Balancer/WAF/IDS+IPS that could solve most of headahches with security

FlyingSnake

This is the correct way to go. My PostgREST is behind nginx for example.

robertlagrant

Is this the ApEx? The gift that keeps on giving, for Oracle.

undefined

[deleted]

fulafel

PostgREST is great.

Also

- written in Haskell

- a major building block for YC funded startup supabase.io (https://news.ycombinator.com/item?id=23319901)

xav0989

Surpringly enough, crt.sh is pretty much all built in PLSQL and PLpgSQL.[1] I'm sure there are advantages to running it all from the database engine itself, but finding devs that can work on it must not be easy.

[1]: https://github.com/crtsh/certwatch_db

michelpp

That's surprising, SQL is one of the most well known languages in the world. plpgsql has its quirks, but as languages go, it's pretty straightforward, most things do what you expect them to do.

anthony_doan

It's not the language but more about how the data is store (data structures/indexes) and what you're going to do with the data.

At least during my time as a developer, I've come across many people that didn't understand this. When asked why they want to use elasticsearch over RMDB is it because they wanted Trie over B+tree? They didn't understand. Also the use cases almost always relational. Postgres have good enough FTS actually if anything elasticsearch is almost always a complement database not a replacement to RMDB.

IggleSniggle

Honestly folks choose elasticsearch only in part for Trie over B+tree. It provides a fair amount of magic wrt to appropriate defaults for dealing with large datasets in a distributed store.

I think a lot of folks would be better off with RMDB, but if you barely know SQL and spend most of your time making UIs, you’re lucky to have the breadth of know-how to configure Postgres the right way (no offense intended to frontend developers).

Of course, Elasticsearch’s magic defaults expectations may come back to bite you later on when you’re using it OOTB this way, but it’s hard to argue with throwing a ton of data in it and then -POOF- you have highly performant replicated queries, with views that are written in your “home” programming language, without even really necessarily understanding what your schema was to start with (yikes, but also, I get it).

ivanceras

I made something[1] similar in rust inspired by this project. I made modification to the syntax to make it more ergonomic in the case of grouping filters with parenthesis for complex and nested filters. I also added joining of tables, renaming columns and more.

[1]: https://github.com/ivanceras/restq

SahAssar

Postgrest has joining (called resource embedding) and column renaming, right? Could you clarify what you mean?

ivanceras

Seems like a lot has changed since I looked at postgrest in its early days.

undefined

[deleted]

jensneuse

Postgraphile is a better alternative. Then there's also Hasura, 8Base and XGeneCloud. Did I miss anyone?

michelpp

Postgraphile and PostgREST can be used side by side, they provide different interfaces but can talk to the same database. Two for the price of one!

fulafel

That seems to be a GraphQL thing so not a direct alternative.

np_tedious

Most (or all?) of those are GraphQL rather than REST.

xgenecloud

XgeneCloud[1] supports REST as well.

https://github.com/xgenecloud/xgenecloud

masklinn

Given that, despite the project name, the tool described by TFA is pretty obviously RPC over HTTP rather than anything resembling Fielding's description, graphql would would just as well.

michelpp

Sorry but I disagree, PostgREST is REST. For example URLs map to resources and it uses HTTP verbs. GraphQL does not.

SahAssar

Better based on what?

jensneuse

Just personal taste. Without knowing the actual use case it doesn't make sense to judge. However, there's one thing that GraphQL CRUD APIs will always win over REST CRUD APIs and that's n+1. Smart enough GraphQL middlewares can turn any Query into a single SQL statement whereas with REST it can very easy become a waterfall of hundreds or thousands of requests. But again, if you don't have nested data you might not this feature.

steve-chavez

PostgREST has solved n+1 with resource embedding[1] since 2015(maybe a year before Postgraphql/Postgraphile was born). So GraphQL does not pose an inherent advantage over REST for doing this.

[1]: http://postgrest.org/en/v7.0.0/api.html#resource-embedding

SahAssar

Postgrest has resource embedding that even works over join tables, so most relations should be no problem to include in a single request.

Daily Digest email

Get the top HN stories in your inbox every day.