RFC 10008: The new HTTP Query Method

Daily Digest email

Get the top HN stories in your inbox every day.

100ms

Including a strong motivating example might have helped sell this, using an example that could trivially be expressed as a GET is extremely distracting.

Even imagining a QUERY with a large JSON filtering structure, or say an image input as request body, it feels extremely odd to include the request body as part of the cache key. It also implies an unbounded and user-controlled cache key, with the only really meaningful general caching strategy being bitwise compare of the request body (or a hash), which in a hostile scenario implies cache busting would be trivial.

This invokes multiple semantic oddities in one go with obvious difficulties for a very niche use case. If I'm writing a service that needs complex filtering or complex input like an image, any form of caching (e.g. individual data columns of a join, or embeddings keyed by perceptual hashes of a decoded image input) is going to be far away from the HTTP layer and certainly unrelated to the exact bit representation of the request on the wire.

Why even bother trying to capture this in a generic way?

I would be far more inclined to try and capture this caching semantic as a new header for POST. Something like "Vary: request-body" or similar. Perfectly backwards compatible and perfectly ignorable for all but the 0.1% of CDN use cases where the behaviour might turn out useful

Joker_vD

> It also implies an unbounded and user-controlled cache key,

The query part of GET's URI is also barely bounded in practice and user-controlled, and is indeed used as part of the cache key (because it's a part of URI), so I am not sure why you raise this objection at all.

giancarlostoro

> and user-controlled

I've found some sites that tack on a session ID and if you try to tamper with the URL in any way, it sends you back to "Page 1" really annoys me lol at that point let me skip to any page with your web UI.

PunchyHamster

Well, because it is more code. Current caching software caches by headers + query string. It now needs to be expaned to cache by body too.

It feels very pointless and there is no drawback of just using POST

OvervCW

There is: your browser or other type of client does not know it can repeat a POST request if it fails, whereas a QUERY request can be freely repeated in case of errors.

afavour

Is caching not the primary reason to use this over POST? You should never want to cache POST requests.

CodesInChaos

The browser can simply store a collision resistant hash (e.g. SHA-256) of the body, if it wants a smaller cache key. I can't really think of any caching related attacks that don't equally apply to a query parameter. Generating a unique 30 character query parameter is just as easy as generating a 30 MB request body, if you want to flood the cache.

ralferoo

Not necessarily that simple, as you'd have sort all the input parameters to maintain a useable cache key. Not especially difficult, but if the data is large and so re-allocation and sorting is required, then you're starting to open up the attack surface where bugs might have been introduced.

dagss

Do you have to? Is it common to treat ?a=1&b=2 the same as ?b=2&a=1 in browser/CDNs/etc?

Seems the spec puts this as a MAY. I think I doubt it will be implemented in generic ways, except perhaps for urlencoded payloads. After all you cannot normalize in general without knowing the query language. At the backend it does not matter, may as well cache one level deeper based on the parsed input irrespective of QUERY or not.

ygouzerh

Regarding the body used as a key for the caching: in the RFC, from my understanding, it's indicated that we can use Location as well:

Exemple:

``` QUERY /search HTTP/1.1 Content-Type: application/json

{ "filters": { "region": "asia", "status": "active" }, "sort": "created_at", "limit": 500 } ```

can answer

``` HTTP/1.1 303 See Other Location: /queries/results/f3a9c1d7 ```

And then you can access later `/queries/results/f3a9c1d7` using a pure GET call, and cache this instead

inigyou

Not all usage scenarios are the public internet, and something doesn't have to be useful on the public internet to be standardized.

Realistically, systems for the public internet will use a secure hash as the cache key so it'll always be the same size. The cache key already includes a URL that can be very long, and an arbitrary set of header values.

ralferoo

Except that by definition, in a URL the data has no implicit meaning so for a cache hit you need an exact match, including order and case, but for a list of POST parameters, they could legitimately be in any order and so you can't just hash it all as a blob, you need to sort the keys, possibly copy data around (unless using keys plus hash), probably allocating more memory, etc. I'm pretty certain we'll see at least one CVE out of the first few implementations of this!

inigyou

POST/QUERY data can be in any format. Who are you to say order doesn't matter? Are you sure you can even parse it? Mine is in DES-encrypted (with key "password") base85 DER, you really gonna implement that in your proxy?

tanepiper

One example - I'm building an MCP server at the moment for a database I'm working on. In ChatGPT I want to do dry-run posts first that roll back before committing - both are POST requests with a property - and it loves to trigger the safety layer in the tools (for various reasons, it's hard to debug exact causes)

But I think this would make it better - QUERY before POST means different request types, not just the same with a safety flag.

cryptonym

Sure you can provide an image as request body, but you could already do it with b64 query parameter. If you try hard enough, you can poorly use any proposed standard. GET with query parameters already is opaque and makes cache busting trivial.

layer8

Query parameters are length-limited, because HTTP URIs are: https://www.rfc-editor.org/info/rfc9110/#section-4.1-5. There is no expectation for arbitrarily long HTTP URLs to be functioning.

cryptonym

Your link doesn't say URIs are length-limited

friendzis

> It also implies an unbounded and user-controlled cache key.

While the concern is valid, caching is entirely optional at query level, therefore it is totally valid to cache only certain "filters".

epolanski

> Why even bother trying to capture this in a generic way?

I guess it's about resolving the odd semantics of using POST which is not idempotent and thus allowing easier control flow of caches and retrys.

Your perspective is 100% correct if you think at the application-layer, but with a dedicated method, you can have that behaviour out-of-the-box out of your HTTP infrastructure (whether it's at your hyperscaler's router or your apache/nginx/browser whatever) and stop implementing yourself the post-as-a-query edge case.

CodesInChaos

I wonder if HTML forms will add support for QUERY:

    <form action="..." method="query">

This would avoid the annoying re-submission warnings you're getting if you refresh a page that was returned by a POST form submission, since QUERY is required to be idempotent.

acabal

Supporting more than GET/POST in HTML forms has been my dream for decades. There's a WHATWG proposal to do just that if you want to add your voice: https://github.com/whatwg/html/pull/11347

CodesInChaos

I'm not convinced of the benefit of allowing PUT/PATCH/DELETE in forms. But QUERY sounds more useful, and its lack of side-effects (plus the lack of legacy code using it) would avoid the need for CORS preflight requests.

There is a separate issue for QUERY: https://github.com/whatwg/html/issues/12594

MBCook

What’s the use case for that?

acabal

Because if HTTP is the language of the web, then HTML forms are how humans speak that language to computers. Right now we humans can only speak GET and POST.

In other words, right now if a human wants to DELETE a widget, the human has click on an HTML form to `POST /widgets/123/delete` - i.e. use an incorrect verb on an incorrect URL/object - or use some other workaround like smuggling a special `_method=DELETE` variable. This is unnatural and semantically incorrect, resulting in ugly hacks that break HTTP-level expectations like idempotency; and it also requires additional app-level logic to process.

Meanwhile a machine is allowed to simply `DELETE /widgets/123` because their interface to HTTP is not clicking on HTML forms.

We humans could converse with websites in semantically correct HTTP, have clean URLs in which both REST APIs and human-facing URLs are identical without hacks, and require no extra app/framework logic, if HTML forms simply allowed all (human-relevant) HTTP verbs.

XYen0n

If new method support is added, PUT should be used in this scenario.

amluto

One oddity of forms: the result of a form POST is a page that has a location (the URL) but that cannot loaded via that location. As far as I know, the fact that the page is a POST and not a GET is not stored anywhere visible to the user or to JS. And refresh works oddly.

If method=QUERY were added, there would be a new variety of this weirdness.

sheept

At least browsers wouldn't have to warn users that they'd be resubmitting data if they reload the page after submitting a query form, since query requests are intended to be idempotent

amluto

You still get the nastiness that the Sec-Fetch-* state gets mostly trashed when you hit refresh. And someone would need to figure out how CORS preflight interacts with refresh, which is not currently an issue with POST. (The current "simple request" behavior or whatever it's called is a real mess and is the cause of a lot of CSRF vulnerabilities.)

bob1029

This is better solved with the post redirect get pattern.

CodesInChaos

The redirect pattern makes sense for a POST request that creates a resource, where you can then redirect to the newly created resource.

QUERY on the other hand makes sense for cases where the request doesn't cause any state changes on the server, and there is no resource to redirect to.

diroussel

That is the good old fashion workaround. But why is it better than a form causing an HTTP QUERY.

If we can do QUERY forms, it would be an ideal time to add JSON encoding for forms.

chrismorgan

See https://github.com/whatwg/html/issues/12594.

paradox460

They never added support for any other verbs, but it's a brave new world, so who knows

100ms

Forms, HTTP implementations, public API surfaces, and all for what exactly. Introducing a new verb for this feels profoundly misplaced

jagged-chisel

Idempotency is an important attribute for correctness. Yep, you can document that POSTing to $ENDPOINT is idempotent, but you can't communicate that to caching layers throughout the network. QUERY, by definition, is idempotent and cacheable.

resters

Great point. I wish more people realized that intuitively.

jnewton_dev

[flagged]

alpinisme

At least support - or lack thereof - for a new verb is unambiguous (compared to changing the semantics of GET)

tempfile

Depends whether your form submission should expect side effects or not. Most forms submissions have side effects. If the effect is truly idempotent, wouldn't PUT be a better verb? That is also supposed to be idempotent.

echoangle

You can’t use PUT as a form action though.

diroussel

GET and QUERY are both idempotent.

mring33621

Hold my beer!

ynac

Just in case anyone wants to pretend it's still that other century:

https://www.rfc-editor.org/rfc/rfc10008.txt

jjice

I'll forever love a long, totally plain text document like this. So many good times with video game FAQs as a kid. It really is a superior form of information in a lot of ways (not all).

riffic

I have a thesis brewing that explores how rich text WYSIWYG editors create a "what you see is all there is" cognitive bias, while plain text overcomes it.

riffic

beautiful formatting. I should crib this style template for internal work memos, it's timeless.

piterrro

> GET request with a body was heavily considered by the IETF working group, but it was ultimately rejected in favor of creating the new QUERY method. The decision to create a distinct method came down to historical interoperability issues and strict compliance with the core architectural definitions of HTTP.

I've been sending request body along GET method for years now

notatoad

this was the response the last time this came up here.

you can do all kinds of nonstandard stuff if you control the server, the client, and any steps in between. the point of standards is for when you don't control it all.

put your server behind a managed load balancer or a caching proxy, and your get requests with bodies aren't going to do so well anymore.

treve

It's not a great idea. I wrote something a few years ago if it's interesting:

https://evertpot.com/get-request-bodies/

huskyr

Apparently some load balancers drop the body.

inigyou

I expect all sorts of intermediaries may drop the body, since having a body is forbidden by the standard.

When it's your client talking to your server you can obviously do whatever you want - it doesn't cause problems until you want to involve third-party code, such as a reverse proxy (such as nginx) or a CDN. This includes proxies your customers may be using.

jmaw

Where is it forbidden by the standard? I don't see anything in the GET definition in RFC 9110 [1] forbidding that. My understanding was that this is just undefined behavior. And not recommended due to your point about some third-party CDNs and RPs handling that UB in different ways.

[1]: https://datatracker.ietf.org/doc/html/rfc9110#name-get

undefined

[deleted]

preisschild

> I've been sending request body along GET method for years now

Generally not a great idea. With some http implementations this is not even possible (for example, fetch)

https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API/U...

> You cannot include a body with GET requests

And transparent caching might result in weird issues.

pseudohadamard

It does seem like a boil-the-ocean "fix". The entire planet has been kludging this with GET or POST as appropriate for the entire lifetime of HTTP, and now there's a proposal to do something that looks identical to POST but with slightly different semantics? So in exchange for solving a problem that's been dealt with (if not perfectly) forever you'd need to convince the entire world to adopt a new, incompatible mechanism? I'll give this one as much of a chance as BEEP.

You've never heard of BEEP? Well, that's why.

drob518

Aside: Wow, we’ve hit 5-digit RFC numbers now!

smashed

Use the QUERY method in your http query to query search results. Do not add query parameters.

I think the name is confusing because the term 'query' is already used to refer to http requests in general.

Just the title of the RFC confused me.

comfydragon

> the term 'query' is already used to refer to http requests in general

In what circles is this the case? I sometimes colloquially refer to a GET request as a query, but definitely not so on a POST, PUT or DELETE.

jbmchuck

I'm guessing the op is referring to https://en.wikipedia.org/wiki/Query_string

rehevkor5

Yeah, and it doesn't even have to be a query, it could be an idempotent effect. I think they'd be better off calling it IPOST (for idempotent post).

Edit: ah, they declare QUERY as "safe" meaning no side effects, for cacheability. My mistake.

andltsemi3

If this is actually going to replace GET requests w/ query strings in the wild, Im very much hoping for browser bookmarks to support keeping request parameters.

inigyou

Probably won't. Probably will replace whenever POST is currently used for a query.

anitil

That's a good point, I do like to be able to bookmark specific issues (like in Jira I bookmark 'all issues that are assigned to me and not done' )

pwdisswordfishq

Wait, it's already past 10 thousand?

rhplus

Someone has an ambiguous bet predicting when RFC 10000 will be published, but the numbers went straight from 9998 to 10008. No-one wins!

https://manifold.markets/CollectedOverSpread/when-will-rfc-1...

ekr____

RFC 10000 will not be published. They're just going to skip past the number.

https://mailarchive.ietf.org/arch/msg/tools-discuss/EpoQcVt_...

RFC #s are issued sometime before publication, so they can come out out of order. I would expect 9999, 10001, etc. to show up eventually.

pwdisswordfishq

Damn, I was so looking forward to see Robert Downey Jr. as the TLS handshake

undefined

[deleted]

echoangle

> This question resolves to the month of publication of the lowest-numbered RFC with a number greater than or equal to 10000.

So of 10008 is the first one after 10000, that date is the one to bet on.

schoen

Is anyone formulating prediction market questions asking AIs to brainstorm about edge cases in order to leave fewer of them uncovered by the market definition?

We do have humans brainstorming about such things, but this feels like something LLMs might be good at.

Imustaskforhelp

Everytime I think that prediction markets bets can't get worse, they do, all in weird ways. I never expected someone betting over when RFC 10,000 will be published but somehow its fits just about right for prediction markets.

just wow, people seem to be having too much money it seems for them to bet over when RFC's are gonna get released.

This isn't even one of the worst offenders on prediction market or even comparable to it but I am just amazed (in a negative manner, surprised? its just strange) by the depth on what people actually bet on these markets.

networked

People aren't betting real money on this. Manifold uses "mana" points similar to HN karma, which is why you get more for-fun silly bets. I don't see anything inherently wrong with it. Disclosure: my mana net worth is 75k; I haven't been active on Manifold.

inlined

I know agents are out of scope of this RFC but I love that this could easily be extended to make the JS EventSource to work on streaming AI queries.

Due to the need of bodies in requests, everyone uses POST and streaming results often use the text/event-stream protocol for responses. But this is technically a bad fit because no state is actually changing and because EventSource can only use GET for some obstinate reason. So many APIs reimplement the functionality with their own parser

barbazoo

> GET: Content (body) "no defined semantics"

I thought it wouldn't be a terrible idea to open up the GET method to contain a body but according to the original spec the GET body is to be ignored completely. There's also caching which would break because the important bit of the request would live in the stripped body.

angrybards

GET with only a URI has the semantics of retrieving the current representation of the resource. This is the most basic form of hyperlinking and quite important to how the web works. Adding a body parameter to GET would break that constraints of the method as you couldn't treat two requests using the same URI as referencing the same thing.

toybeaver

This makes me happy tbh, I was never a fan of creating `POST /search` endpoints when working with robust APIs

mlhpdx

Wow, it still isn’t a standard? I’ve been building with the QUERY method for years now.

I’ve enjoyed the combination with Range headers for paging, despite this tidbit:

> It is expected that these built-in features will be used instead of HTTP Range Requests

Using the QUERY request as the definition of a set, and Range to retrieve subsets seems very natural.

Daily Digest email

Get the top HN stories in your inbox every day.