Hacker News

8 hours ago by wcoenen

I'm a little confused about the issue description, because "mov" is not a MIME type.

Examples of MIME types: "text/plain", "text/html", "image/png" "application/pdf", "video/quicktime", ...

If I was prevented from using the username "wcoenentext/html", then I wouldn't really be bothered by that. (Although I might question the design decisions that would necessitate such a restriction.)

8 hours ago by scblzn

Hello,

I’m the author of the issue on Gitlab (small world, isn’t it ?)

Yes the message is confusing and I agree that .mov isn’t a MIME type but I was merely reporting the error message shown ( plus, they added .mov in their list of file types and had aliased it to .mp4 format, please see: https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/in... )

3 hours ago by codetrotter

> they added .mov in their list of file types and had aliased it to .mp4 format

That’s weird. Why’d they do that. They should make a separate entry for mov and associate it with video/quicktime

Guess it might be something related to https://stackoverflow.com/a/44785870 but like they point out, mov is a container format that can contain one of many different codecs used. And isn’t mp4 just a container too? Referring to mov files as video/mp4 seems straight up incorrect to me

an hour ago by john_cogs

A GitLab team member has opened a merge request to make the error message more clear: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/70374/...

7 hours ago by eyelidlessness

Damn and I was about to start registering usernames like for json/bourne but I guess that’s overly specific.

6 hours ago by waspight

That will be the name of my next child.

5 hours ago by pulse7

Please don't do that... Elon Musk's son is named "X Æ A-12", don't follow his steps...

8 hours ago by zodiakzz

The word they're looking for is "file extension name".

3 hours ago by qwerty456127

There is no such thing as a file extension name, it's a file name extension. It's called this way because file names could only be 8-byte strings initially and then this was extended with 3 extra byte places.

Neither part of the MIME type format even has to match any existing (commonly used) file name extension anyway. E.g. it can be `text/plain`. Even when it does it is just a coincidence (although very common), it actually references the format name (IIRC `image/jpeg` was used even when almost nobody were using `jpeg` for the extension and the convention was to use `jpg`).

5 hours ago by pastage

This is interesting because compounds are mostly written together in other languages.

Never thought about it but file extension name really is a word. Someone replied saying this is three words, but it is not is it? It's an open compound word or maybe a "set phrase", I wanted to call it an idiomatic expression but that was clearly wrong.

7 hours ago by undefined

[deleted]

7 hours ago by undefined

[deleted]

7 hours ago by iamtedd

Who are you, Kath Day-Night?

8 hours ago by a-dub

it seems pretty obvious that they mean any file extension registered to a known MIME type.

2 hours ago by jfrunyon

No file extension <-> MIME type "registry" exists. File extensions exist completely outside of MIME. Many file extensions do not correspond to a registered MIME type, or in many cases even a de-facto one (other than application/octet-stream or text/plain).

It seems pretty obvious - based on the failed username in question, and to someone with fairly deep technical knowledge - that they mean anything they consider a file extension. Which is not an excuse for this marvel of awful UI slapped on top of a poorly-thought-out workaround (for some unknown vuln (that's been patched for over 2 months and is still private? quite strange for an "open" company eh?).

an hour ago by yuliyp

Officially no such registry exists. In practice, Apache does have such a registry by default: https://svn.apache.org/repos/asf/httpd/httpd/trunk/docs/conf... and other systems do use that mapping or a similar one.

8 hours ago by gpvos

It's easy, basic, and very important for clarity to use the correct terminology in this case. A MIME type is really something different.

3 hours ago by a-dub

"ERROR: The Gitlab server has rejected the proposed username, as it ends in the same suffix as a file extension that is registered to a MIME type in the Ruby runtime under which the server runs. This list is quite long, and possibly difficult to retrieve, so we will not list it here, but you can find a list of extensions commonly used if you Google for MIME. Alternatively, if this makes no sense at all, ask a local alpha geek and they should be able to help. We understand this is weird, but the reasons for doing so are currently embargoed as they potentially have wide ranging security or stability consequences for a large number of installs. If you wouldn't mind, please do us a small one and keep quiet while we have a chance to prepare and distribute a patch without forcing anyone to forego any nights of sleep 18 months into a global pandemic complete with associated societal fracturing and potential economic collapse. Thank you for your cooperation on this easy, basic and very important matter."

fixed it!

5 hours ago by p49k

It’s unrealistic to expect everyone to be able to know and use terminology perfectly. The description itself was well-written and that’s what’s important in terms of finding/fixing the issue.

7 hours ago by hnlmorg

In fairness, it is Gitlabs wording the issue reporter is using. Check the error message.

7 hours ago by Grollicus

Gitlab recently exchanged the "WIP" prefix for merge requests (Work in Progress = started to do something but didn't complete it yet) for "Draft", which has connotations of throwing the draft/sketch away to build the final product.

Which is definitively not what is meant there. But I think it shows that Gitlab is not a company I'd go to if I wanted linguistic precision.

an hour ago by hibbelig

> it seems pretty obvious that they mean any file extension registered to a known MIME type.

I guess I'm dense, but I actually thought it's about users such as Mr. Joe R Text/Plain. When I read about "mov" in the actual issue, then it became clear.

6 hours ago by pechay

Once had my Australian production system go down because a js plugin we were using had added .au to its list of media types. Integration had a different TLD. Response I got from the author was 'LOL!'. :)

5 hours ago by CodesInChaos

Why is the application treating a TLD like a file extension?

5 hours ago by detaro

probably because some broken code matched on the end of the URL, without checking if it was just the naked domain?

8 hours ago by ncann

Looks like the fix restricted the check to "usernames that end with dot and MIME type"

Still, what is the attack vector?

3 hours ago by anentropic

Probably: some web frameworks do content negotiation by appending a content type like .json to the end of the url

Not sure if it's an attack vector per se, or just that the behaviour is incompatible with allowing usernames containing . and then having urls where the username is the last segment of the url

seems like a badly designed url scheme :)

2 hours ago by jfrunyon

> some web frameworks do content negotiation by appending a content type like .json to the end of the url

This has always disturbed me, considering that HTTP has had content negotiation for ... oh, basically its entire history [https://www.w3.org/Protocols/HTTP/1.0/spec.html#Accept].

7 hours ago by walty8

But the in the screen capture of article, the user name is actually 'issac.asimov', i.e. the mime type does not immediately follow the dot.

6 hours ago by nobody9999

>But the in the screen capture of article, the user name is actually 'issac.asimov', i.e. the mime type does not immediately follow the dot.

A variation on the Scunthorpe Problem[0] then, eh?

[0] https://en.wikipedia.org/wiki/Scunthorpe_problem

6 hours ago by whizzter

Somebody probably put in a regexp with .mov$ , however for regexps the dot (.) matches everything (and $ matches end) so the i in asimov is eaten regardless and then the rest of the match succeeds.

6 hours ago by iechoz6H

Perhaps the sub-clause is redundant there?

'The problem was named after an incident in 1996 in which AOL's profanity filter prevented residents of the town of Scunthorpe, Lincolnshire, England, from creating accounts with AOL, because the town's name contains the substring "cunt".'

7 hours ago by ajkjk

That was before the fix.

2 hours ago by undefined

[deleted]

8 hours ago by boomskats

This doesn't look like a security issue, unless I'm missing something.

8 hours ago by paxys

Definitely a security issue.

- The merge request which originally added this check is inaccessible (https://gitlab.com/gitlab-org/security/gitlab/-/merge_reques...)

- In the issue comments the Gitlab employee says "Sorry, I cannot go into details right now. I will link the issue here once it goes public, is it ok?"

8 hours ago by nine_k

It could maybe potentially be exploited in a very interestingly crafted email, where there's link to download something (e.g. the source tarball, or a build artifact) with an URL containing the username, or being otherwise close by, so that the downloaded file would be interpreted differently. But I'm not creative enough at this hour to suggest a working exploit.

6 hours ago by dolmen

I suspect a case of impersonating a user which doesn't have the suffix. Ex: create user "toto.mov" to takeover some resources of user "toto".

7 hours ago by amjd

Maybe it's something to do with a MIME sniffing attack. The user profile URL may be detected as a different MIME type by the browser based on the extension: https://gitlab.com/myname.js

I'm not sure how one could exploit it though...

8 hours ago by foota

I've played with this before. A correctly implemented mail library should handle e.g., subject lines that contain SMTP control characters. I developed a lengthy repro for an email parsing issue in an ancient version of some java email library that contained a truely horrendous parser, only to find out that the library had been updated internally recently :-)

7 hours ago by a-dub

MIME types are used all over the place:

1) web servers, browsers, proxies 2) graphical os shells 3) email

every file a webserver returns has a mime type in the header, and that is how the browser knows how to present it.

8 hours ago by LewisVerstappen

5 hours ago by tankenmate

I get the sneaking suspicion that this is a case of Ruby's magic being slightly too magic; it's a problem I have tripped across in the past.

Slightly tangentially reminds me of the "More Magic" switch of GLS fame.

8 hours ago by pavon

Indeed the MR template does not have the security box checked.

8 hours ago by kinix

My guess is that the username is used in a url somewhere? So browsers might try and interpret it as a file

8 hours ago by paxys

TL;DR for those wondering:

- There was a yet-undisclosed security vulnerability in Gitlab usernames

- Staff member made a change to disallow usernames ending with `Mime::EXTENSION_LOOKUP.keys`, which I assume is a set of recognized file extensions (hidden – https://gitlab.com/gitlab-org/security/gitlab/-/merge_reques...)

- This was overly broad since it caught a lot of common names (like "asimov") (https://gitlab.com/gitlab-org/gitlab/-/issues/335278)

- The check was updated to additionally look for a "." before the extension (https://gitlab.com/gitlab-org/gitlab/-/merge_requests/65954)

6 hours ago by OskarS

> Staff member made a change to disallow usernames ending with `Mime::EXTENSION_LOOKUP.keys`

This is a hell of a thing to commit and pass code review in 2021 on a project like GitLab. I understand that the staff member was fixing a security issue and was probably not thinking deeply about the ramifications, but even so. How many "Falsehoods programmers believe about names" articles do we need?

5 hours ago by mavhc

That's usually the exact time you want to think deeply about the ramifications

4 hours ago by OskarS

I don't want to be too harsh on the programmer, everyone makes mistakes. It's easy to see how a person focused on a security issue with user names makes a quick fix without thinking through how this will affect account creation. As programmers, we have to make like 7,000 decisions every day, you're bound to fuck up some of them. This is a pretty big one, though.

The bigger question is how this passed code review and testing.

5 hours ago by Maxion

Surely it would be easy to run the new rules against random-list-of-usernames-found-through-google before pushing to prod? Or perhaps the security issue was deemed so great that they needed a fix out yesterday?

3 hours ago by Aeolun

I mean, they’re still not telling us what it was and it’s been fixed for a month. Must’ve been pretty big.

an hour ago by john_cogs

GitLab team member here. This issue provides additional context on the changes: https://gitlab.com/gitlab-org/gitlab/-/issues/26295

an hour ago by cesarb

That seems to be a consequence of what IMO is an unfortunately common bad design: having user-controlled data like usernames as the first path component (without a prefix like ~). There are many things which are expected to be found at the root of the path (the classic example being robots.txt, but there's also favicon.ico, .well-known, and probably others; I vaguely recall that IIRC Flash used a fixed filename in the root for cross-domain access control), and you never know when a new one will be invented by someone (though .well-known is supposed to contain the spread of these "magic" names).

7 hours ago by akie

If anyone ever says that software engineering is completely unlike plumbing, I will point them towards your comment. All systems, however nicely architected, are full of duct-tape solutions like this.

6 hours ago by andix

Professional plumbers don't use duct tape for fixing leaks ;)

5 hours ago by capableweb

As someone who has traveled to some of the lesser traveled places on this planet, they do too! Sometimes even worse techniques are being used to fix things.

7 hours ago by Grollicus

Are they though? What I can think of (broken upstream reverse proxies that do mime type inference by filename) would warrant a WE_USE_BROKEN_LEGACY_SHIT_UPSTREAM config flag so that it doesn't get in the way of normal users.

So I'm probably missing something and I'm really curious for the underlying vulnerability.

5 hours ago by akie

I have no clue about the underlying issue, but I'm guessing it's occurring on a boundary or an interplay between two systems.

Something like "the username can be part of the URL, and if the URL contains .mov, some browsers will misinterpret this and assume it's a movie file, leading to bad things™".

Or: "the username is sometimes used as a folder name, and our syncing software contains rules to exclude certain file extensions, so these folders were never synced, which lead to issues on production servers"

I'm guessing it's something along these lines. Something that you control, but not really, leading to these kind of haphazard workarounds.

2 hours ago by DonHopkins

>- Staff member made a change to disallow usernames ending with `Mime::EXTENSION_LOOKUP.keys`, which I assume is a set of recognized file extensions (hidden – https://gitlab.com/gitlab-org/security/gitlab/-/merge_reques...)

If only github had an established system and procedure for doing code reviews before releasing security fixes...

8 hours ago by fregante

My guess is that they were using /.mov$/ to check the username, which is missing an escape.

7 hours ago by ajkjk

8 hours ago by alfiedotwtf

Sounds like a WAF is getting in the way. Wikileaks had the same issue years ago when you couldn't search for programming language binaries e.g "Ruby" or "Perl" etc

7 hours ago by kzrdude

What's waf?

6 hours ago by Grollicus

https://en.wikipedia.org/wiki/Web_application_firewall

Basically something to extract boatloads of money from enterprise customers by annoying THEIR customers so they can't write "<script>" in texts in their application.

Or, less tongue-in-cheek, a way to harden web applications against known attack patterns like sql injections or xss-attacks. As they work on pattern recognition and don't know anything about your application they sometimes get in the way. But they'll probably check some box for some security audit so they're used.

Cloudflare for example offers one at https://www.cloudflare.com/waf/

5 hours ago by a-dub

oh weird. i thought they also did stuff like parse and reconstruct requests to try and catch any funny business and centralize/add ease for things like ratelimiting and fail2ban for webapps. looking at this one, it appears not.

6 hours ago by bigiain

5 hours ago by undefined

[deleted]

8 hours ago by em3rgent0rdr

Reminds me of https://xkcd.com/327/

"Did you really name your son Robert'); DROP TABLE Students;-- ?"

"Oh, yes. Little Bobby Tables, we call him."

2 hours ago by InsomniacL

why would this get a single downvote? i suspect the underlying reason to the restriction is somewhat related to sanitising input.

23 minutes ago by Grollicus

I suspect because Isaac Asimov was named before any of these problems with his name were a thing and his name is not something artificially constructed to cause problems with buggy computer systems.

2 hours ago by FroshKiller

I did not downvote, but I don't like reading the same xkcd references over and over, so I can imagine others feel the same way.

Daily digest email

Get a daily email with the the top stories from Hacker News. No spam, unsubscribe at any time.