Posted 2 days ago

/

44 comments

/

github.com

2 days ago by kazinator

> In Rust (as in C, for that matter), two structs are not interchangeable just because they look the same.

In C, two structs with different tags in the same translation unit are indeed incompatible. (That includes two structs with no tag that look the same, because, effectively, each has some internal, machine-generated tag.)

Within the same translation unit, looking the same is not a consideration at all; it's all based on the tag symbol.

Between translation units, two complete struct types that are exactly the same (same members, of the same type, in the same order, with the same names, and other details like bitfield configuration) are compatible are if they both have the same tag.

But they are also compatible if neither has a tag: and in that case, they would be incompatible if they were in the same translation unit.

Basically, two machine-generated tags for anonymous structs are considered equivalent if they are in different translation units, in which case compatibility is purely down to structural equivalence.

In practice, C compilers do not police struct tags at all between translation units; you can get away with it if the same object is known as "struct point { double x, y; }" and "struct pointe { double x, y; }" in another translation units. You can even change the member names, for that matter.

You will run aground with a C++ compiler, though, due to type safe linkage which pulls in class names.

FFI implementations obviously don't care about any of this. If I'm declaring an FFI type that is to be compatible with the C "struct stat" in a Lisp, so I can call the stat function, these concepts have gone out the window. The memory layout compatibility is all that matters: correct sizes, alignments, conversions.

2 days ago by josephcsible

Doesn't the "common initial sequence" rule imply that structs with the same members must be the same, even within a translation unit? If they didn't, wouldn't putting them both in a union and accessing through the other one not work, even though the standard requires it to work?

2 days ago by forrestthewoods

What is a tag in C?

2 days ago by formerly_proven

The name of a struct ("struct foo", the tag is "foo").

2 days ago by andybak

Not a Rust user but this all sounds remarkably painful. Is it common? The only other compiled, type-safe language I use regularly is C# (via Unity) and I don't recall this level of upheaval.

2 days ago by mbrubeck

This is uncommon, because it's not really necessary except for “foundational” libraries that are used “publicly” (i.e., not just internally) by hundreds of other libraries.

If only three or four of my dependencies use the Foo crate, then I can just wait for all of them to upgrade to Foo 2.0, or I can do it myself and submit pull requests. In this case, I don't really care whether Foo's maintainer uses the semver trick.

However, if I'm developing a really big project like Servo or rustc, and Foo is a foundational crate that is used by dozens of my (transitive) dependencies, then waiting for all of them to upgrade (or doing the work to upgrade all of them myself) starts to become prohibitive. Now a bit of work by Foo's maintainer can save a lot of time for large downstream consumers.

An example of a “foundational” crate that has used the semver trick is `num-traits`, which has over a thousand reverse dependencies: https://crates.io/crates/num-traits/0.1.43

2 days ago by 1wd

C# has assembly binding redirection.

https://stackoverflow.com/a/43366172/3679043

2 days ago by emn13

Yeah, and they're a huge pain if you indeed actually need to use them, and often result in a non-working mess. This wasn't all that uncommon in the early days of .net core, which was particularly bad at this IIRC largely because many foundational libraries were split into packages that essentially could only ever be upgraded in concert. There are a few technical nuances that mean I'm sure this isn't quite the same as the rust case, but it's pretty bad nontheless. Even well thought-out transitions like .net standard weren't free from gotcha's particularly when mixed with multitargeting and deep transitive dependency graphs (which was pretty easy to get in the early .net core days).

The whole thing is clearly not great (not just in C#). In particular - many of these problems are entirely artificial. If the type system were sufficiently dynamic, or magically psychic - the problems often go away. E.g. in the rust example - void_c didn't actually change, it was just incompatible because rustc isn't psychic. Even seemingly serious problems like an interface (~ in rust a trait) gaining a new method aren't necessarily fundamentally breaking - as long as you're not calling it. Even truly breaking changes like "best-solution-finder returns a different result and has an incompatible API" might not be breaking between in the case of a diamond dependency pattern as long as the method's side effects allow running both versions - one for each transitive dependency requesting that specific version. Even cases where the break is "real" like a rename, it'd be trivial to consider a shim that allows old consumers to use the new api.

In the vast, vast majority of cases in my experience this kind of breakage is a problem due to technicalities. That doesn't mean I have a clue how to solve that, but it does beg the question: isn't it possible for a dependency resolution system to do fundamentally better, here?

2 days ago by qes

> This wasn't all that uncommon in the early days of .net core

Good heavens was that a mess. It had been a nice 15+ years of easy-peasy with dependency management prior to that in MS-land, though.

It seemed that a fair amount of the problems were MS learning the pitfalls of how they packaged System assemblies for NuGet. Then the compatibility shim mostly brought things back to what we had been used to in .Net (works) - even easier, in fact, now that automatic assembly binding in builds isn't a minefield.

2 days ago by whoisthemachine

In C#, default implementations of new methods on interfaces should at least reduce the pain of additive changes.

2 days ago by ViViDboarder

I’m new to Rust, but it doesn’t appear to be.

If I’m following correctly, it would only occur if a library is using a struct defined in a dependency in a public API you are using and you also use the same dependency elsewhere in your application. Upgrading the version elsewhere in your application could break passing that struct.

Most of the time I see public APIs which either base types or struct a defined by the library itself.

2 days ago by undefined

[deleted]

2 days ago by gazarullz

would have been nice to tag the post as a rust + semver regarding post instead of only semver as the title implies

2 days ago by mjw1007

You could use the same trick for other languages and packaging systems, as long as they support linking multiple major versions of the same library into one program.

2 days ago by hombre_fatal

I guess we'll have to (ugh) click the title and spend (shudder) 10 seconds reading to get further context.

2 days ago by sixstringtheory

Wouldn’t this still break a consumer who doesn’t realize the trick is being employed? Doesn’t this assume the consumer is making the requisite changes in their library as part of upgrading to the “nonbreaking” upgrade that slips the new type definition in via its dependency on its own breaking upgrade?

2 days ago by Tuna-Fish

No. Someone using the old version can continue using it, completely unaware of the fact that anything changed.

2 days ago by diegocg

Wouldn't this be solved easily with symbol versioning?

2 days ago by kibwen

Possibly (I've never seen a concrete proposal of per-symbol versioning, so I can't say for certain), but in that case all downstream consumers of your library would have to pre-emptively declare the version for each and every symbol they use.

Now obviously not all users will be using every symbol from every library that they depend on, but, to use the OP's example of libc, which contains over 4,500 symbols... that starts to look unwieldy.

Of course, you could technically do this today by just having every symbol in its own crate. And while that seems like quite a stretch, I think it is the consensus that the libc crate in particular is too big, and should have been split out into multiple crates in order to better facilitate these sorts of upgrades. So there might be a practical middle ground by having one crate for "crucial, fundamental symbols" and a separate crate for "ancillary symbols", where each could be versioned separately. That might get close enough to the precision of per-symbol versioning without getting unwieldy.

2 days ago by nixpulvis

I would be very interested in how this could be created automatically by `cargo` itself. A cargo semver tool which can bump versions, and create two versions on major/minor breaking changes like this post recommends would be really cool to see.

2 days ago by dmitriid

> Servo found themselves coordinating an upgrade of 52 libraries over a period of three months

And yet... people only complain about npm having lots of dependencies ;)

2 days ago by PhineasRex

Projects of similar size in JS have hundreds of dependencies, so 52 really isn't a lot

2 days ago by dmitriid

That's mostly because JS doesn't have a suitable standard library.

2 days ago by saagarjha

Rust has this same problem. (Although you could make the argument that this specific state of things was explicitly chosen in Rust's case.)

a day ago by heavenlyblue

Or because JS is and was a hype and is full of people marketing their names through many easy, small packages.

2 days ago by ilammy

Well, looking at various crates pulling in lots (dozens) of dependencies, I'd say crates.io is clearly moving somewhere into that direction of microlibraries with extensive code reuse.

2 days ago by kibwen

I only encounter this in Rust when doing anything web-related. There's quite a lot of prominent authors who go out of their way to reduce their dependencies by any means necessary (e.g. https://github.com/tokio-rs/tokio/pull/1324 , which is a notably extreme case).

2 days ago by dmitriid

> I only encounter this in Rust when doing anything web-related.

It's probably because anything web-related requires so many things no readily available in the language or in the standard library: anything from databases (sometimes different breeds of databases) to templating to serialisation (possibly multiple types of serialisation) to rest or graphql to...

Just serialisation (which is almost invariable Serde) will pull in at least 54 dependencies (if you only use serde and serde_json). A framework such as rocket which provides all that, and more, will pull in ... 332 dependencies :)

Edit: Calculation is invalid, see https://news.ycombinator.com/item?id=24024671

2 days ago by ChrisSD

The trouble with monolithic crates is that they're monolithic. People complain about the massive amount of code they have to pull in just for a few functions, and the effect this can have on compile times.

The trouble with small crates is they're small. People complain when the number of dependencies grows larger and mockingly reference "leftpad".

2 days ago by josephcsible

The issue with leftpad wasn't that it was small. The issue was that so much production code relied on it not going away, despite npm letting the author make it go away.

2 days ago by linkdd

If people pull a massive amount of code to use a few functions, the problem is not the size of the code, the problem is the lazyness of the developer who doesn't want to write 3 functions.

Same for leftpad-like packages, I don't need a dependency to a single function that I can rewrite myself. Especially if I'm not writing open-source software and I have to audit the licenses of my dependencies.

Software development is a matter of trade-offs, you get what you choose, people should not be complaining about that.

2 days ago by undefined

[deleted]

2 days ago by Lammy

> The Rust library ecosystem has a history of traumatic library upgrades. The upgrade of libc from 0.1 to 0.2 is known as the "libcpocalypse".

Somebody should come up with a commonly-agreed-upon versioning scheme where we can indicate that breaking changes should be expected so people could avoid putting themselves in situations they regret.

2 days ago by steveklabnik

The versioning scheme is not the issue here.

The issue is that there can only be one copy of certain kinds of libraries, and when there's a major version bump, it creates a fork in the ecosystem.

2 days ago by dtolnay

The article isn't supposed to imply that any regret was involved.

People used early libc and early serde to get a massive benefit (respectively: talk to C code, and process JSON). Independent of version numbers those are things people want to do in Rust.

Daily Digest

Get a daily email with the the top stories from Hacker News. No spam, unsubscribe at any time.