Custom Allocators in Rust

nical.github.io

Daily Digest email

Get the top HN stories in your inbox every day.

cormacrelf

> This one is probably closest to what one would write in languages like C++. The data structure just assumes the allocator will outlive it, and it is up to the user to either use an static allocator, or pretend to using unsafe code to cast away the lifetime while making sure that the allocator outlives the data structure without the compiler's support.

I know many in the Rust community will frown upon this approach, but to be honest I don't think that it is a terrible solution in the context of an advanced feature like custom allocation strategies. Tessellator does not expose an unsafe API, but it documents that if its users were to break the rules they'd simply have to make sure the allocator outlives the data structure. In any other language with this kind of control over memory management, this contract would have to be manually upheld by users of the API and it is considered normal.

… no thanks.

andrepd

The value proposition of Rust is precisely having zero-cost abstractions and highest-performance code with compile-time verification of correctness.

That being said, I don't oppose the occasional judicious use of unsafe. If 99% of your code is verified, it's not 100%, but that's still a massive improvement over a C++ codebase.

cormacrelf

We can talk in specifics: this example was about making people use unsafe to avoid having to type <'static> in the general case. The author had been trying for a number of paragraphs to avoid polluting the type name with generics or lifetimes. This is what going too far looks like.

The std APIs look like this:

   struct Box<T, A: Allocator = Global>

And it’s been this way in stable releases for the last year or so. The same has been done for Vec and all the other std::collections. What percent of Rust programmers do you think even noticed at all? 1%? The most flexible design and the least impacting on regular users. You can use A = &'a dyn Allocator if you like, equally you can choose a ZST and not pay for 16 bytes of storage. The library author has no need to choose in advance at all, which is great if they’re determined to make weirdly constrained choices, ultimately forcing most uses of the API to be unsafe. I stopped reading after that so I don’t know which design they went for.

throwawaymaths

> The most flexible design

Not quite. For the most general case, you probably want allocator to be a proper parameter instead of a type parameter. For example, suppose you have a green thread that you want to have its own isolated heap. Then you can't assume that any given allocator is a singleton in its type; The struct itself must somehow be able to find to its "owner" on release. In the green thread case you can't "just use threadlocal" because a green thread might not be sticky to an os thread.

mikepurvis

And if the generated documentation is really the main sticking point, it would be perfectly possible to special case the allocator type parameter in rustdoc so that it hides or otherwise deemphasizes it.

likeabbas

I think `unsafe` would have been more aptly named `compiler_unverifiable`. IMO there would be less apprehension to using `unsafe` when it's needed.

ZephyrBlu

As someone who mainly uses higher level languages, doesn't care for C/C++ and really likes Rust, I'm glad it was named so strongly and that safety and correctness is very important in the community.

It creates a strong incentive to only write safe Rust, which is great for the vast majority of people.

Yoric

`unchecked` might be more palatable, but I agree that there is some uncomfortable mismatch between the meanings of "unsafe" and `unsafe`.

peyton

Ok Rust is about a lot of things, but correctness is not one of those things. Dafny is an example of a programming language that’s about correctness.

lenkite

Well, I hope we get custom allocation soon in Rust. Its really strange not having the same in an advertised system programming language.

Can help in incidents like the below:

https://www.svix.com/blog/heap-fragmentation-in-rust-applica...

Diggsey

You can already do custom allocation in Rust. These efforts are about bringing support for custom allocators to the types in `std` so that you can combine use of a normal `Vec` with a custom allocator, instead of needing to definte your own `Vec` (or at least pull one in from a 3rd party crate).

tomjakubowski

You might want custom local allocators. You can already replace the global allocator with one designed to reduce fragmentation, as described in the article.

foota

There's been some side work towards defining dynamic scoping for rust. E.g., allow functions to declare a dependency on some value or type, and then require callers to either pass that dependency explicitly, or have it on their dependencies.

One of the proposed uses was for allocators.

https://tmandry.gitlab.io/blog/posts/2021-12-21-context-capa... is one of the more cogent proposals.

I don't think any of these has made substantial progreess towards being approved though.

forrestthewoods

That’s pretty neat. Seems very similar to Jai’s implicit context. I’m definitely a fan of context objects. It solves so many painful problems related to allocators, logging, etc.

foota

Yeah, I think this is a really nice thing to have in a language, purists may not like it, but being able to flexibly pass something through layers is a cheat code for coding.

Reitet00

Great to see some movement in this area for Rust. Compared to Zig which had this from day 1 it may be hard to adjust Rust (the way it was hard for Go to adopt generics).

Diggsey

I think most of the challenges are around making custom allocators safe without harming the ergonomics, which Zig hasn't solved.

If Rust was content to have use of custom allocators be unsafe it would be trivial to add them (since you could just add new variants of allocating methods that take an allocator as a parameter).

hansvm

Are you just reminding us that Rust does some checks that Zig doesn't, or are you saying that there's some particular footgun in their allocators above and beyond the fact that UAF and other memory bugs are writeable in general?

Diggsey

Neither, I was disagreeing with the comparison to Go and generics.

Generics in Go don't add anything beyond what generics already do in other languages, so the challenge with bringing generics to Go is "how do we adapt the language to support a feature that already exists in other languages and is generally well understood". Bringing generics to a language that wasn't designed with them in mind has often resulted in a sub-optimal implementation (eg. Java vs C#).

On the other hand, "safe custom allocators" are not a feature that any language (to my knowledge) has solved. It's not as though this was an oversight in Rust's initial design: using a custom allocator in an unsafe context has always been possible in Rust, and it's too early to say whether bringing this feature into the language later will result in a similarly sub-optimal design: in order to be sub-optimal there would have to exist some better solution out there, and there currently doesn't.

_0w8t

Zig approach essentially parametrizes instances of containers on the instance of the allocator. To properly support that in the type system one needs dependent types.

In theory that can be done, but the consequences for the compilation time will be extreme as the compiler becomes essentially a generic theorem prover.

Plus the noise from the proofs in the sources will be much bigger than that of type parameters.

throwaway17_17

I’m pretty sure the ‘logic’ require for paramterizing container types over allocators can be restricted in a wide majority of cases to a linear set which would allow for refinement types to be used. This would eliminate the requirement for full theorem proving.

In fact, it would probably be acceptable to insist on keeping the scope of parameterization limited to the linearly determinable set of values and operations on those values.

throwawaymaths

I think realistically if you want to prove safety of memory provenance in zig, you assume that allocators are sound, and you just track the lifetime of the memory from alloc/create to free/destroy and call it a day. This is probably "good enough", and in rust you're assuming the allocation is sound as well, it's just implicit.

_0w8t

The article assumes that storing an allocator together with a container inevitably bloats the container by 1-2 CPU words. This is not necessary so. One can require that it should be possible to get the allocator from a pointer it allocates. With many allocator designs this will costs like one CPU word per CPU page which is trivial.

This still requires to parametrize the container on the type of the allocator, but there will be no penalty on the size of the containers and the API will be safe.

stephc_int13

I think that avoiding global allocation and lifetime management is good design and a way to make memory management as simple and flexible as it should.

But, from my own work in the field, I came to the conclusion that there is a better way than passing custom allocators: passing containers instead.

In practice this is almost the same thing (except that containers are made to be easily traversed) but semantically there is a subtle and positive difference.

The goal is to avoid confusing the allocating policy (let's say arena versus heap) with the allocator instance.

conaclos

What do you mean by Containers?

A good abstraction for passing allocators could be effect and effect handler.

stephc_int13

A container is basically a structure holding a collection.

In that case, the container also acts as an allocator (the feature can be embedded directly or not)

The only practical difference is that a container should provide some facilities for traversal, while the allocator is only taking care of allocating and freeing.

From my experience, I very rarely allocate things without keeping them in a kind of container, it can be something as simple as an array of references.

phkahler

How much better is Rusts default than C++? I would expect it to be better since they have all that nice lifetime information on every piece of data. It seems like some nice things could be done with that. Are they?

LegionMammal978

By default, Rust just delegates to the ordinary libc malloc(), realloc(), and free(). In safe code, allocations are managed through RAII, just like in modern C++. Box works like unique_ptr, Arc works like shared_ptr, Vec works like vector, and so on. Lifetime annotations are just a way for the programmer to tell the compiler that none of the rules are being broken: they don't actually affect the semantics of the compiled program, they just make it fail to compile if there is an error. The underlying UB rules (except for those around unique references) are not much stricter than those required by C++; meanwhile, the lifetime rules are fairly strict, but the compiler cannot optimize based on them.

At best, the lifetime system could allow you to use a faster design for your library, by placing tighter compile-time constraints on allowed use cases than you could have feasibly documented in a C++ library's interface. But examples of this are not too common in my experience, since most find it acceptable for C++ libraries not to accommodate sufficiently weird use cases.

cbarrick

My problem with the allocaor-api is the "big box" problem.

That is, Box<T, A> is a struct containing both a pointer to T and an allocator handle A.

With an arena allocator, the handle A would be a pointer to the arena. But when using an arena, you don't actually need the box to retain the A after the initial allocation. Deallocation is a no-op; storing A is just wasting 8 bytes.

To get around this, Bumpalo (a prominent arena allocator) defines it's own box type rather than using the std Box. I'm not sure how I feel about that solution.

antimora

Does anyone know a custom allocator that allocates memory for the same pattern usage? I am trying to find a good allocator for tensor operations used in Burn's framework (https://github.com/burn-rs/burn). It uses the same vectors without any varying size changes.

Daily Digest email

Get the top HN stories in your inbox every day.