Skip to content(if available)orjump to list(if available)

How safe is Zig?

How safe is Zig?

257 comments

·June 23, 2022

AndyKelley

I have one trick up my sleeve for memory safety of locals. I'm looking forward to experimenting with it during an upcoming release cycle of Zig. However, this release cycle (0.10.0) is all about polishing the self-hosted compiler and shipping it. I'll be sure to make a blog post about it exploring the tradeoffs - it won't be a silver bullet - and I'm sure it will be a lively discussion. The idea is (1) escape analysis and (2) in safe builds, secretly heap-allocate possibly-escaped locals with a hardened allocator and then free the locals at the end of their declared scope.

nine_k

I would rather prefer the compiler to tell me: "Hey, this stack-allocated variable is escaping the function's scope, I can't do that! Allocate it somewhere outside the stack."

Maybe the compiler could offer me a simple way to fix the declaration somehow. But being explicit and transparent here feels important to me; if I wanted to second-guess the compiler and meditate over disassembly, I could pick C++.

flohofwoe

Since HN upvotes are invisible in the UI: +1. IMHO escaped locals should be an error, but not a hidden allocation.

randyrand

I feel like the most user friendly solution for Use After Free is to just use Reference Counting. Basically, just copy Objective-C's "weak_ptr" design.

For every allocation, also store on the heap a "reference object", that keeps track of this new reference.

  struct reference_t {
   // the ptr returned by malloc
   void* ptr;
   // the stack/heap location that ptr was written to.
   // i.e  the reference location.
   void* referenceAddr; 
  }
Reference track: every time ptr is copied or overwritten, create and delete reference objects as needed.

If ptr is freed, visit every reference and call *referenceAddr = NULL, turning all of these references into NULL pointers.

kaba0

Reference counting is an incomplete solution it itself - it leaks circular references, and on multithreaded programs it is quite slow. Though some optimizations could be done statically to improve on that.

All in all, a mark-and-sweep will likely be faster (in throughput at least) let alone a good GC.

randyrand

Circular references wouldn’t matter.

The programmer would still decide when to free the object, not some automatic system. Manual memory management, with Ref Counting just to add some additional Use After Free checks.

jhgb

Wouldn't this, if applied by default, complicate Zig's easy C interoperability? I don't quite see how plain C code could play well with this.

skullt

Does that not contradict the Zig principle of no hidden allocations?

kristoff_it

I don't know the precise details of what Andrew has in mind but the compiler can know how much memory is required for this kind of operation at compile time. This is different from normal heap allocation where you only know how much memory is needed at the last minute.

At least in simple cases, this means that the memory for escaped variables could be allocated all at once at the beginning of the program not too differently to how the program allocates memory for the stack.

messe

Static allocation at the beginning of the program like that can only work for single threaded programs with non-recursive functions though, right?

I’d hazard a guess that the implementation will rely on use-after-free faulting, meaning that the use of any escaped variable will fault rather than corrupting the stack.

remexre

Could this be integrated into the LLVM SafeStack pass? (I don't know how related Zig still is to LLVM, or if your thing would be implemented there.)

randyrand

Just another idea for use after free:

What If we combined the 'non-repeating' malloc idea with 128-bit uuids?

malloc would just return a 128-bit uuid, and to get to the data ptr you'd need to consult a hash table.

  dataPtrArr[hash(uuid)].dataPtr = dataPtr
We'd check if it's been freed by checking:

  dataPtrArr[hash(uuid)].uuid == uuid

flohofwoe

At that point it's better to use 'tagged index handles', but IMHO that's outside the scope of the language (but maybe an option for the stdlib):

https://floooh.github.io/2018/06/17/handles-vs-pointers.html

(but for this an "auto-decaying pointer" would be nice which cannot be stored outside the stack and cannot be "carried across" function calls.

randyrand

Nice article. Similar idea ofc. Yes, that design could work tho all objects per pool must be the same size. Not sure how that would be better.

pjmlp

At which point is much simpler to introduce automatic memory management in some form.

That solution is basically how platforms like Psion or Symbian used handles due to memory constraints.

randyrand

IMO this is still simpler than automatic memory management, and the runtime costs are mostly fixed and predictable.

You also don't need to worry about ref cycles or GC pauses.

anonymoushn

I would like Zig to do more to protect users from dangling stack pointers somehow. I am almost entirely done writing such bugs, but I catch them in code review frequently, and I recently moved these lines out of main() into some subroutine:

  var fba = std.heap.FixedBufferAllocator.init(slice_for_fba);
  gpa = fba.allocator();
slice_for_fba is a heap-allocated byte slice. gpa is a global. fba was local to main(), which coincidentally made it live as long as gpa, but then it was local to some setup subroutine called by main(). gpa contains an internal pointer to fba, so you run into trouble pretty quickly when you try allocating memory using a pointer to whatever is on that part of the stack later, instead of your FixedBufferAllocator.

Many of the dangling stack pointers I've caught in code review don't really look like the above. Instead, they're dangling pointers that are intended to be internal pointers, so they would be avoided if we had non-movable/non-copyable types. I'm not sure such types are worth the trouble otherwise though. Personally, I've just stopped making structs that use internal pointers. In a typical case, instead of having an internal array and a slice into the array, a struct can have an internal heap-allocated slice and another slice into that slice. like I said, I'd like these thorns to be less thorny somehow.

10000truths

Alternatively, use offset values instead of internal pointers. Now your structs are trivially relocatable, and you can use smaller integer types instead of pointers, which allows you to more easily catch overflow errors.

anonymoushn

This is a good idea, but native support for slices tempts one to stray from the path.

alphazino

> so they would be avoided if we had non-movable/non-copyable types.

There is a proposal for this that was accepted a while ago[0]. However, the devs have been focused on the self-hosted compiler recently, so they're behind on actually implementing accepted proposals.

[0] https://github.com/ziglang/zig/issues/7769

throwawaymaths

This. I believe it is in the works, but postponed to finish up self-hosted.

https://github.com/ziglang/zig/issues/2301

avgcorrection

A meta point to make here but I don’t quite understand the pushback that Rust has gotten. How often does a language come around that flat out eliminates certain errors statically, and at the same time manages to stay in that low-level-capable pocket? And doesn’t require a PhD (or heck, a scholarly stipend) to use? Honestly that might be a once in a lifetime kind of thing.

But not requiring a PhD (hyperbole) is not enough: it should be Simple as well.

But unfortunately Rust is (mamma mia) Complex and only pointy-haired Scala type architects are supposed to gravitate towards it.

But think of what the distinction between no-found-bugs (testing) and no-possible-bugs (a certain class of bugs) buys you; you don’t ever have to even think about those kinds of things as long as you trust the compiler and the Unsafe code that you rely on.

Again, I could understand if someone thought that this safety was not worth it if people had to prove their code safe in some esoteric metalanguage. And if the alternatives were fantastic. But what are people willing to give up this safety for? A whole bunch of new languages which range from improved-C to high-level languages with low-level capabilities. And none of them seem to give some alternative iron-clad guarantees. In fact, one of their selling point is mere optionality: you can have some safety and/or you can turn it off in release. So runtime checks which you might (culturally/technically) be encouraged to turn off when you actually want your code to run out in the wild, where users give all sorts of unexpected input (not just your “asdfg” input) and get your program into weird states that you didn’t have time to even think of. (Of course Rust does the same thing with certain non-memory-safety bug checks like integer overflow.)

nyanpasu64

Unsafe Rust is an esoteric language without iron-clad guarantees, and type-level programming and async Rust is an esoteric metalanguage (https://hirrolot.github.io/posts/rust-is-hard-or-the-misery-...). For example, matklad made a recent blog post on "Caches In Rust" (https://matklad.github.io/2022/06/11/caches-in-rust.html). The cache is built around https://docs.rs/elsa, which is built around https://docs.rs/stable_deref_trait/latest/stable_deref_trait..., which is unsound for Box and violates stacked borrows in its current form: https://github.com/Storyyeller/stable_deref_trait/issues/15

There is a recurring trend of sound C programs turning into unsound Rust programs, because shared mutability is often necessary but it's difficult to avoid creating &mut, and Stacked Borrows places strict conditions on constructing &mut T (they invalidate some but not all aliasing *const T).

staticassertion

I don't think this is a great example of "sound C program turning into unsound Rust program". The crate isn't "unsound" in the way a C program would be - it's unsound in the sense that, given either 'unsafe' elsewhere or changes to how Rust constructs work (that are not guaranteed) a consumer of this crate could accidentally violate one of the necessary guarantees.

For a Rust program the bar is "has to be safe, even if some other part of the program uses unsafe". That seems like it's arguably a higher bar than C where everything is already "unsafe" in that same way.

haberman

I invested a lot of time porting some parsing code I had written to Rust, with the vision that Rust is the memory-safe future. The code I was porting from used arenas, so I tried to use arenas in Rust also.

Using arenas required a bunch of lifetime annotations everywhere, but I was happy to do it if I could get provable memory safety.

I got everything working, but the moment I tried to wrap it in Python, it failed. The lifetime annotation on my struct was a problem. I tried to work around this by using ouroboros and the self-referencing struct pattern. But then I ran into another problem: the Rust arena I was using (Bumpalo) is not Sync, which means references to the arena are not Send. All of my arena-aware containers were storing references to the Arena and therefore were not Send, but wrapping in Python requires it to be Send. I wrote more about these challenges here: https://blog.reverberate.org/2021/12/19/arenas-and-rust.html

You might say "well don't use an arena, use Box, Rc, etc." But now you're telling me to write more complicated and less efficient code, just to make it work with Rust. That is a hard pill to swallow for what is supposed to be a systems language.

himujjal

I did the Crafting Interpreters book to learn Rust after I did The Rust book.

I faced the same problem as you. Somehow it felt like Arenas went beyond Rust's philosophy and added huge amounts of complexity to the interpreter.

Tree traversal and mutating the environment of a `block` internally was an issue I spent like 2-3 days on. I was porting Java code to Rust after all. Somehow got it working in a Rust way. I used unsafe at one place. But I was left heavily unsatisfied. Something about graphs/trees and Rust don't match up.

kaba0

I’ve written a JVM in Rust where I was pretty much unable to work without unsafe, and while I know the general consensus is that it is a sin to use, I managed to get away with a few usages wrapped tightly inside a safe API. Sure, I had memory problems during writing, but with MIRI and some debug assertions they were not hard to hunt down and I really only had to get that few lines of code inside unsafe blocks right.

What I’m trying to say with all that, do not be afraid to use unsafes in Rust. It is part of the language for a reason. Sure, do use ARC for some non-performance critical whatever, because it frankly doesn’t really matter. But where it matters, and you decided to use a low-level language, then go for unsafe if that’s the only reasonable way. The result will still be much safer than the other low-level languages. I believe the problem here is the same what C++ tried to achieve: making people believe it is a high level language. That is just dishonest, and really should not be the goal of Rust.

staticassertion

You can tell pyo3 that your type isn't Send and then it'll panic if the object is accessed from multiple threads. Given that that's the only safe option, that seems fine? You say that that's not acceptable for a production library but I don't see the issue.

You have the same restrictions in C++ except with worse consequences.

haberman

> You have the same restrictions in C++ except with worse consequences.

In C++ it is safe because the arena is only used from one thread at a time.

To model this C++ pattern in Rust, what I would really want is:

1. Arena should be Sync, and not use interior mutability.

2. Arena::alloc() should do a dual borrow: (a) a mut borrow of the Arena metadata, only for the duration of the alloc() call, and (b) a non-mut (shared) borrow of the Arena data.

Because this kind of split borrow cannot be expressed in Rust AFAICS, (2) is not possible, so (1) is not feasible. This forces Bumpalo to be !Sync, which makes a direct Rust port of the C++ pattern impossible.

I've heard this called the "factory problem" for Rust: you cannot easily make a factory type in Rust that returns references, because if the create() operation mutates the factory, then the returned reference will have a mutable borrow on the factory.

The alternative would be to make a truly thread-safe arena/factory, which could be Sync with interior mutability, but that is an efficiency compromise due to synchronization overhead.

himujjal

I understand where you are coming from. Rust and Graphs/Trees is a hard problem. Somehow it goes beyond the way we think about Graphs.

avgcorrection

I respect the effort. I won’t argue against such hard-earned experience.

gnuvince

Rust has been my primary language for the past 5 years, but it's moving in a direction that gets it farther away from my own values about what software ought to be like. As more features are added to the language, the ways they interact with each other increases the overall complexity of the language and it becomes hard to keep up.

I really like the safety guarantees that Rust provides and I want to keep enjoying them, but the language -- and more importantly, its ecosystem -- is moving from something that was relatively simple to a weird mish-mash of C++, JavaScript, and Haskell, and I'm keeping an eye out for a possible escape hatch.

Zig, Odin, or Hare are not on the same plane of existence as Rust when it comes to out-of-the-box safety (or, at the moment, out-of-the-box suitability for writing production-grade software), but they are simpler and intend to remain that way. That really jives with my values. Yes, this means that some of the complexity of writing software is pushed back onto me, the programmer, but I feel that I have a better shot at writing good software with a simple language than with a complex language where I only superficially understand the features.

brabel

> have a better shot at writing good software with a simple language than with a complex language where I only superficially understand the features.

That's exactly how I feel too. No matter how much I use Rust, I find it nearly impossible to claim I understand a lot of its features. Zig OTOH is basically what C would look like if designed today! The improvements it offers over C are very compelling to me... it remains to be seen if the lack of formal guarantees that Rust gives still makes Zig programs similarly as buggy as C programs, but my current impression is that Zig programs are going to be very far away from C's in terms of safety issues... the features the blog post mention go a long way.

socialdemocrat

I think you are too dismissive of the importance of simplicity. Programming is hard. That Rust takes away certain problems doesn’t change that. A lot of coding is just reading and understanding some code. If you have problems understanding some code then I hat is also code you are. Or likely to not catch bugs in.

A compiler cannot be a substitute for your brian. The ability to read code and think clearly about it is a massively important feature because humans at the end of the day are the ones who have to understand code and fix it.

It depends on the person. Programmers are different. Rust works great for some. To me it looks too much like C++ which is something I want to put behind. I know it is a different language but it has a lot of that same focus as C++ that leads to slow compilers and complex looking code.

If I was younger I might have put in the effort, but I am not willing to make the same wrong bet I did with C++. I sunk so much time into perfecting C++ skills and realizing afterwards when using other languages that it was a huge waste.

gurjeet

> ... I am not willing to make the same wrong bet I did with C++. I sunk so much time into perfecting C++ skills ...

What are your top complaints about C++? What parts/patterns are wasteful, and must be avoided?

I have encountered C++ a couple of times in my career. And both those times I was barely able to survive in the short periods of time I spent in those jobs. I'm pretty good at C, but for the life of me, I just can't deal with the hidden behaviours of C++.

So far, I have seen only one example of well-written, understandable C++ code: LLVM. I dabbled in learning LLVM for a side project, and the tutorials on writing compiler passes, and the LLVM's own code, seems to use only the rudimentary features of C++ (primarily, single-inheritance). And this absence of complex C++ features made me feel comfortable looking at, reading, and understanding the LLVM code.

I am of the firm belief now that good code can be written in any bad language, and bad code can be written in any good language. Perhaps those couple of times that I encountered difficult C++ codebases, they were just instances of bad code, and should not be used as an indictment of the C++ language.

Good code => readable, understandable, maintainable, extensible, rewritable. Bad code => !Good code.

SubjectToChange

> LLVM's own code, seems to use only the rudimentary features of C++ (primarily, single-inheritance). And this absence of complex C++ features made me feel comfortable looking at, reading, and understanding the LLVM code.

Uh, what parts of LLVM were you reading? Because LLVM uses pretty much every C++14 feature allowed by their coding standard. The fact that you felt like no complex machinery was being used is actually a testament of the powers of abstraction in C++.

avgcorrection

Readable code is important? Then keep in mind the context: low-level programming and writing Safe Rust or writing in some not-quite-memory-safe language because one would rather be bitten by undefined behavior now and again rather than have to learn the borrow checker.

Knowing for sure that the code you write is at least memory safe is a certain kind of readability win and I don’t see how anyone can claim that it’s not.

kristoff_it

> Of course Rust does the same thing with certain non-memory-safety bug checks like integer overflow.

The problem with getting lost too much in the ironclad certainties of Rust is that you start forgetting that simplicity (papa pia) protects you from other problems. You can get certain programs in pretty messed up states with an unwanted wrap around.

Programming is hard. Rust is cool, very cool, but it's not a universal silver bullet.

avgcorrection

Nothing Is Perfect is a common refrain and non-argument.

If option A has 20 defects and option B has the superset of 25 defects then option A is better—the fact that option A has defects at all is completely besides the point with regards to relative measurements.

Karrot_Kream

But if Option A has 20 defects and takes a lot of effort to go down to 15 defects, yet Option B has 25 defects and offers a quick path to go down to 10 defects, then which option is superior? You can't take this in isolation. The cognitive load of Rust takes a lot of defects out of the picture completely, but going off the beaten path in Rust takes a lot of design and patience.

People have been fighting this fight forever. Should we use static types which make it slower to iterate or dynamic types that help converge on error-free behavior with less programmer intervention? The tradeoffs have become clearer over the years but the decision remains as nuanced as ever. And as the decision space remains nuanced, I'm excited about languages exploring other areas of the design space like Zig or Nim.

kristoff_it

Zig keeps overflow checks in the main release mode (ReleaseSafe), Rust defines ints as naturally wrapping in release. This means that Rust is not a strict superset of Zig in terms of safety, if you want to go down that route.

I personally am not interested at all in abstract discussions about sets of errors. Reality is much more complicated, each error needs to be evaluated with regards to the probability of causing it and the associated cost. Both things vary wildly depending on the project at hand.

coldtea

>If option A has 20 defects and option B has the superset of 25 defects then option A is better

Only if "defect count" is what you care for.

What if you don't give a fuck about defect count, but prefer simplicity to explore/experiment quickly, ease of use, time to market, and so on?

Klonoar

>A meta point to make here but I don’t quite understand the pushback that Rust has gotten.

The non-CS "human" answer to this is that so much of tech and programming is unfortunately tied to identity. There are developers who view their choices as bordering on religion (from editors to languages to operating systems and so on) and across the entire industry you can see where some will take the slightest hint that things could be better as an affront to their identity.

The more that Rust grows and winds up in the industry, the more this will continue to happen.

avgcorrection

Yes. I’m as guilty of this as anyone else.

the__alchemist

This is a concise summary of why I'm betting on Rust as the future of performant and embedded computing. You or I could poke holes in it for quite some time. Yet, I imagine the holes would be smaller and less numerous than in any other language capable in these domains.

I think some of the push back is from domains where Rust isn't uniquely suited. Eg, You see a lot of complexity in Rust for server backends; eg async and traits. So, someone not used to Rust may see these, and assume Rust is overly complex. In these domains, there are alternatives that can stand toe-to-toe with it. In lower-level domains, it's not clear there are.

cogman10

> I think some of the push back is from domains where Rust isn't uniquely suited. Eg, You see a lot of complexity in Rust for server backends; eg async and traits. So, someone not used to Rust may see these, and assume Rust is overly complex. In these domains, there are alternatives that can stand toe-to-toe with it. In lower-level domains, it's not clear there are.

The big win for rust in these domains is startup time, memory usage, and distributable size.

It may be that these things outweigh the easier programming of go or java.

Now if you have a big long running server with lots of hardware at your disposal then rust doesn't make a whole lot of sense. However, if want something like an aws lambda or rapid up/down scaling based on load, rust might start to look a lot more tempting.

modeless

It's pretty simple. Rust's safety features (and other language choices) have a productivity cost. For me I found the cost surprisingly high, and I'm not alone (though I'm sure I'll get replies from people who say the borrow checker never bothers them anymore and made them a better programmer, let's just agree there's room to disagree).

Although I'm a big fan of safety, since experiencing Rust my opinion is that low-pause GC is a better direction for the future of safe but high performance programming. And there's also a niche for languages that aren't absolutely safe which I think Zig looks like a great fit for.

avgcorrection

Please read the fricking context.

My comment was about preferring other new, low-level languages over Rust when they don’t give the same safety guarantees. If you can deal with a GC then fine—my comment has got nothing to do with that.

So it was much, much more narrow than making a case for Rust in general.

Rust and GC both eliminate certain defects. And if you can use a GC then you don’t need Rust (w.r.t. memory safety).

Admittedly maybe I could have made it more clear that my comment does not make an argument against new low-level capable languages when used with some kind of automatic memory management scheme, like I guess Nim.

crabbygrabby

After having seeing GC after GC fail to live up to expectations... I'm still voting for Rust. So much more control over wether you even want to allocate or not. I know where you are coming from but, I see it differently I guess.

modeless

Whether you allocate or not is a property of the language, not the GC. A lot of GC'd languages encourage or even force allocations all over the place. But maybe we could do better in a new language.

dleslie

And here is the table with Nim added; though potentially many GC'd languages would be similar to Nim:

https://uploads.peterme.net/nimsafe.html

Edit: noteworthy addendum: the ARC/ORC features have been released, so the footnote is now moot.

3a2d29

Seeing Nim danger made me think, shouldn’t rust unsafe be added?

Seems inaccurate to display rust as safe and not include what actually allows memory bugs to be found in public crates.

IshKebab

I don't know why Rust gets "runtime" and Nim gets "compile time" for type confusion?

shirleyquirk

yes, for tagged unions specifically, (which the linked post refers to for that row) Nim raises an exception at runtime when trying to access the wrong field, (or trying to change the discriminant)

jewpfko

Thanks! I'd love to see a Dlang BetterC column too

Snarwin

Here's a version with D included:

https://gist.github.com/pbackus/0e9c9d0c83cd7d3a46365c054129...

The only difference in BetterC is that you lose access to the GC, so you have to use RC if you want safe heap allocation.

kzrdude

I guess rust should say "wraps" for integer overflow as well, as that's what it does in default release compile.

verdagon

A lot of embedded devices and safety critical software sometimes don't even use a heap, and instead use pre-allocated chunks of memory whose size is calculated beforehand. It's memory safe, and has much more deterministic execution time.

This is also a popular approach in games, especially ones with entity-component-system architectures.

I'm excited about Zig for these use cases especially, it can be a much easier approach with much less complexity than using a borrow checker.

jorangreef

This is almost what we do for TigerBeetle, a new distributed database being written in Zig. All memory is statically allocated at startup [1]. Thereafter, there are zero calls to malloc() or free(). We run a single-threaded control plane for a simple concurrency model, and because we use io_uring—multithreaded I/O is less of a necessary evil than it used to be.

I find that the design is more memory efficient because of these constraints, for example, our new storage engine can address 100 TiB of storage using only 1 GiB of RAM. Latency is predictable and gloriously smooth, and the system overall is much simpler and fun to program.

[1] “Let's Remix Distributed Database Design” https://www.youtube.com/channel/UC3TlyQ3h6lC_jSWust2leGg

infamouscow

> Latency is predictable and gloriously smooth, and the system overall is much simpler and fun to program.

This has also been my experience building a database in Zig. It's such a joy.

catlifeonmars

> for example, our new storage engine can address 100 TiB of storage using only 1 GiB of RAM.

I’m a little confused by this statement. I assume by “address” you mean indexing, and the size of an index is related to the number of entries, not the amount of data being indexed. (For example, you could trivially address 100TiB using 1 address width of memory if all 100TiB belongs to the same key).

jorangreef

> I’m a little confused by this statement. I assume by “address” you mean indexing, and the size of an index is related to the number of entries, not the amount of data being indexed.

Thanks for the question!

What's in view here is an LSM-tree database storage engine. In general, these typically store keys between 8 and 32 bytes and values up to a few MiB.

In our case, the question is how much memory is required for the LSM-tree to be able to index 100 TiB worth of key/value storage, where:

  * keys are between 8 and 32 bytes,
  * values are between 8 and 128 bytes,
  * keys and values are stored in tables up to 64 MiB,
  * each table requires between 128 and 256 bytes of metadata to be kept in memory,
  * auxiliary data structures such as a mutable/immutable table must be kept in memory, and where
  * all memory required by the engine must be statically allocated.
That's alot of small keys and values!

Typically, a storage system might require at least an order of magnitude more than 1 GiB of memory to keep track of that many keys using an LSM-tree as index, even using dynamic allocation, which only needs to allocate as needed.

Another way to think of this is as a filesystem, since it's a very similar problem. Imagine you stored 100 TiB worth of 4096 byte files in ZFS. How much RAM would that require for ZFS to be able to keep track of everything?

pcwalton

Even in this environment, you can still have dangling pointers to freed stack frames. There's no way around having a proper lifetime system, or a GC, if you want memory safety.

verdagon

Yep, or generational references [0] which also protect against that kind of thing ;)

The array-centric approach is indeed more applicable at the high levels of the program.

Sometimes I wonder if a language could use an array-centric approach at the high levels, and then an arena-based approach for all temporary memory. Elucent experimented with something like this for Basil once [1] which was fascinating.

[0] https://verdagon.dev/blog/generational-references

[1] https://degaz.io/blog/632020/post.html

com2kid

> Yep, or generational references [0] which also protect against that kind of thing ;)

First off, thank you for posting all your great articles on Vale!

Second off, I just read the generational references blog post for the 3rd time and now it makes complete sense, like stupid obvious why did I have problems understanding this before sense. (PS: The link to the benchmarks is dead :( )

I hope some of the novel ideas in Vale make it out to the programming language world at large!

yw3410

It feels like it would work really well (you could even swap between arenas per frame). I've been wanting to try something similar but it's early days.

im3w1l

Well if get rid of not just the heap, but the stack too... turn all variables into global ones, then it will be safe.

This means we lose thread safety and functions become non-reentrant (but easy to prove safe - make sure graph of A-calls-B is a acyclical).

infamouscow

> Even in this environment, you can still have dangling pointers to freed stack frames.

How frequently does this happen in real software? I learned not to return pointers to stack allocated variables when I was 12 years old.

> There's no way around having a proper lifetime system, or a GC, if you want memory safety.

If you're building an HTTP caching program where you know the expiration times of objects, a Rust-style borrow-checker or garbage collector is not helping anyone.

seoaeu

> > Even in this environment, you can still have dangling pointers to freed stack frames.

> How frequently does this happen in real software? I learned not to return pointers to stack allocated variables when I was 12 years old.

This happens rarely. However, the reason it isn't an issue is because C programmers are (and have to be) extremely paranoid about this kind of thing.

Rust, however, lets you recklessly pass around pointers to local variables while guaranteeing that you won't accidentally use one as a return value. One example is scoped thread pools which let you spawn a bunch of worker threads and then pass them pointers to stack allocated variables that get concurrently accessed by all the threads. The Rust type system/borrow checker ensures both thread safety and memory safety.

Would you trust a novice C programmer to use something like that?

Arnavion

>I learned not to return pointers to stack allocated variables when I was 12 years old.

So, if you slip while walking today, does that mean you didn't learn to walk when you were one year old?

brundolf

Rust's borrow checker would be much calmer in these scenarios too, wouldn't it? If there are no lifetimes, there are no lifetime errors

thecatster

Rust is definitely different (and calmer imho) on bare metal. That said (as much of a Rust fanboy I am), I also enjoy Zig.

the__alchemist

Yep! We've entered a grey areas, where some Rust embedded libs are expanding the definitions of memory safety, and what the borrow checker should evaluate beyond what you might guess. Eg, structs that represent peripherals, that are now checked for ownership; the intent being to prevent race conditions. And Traits being used to enforce pin configuration.

snicker7

How exactly is pre-allocation safer? If you would ever like to re-use chunks of memory, then wouldn’t you still encounter “use-after-free” bugs?

verdagon

The approach can reuse old elements for new instances of the same type, so to speak. Since the types are the same, any use-after-free becomes a plain ol' logic error. We use this approach in Rust a lot, with Vecs.

olig15

But if you have a structure that contains offsets into another buffer somewhere, or an index, whatever - the wrong value here could be just as bad as a use-after-free. I don’t see how this is any safer. If you use memory after free from a malloc, with any chance you’ll hit a page fault, and your app will crash. If you have a index/pointer to another structure, you could still end up reading past the end of that structure into the unknown.

kaba0

These are 1000 times worse than even a segfault. These are the bugs you won’t notice until they crop up at a wildly different place, and you will have a very hard time tracking it back to their origin (slightly easier in Rust, as you only have to revalidate the unsafe parts, but it will still suck)

bsder

Normally you do this on embedded so that you know exactly what your memory consumption is. You never have to worry about Out of Memory and you never have to worry about Use After Free since there is no free. That memory is yours for eternity and what you do with it is up to you.

It doesn't, however, prevent you from accidentally scribbling over your own memory (buffer overflow, for example) or from scribbling over someone else's memory.

nine_k

No; every chunk is for single, pre-determined use.

Imagine all variables in your program declared as static. This includes all buffers (with indexes instead of pointers), all nested structures, etc.

LAC-Tech

Safe enough. You can use `std.testing.allocator` and it will report leaks etc in your test cases.

What rust does sounds like a good idea in theory. In practice it rejects too many valid programs, over-complicates the language, and makes me feel like a circus animal being trained to jump through hoops. Zigs solution is hands down better for actually getting work done, plus it's so dead simple to use arena allocation and fixed buffers that you're likely allocating a lot less in the first place.

Rust tries to make allocation implicit, leaving you confused when it detects an error. Zig makes memory management explicit but gives you amazing tools to deal with it - I have a much clearer mental model in my head of what goes on.

Full disclaimer, I'm pretty bad at systems programming. Zig is the only one I've used where I didn't feel like memory management was a massive headache.

Klonoar

>Zigs solution is hands down better for actually getting work done

Rust has seen significant usage in large companies; they wouldn't be using it unless it was usable for "real work".

>Full disclaimer, I'm pretty bad at systems programming. Zig is the only one I've used where I didn't feel like memory management was a massive headache.

I'd say this about Rust, though. Rust's mental model is very straightforward if you accept the borrow-checker and stop fighting it. Can you list any examples of what you think is a headache...?

>In practice it rejects too many valid programs, over-complicates the language, and makes me feel like a circus animal being trained to jump through hoops.

I've found that jumping through those hoops leads to things running in production that don't make me get up in the middle of the night. Can you show me a "valid program" that Rust rejects?

LAC-Tech

Rust has seen significant usage in large companies; they wouldn't be using it unless it was usable for "real work".

I didn't say it wasn't usable. I said I found Zig more usable.

I'd say this about Rust, though. Rust's mental model is very straightforward if you accept the borrow-checker and stop fighting it. Can you list any examples of what you think is a headache...?

Mate, I didn't start learning rust in order to wage war against the borrow checker. I had no idea what the hell it wanted a lot of the time. Each time I fixed an error I thought I got it, and each time I was wrong. The grind got boring.

As for specific examples no, I've tried to put rust out of my mind. I certainly can't remember specific issues from 3 months ago.

I've found that jumping through those hoops leads to things running in production that don't make me get up in the middle of the night. Can you show me a "valid program" that Rust rejects?

Yeah that's how rust as sold, the compiler is your friend and stuff will compile and it will never fail.

In reality the compiler was so irritating I hardly got anything done at all. The output wasn't super reliable software, it was no software.

crabbygrabby

Sounds like you got annoyed trying to learn something new, quit, and decided it's not worth your time. When people tell you it takes a few weeks to get the hang of rust they aren't kidding. Most people aren't the exception to that, but once you do get it, it's really great... Not kidding about that either...

voidhorse

I agree with the parent. Rust is hard because it fundamentally inverts the semantics of pretty much every other programming language on earth by making move semantics the default instead of copying.

Yet there’s no syntax to indicate this. Worse, actual copies are hidden behind a trait that you have no way of knowing whether a particular external lay defined type implements it or not outside of reading documentation. A lot of Rust’s important mechanics are underrepresented syntactically, which makes the language harder to get used to imo. I agree with the parent that in general it’s better for things to be obvious as you’re writing them—if rust had syntax that screamed “you’re moving this thing” or “you’re copying this thing because it implements copy” that’d be a lot easier to get used to than what beginners are currently stuck with which is a cycle of “get used to the invisible semantics by having the compiler yell at you at build time until you’ve drilled it into your head past the years of accumulated contrary models” and oh, as soon as you have to use another language this model becomes useless, so expertise in it does not translate to other domains (though that will hopefully change in the future)

brabel

> they wouldn't be using it unless it was usable for "real work".

We had to introduce Rust at work because we really needed some WASM functionality in one of our mostly JS frontends... anyway, I was excited about Rust and all and pushed the idea, implemented the whole thing and made presentations for other developers about Rust. I was thinking everyone would be as excited as I was and would jump at the chance of maintaining the Rust module.

In reality , only one of the 20+ devs even tried to ever touch the Rust code. Everyone else thought the code looked like Greek (no offence to my greek friends!) ... today when the code needs change, I am pretty much the only one who can do it, or the other guy (who is more novice than me in Rust so takes a lot longer to do anything, but at least there's someone else).

For reference: we write code in Java/Kotlin/Groovy/Erlang. So, we're not a system programming shop in any way, so I can't speak for places where C and C++ were previously being used.

Klonoar

I would be curious how your shop fares with Swift then, considering it may also look "like greek" to them.

(Legit point, no snark)

Measter

> Can you show me a "valid program" that Rust rejects?

    #[derive(Debug)]
    struct Foo {
        a: i32
    }
    
    fn thing(foo: &mut Foo) {
        match foo {
            f @ Foo { a } if *a > 5 => {
                println!("{:?}", f)
            }
            _ => {}
        }
    }
There's no reason it should reject that, as the use of the `a` reference doesn't interleave with the use of `f`.

Klonoar

By creating `f` you're in essence trying to borrow something that's mutably borrowed already, which the borrow checker doesn't allow. I guess I could see some logic for this being possible, but in practice I've never encountered this in any Rust codebase I've gone through.

The trivial example fix is just to... ensure it can copy, and tweak the match line:

    #[derive(Copy, Clone, Debug)]
    struct Foo {
        a: i32
    }
    
    fn thing(foo: &mut Foo) {
        match *foo {
            f @ Foo { a } if a > 5 => {
                println!("{:?}", f)
            }
        
            _ => {}
        }
    }
      

    fn main() {
        let mut x = Foo { a: 1 };
        thing(&mut x);
    }
If the struct was bigger and/or had types that couldn't copy, I'd refrain from trying to shoehorn matching like that entirely.

woodruffw

This was a great read, with an important point: there's always a tradeoff to be made, and we can make it (e.g. never freeing memory to obtain temporal memory safety without static lifetime checking).

One thought:

> Never calling free (practical for many embedded programs, some command-line utilities, compilers etc)

This works well for compilers and embedded systems, but please don't do it command-line tools that are meant to be scripted against! It would be very frustrating (and a violation of the pipeline spirit) to have a tool that works well for `N` independent lines of input but not `N + 1` lines.

samatman

There are some old-hand approaches to this which work out fine.

An example would be a generous rolling buffer, with enough room for the data you're working on. Most tools which are working on a stream of data don't require much memory, they're either doing a peephole transformation or building up data with filtration and aggregation, or some combination.

You can't have a use-after-free bug if you never call free, treating the OS as your garbage collector for memory (not other resources please) is fine.

woodruffw

Yeah, those are the approaches that I've used (back when I wrote more user tools in C). I wonder how those techniques translate to a language like Zig, where I'd expect the naive approach to be to allocate a new string for each line/datum (which would then never truly be freed, under this model.)

anonymoushn

I've been writing a toy `wordcount` recently, and it seems like if I wanted to support inputs much larger than the ~5GB file I'm testing against, or inputs that contain a lot more unique strings per input file size, I would need to realloc, but I would not need to free.

woodruffw

Is that `wordcount` in Zig? My understanding (which could be wrong) is that reallocation in Zig would leave the old buffer "alive" (from the allocator's perspective) if it couldn't be expanded, meaning that you'd eventually OOM if a large enough contiguous region couldn't be found.

anonymoushn

It's in zig but I just call mmap twice at startup to get one slab of memory for the whole file plus all the space I'll need. I am not sure whether Zig's GeneralPurposeAllocator or PageAllocator currently use mremap or not, but I do know that when realloc is not implemented by a particular allocator, the Allocator interface provides it as alloc + memcpy + free. So I think I would not OOM. In safe builds when using GeneralPurposeAllocator, it might be possible to exhaust the address space by repeatedly allocating and freeing memory, but I wouldn't expect to run into this on accident.

avgcorrection

> This was a great read, with an important point: there's always a tradeoff to be made, and we can make it (e.g. never freeing memory to obtain temporal memory safety without static lifetime checking).

I.e. we can choose to risk running out of memory? I don’t understand how this is a viable strategy unless you know you only will process a certain input size.

woodruffw

Yes. There are many domains where you know exactly how much memory you’ll need (even independent of input size), so just “leaking” everything is a perfectly valid technique.

avgcorrection

You will have to explain this to me. From the original mention (article) it seems that they mean that compilers in general can be written in this way. Is that what they mean? Or do they mean that compilers can be written in that way if they know something about the inputs that it will be fed?

tptacek

"Temporal" and "spatial" is a good way to break this down, but it might be helpful to know the subtext that, among the temporal vulnerabilities, UAF and, to an extent, type confusion are the big scary ones.

Race conditions are a big ugly can of worms whose exploitability could probably be the basis for a long, tedious debate.

When people talk about Zig being unsafe, they're mostly reacting to the fact that UAFs are still viable in it.

jorangreef

I see your UAF and raise you a bleed!

As you know, buffer bleeds like Heartbleed and Cloudbleed can happen even in a memory safe language, they're hard to defend against (padding is everywhere in most formats!), easier to pull off than a UAF, often remotely accessible, difficult to detect, remain latent for a long time, and the impact is devastating. All your RAM are belong to us.

For me, this can of worms is the one that sits on top of the dusty shelf, it gets the least attention, and memory safe languages can be all the more vulnerable as they lull one into a false sense of safety.

tptacek

Has an exploitable buffer bleed (I'm happy with this coinage!) happened in any recent memory safe codebase?

jorangreef

I worked on a static analysis tool to detect bleeds in outgoing email attachments, looking for non-zero padding in the ZIP file format.

It caught different banking/investment systems written in memory safe languages leaking server RAM. You could sometimes see the whole intranet web page, that the teller or broker used to generate and send the statement, leaking through.

Bleeds terrify me, no matter the language. The thing with bleeds is that they're as simple as a buffer underflow, or forgetting to zero padding. Not even the borrow checker can provide safety against that.

kaba0

Would that work in the case of Java for example? It nulls every field as per the specification (at least observably at least), so unless someone writes some byte mangling manually I don’t necessarily see it work out.

TinkersW

Falsely representing the state of C & C++ doesn't really lead to a convincing argument. All those safety checks Zig supports are easily enabled in C++, and widely used. Sometimes they are even on by default.

lmh

Question for Zig experts:

Is it possible, in principle, to use comptime to obtain Rust-like safety? If this was a library, could it be extended to provide even stronger guarantees at compile time, as in a dependent type system used for formal verification?

Of course, this does not preclude a similar approach in Rust or C++ or other languages; but comptime's simplicity and generality seem like they might be beneficial here.

pron

Not as it is (it would require mutating the type's "state"), but hypothetically, comptime could be made to support even more programmable types. But could doesn't mean should. Zig values language simplicity and explicitness above many other things.

lmh

Thanks, that's informative. This was meant to clarify the bounds of Zig's design rather than as a research proposal. Otherwise, one might read it as an open invitation to just the sort of demonic meta-thinking that its users abhor.

kristoff_it

Somebody implemented part of it in the past, but it was based on the ability to observe the order of execution of comptime blocks, which is going to be removed from the language (probably already is).

https://github.com/DutchGhost/zorrow

It's not a complete solution, among other things, because it only works if you use it to access variables, as the language has no way of forcing you.

lmh

Thanks, that's interesting.

anonymoushn

It is possible in principle to write a Rust compiler in comptime Zig, but the real answer is "no."

avgcorrection

Why would the mere existence of some static-eval capability give you that affordance?

Researchers have been working on these three things for decades. Yes, “comptime” isn’t some Zig invention but a somewhat limited (and anachronistic to a degree) version of what researchers have added to research versions of ML and Ocaml. So can it implement all the static language goodies of Rust and give you dependent types? Sure, why not? After all, computer scientists never had the idea that you can evaluate values and types at compile-time. Now all those research papers about static programming language design will wither on their roots now that people can just use the simplicity and generality of `comptime` to prove programs correct.

ptato

Not an expert by any means, but my gut says that it would be very cumbersome and not practical for general use.

dkersten

I’m not sure I understand the value of an allocator that doesn’t reuse allocations, as a bug prevention thing. Is it just for performance? (Since its never reused, allocation can simply be incrementing an offset by the size of the allocation)? Because beyond that, you can get the same benefit in C by simply never calling free on the memory you want to “protect” against use-after-free.

anonymoushn

The allocations are freed and the addresses are never reused. So heap use-after-frees are segfaults.

dkersten

Does that mean that each allocation is always page-aligned?

anonymoushn

Not sure on the details here. I'd have to try it out and see.

Larger allocations will be page-aligned, but if you make a bunch of very small allocations, they may go into the same pages, and freeing all but one per page of them may leave you with the pages still mapped. I've skimmed the GeneralPurposeAllocator code and know it has this sort of behavior at least sometimes, but I'm not really familiar with which things change in safe builds.

kaba0

I believe it is only for performance, as malloc will have to find place for the allocation, while it is a pointer bump only for a certain kind of allocator.

ajross

> In practice, it doesn't seem that any level of testing is sufficient to prevent vulnerabilities due to memory safety in large programs. So I'm not covering tools like AddressSanitizer that are intended for testing and are not recommended for production use.

I closed the window right there. Digs like this (the "not recommended" bit is a link to a now famous bomb thrown by Szabolcs on the oss-sec list, not to any kind of industry consensus piece) tell me that the author is grinding an axe and not taking the subject seriously.

Security is a spectrum. There are no silver bullets. It's OK to say something like "Rust is better than Zig+ASan because", it's quite another to refuse to even treat the comparison and pretend that hardening tools don't exist.

This is fundamentally a strawman, basically. The author wants to argue against a crippled toolchain that is easier to beat instead of one that gets used in practice.

klyrs

As a Zig fan, I disagree. I think it's really important to examine the toolchain that beginners are going to use.

> I'm also focusing on software as it is typically shipped, ignoring eg bounds checking compilers like tcc or quarantining allocators like hardened_malloc which are rarely used because of the performance overhead.

To advertize that Zig is perfectly safe because things like ASan exist would be misleading, because that's not what users get out of the box. Zig is up-front and honest about the tradeoffs between safety and performance, and this evaluation of Zig doesn't give any surprises if you're familiar with how Zig describes itself.

ajross

> To advertize that Zig is perfectly safe because things like ASan exist would be misleading

Exactly! And for the same reason. You frame your comparison within the bounds of techniques that are used in practice. You don't refuse to compare a tool ahead of time, especially when doing so reinforces your priors.

To be blunt: ASan is great. ASan finds bugs. Everyone should use ASan. Everyone should advocate for ASan. But doing that cuts against the point the author is making (which is basically the same maximalist Rust screed we've all heard again and again), so... he skipped it. That's not good faith comparison, it's spin.

KerrAvon

ASAN doesn’t add memory safety to the base language. It catches problems during testing, assuming those problems occur during the testing run (they don’t always! ASAN is not a panacea!). It’s perfectly fair to rule it out of bounds for this sort of comparison.

lmm

> You frame your comparison within the bounds of techniques that are used in practice.

Well, is ASan used in practice, by the relevant target audience (i.e. mainstream C++ developers)? My guess is that the vast majority of the people both Rust and Zig are aiming for are people who don't use ASan with C++ today and wouldn't use ASan with Rust or Zig if they switched to them.

klyrs

Wait, are you saying that because the author didn't push your personal agenda, that's spin? Hardly.