The reason I really like Zig is because there's finally a language that makes it...

kibwen · 2025-12-04T23:23:33 1764890613

> No more praying that your program isn't unceremoniously killed just for asking for more memory - all allocations are assumed fallible and failures must be handled explicitly.

But for operating systems with overcommit, including Linux, you won't ever see the act of allocation fail, which is the whole point. All the language-level ceremony in the world won't save you.

pornel · 2025-12-05T03:50:27 1764906627

Even on Linux with overcommit you can have allocations fail, in practical scenarios.

You can impose limits per process/cgroup. In server environments it doesn't make sense to run off swap (the perf hit can be so large that everything times out and it's indistinguishable from being offline), so you can set limits proportional to physical RAM, and see processes OOM before the whole system needs to resort to OOMKiller. Processes that don't fork and don't do clever things with virtual mem don't overcommit much, and large-enough allocations can fail for real, at page mapping time, not when faulting.

Additionally, soft limits like https://lib.rs/cap make it possible to reliably observe OOM in Rust on every OS. This is very useful for limiting memory usage of a process before it becomes a system-wide problem, and a good extra defense in case some unreasonably large allocation sneaks past application-specific limits.

These "impossible" things happen regularly in the services I worked on. The hardest part about handling them has been Rust's libstd sabotaging it and giving up before even trying. Handling of OOM works well enough to be useful where Rust's libstd doesn't get in the way.

Rust is the problem here.

tucnak · 2025-12-05T06:45:16 1764917116

I hear this claim on swap all the time, and honestly it doesn't sound convincing. Maybe ten or twenty years ago, but today? CAS latency for DIMM has been going UP, and so is NVMe bandwidth. Depending on memory access patterns, and whether it fits in the NVMe controller's cache (the recent Samsung 9100 model includes 4 GB of DDR4 for cache and prefetch) your application may work just fine.

10000truths · 2025-12-04T23:50:54 1764892254

Sure, but you can do the next best thing, which is to control precisely when and where those allocations occur. Even if the possibility of crashing is unavoidable, there is still huge operational benefit in making it predictable.

Simplest example is to allocate and pin all your resources on startup. If it crashes, it does so immediately and with a clear error message, so the solution is as straightforward as "pass bigger number to --memory flag" or "spec out larger machine".

kibwen · 2025-12-05T00:38:25 1764895105

No, this is still misunderstanding.

Overcommit means that the act of memory allocation will not report failure, even when the system is out of memory.

Instead, failure will come at an arbitrary point later, when the program actually attempts to use the aforementioned memory that the system falsely claimed had been allocated.

Allocating all at once on startup doesn't help, because the program can still fail later when it tries to actually access that memory.

the_duke · 2025-12-05T00:49:31 1764895771

To be fair, you can enforce this just by filling all the allocated memory with zero, so it's possible to fail at startup.

Or, even simpler, just turn off over-commit.

But if swap comes into the mix, or just if the OS decides it needs the memory later for something critical, you can still get killed.

bluGill · 2025-12-05T02:43:26 1764902606

I would be suprised if some os detects the page of zeros and removes that allocation until you need it. this seems like a common enough case as to make it worth it when memory is low. I'm not aware of any that do, but it wouldn't be that hard and so seems like someone would try it.

knorker · 2025-12-05T09:08:11 1764925691

There's also KSM, kernel same-page merging.

10000truths · 2025-12-05T01:06:56 1764896816

Which is why I said "allocate and pin". POSIX systems have mlock()/mlockall() to prefault allocated memory and prevent it from being paged out.

interroboink · 2025-12-05T02:04:45 1764900285

Random curious person here: does mlock() itself cause the pre-fault? Or do you have to scribble over that memory yourself, too?

(I understand that mlock prevents paging-out, but in my mind that's a separate concern from pre-faulting?)

10000truths · 2025-12-05T03:00:53 1764903653

FreeBSD and OpenBSD explicitly mention the prefaulting behavior in the mlock(2) manpage. The Linux manpage alludes to it in that you have to explicitly pass the MLOCK_ONFAULT flag to the mlock2() variant of the syscall in order to disable the prefaulting behavior.

kibwen · 2025-12-05T01:58:21 1764899901

Aha, my apologies, I overlooked that.

wavemode · 2025-12-04T23:37:56 1764891476

I imagine people who care about this sort of thing are happy to disable overcommit, and/or run Zig on embedded or specialized systems where it doesn't exist.

dlisboa · 2025-12-04T23:44:31 1764891871

There are far more people running/writing Zig on/for systems with overcommit than not. Most of the hype around Zig come from people not in the embedded world.

xyzzy_plugh · 2025-12-05T00:31:29 1764894689

If we can produce a substantial volume of software that can cope with allocation failures then the idea of using something than overcommit as the default becomes feasible.

It's not a stretch to imagine that a different namespace might want different semantics e.g. to allow a container to opt out of overcommit.

It is hard to justify the effort required to enable this unless it'll be useful for more than a tiny handful of users who can otherwise afford to run off an in-house fork.

kibwen · 2025-12-05T00:43:04 1764895384

> If we can produce a substantial volume of software that can cope with allocation failures then the idea of using something than overcommit as the default becomes feasible.

Except this won't happen, because "cope with allocation failure" is not something that 99.9% of programs could even hope to do.

Let's say that you're writing a program that allocates. You allocate, and check the result. It's a failure. What do you do? Well, if you have unneeded memory lying around, like a cache, you could attempt to flush it. But I don't know about you, but I don't write programs that randomly cache things in memory manually, and almost nobody else does either. The only things I have in memory are things that are strictly needed for my program's operation. I have nothing unnecessary to evict, so I can't do anything but give up.

The reason that people don't check for allocation failure isn't because they're lazy, it's because they're pragmatic and understand that there's nothing they could reasonably do other than crash in that scenario.

Dylan16807 · 2025-12-05T06:15:00 1764915300

I used to run into allocation limits in opera all the time. Usually what happened was a failure to allocate a big chunk of memory for rendering or image decompression purposes, and if that happens you can give up on rendering the current tab for the moment. It was very resilient to those errors.

AlotOfReading · 2025-12-05T03:00:12 1764903612

Have you honestly thought about how you could handle the situation better than an crash?

For example, you could finish writing data into files before exiting gracefully with an error. You could (carefully) output to stderr. You could close remote connections. You could terminate the current transaction and return an error code. Etc.

Most programs are still going to terminate eventually, but they can do that a lot more usefully than a segfault from some instruction at a randomized address.

bluGill · 2025-12-05T02:44:58 1764902698

Even when I have a cache - it is probably in a different code path / module and it would be a terrible architecture that let me access that code.

Dylan16807 · 2025-12-05T06:16:57 1764915417

A way to access an "emergency button" function is a significantly smaller sin than arbitrary crashes.

wolvesechoes · 2025-12-05T08:38:31 1764923911

> Most of the hype around Zig come from people not in the embedded world.

Yet another similarity with Rust.

wavemode · 2025-12-05T00:06:07 1764893167

I never said that all Zig users care about recovering from allocation failure.

randyrand · 2025-12-05T05:35:18 1764912918

> you won't ever see the act of allocation fail

ever? If you have limited RAM and limited storage on a small linux SBC, where does it put your memory?

bjackman · 2025-12-05T06:01:08 1764914468

It handles OOM by killing processes.

Guvante · 2025-12-04T23:24:39 1764890679

If you are pre-allocating Rust would handle that decently as well right?

Certainly I agree that allocations in your dependencies (including std) are more annoying in Rust since it uses panics for OOM.

The no-std set of crates is all setup to support embedded development.

incompatible · 2025-12-05T00:14:38 1764893678

I don't know Zig. The article says "Many people seem confused about why Zig should exist if Rust does already." But I'd ask instead why does Zig exist when C does already? It's just a "better" C? But has the drawback that makes C problematic for development, manual memory management? I think you are better off using a language with a garbage collector, unless your usage really needs manual management, and then you can pick between C, Rust, and Zig (and C++ and a few hundred others, probably.)

throwawaymaths · 2025-12-05T02:37:30 1764902250

yeah, its a better c, but like wouldnt it be nice if c had stadardized fat pointers so that if you move from project to project you don't have to triple check the semantics? for example and like say 50+ "learnings" from 40 years c that are canonized and first class in the language + stdlib

bluecalm · 2025-12-05T08:35:30 1764923730

I think the whole idea is to remove some pain points of C while not introducing additional annoyances people writing low level code don't want.

munificent · 2025-12-05T00:55:40 1764896140

> Stack space is not treated like magic - the compiler can reason about its maximum size by examining the call graph, so you can pre-allocate stack space to ensure that stack overflows are guaranteed never to happen.

How does that work in the presence of recursion or calls through function pointers?

10000truths · 2025-12-05T01:22:08 1764897728

Recursion: That's easy, don't. At least, not with a call stack. Instead, use a stack container backed by a bounded allocator, and pop->process->push in a loop. What would have been a stack overflow is now an error.OutOfMemory enum that you can catch and handle as desired. All that said, there is a proposal that addresses making recursive functions more friendly to static analysis [0].

Function pointers: Zig has a proposal for restricted function types [1], which can be used to enforce compile-time constraints on the functions that can be assigned to a function pointer.

[0]: https://github.com/ziglang/zig/issues/1006 [1]: https://github.com/ziglang/zig/issues/23367

smallstepforman · 2025-12-05T05:44:19 1764913459

Linux has overcommit so failing malloc hasnt been a thing for over a decade. Zig is late to the party since it strong arms devs to cater to a scenerio which no longer exists.

veltas · 2025-12-05T08:10:18 1764922218

On Linux you can turn this off. On some OS's it's off by default. Especially in embedded which is a major area of native coding. If you don't want to handle allocation failures in your app you can abort.

Also malloc can fail even with overcommit, if you accidentally enter an obviously incorrect size like -1.