Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The reason I really like Zig is because there's finally a language that makes it easy to gracefully handle memory exhaustion at the application level. No more praying that your program isn't unceremoniously killed just for asking for more memory - all allocations are assumed fallible and failures must be handled explicitly. Stack space is not treated like magic - the compiler can reason about its maximum size by examining the call graph, so you can pre-allocate stack space to ensure that stack overflows are guaranteed never to happen.

This first-class representation of memory as a resource is a must for creating robust software in embedded environments, where it's vital to frontload all fallibility by allocating everything needed at start-up, and allow the application freedom to use whatever mechanism appropriate (backpressure, load shedding, etc) to handle excessive resource usage.





> No more praying that your program isn't unceremoniously killed just for asking for more memory - all allocations are assumed fallible and failures must be handled explicitly.

But for operating systems with overcommit, including Linux, you won't ever see the act of allocation fail, which is the whole point. All the language-level ceremony in the world won't save you.


Even on Linux with overcommit you can have allocations fail, in practical scenarios.

You can impose limits per process/cgroup. In server environments it doesn't make sense to run off swap (the perf hit can be so large that everything times out and it's indistinguishable from being offline), so you can set limits proportional to physical RAM, and see processes OOM before the whole system needs to resort to OOMKiller. Processes that don't fork and don't do clever things with virtual mem don't overcommit much, and large-enough allocations can fail for real, at page mapping time, not when faulting.

Additionally, soft limits like https://lib.rs/cap make it possible to reliably observe OOM in Rust on every OS. This is very useful for limiting memory usage of a process before it becomes a system-wide problem, and a good extra defense in case some unreasonably large allocation sneaks past application-specific limits.

These "impossible" things happen regularly in the services I worked on. The hardest part about handling them has been Rust's libstd sabotaging it and giving up before even trying. Handling of OOM works well enough to be useful where Rust's libstd doesn't get in the way.

Rust is the problem here.


I hear this claim on swap all the time, and honestly it doesn't sound convincing. Maybe ten or twenty years ago, but today? CAS latency for DIMM has been going UP, and so is NVMe bandwidth. Depending on memory access patterns, and whether it fits in the NVMe controller's cache (the recent Samsung 9100 model includes 4 GB of DDR4 for cache and prefetch) your application may work just fine.

Sure, but you can do the next best thing, which is to control precisely when and where those allocations occur. Even if the possibility of crashing is unavoidable, there is still huge operational benefit in making it predictable.

Simplest example is to allocate and pin all your resources on startup. If it crashes, it does so immediately and with a clear error message, so the solution is as straightforward as "pass bigger number to --memory flag" or "spec out larger machine".


No, this is still misunderstanding.

Overcommit means that the act of memory allocation will not report failure, even when the system is out of memory.

Instead, failure will come at an arbitrary point later, when the program actually attempts to use the aforementioned memory that the system falsely claimed had been allocated.

Allocating all at once on startup doesn't help, because the program can still fail later when it tries to actually access that memory.


To be fair, you can enforce this just by filling all the allocated memory with zero, so it's possible to fail at startup.

Or, even simpler, just turn off over-commit.

But if swap comes into the mix, or just if the OS decides it needs the memory later for something critical, you can still get killed.


I would be suprised if some os detects the page of zeros and removes that allocation until you need it. this seems like a common enough case as to make it worth it when memory is low. I'm not aware of any that do, but it wouldn't be that hard and so seems like someone would try it.

There's also KSM, kernel same-page merging.

Which is why I said "allocate and pin". POSIX systems have mlock()/mlockall() to prefault allocated memory and prevent it from being paged out.

Random curious person here: does mlock() itself cause the pre-fault? Or do you have to scribble over that memory yourself, too?

(I understand that mlock prevents paging-out, but in my mind that's a separate concern from pre-faulting?)


FreeBSD and OpenBSD explicitly mention the prefaulting behavior in the mlock(2) manpage. The Linux manpage alludes to it in that you have to explicitly pass the MLOCK_ONFAULT flag to the mlock2() variant of the syscall in order to disable the prefaulting behavior.

Aha, my apologies, I overlooked that.

I imagine people who care about this sort of thing are happy to disable overcommit, and/or run Zig on embedded or specialized systems where it doesn't exist.

There are far more people running/writing Zig on/for systems with overcommit than not. Most of the hype around Zig come from people not in the embedded world.

If we can produce a substantial volume of software that can cope with allocation failures then the idea of using something than overcommit as the default becomes feasible.

It's not a stretch to imagine that a different namespace might want different semantics e.g. to allow a container to opt out of overcommit.

It is hard to justify the effort required to enable this unless it'll be useful for more than a tiny handful of users who can otherwise afford to run off an in-house fork.


> If we can produce a substantial volume of software that can cope with allocation failures then the idea of using something than overcommit as the default becomes feasible.

Except this won't happen, because "cope with allocation failure" is not something that 99.9% of programs could even hope to do.

Let's say that you're writing a program that allocates. You allocate, and check the result. It's a failure. What do you do? Well, if you have unneeded memory lying around, like a cache, you could attempt to flush it. But I don't know about you, but I don't write programs that randomly cache things in memory manually, and almost nobody else does either. The only things I have in memory are things that are strictly needed for my program's operation. I have nothing unnecessary to evict, so I can't do anything but give up.

The reason that people don't check for allocation failure isn't because they're lazy, it's because they're pragmatic and understand that there's nothing they could reasonably do other than crash in that scenario.


I used to run into allocation limits in opera all the time. Usually what happened was a failure to allocate a big chunk of memory for rendering or image decompression purposes, and if that happens you can give up on rendering the current tab for the moment. It was very resilient to those errors.

Have you honestly thought about how you could handle the situation better than an crash?

For example, you could finish writing data into files before exiting gracefully with an error. You could (carefully) output to stderr. You could close remote connections. You could terminate the current transaction and return an error code. Etc.

Most programs are still going to terminate eventually, but they can do that a lot more usefully than a segfault from some instruction at a randomized address.


Even when I have a cache - it is probably in a different code path / module and it would be a terrible architecture that let me access that code.

A way to access an "emergency button" function is a significantly smaller sin than arbitrary crashes.

> Most of the hype around Zig come from people not in the embedded world.

Yet another similarity with Rust.


I never said that all Zig users care about recovering from allocation failure.

> you won't ever see the act of allocation fail

ever? If you have limited RAM and limited storage on a small linux SBC, where does it put your memory?


It handles OOM by killing processes.

If you are pre-allocating Rust would handle that decently as well right?

Certainly I agree that allocations in your dependencies (including std) are more annoying in Rust since it uses panics for OOM.

The no-std set of crates is all setup to support embedded development.


I don't know Zig. The article says "Many people seem confused about why Zig should exist if Rust does already." But I'd ask instead why does Zig exist when C does already? It's just a "better" C? But has the drawback that makes C problematic for development, manual memory management? I think you are better off using a language with a garbage collector, unless your usage really needs manual management, and then you can pick between C, Rust, and Zig (and C++ and a few hundred others, probably.)

yeah, its a better c, but like wouldnt it be nice if c had stadardized fat pointers so that if you move from project to project you don't have to triple check the semantics? for example and like say 50+ "learnings" from 40 years c that are canonized and first class in the language + stdlib

I think the whole idea is to remove some pain points of C while not introducing additional annoyances people writing low level code don't want.

> Stack space is not treated like magic - the compiler can reason about its maximum size by examining the call graph, so you can pre-allocate stack space to ensure that stack overflows are guaranteed never to happen.

How does that work in the presence of recursion or calls through function pointers?


Recursion: That's easy, don't. At least, not with a call stack. Instead, use a stack container backed by a bounded allocator, and pop->process->push in a loop. What would have been a stack overflow is now an error.OutOfMemory enum that you can catch and handle as desired. All that said, there is a proposal that addresses making recursive functions more friendly to static analysis [0].

Function pointers: Zig has a proposal for restricted function types [1], which can be used to enforce compile-time constraints on the functions that can be assigned to a function pointer.

[0]: https://github.com/ziglang/zig/issues/1006 [1]: https://github.com/ziglang/zig/issues/23367


Linux has overcommit so failing malloc hasnt been a thing for over a decade. Zig is late to the party since it strong arms devs to cater to a scenerio which no longer exists.

On Linux you can turn this off. On some OS's it's off by default. Especially in embedded which is a major area of native coding. If you don't want to handle allocation failures in your app you can abort.

Also malloc can fail even with overcommit, if you accidentally enter an obviously incorrect size like -1.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: