Hacker Newsnew | past | comments | ask | show | jobs | submit | 10000truths's commentslogin

...over the course of 8.5 months, which is way too short for a meaningful result. If their strategy could outperform the S&P 500's 10-year return, they wouldn't be blogging about it.

The reason I really like Zig is because there's finally a language that makes it easy to gracefully handle memory exhaustion at the application level. No more praying that your program isn't unceremoniously killed just for asking for more memory - all allocations are assumed fallible and failures must be handled explicitly. Stack space is not treated like magic - the compiler can reason about its maximum size by examining the call graph, so you can pre-allocate stack space to ensure that stack overflows are guaranteed never to happen.

This first-class representation of memory as a resource is a must for creating robust software in embedded environments, where it's vital to frontload all fallibility by allocating everything needed at start-up, and allow the application freedom to use whatever mechanism appropriate (backpressure, load shedding, etc) to handle excessive resource usage.


> No more praying that your program isn't unceremoniously killed just for asking for more memory - all allocations are assumed fallible and failures must be handled explicitly.

But for operating systems with overcommit, including Linux, you won't ever see the act of allocation fail, which is the whole point. All the language-level ceremony in the world won't save you.


Even on Linux with overcommit you can have allocations fail, in practical scenarios.

You can impose limits per process/cgroup. In server environments it doesn't make sense to run off swap (the perf hit can be so large that everything times out and it's indistinguishable from being offline), so you can set limits proportional to physical RAM, and see processes OOM before the whole system needs to resort to OOMKiller. Processes that don't fork and don't do clever things with virtual mem don't overcommit much, and large-enough allocations can fail for real, at page mapping time, not when faulting.

Additionally, soft limits like https://lib.rs/cap make it possible to reliably observe OOM in Rust on every OS. This is very useful for limiting memory usage of a process before it becomes a system-wide problem, and a good extra defense in case some unreasonably large allocation sneaks past application-specific limits.

These "impossible" things happen regularly in the services I worked on. The hardest part about handling them has been Rust's libstd sabotaging it and giving up before even trying. Handling of OOM works well enough to be useful where Rust's libstd doesn't get in the way.

Rust is the problem here.


I hear this claim on swap all the time, and honestly it doesn't sound convincing. Maybe ten or twenty years ago, but today? CAS latency for DIMM has been going UP, and so is NVMe bandwidth. Depending on memory access patterns, and whether it fits in the NVMe controller's cache (the recent Samsung 9100 model includes 4 GB of DDR4 for cache and prefetch) your application may work just fine.

Sure, but you can do the next best thing, which is to control precisely when and where those allocations occur. Even if the possibility of crashing is unavoidable, there is still huge operational benefit in making it predictable.

Simplest example is to allocate and pin all your resources on startup. If it crashes, it does so immediately and with a clear error message, so the solution is as straightforward as "pass bigger number to --memory flag" or "spec out larger machine".


No, this is still misunderstanding.

Overcommit means that the act of memory allocation will not report failure, even when the system is out of memory.

Instead, failure will come at an arbitrary point later, when the program actually attempts to use the aforementioned memory that the system falsely claimed had been allocated.

Allocating all at once on startup doesn't help, because the program can still fail later when it tries to actually access that memory.


To be fair, you can enforce this just by filling all the allocated memory with zero, so it's possible to fail at startup.

Or, even simpler, just turn off over-commit.

But if swap comes into the mix, or just if the OS decides it needs the memory later for something critical, you can still get killed.


I would be suprised if some os detects the page of zeros and removes that allocation until you need it. this seems like a common enough case as to make it worth it when memory is low. I'm not aware of any that do, but it wouldn't be that hard and so seems like someone would try it.

Which is why I said "allocate and pin". POSIX systems have mlock()/mlockall() to prefault allocated memory and prevent it from being paged out.

Random curious person here: does mlock() itself cause the pre-fault? Or do you have to scribble over that memory yourself, too?

(I understand that mlock prevents paging-out, but in my mind that's a separate concern from pre-faulting?)


FreeBSD and OpenBSD explicitly mention the prefaulting behavior in the mlock(2) manpage. The Linux manpage alludes to it in that you have to explicitly pass the MLOCK_ONFAULT flag to the mlock2() variant of the syscall in order to disable the prefaulting behavior.

Aha, my apologies, I overlooked that.

I imagine people who care about this sort of thing are happy to disable overcommit, and/or run Zig on embedded or specialized systems where it doesn't exist.

There are far more people running/writing Zig on/for systems with overcommit than not. Most of the hype around Zig come from people not in the embedded world.

If we can produce a substantial volume of software that can cope with allocation failures then the idea of using something than overcommit as the default becomes feasible.

It's not a stretch to imagine that a different namespace might want different semantics e.g. to allow a container to opt out of overcommit.

It is hard to justify the effort required to enable this unless it'll be useful for more than a tiny handful of users who can otherwise afford to run off an in-house fork.


> If we can produce a substantial volume of software that can cope with allocation failures then the idea of using something than overcommit as the default becomes feasible.

Except this won't happen, because "cope with allocation failure" is not something that 99.9% of programs could even hope to do.

Let's say that you're writing a program that allocates. You allocate, and check the result. It's a failure. What do you do? Well, if you have unneeded memory lying around, like a cache, you could attempt to flush it. But I don't know about you, but I don't write programs that randomly cache things in memory manually, and almost nobody else does either. The only things I have in memory are things that are strictly needed for my program's operation. I have nothing unnecessary to evict, so I can't do anything but give up.

The reason that people don't check for allocation failure isn't because they're lazy, it's because they're pragmatic and understand that there's nothing they could reasonably do other than crash in that scenario.


I used to run into allocation limits in opera all the time. Usually what happened was a failure to allocate a big chunk of memory for rendering or image decompression purposes, and if that happens you can give up on rendering the current tab for the moment. It was very resilient to those errors.

Have you honestly thought about how you could handle the situation better than an crash?

For example, you could finish writing data into files before exiting gracefully with an error. You could (carefully) output to stderr. You could close remote connections. You could terminate the current transaction and return an error code. Etc.

Most programs are still going to terminate eventually, but they can do that a lot more usefully than a segfault from some instruction at a randomized address.


Even when I have a cache - it is probably in a different code path / module and it would be a terrible architecture that let me access that code.

A way to access an "emergency button" function is a significantly smaller sin than arbitrary crashes.

I never said that all Zig users care about recovering from allocation failure.

> you won't ever see the act of allocation fail

ever? If you have limited RAM and limited storage on a small linux SBC, where does it put your memory?


It handles OOM by killing processes.

Linux has overcommit so failing malloc hasnt been a thing for over a decade. Zig is late to the party since it strong arms devs to cater to a scenerio which no longer exists.

On Linux you can turn this off. On some OS's it's off by default. Especially in embedded which is a major area of native coding. If you don't want to handle allocation failures in your app you can abort.

Also malloc can fail even with overcommit, if you accidentally enter an obviously incorrect size like -1.


If you are pre-allocating Rust would handle that decently as well right?

Certainly I agree that allocations in your dependencies (including std) are more annoying in Rust since it uses panics for OOM.

The no-std set of crates is all setup to support embedded development.


I don't know Zig. The article says "Many people seem confused about why Zig should exist if Rust does already." But I'd ask instead why does Zig exist when C does already? It's just a "better" C? But has the drawback that makes C problematic for development, manual memory management? I think you are better off using a language with a garbage collector, unless your usage really needs manual management, and then you can pick between C, Rust, and Zig (and C++ and a few hundred others, probably.)

yeah, its a better c, but like wouldnt it be nice if c had stadardized fat pointers so that if you move from project to project you don't have to triple check the semantics? for example and like say 50+ "learnings" from 40 years c that are canonized and first class in the language + stdlib

> Stack space is not treated like magic - the compiler can reason about its maximum size by examining the call graph, so you can pre-allocate stack space to ensure that stack overflows are guaranteed never to happen.

How does that work in the presence of recursion or calls through function pointers?


Recursion: That's easy, don't. At least, not with a call stack. Instead, use a stack container backed by a bounded allocator, and pop->process->push in a loop. What would have been a stack overflow is now an error.OutOfMemory enum that you can catch and handle as desired. All that said, there is a proposal that addresses making recursive functions more friendly to static analysis [0].

Function pointers: Zig has a proposal for restricted function types [1], which can be used to enforce compile-time constraints on the functions that can be assigned to a function pointer.

[0]: https://github.com/ziglang/zig/issues/1006 [1]: https://github.com/ziglang/zig/issues/23367


Disclosing an individual student's information to third parties without express consent is a violation of FERPA laws.

This is addressed in the "information asymmetry" section of the article.

What do you believe needs improving and why?

I think the ambiguity was deliberate.

And very, very clever.

The table is a bit misleading. Most of the resources of a website are loaded concurrently and are not on the critical path of the "first contentful paint", so latency does not compound as quickly as the table implies. For web apps, much of the end-to-end latency hides lower in the networking stack. Here's the worst-case latency for a modern Chrome browser performing a cold load of an SPA website:

DNS-over-HTTPS-over-QUIC resolution: 2 RTTs

TCP handshake: 1 RTT

TLS v1.2 handshake: 2 RTTs

HTTP request/response (HTML): 1 RTT

HTTP request/response (bundled JS that actually renders the content): 1 RTT

That's 7 round trips. If your connection crosses a continent, that's easily a 1-2 second time-to-first-byte for the content you actually care about. And no amount of bandwidth will decrease that, since the bottlenecks are the speed of light and router hop latencies. Weak 4G/WiFi signal and/or network congestion will worsen that latency even further.


The reason why using a CDN is so effective for improving the perceived performance of a web site is because it reduces the length (and hence speed of light delay) of these first 7 round trips by moving the static parts of the web app (HTML+JS) to the "edge", which is just a bunch of cache boxes scattered around the world.

The user no longer has to connect to the central app server, they can connect to their nearest cache edge box, which is probably a lot closer to them (1-10ms is typical).

Note that stateful API calls will still need to go back to the central app server, potentially an intercontinental hop.


Indeed, at some point, you can't lower tail latencies any further without moving closer to your users. But of the 7 round trips that I mentioned above, you have control over 3 of them: 2 round trips can be eliminated by supporting HTTP/3 over QUIC (and adding HTTPS DNS records to your zone file), and 1 round trip can be eliminated by server-side rendering. That's a 40-50% reduction before you even need to consider a CDN setup, and depending on your business requirements, it may very well be enough.

For context this article was written when 95%+ of websites used HTTP 1.1 (and <50% used HTTPS).

Yes? Funnily enough, I don't often use indexed access in Rust. Either I'm looping over elements of a data structure (in which case I use iterators), or I'm using an untrusted index value (in which case I explicitly handle the error case). In the rare case where I'm using an index value that I can guarantee is never invalid (e.g. graph traversal where the indices are never exposed outside the scope of the traversal), then I create a safe wrapper around the unsafe access and document the invariant.


If that's the case then hats off. What you're describing is definitely not what I've seen in practice. In fact, I don't think I've ever seen a crate or production codebase that documents infallibility of every single slice access. Even security-critical cryptography crates that passed audits don't do that. Personally, I found it quite hard to avoid indexing for graph-heavy code, so I'm always on the lookout for interesting ways to enforce access safety. If you have some code to share that would be very interesting.


My rule of thumb is that unchecked access is okay in scenarios where both the array/map and the indices/keys are private implementation details of a function or struct, since an invariant is easy to manually verify when it is tightly scoped as such. I've seen it used it in:

* Graph/tree traversal functions that take a visitor function as a parameter

* Binary search on sorted arrays

* Binary heap operations

* Probing buckets in open-addressed hash tables


> I don't think I've ever seen a crate or production codebase that documents infallibility of every single slice access.

The smoltcp crate typically uses runtime checks to ensure slice accesses made by the library do not cause a panic. It's not exactly equivalent to GP's assertion, since it doesn't cover "every single slice access", but it at least covers slice accesses triggered by the library's public API. (i.e. none of the public API functions should cause a panic, assuming that the runtime validation after the most recent mutation succeeds).

Example: https://docs.rs/smoltcp/latest/src/smoltcp/wire/ipv4.rs.html...


I think this goes against the Rust goals in terms of performance. Good for safe code, of course, but usually Rust users like to have compile time safety to making runtime safety checks unnecessary.


> graph-heavy code

Could you share some more details, maybe one fully concrete scenario? There are lots of techniques, but there's no one-size-fits-all solution.


Sure, these days I'm mostly working on a few compilers. Let's say I want to make a fixed-size SSA IR. Each instruction has an opcode and two operands (which are essentially pointers to other instructions). The IR is populated in one phase, and then lowered in the next. During lowering I run a few peephole and code motion optimizations on the IR, and then do regalloc + asm codegen. During that pass the IR is mutated and indices are invalidated/updated. The important thing is that this phase is extremely performance-critical.


And it's fine for a compiler to panic when it violates an assumption. Not so with the Cloudflare code under discussion.


Idiomatic Rust would have been to return a Result<> to the caller, not to surprise them with a panic.

The developer was lazy.

A lot of Rust developers are: https://github.com/search?q=unwrap%28%29+language%3ARust&typ...


One normal "trick" is phantom typing. You create a type representing indices and have a small, well-audited portion of unsafe code handling creation/unpacking, where the rest of the code is completely safe.

The details depend a lot on what you're doing and how you're doing it. Does the graph grow? Shrink? Do you have more than one? Do you care about programmer error types other than panic/UB?

Suppose, e.g., that your graph doesn't change sizes, you only have one, and you only care about panics/UB. Then you can get away with:

1. A dedicated index type, unique to that graph (shadow / strong-typedef / wrap / whatever), corresponding to whichever index type you're natively using to index nodes.

2. Some mechanism for generating such indices. E.g., during graph population phase you have a method which returns the next custom index or None if none exist. You generated the IR with those custom indexes, so you know (assuming that one critical function is correct) that they're able to appropriately index anywhere in your graph.

3. You have some unsafe code somewhere which blindly trusts those indices when you start actually indexing into your array(s) of node information. However, since the very existence of such an index is proof that you're allowed to access the data, that access is safe.

Techniques vary from language to language and depending on your exact goals. GhostCell [0] in Rust is one way of relegating literally all of the unsafe code to a well-vetted library, and it uses tagged types (via lifetimes), so you can also do away with the "only one graph" limitation. It's been awhile since I've looked at it, but resizes might also be safe pretty trivially (or might not be).

The general principle though is to structure your problem in such a way that a very small amount of code (so that you can more easily prove it correct) can provide promises that are enforceable purely via the type system (so that if the critical code is correct then so is everything else).

That's trivial by itself (e.g., just rely on option-returning .get operators), so the rest of the trick is to find a cheap place in your code which can provide stronger guarantees. For many problems, initialization is the perfect place (e.g., you can bounds-check on init and then not worry about it again) (e.g., if even bounds-checking on initialization is too slow then you can still use the opportunity at initialization to write out a proof of why some invariant holds and then blindly/unsafely assert it to be true, but you then immediately pack that hard-won information into a dedicated type so that the only place you ever have to think about it is on initialization).

[0] https://plv.mpi-sws.org/rustbelt/ghostcell/


I do use a combination of newtyped indices + singleton arenas for data structures that only grow (like the AST). But for the IR, being able to remove nodes from the graph is very important. So phantom typing wouldn't work in that case.


I realize that this is meant as an exercise to demonstrate a property of variance. But most investors are risk-averse when it comes to their portfolio - for the example given, a more practical target to optimize would be worst-case or near-worst-case return (e.g. p99). For calculating that, a summary measure like variance or mean does not suffice - you need the full distribution of the RoR of assets A and B, and find the value of t that optimizes the p99 of At+B(1-t).


It's hard enough to get a reliable variance-covariance estimate.


They are absolutely aware of these sorts of abuses. I'll bet my spleen that it shows up as a line item in the roadmapping docs of their content integrity/T&S teams.

The root problem is twofold: the inability to reliably automate distinguishing "good actor" and "bad actor", and a lack of will to throw serious resources at solving the problem via manual, high precision moderation.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: