“What the Hardware Does” Is Not What Your Program Does: Uninitialized Memory

userbinator · on July 14, 2019

When K&R invented C, and when ISO/ANSI standardised C, I don't think this is at all what they had in mind for UB. Whenever discussions like this come up, I like to quote the C standard itself on its definition:

"NOTE: Possible undefined behavior ranges from [...] to behaving during translation or program execution in a documented manner characteristic of the environment"

The whole point of C as K&R and ISO is to let you do "what the hardware does". They left parts of the standard purposefully undefined, so it could remain applicable to a wide variety of implementations. The intention was definitely NOT "screw the programmer" as a lot of the compiler writers seem to have interpreted it, but to allow them to do something that makes sense for the environment.

Now we have, mostly academics from what I've noticed, that are taking the language farther and farther away from reality; focusing only on something uselessly abstract, completely ignoring the practical consequences of what they're doing. Compilers are becoming increasingly hostile to programmers. A programming language that had humble and very practical uses, with straightforward and easily understood behaviour, has been perverted into theoretical quagmire of uselessness.

Something very very odd is going on, and I don't like it one bit. It's very WTF-inducing.

(I don't know much about Rust, hence why I didn't say anything about it. But I've been using C since the late 80s.)

nothrabannosir · on July 14, 2019

This is driven by programmers' insatiable thirst for performance. Compiler writers are constantly judged on benchmarks, and the only way to squeeze that last flop out of a piece of code is to take the specification to its extreme.

UB is always about optimisations and performance. Incidentally, this is why I don't think talking about "nasal demons" is productive. The compiler mostly just uses UB to assume: ah, this can't happen, so I can optimise it away. Often that means valid programs go faster. We wanted it: we got it.

From my limited experience (and it's been a while), -O0 (no optimisations) is really quite reliable, even if you do all kinds of UB shenanigans.

naniwaduni · on July 14, 2019

> This is driven by programmers' insatiable thirst for performance. Compiler writers are constantly judged on benchmarks, and the only way to squeeze that last flop out of a piece of code is to take the specification to its extreme.

Really? I've seen people switch between competing compilers for licensing reasons, platform support, features---but benchmark performance? Maybe blog posts suggesting that a new compiler wasn't ready.

Compiler writers judge themselves on benchmarks.

closeparen · on July 14, 2019

Competitive performance for your workload is basically the reason you would buy Intel's compiler, right?

mhh__ · on July 14, 2019

It might not be true now because LLVM and GCC can generally put a commercial compiler 6 feet under, but if you're paying for a compiler you'd definitely want to choose the one that delivers the best performance (Money being no object)

No idea whether ICC is still worth paying for

userbinator · on July 14, 2019

No idea whether ICC is still worth paying for

From my experience, ICC is far more reluctant to exploit UB, yet still generates very good code.

saagarjha · on July 15, 2019

ICC exploits the standard itself: it generates code that is technically incorrect.

dataflow · on July 15, 2019

What do you mean?

saagarjha · on July 15, 2019

As Patrick mentions, ICC generates code that doesn't follow IEEE-754: https://news.ycombinator.com/item?id=20437375 (I should have mentioned I was talking about that rather than the C standard).

dataflow · on July 15, 2019

Oh I see, you're talking about floating point.

So basically ICC has -ffast-math (or -funsafe-math-optimizations) on by default, and you can turn it off with an explicit flag?

I see this as more of a philosophical difference than a material one since you can just add or remove the flag on either one...

dllthomas · on July 14, 2019

IME, benchmarks aren't enough of an impetus to move between compilers, but are often a not-insignificant piece of what's considered when moving is otherwise motivated.

SamReidHughes · on July 15, 2019

For C++, compilation time performance is the benchmark. At least it's why I started using clang.

xeeeeeeeeeeenu · on July 15, 2019

>This is driven by programmers' insatiable thirst for performance. Compiler writers are constantly judged on benchmarks, and the only way to squeeze that last flop out of a piece of code is to take the specification to its extreme.

Ironically, strict aliasing rule (which is one of the most common causes of UB) makes writing fast programs much harder, because it forbids type punning (except via memcpy or unions).

BTW, according to WG14 mailings and minutes, the C committee is considering either relaxing it or creating a standard way to suppress it in C2X. I can't wait for it.

saagarjha · on July 15, 2019

Writing fast programs while keeping the strict aliasing rule in mind isn't all that hard: compilers know the semantics of memcpy and can optimize your use to what it's "supposed to be in assembly".

dagenix · on July 15, 2019

> When K&R invented C

Brian Kernighan didn't invent C, he co-authored a book about it.

> Whenever discussions like this come up, I like to quote the C standard itself on its definition: "NOTE: Possible undefined behavior ranges from [...] to behaving during translation or program execution in a documented manner characteristic of the environment"

The actual quote is: "NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message)." - but you snipped out the part that contradicted your point.

> The whole point of C as K&R and ISO is to let you do "what the hardware does". They left parts of the standard purposefully undefined, so it could remain applicable to a wide variety of implementations. The intention was definitely NOT "screw the programmer" as a lot of the compiler writers seem to have interpreted it, but to allow them to do something that makes sense for the environment.

This seems to be contradicted by the full quote above: "ignoring the situation completely with unpredictable results" does sound like it's going to "make sense for the environment". It's also pretty inflammatory to say that the intention of a compiler author is to "screw the programmer".

> Now we have, mostly academics from what I've noticed, that are taking the language farther and farther away from reality; focusing only on something uselessly abstract, completely ignoring the practical consequences of what they're doing. Compilers are becoming increasingly hostile to programmers. A programming language that had humble and very practical uses, with straightforward and easily understood behaviour, has been perverted into theoretical quagmire of uselessness.

So, the "academics" are making compilers "hostile to programmers" and perverting the language into a "theoretical quagmire of uselessness"? That's nonsensical anti-academic garbage unsubstantiated by fact.

lmm · on July 14, 2019

> Now we have, mostly academics from what I've noticed, that are taking the language farther and farther away from reality; focusing only on something uselessly abstract, completely ignoring the practical consequences of what they're doing. Compilers are becoming increasingly hostile to programmers.

I don't think it's the academics, it's the compiler implementers - driven by wanting to win at benchmarks. Our industry is absurdly focused on "performance" over all else (even correctness). But then again, those who care about other things moved on to non-C languages years or decades ago.

roca · on July 14, 2019

Indeed, it's compiler implementors, not academics.

However, the performance impact of optimizations that take advantage of UB is not known, and is potentially very large. It would be a very interesting experiment to modify a C/C++ compiler so that every C/C++ program has defined semantics in terms of a simple array-of-bytes abstract machine, and see how much slower the generated code is compared to the regular compiler.

userbinator · on July 14, 2019

It would be a very interesting experiment to modify a C/C++ compiler so that every C/C++ program has defined semantics in terms of a simple array-of-bytes abstract machine, and see how much slower the generated code is compared to the regular compiler.

Alternatively, look at compilers like Intel's ICC --- it has historically been one of the best at code generation, yet it's not known for having anywhere near the same level of UB-craziness as Clang (or GCC, to a lesser extent). The same has been my experience with MSVC, at least the earlier versions.

pcwalton · on July 15, 2019

On the contrary, ICC doesn't even implement IEEE 754 correctly by default [1], for performance. This is way more aggressive than GCC or Clang.

[1]: https://hal.archives-ouvertes.fr/hal-00128124v5/document

BeeOnRope · on July 15, 2019

Uh, ICC does some crazy things to get some of its performance, such as making transformations not allowed by the standard (inventing writes), in order to get better vectorization.

So in some respects ICC is very aggressive.

gameswithgo · on July 14, 2019

our industry mainly uses javascript, we aren’t focused on performance at all

wvenable · on July 15, 2019

Yet the C/C++/ASM in JavaScript runtimes are heavily optimized for performance.

saagarjha · on July 15, 2019

Compiler implementers are often academics, or at least a lot likely to be higher on the ivory tower than their users are.

ralfjung · on July 15, 2019

Which major compiler is mostly implemented by academics? Neither GCC nor LLVM, for sure.

Us academics do have a lot of "fun" figuring out a way to put what the compiler developers do on solid footing [1]. But this is a game of whack-a-mole that will never end: each time we find a way to formally approach some new crazy optimization they came up with, and help them weed out all the bugs that were caused by not carefully thinking through all the implications [2], the next crazy optimization comes up [3].

[1]: https://people.mpi-sws.org/~jung/twinsem/twinsem.pdf

[2]: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82282, https://bugs.llvm.org/show_bug.cgi?id=35229, https://bugs.llvm.org/show_bug.cgi?id=34548

[3]: https://bugs.llvm.org/show_bug.cgi?id=21725, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359

roblabla · on July 14, 2019

Rust is in a very different situation though. Safe Rust has no undefined behavior - if something is behaving in an unsound manner, then that's a compiler bug.

Safe Rust does have some implementation-specific behavior - for instance, panics can either unwind or abort.

In Unsafe Rust, all bets are off, and the programmer may trigger Undefined Behavior if they're not careful. The upshot is that it's easier to maintain the compiler's invariants because you only need to be careful in a single place about making sure everything is in order. By clearly marking places where the programmer is in charge of maintaining the invariants instead of the compiler, the hope is that mistakes will be minimized (thanks to careful code review and making the programmer think a lot about what they're writing).

Further helping with that, the goal is to have tools to help catch UB at runtime. For instance, miri is an interpreter for MIR (an intermediate language between rust and the LLVM IR) that checks for UB.

I agree that in C, UB has probably be stretched way beyond its intended purpose. But in Rust, this new definition of UB is taken advantage of in much less WTF-inducing ways, allowing us to enjoy those performance guarantees without losing our minds.

dataflow · on July 14, 2019

The "[...]" you deliberately omitted explicitly permits "ignoring the situation completely with unpredictable results".

jschwartzi · on July 14, 2019

I would interpret that to mean that if memory is uninitialized to a specific value and you're working in a physical rather than virtual address space, then a read from some random pointer to uninitialized data might return an unpredictable value, such as the value of a GPIO register or a word from an RS-232 buffer. There's clearly a defined behavior for that specific platform.

And when they're talking about doing something hardware-specific I would take that to be about the MMU on certain platforms. Platforms with an MMU and where the processor is executing code in a virtual address space can generate an access violation if the program code reads data at a virtual address that is not mapped to a specific segment of physical memory.

Even when a program runs in a physical address space, the processor may be configured with an address space that is significantly larger than the actual block of memory and registers on the memory bus. And this matters because if you have a 32-bit processor but the upper 3 gigabytes of the address space don't map to a register file or RAM then it's hardware-dependent what happens when you read from those addresses.

Some hardware will roll the map over and read from somewhere in the available section of the address space based on the offset, and some hardware will treat that as an error which triggers an access violation. Some hardware will bug out and do something completely unexpected. It's undefined behavior what happens in this case. It has everything to do with how your specific chip, mainframe, or minicomputer is wired. And that's what they're referring to.

Undefined behavior does not mean "do whatever you want," it means "do what the hardware would probably do in this situation, or do whatever can be reasonably expected." And that's important here because that means that UB is not carte blanche to violate the programmer's expectations here.

dataflow · on July 15, 2019

As far as the standard's goes, you're interpreting undefined behavior to mean unspecified behavior -- behavior that's well-defined, but whose definition "depends on the implementation" (in this case, your hardware). I mean, I can't stop you if you want to read it that way, but you're literally interpreting it to mean exactly what they did not intend it to mean... that's why they chose separate terms and explicitly told you undefined behavior is allowed to be unpredictable, rather than being required to be implementation-dependent.

But let's ignore the standard... who cares what it says...

The thing is, what you're asking for inhibits optimizations that many people would very much like to see from their compilers. Like for example if you have an uninitialized function pointer that you only assign to in 1 or 2 locations -- the compiler should be able to just replace the indirect function calls with direct function calls. You're demanding that it doesn't do that, and that it simply call whatever function or non-function that pointer happened to point to. I mean -- you're welcome to ask for that, and maybe your compiler should have a flag to make it behave that way (or maybe it does already? do you use it if so?), but to me and many other people, the compiler should obviously be permitted to see right through that.

nkurz · on July 15, 2019

> You're demanding that it doesn't do that

Rather than telling someone what they are demanding, it's often better to ask them --- especially if you are sure that what they are demanding is stupid. Personally, I'd guess that 'jschwartzi' might be fine with a compiler that makes the optimization you refer to, and is instead objecting to a compiler that deletes essential safety checks in other parts of the program on the assumption that all bets are off once "undefined behavior" can be proven to occur. If he's like me, he'd probably also prefer that the compiler issue a warning about the undefined behavior rather than silently making changes to the program. But better to ask him than to guess.

dataflow · on July 15, 2019

I don't think it's a stupid demand at all -- like I said, I have nothing against implementations behaving more nicely if they wish. I'm just saying that this is an extra demand, not an interpretation of the standard, and that it would have performance repercussions which many C users would rather avoid.

In the case of your safety check example, it'd be nice if you could mention something concrete so we know exactly what situation you're talking about. But I mean, I can't rule out that maybe you'll find a couple situations here and there where the standard shouldn't leave things undefined. But the argument I'm rebutting here is that all instances of UB must behave "like the hardware", not that this particular instance is good but another one is bad, so I'm not sure you two would agree. I agree warnings would be nice too (some of which already exist), and I think despite their current efforts compiler still have some ways to go (e.g. a macro expanding to 0 should probably not behave the same as the literal 0 when you multiply, say, by a constant), but again, that's already assuming you're fine with UB...

tsimionescu · on July 15, 2019

> In the case of your safety check example, it'd be nice if you could mention something concrete so we know exactly what situation you're talking about.

There are some well-publicised cases of compilers removing NULL-checks[0] on the assumption that the NULL value can't occur as it would be UB.

As another example, the Linux kernel assumes in several places that signed integer overflow wraps, so it is compiling with -f-no-strict-overflow/-fwrapv ever since GCC started optimizing based on this piece of UB (they noticed the compiler behavior change and added the flags before releasing any faulty kernels though, apparently).

[0] https://lwn.net/Articles/342330/

dataflow · on July 15, 2019

I know about that NULL check example, but the thing is, the compiler is being pretty reasonable there. The fact that the pointer is dereferenced means that the NULL check was useless: if the address is NULL, then the NULL check can never be reached because you'd have already segfaulted (your beloved hardware behavior!) on the dereference earlier. [1] So that code is dead, and needs to be removed. The fact that the dereferenced variable is unused means that the compiler needs to eliminate that too.

People want these optimizations individually. They don't want to keep dead code taking up cycles and they also don't want dead variables taking up registers. So you can't really find much support arguing that those optimizations should be removed entirely. The only real possibilities you can propose here are that the compiler should have magically re-inserted the pruned check during the second optimization, or that it should have performed them in the opposite order. But are you sure these are actually possible and if so, practical? I mean, maybe they are, but they are far from obvious to me. I can easily see the compiler thrashing and failing to reach a fixed point if it re-inserts code that a previous optimization pass pruned. Similarly, I don't see how the compiler can just magically detect an optimization order that ensures "surprising" situations like this don't occur. My guesstimation is that it would carry severe downsides people wouldn't want. Now maybe I'm just not smart enough to see a good solution to this that doesn't carry significant downsides, and there's already one out there. If there is, I'm curious to hear about it, and I hope someone implements it under some flag, but I have yet to hear of one.

For signed integer overflow -- that might be one place where I think it would make sense to just define it to either wrap with 2's complement just like unsigned integers do, or to be unspecified behavior that falls back to the implementation's representation. Though in the latter case... you already have an implementation-specific solution: your compiler flags. But again, we might agree on a couple optimizations here and there, but that's a far cry from saying UB should just fall back to hardware behavior. And honestly, I'm not even here supporting C; I hate it. If you want wrapping, I would suggest it's a sign you might want to use C++ already. Then you can define an integer type that will play Beethoven when you overflow, and people who want their UB on overflow can have that too.

[1] I'm ignoring the validity of address 0 in kernel-mode here; there's also some subtleties on what's a null pointer and what's address zero that are rather beside my point.

simiones · on July 15, 2019

I think that everyone agrees that NULL checks should be elided when the compiler can prove that they are useless. However, I think most people assume that the way for the compiler to prove that would be "when it sees that the variable has a non-NULL value" - e.g. `int v = 0; int *p = &v;`.

That's the sort of thing I would suggest - don't work back from UB (and I agree, I wouldn't expect the optimizer to backtrack optimizations as new facts come up), work forward from actually known facts.

NULL checks are probably a pretty bad example, since the NULL access would surely SEGFAULT if allowed to execute (though the particular case of address 0 vs NULL, and of code catching segfaults like on Windows, throw a wrench in this assumption eve here), but other types of UB are much worse.

If you accidentally issue a read from a point after the end of an array, but only later check that the index was within bounds (e.g. `int x = a[i]; if i < len(a) return x; else return NULL`), the compiler eliding the bounds check by the same logic will take a program that might have been safe in practice to a program that is certainly not safe. Note that I don't know if compilers perform this type of optimization, so this may be a hypothetical.

In general though, I think that the tension here comes from C being used in 2 very different use cases: 1 is C used as portable assembly, where you expect the compiler to keep a pretty 1:1 mapping with your code; and the second one is C used as the ultimate performance language, where you drop to C when you can't optimize further in anything else. I think most of the complaints about "exploiting UB" come from the first camp, whereas the second camp is pretty happy with the current status quo.

dataflow · on July 15, 2019

I think you're focusing so much on particular examples you're missing the larger point. To repeat what I've said repeatedly I'm not defending every single instance of UB in the standard. And I can't keep going back and forth with you to to debate every single one (yes, int overflow = no UB, NULL deref = yes UB, out-of-bounds = maybe UB, etc.), which is what we're ending up doing rather pointlessly right now. The point to take away here is that UB itself as a notion is something people in both camps desire in many scenarios, so you can't just get rid of it in its entirety and say "map it to hardware" or "always reason forward". Because, again, if you do, in many cases, those would inhibit optimizations that people want. The only real solution is for people to stop seeing C as a portable assembly language, which it is simply not. It's defined in terms of an abstract machine, so either people need to switch languages, or switch their mental models.

BeeOnRope · on July 15, 2019

I don't think emitting a warning is feasible in most cases. The compiler (generally) doesn't know at compile time that UB definitely occurs, only that in some UB exists on some path that may or may not be reachable in theory or practice.

Usually these are not reachable in practice so warnings that these exist would cause a tidal wave of pointless warnings. Instead, the compiler simply prunes those paths which can lead to better code generation for the paths that are taken.

taneq · on July 14, 2019

What would nasal demons be if not "ignoring the situation completely with unpredictable results"?

mpweiher · on July 14, 2019

Yeah, for example just reading beyond the array bounds.

What current compilers do is about as far away from “ignoring the situation completely” as it is possible to get.

dataflow · on July 14, 2019

Depends how you look at it. They don't ignore it during translation but that's because they work hard to make the program "ignore it completely" (i.e., not spend a single cycle on the possibility) during execution.

mpweiher · on July 15, 2019

(a) That's not what the text says.

(b) What compilers are doing does not match "ignore at execution", because they also use it to remove completely separate code.

dataflow · on July 15, 2019

If stripping code whose only relevance is in a UB situation doesn't amount to ignoring the UB situation completely, I don't know what does. It's literally the production of code that is completely ignorant of the UB situation.

mpweiher · on July 15, 2019

If that were what it does, I'd agree.

It isn't.

dataflow · on July 15, 2019

If you could give an example to actually illustrate what you mean, maybe I'd agree too, but it's a little tough when you don't...

pcwalton · on July 15, 2019

The undefined behavior exploitation that compilers do is entirely reasonable and not something that we should go back on. Defining overflow as wrapping, for example, would be bad for security because it means we couldn't check it with UBSan.

Rather than rehashing the same arguments I've made over and over, I'll just link to parts of a Twitter thread where Daniel Micay argues eloquently that keeping the sources of UB that we have today as they are is important:

https://twitter.com/DanielMicay/status/1118017342644998144

https://twitter.com/DanielMicay/status/1118034688935747584

tylerhou · on July 15, 2019

> Defining overflow as wrapping, for example, would be bad for security because it means we couldn't check it with UBSan.

You could always write “WrapAroundSan”.

pcwalton · on July 15, 2019

The problem is that if you did that then you couldn't check regular C programs anymore, since overflow would be valid. Right now we're in the nice situation in which normal C programs are expected not to exhibit signed overflow. Change that state of affairs and suddenly a lot of obvious mistakes become valid.

userbinator · on July 15, 2019

The problem is that if you did that then you couldn't check regular C programs anymore, since overflow would be valid.

That also means it's possible to write overflow checks easily, and ones that the compiler won't optimise out. Before compilers became UB-crazy, you could write such checks in the most straightforward way, and get exactly what you expected. I'd consider that a far bigger advantage for security than arguing for the existence of a tool whose sole reason for existence seems to be due to the presence of UB in the first place.

ralfjung · on July 15, 2019

You could always say that wrap-around has to either (a) cause SIGILL or (b) return the wrapped-around result. That still allows linting with a sanitizer without involving any UB at all.

This is effectively what Rust does (replace "SIGILL" by "panic").

enedil · on July 15, 2019

This isn't the reason why signed overflow is UB in C.

saagarjha · on July 15, 2019

As far as I can tell, signed overflow is undefined behavior because when the standard was defined there were a lot of machines where it would trap or do something weird, and for some reason the standards authors did not choose to make it implementation-defined behavior.

jcelerier · on July 15, 2019

> Defining overflow as wrapping, for example, would be bad for security because it means we couldn't check it with UBSan.

no, because unsigned overflow is valid and yet ubsan is able to check for it with -funsigned-integer-overflow (and I caught so many bugs like this)

ralfjung · on July 15, 2019

UB has [developed a lot][1] since then. Now UB is a way for the programmer to help the compiler generate better code by providing extra information that is hard for the compiler to prove itself. I think, in general, this is actually an ingenious idea (I have [written about this][2] in the context of Rust before). But it can surely be taken too far, and it is particularly a problem when the programmer is not aware of the promises they are making. This is more an API design problem though than a fundamental problem with UB itself.

[1]: https://raphlinus.github.io/programming/rust/2018/08/17/unde...

[2]: https://www.ralfj.de/blog/2017/07/14/undefined-behavior.html

frutiger · on July 14, 2019

> When K&R invented C

Side note, Kernighan (of K&R) did not have any part in the invention of C, it was all Dennis Ritchie. Kernighan famously wrote the book on it with Ritchie but that was it.

In his own words[1]:

> remember, C is entirely the work of Dennis Ritchie, I am but a popularizer

1. https://www.cs.cmu.edu/~mihaib/kernighan-interview/

noobermin · on July 15, 2019

While I agree with you, it is true that C of the K&R days doesn't exist anymore because computers are no longer like PDP-11s. This bookmark of mine is under "importantkeep"[0] (also cited in the article).

I agree though, the moment abstract machines distract from actually creating something is the moment one has gone too far.

[0] https://queue.acm.org/detail.cfm?id=3212479

ralfjung · on July 15, 2019

IMO the real machines are much more distracting than the abstract ones. ;)

jjoonathan · on July 15, 2019

Yes! The GPL says that GCC is provided on an

> "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, ... FITNESS FOR A PARTICULAR PURPOSE.

which would permit the GCC steering committee to decide tomorrow that every other statement must be "HAIL SATAN;" or your code won't compile.

Of course, "can" doesn't mean "should," and if they actually did it, it wouldn't say anything particularly deep about the GPL being a bad license, it would merely indicate that the GCC steering committee had become untrustworthy, and that the community should act accordingly.

Undefined behavior is exactly analogous. It was a degree of freedom designed to allow compiler writers to smooth over CPU architecture friction back in a time when the CPU architecture scene was far less settled than it is now. Trust and responsibility were always understood to be necessary components to make it work. The fact that trust can be broken isn't deep, isn't surprising, and isn't a failure of C.

If C compiler writers want to use UB as an excuse to abuse the trust they have been given, that's on them. I am moving away from C, for many reasons, but the UB-related reason has very little to do with UB being inherently evil and very much to do with the fact that it seems to be placed in increasingly irresponsible hands.

an_d_rew · on July 14, 2019

> the Rust program you wrote does not run on your hardware. It runs on the Rust abstract machine

Excellent point, and very well made… Thank you for writing this!

There is a constant back-and-forth here on Hacker News about whether or not “undefined behavior“ is the root of all evil or the root of all real-worls optimizations… And your article does a great job of explaining, in real-world terms, what UB really (in-part) is.

twic · on July 14, 2019

I don't like this formulation much. I would rather say that the program does indeed run on your hardware, but the program that runs is not the program you wrote. It's the program into which program you wrote has transformed by the compiler - better yet, it could be any one of the programs into which the compiler is allowed to transform the program you wrote.

I prefer this because it is more open about the fact that all the weird and counterintuitive behaviour happens because the language specifiers and compiler implementers decided to make it that way (usually, with good reason!). It's not some unavoidable property of the universe, or the machine.

The idea of an abstract machine is still really useful, because it lets you reason directly about the code you are writing, rather than having to express and reason about what the compiler might do with it. But i think we should be clear that it's a tool for thinking, not a truth.

The idea that you can ignore the real hardware is particularly unhelpful in Rust, because it's a great fit to low-level problems where the real hardware is a big deal. For example, at work, we have a Rust program where we routinely need to think about NUMA placement and cache coherency protocols. Those don't exist in the Rust abstract machine at all!

ralfjung · on July 15, 2019

Another argument for considering the abstract machine as the primary way to think about programs in a language is that it is very easy to end up with a set of optimizations that all look reasonable in isolation but are inconsistent, and lead to incorrect code when combined. Both GCC and LLVM suffer from this (and in fact MSVC had/has the exact same bug).

GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752

LLVM: https://bugs.llvm.org/show_bug.cgi?id=35229, https://bugs.llvm.org/show_bug.cgi?id=34548

anp · on July 14, 2019

> The idea that you can ignore the real hardware is particularly unhelpful in Rust, because it's a great fit to low-level problems where the real hardware is a big deal. For example, at work, we have a Rust program where we routinely need to think about NUMA placement and cache coherency protocols. Those don't exist in the Rust abstract machine at all!

If I understand Ralf's overall model here, I think they'd argue that this sort of reasoning must be done within the context of the abstract machine's behavior for std::ptr::{read,write}_volatile, no?

twic · on July 14, 2019

The kind of reasoning we do is "if thread A writes to this location, then next time thread B writes to this location, it will have to take ownership of the cache line, which will take N cycles, so let's not do that". I don't think that kind of performance reasoning maps on to anything in Rust.

twic · on July 15, 2019

I missed out so many words in this comment. I need to hire a proofreader.

ralfjung · on July 15, 2019

I think thinking about real hardware (most of the time) just distracts from thinking about what your program "actually does", which is specified by the abstract machine. By thinking in terms of the abstract machine, you can forget about compilers and optimizations when writing your program, and focus on your code and what it does.

Of course, when you ask why the abstract machine is the way it is, optimizations and hardware come up again. But I think these concerns are better separated. This also mirrors how languages like C/C++ are actually designed, at least in theory: optimizations are justified against the abstract machine, not the other way around. Isn't it much easier to have one, albeit weird, machine in your head, than a (only marginally simpler) "real" machine plus a list of optimizations that also contribute to the behavior of the compiled program? And that list can even change any time!

> The idea of an abstract machine is still really useful, because it lets you reason directly about the code you are writing, rather than having to express and reason about what the compiler might do with it. But i think we should be clear that it's a tool for thinking, not a truth.

If you read the C/C++ standard, you can see that the abstract machine is the truth. Same if you read the really well-written WebAssembly standard (which comes with a mathematically precise formal definition).

So IMO you got it backwards. The abstract machine is the truth, the optimizations are just the way that machine gets exploited right now and can change with any compiler update. Of course the abstract machine is not "God-given", but neither are the optimizations. And the abstract machine can only change when switching to a new version of the language standard, the optimizations can change on any minor compiler update. The machine is much more stable than the list of optimizations.

> The idea that you can ignore the real hardware is particularly unhelpful in Rust, because it's a great fit to low-level problems where the real hardware is a big deal. For example, at work, we have a Rust program where we routinely need to think about NUMA placement and cache coherency protocols. Those don't exist in the Rust abstract machine at all!

I admit that once you think about performance, the details of what your compiler and hardware happen to do become very relevant. But when talking about correctness, I think that is an unsuited level of (lack of) abstraction.

EDIT: Based on this feedback and others, I have amended the blog post a bit. It now says

> Maybe the most important lesson to take away from this post is that “what the hardware does” is most of the time irrelevant when discussing what a Rust/C/C++ program does, unless you already established that there is no undefined behavior. [...]

> UB-free programs can be made sense of by looking at their assembly, but whether a program has UB is impossible to tell on that level. For that, you need to think in terms of the abstract machine.

rkagerer · on July 14, 2019

Would have liked to see the machine code generated by the example function, and a deeper dive mapping compiler choices to the unintuitive results.

The article (indeed the point of it) abstracts that all away behind "undefined behavior" and a mental model sitting between your code and its resulting executable. Which is fine, but it leaves a loose end which fails to sate my curiosity.

archgoon · on July 14, 2019

It depends on whether you generate using rustc 1.28 or rustc 1.36, and whether you're compiling with or without optimzations. This does not crash in unoptimized rust (either 1.36 or 1.28) but it will crash in optimized rust 1.36.

https://godbolt.org/z/8Yxl2c

I think though (as your question indicates); that the author misses the point of why people care about "What the hardware does". At the end of the day, assembly code is going to execute, and that assembly code is going to (despite the authors protestations to the contrary) have well defined memory of one value or another. The moment you start saying "Rust has a third value of uninitialized" the question comes up "How is that abstraction enforced by the hardware?" This is valuable information for understanding how the language works.

From the authors discussion, I was expecting some sort of sentinel value being checked; however, instead, the uninitialized memory access is detected by the compiler and it panics uniformly regardless of the actual memory state.

The idea that one should only worry about the abstract virtual machine of rust seems like an encouragement of magical thinking. "Don't worry about how any of this works, the compiler will just make it happen". This will not go over well with many people who are curious about learning Rust.

However, if the author is arguing "Don't let the behavior of a naive enforcement of a Rust safety construct dictate how the optimized version should work" this seems like a more interesting position; but it's not clear that is the argument being made here.

Rusky · on July 14, 2019

> However, if the author is arguing "Don't let the behavior of a naive enforcement of a Rust safety construct dictate how the optimized version should work" this seems like a more interesting position; but it's not clear that is the argument being made here.

This is exactly the point the author is arguing. The focus of all their work on UB is to make sure safe Rust can do all the optimizations we would like, by careful design of the abstract machine.

The immediately visible outcome of this work is a set of rules for what you can do in unsafe Rust, which taken together amount to this weird-looking abstract machine with its extra "uninitialized" values- something that can be implemented efficiently on real hardware assuming no UB.

The point here is that this abstract machine is a better, simpler, easier way to convince yourself whether or not an unsafe Rust program is well-defined, and that "what the hardware does" is too many layers removed to be a good tool here. You can think about "what the hardware does" another time, for other purposes, but trying to do so in this context is actively unhelpful.

archgoon · on July 14, 2019

shrug. There's a difference between saying "This is a useful abstraction" and saying "understanding what assembly is generated is irrelevant in understanding what your program does so I will try to end all discussions where it comes up".

I mean, the latter seems quite a bit more extreme, and is what the author explicitly is calling for.

saagarjha · on July 15, 2019

The issue is that there are no guarantees on the generated assembly, and what your compiler of today may do is not necessarily what tomorrow's will.

archgoon · on July 15, 2019

There are most certainly guarantees on the generated assembly. The assembly has to enforce the abstract machine. I want to know how it does that. It can change, that's fine, it can be improved, it can be made worse, but the idea that the rust program doesn't run on physical hardware, as explicitly stated in the article, is pure bullshit.

ralfjung · on July 15, 2019

> The assembly has to enforce the abstract machine.

The assembly has to implement the abstract machine only if your program has no UB. The assembly never has to check if memory is "initialized" or not even though that distinction is real on the abstract machine, because if the difference would matter, your program would have UB.

To determine if your program has UB, looking at the assembly is useless. The only way is to consider the abstract machine.

saagarjha · on July 15, 2019

Personally, I like to bind the compiler to implementing the abstract machine in all cases, and in the face of undefined behavior the abstract machine has no requirements on its behavior. Of course, this is just a semantic quibble: in practice, the results are the same ;)

archgoon · on July 15, 2019

When you are in UB it's even more interesting to ask what the hardware actually does because the standard will not specify anything.

ralfjung · on July 16, 2019

The standard will not specify anything, so what the compiler outputs is gibberish. You are literally looking at a sequence of bytes on which no constraints whatsoever are imposed. LLVM could have compiled my UB program do `0xDEADBEEF` (which I assume is not valid x86 but I do not know) and there would be no compiler bug. Looking at `0xDEADBEEF` here is not useful.

Trying to interpret the assembly of a UB program is like trying to interpret the noise of a radio when there is no station on the given frequency. It has more to do with fortune telling than anything else. There is no signal in there, or at least not enough of it to be useful.

saagarjha · on July 15, 2019

No, because the compiler will not generate code that is consistent in this case.

archgoon · on July 15, 2019

So? What do you think I'm arguing for here?

saagarjha · on July 15, 2019

There is no standard mapping between "your C code" and "what your computer will do" if your code has undefined behavior. Your compiler will produce some assembly, which you cannot rely on, and that will be "what your hardware does". If that's what you're trying to say I think we agree.

saagarjha · on July 15, 2019

> The assembly has to enforce the abstract machine.

Yes, but this only really means anything in the absence of undefined behavior. The compiler's job is generate assembly that produces the results that running the code in the abstract machine would, but the issue is that undefined behavior allows the abstract machine to do arbitrary things, so the compiler is free to generate whatever it likes in this case.

archgoon · on July 15, 2019

Um... this seems to be a stronger case then to ask what the hardware does. If the compiler can generate arbritray code, then the only recourse to understand what the resulting binary actually does is to look at the actual generated assembly. Understanding what the compiler was trying (and yes, this will change based on which compiler version you used) to do would presumably be helpful in that process. Sure, don't design around this behavior, but if you find yourself deploying an executable with undefined behavior, and you need to figure out the scope of the problem; this seems useful.

I don't get this hostility to understanding the tools you're using.

saagarjha · on July 15, 2019

> the only recourse to understand what the resulting binary actually does is to look at the actual generated assembly

Pretty much, yes.

> I don't get this hostility to understanding the tools you're using.

Don't take me the wrong way: I'm interested in how compilers work, but I accept the concession that I can only really understand their output when my program is free of undefined behavior. It would be nice to have the compiler try its best in the cases where I am violating the rules of the programming language, and often it will do so, but in general I cannot expect this and trying to do so will require making some sort of tradeoff with regards to performance or language power.

steveklabnik · on July 14, 2019

> At the end of the day, assembly code is going to execute, and that assembly code is going to (despite the authors protestations to the contrary) have well defined memory of one value or another.

The point is that you may not get the assembly you assume you’re going to get. Like the example shows, it may never even generate something that accesses the value at all.

archgoon · on July 14, 2019

Yes, but that doesn't mean that what the hardware does is irrelevant.

steveklabnik · on July 14, 2019

That’s true. You have to understand that abstract machine, what it guarantees, and how that relates to your hardware and what it guarantees.

I really need to finish my own blog post series on this topic...

archgoon · on July 15, 2019

Guess there's some bigger context that I'm missing here. I wouldn't have expected saying "Understanding how your compiler enforces it's abstract machine is beneficial" would be a controversial position to take.

ralfjung · on July 15, 2019

> Understanding how your compiler enforces it's abstract machine is beneficial

The compiler does not enforce it though. It only implements the abstract machine, and the implementation is only correct for UB-free programs.

archgoon · on July 15, 2019

When you are in UB it's even more interesting to ask what the hardware actually does because the standard will not specify anything.

Rusky · on July 15, 2019

No, that's when it's least interesting to ask what the hardware does. A program with UB will not reliably compile to any particular hardware behavior, so changing unrelated parts of the program or upgrading your compiler can change which hardware behavior you get.

The actual hardware behavior is useful for other purposes, like understanding why the abstract machine is the way it is, or understanding and improving the performance of well-defined programs, but it is not useful at all once you have UB.

archgoon · on July 15, 2019

It's interesting to me, and it's interesting to the first person in this thread. I'm sorry that you feel that were wasting our time.

rrobukef · on July 14, 2019

The provided example ([rust playground](https://play.rust-lang.org/?version=stable&mode=release&edit...) can generate assembly ('...' next to 'Run').

`main` just calls `panic` immediately. The functions returns undef. The reason: `x > 150` is undefined, `undef || ...` is undefined thus the first if-statement with side-effects may be interpret undefined as true.

I wonder why the optimiser didn't choose false and let the assert pass?

ralfjung · on July 15, 2019

> I wonder why the optimiser didn't choose false and let the assert pass?

I don't know exactly, but it's kind of a moot point. I would have negated the statements until I found a way to make it return what I want it to.

The point is that the compiler picks some result, and it does so "locally", so when it picks results for multiple comparisons it makes no attempt to check that these results are all "consistent" and can even arise for a single value. The result that we can observe is that the value is "unstable".

To "fix" this (assuming we wanted to specify that unstable values are not allowed in C/C++/Rust), the compiler would have to keep track of which constant foldings it already did for some uninitialized value, and make sure it remains consistent with that. That's a hard problem and likely undecidable in general. Allowing unstable values frees the optimizer from this burden, letting it optimize more code better.

Arnavion · on July 14, 2019

The dropdown next to the "Run" button in the playground link shows you the ASM. But here's the relevant part:

    playground::main:
        push rax
        call std::panicking::begin_panic
        ud2

That is, it's an unconditional panic.

mjw1007 · on July 14, 2019

There's a link to the generated machine code in a footnote (for a variant returning a bool rather than calling assert!).

It's literally just

    xor eax, eax
    ret

robocat · on July 14, 2019

That's what I expected. And I that is probably due to an optimisation rather than uninitialised variables (can anyone confirm that?).

I am sceptical the author really knows much as some of their statements seem blatantly wrong or just nonsense:

"So, one time we 'look' at x it can be at least 150, and then when we look at it again it is less than 120, even though x did not change."

Is talking about "x < 150 || x > 120", but gets it the wrong way around, ouch!

"Memory remembers if you initialized it. The x that is passed to always_return_true is not the 8-bit representation of some number, it is an uninitialized byte."

Benefit of doubt could be extremely poor metaphors, or referencing the wrong code?

Also stating C is not low-level is a conceited attempt to redefine the word.

comex · on July 15, 2019

> "So, one time we 'look' at x it can be at least 150, and then when we look at it again it is less than 120, even though x did not change."

> Is talking about "x < 150 || x > 120", but gets it the wrong way around, ouch!

It's not the wrong way around. The assertion failure being discussed happens when the function returns false, which happens when both sides of the || are false. Technically he should have said "less than or equal to 120" rather than just "less than", but otherwise it's accurate.

littlestymaar · on July 15, 2019

> Technically he should have said "less than or equal to 120" rather than just "less than", but otherwise it's accurate.

I suspect it may be because the author is German. In French at least «inférieur» means “less or equal than” and you need to say «strictement inférieur» to say “less than”, and I wouldn't be surprised if it were the same in German.

ralfjung · on July 15, 2019

That and it is really tedious to spell out "less than or equal to", in German we have a much shorter phrase for it ("kleiner-gleich").

But thanks for pointing out this mistake, I will fix it immediately.

robocat · on July 15, 2019

Fair call. But the real point is that the function gets compiled to:

    xor eax, eax
    ret

i.e. the input variable is not compared with 150 or 120. His intuition about his code is wrong - it has been compiled out (unless I am missing something about choosing a different optimisation level, or declaring things volatile, etc).

ralfjung · on July 15, 2019

You are making exactly the mistake the post is all about. :)

Only UB-free programs can be made sense of by looking at their assembly. Whether a program has UB is impossible to tell on that level. For that, you need to think in terms of the abstract machine.

I mean, look at the code I wrote! It literally compares `x` with 150 and 120. That's the program I wrote. This program has a "meaning"/"behavior" that is entirely irrelevant of compilers and optimizations, and determined by the langauge specification. How can you argue that it does compare `x`?

robocat · on July 15, 2019

Right. When you are talking about the compiler and the abstract machine, perhaps some sentences could be clearer about that (e.g. the sentences I latched onto - which I admit is my fault for skim reading).

Responding to whether C is "low-level" I like this comment: http://lambda-the-ultimate.org/node/5534#comment-95721 And processors have undefined behaviour so should we say assembly is not "low-level"? e.g. "Grep through the ARM architecture reference manual for 'UNPREDICTABLE' (helpfully typeset in all caps), for example…" - pcwalton

Thanks heaps for your article which was a good read, and it led me to the funnier side of undefined behaviour: https://raphlinus.github.io/programming/rust/2018/08/17/unde... and https://lkml.org/lkml/2018/6/5/769

ralfjung · on July 16, 2019

The argument for C not being low-level is not via UB, it is via the fact that a lot happens when C gets translated to assembly, and to explain that you need to consider an abstract machine that is many things, but not low-level.

Dylan16807 · on July 14, 2019

> Also stating C is not low-level is a conceited attempt to redefine the word.

I expect a low level language to run the code I typed, or something that has the same effect.

They're not redefining the term. C itself has been redefined away from its origins.

saagarjha · on July 15, 2019

The author tries to ascribe too much meaning to undefined behavior and gets some parts of this wrong, but they are correct in saying that C is not a low-level language in the context that they're using it.

ralfjung · on July 15, 2019

I don't think I got any of it wrong, but in case I did I'd appreciate if you could point out my mistake(s). :)

saagarjha · on July 15, 2019

I talked a bit about it here: https://news.ycombinator.com/item?id=20435309. Basically, the compiler has no need to do things like perform reads in the face of undefined behavior: it could output an empty executable if it wished. Maybe your specific version of the compiler does, but that doesn't mean others (or even a future version of yours) will. Trying to figure out what a compiler might do in the face of undefined behavior is generally not a worthwhile exercise.

ralfjung · on July 16, 2019

> Trying to figure out what a compiler might do in the face of undefined behavior is generally not a worthwhile exercise.

That is exactly the point of my post! If you think I disagree with that statement, we seriously miscommunicated somewhere.

The parts you seem to be concerned about are those where I try to explain why the abstract machine is the way it is. Hardware and compiler concerns do come in at that point, and my feeling is just dogmatically giving an abstract machine won't help convince people of its usefulness.

rini17 · on July 14, 2019

Is there an example of actually helpful and practical enabled-by-undefined-behavior optimization in C/C++? All I can remember are discussions of its pitfalls, like this.

jcelerier · on July 14, 2019

this whole series is nice about that: http://blog.llvm.org/2011/05/what-every-c-programmer-should-...

ralfjung · on July 16, 2019

I added a few sentences about that to the post:

> In the case of our example, the program actually compares such an “unobservable” bit pattern with a constant, so the compiler constant-folds the result to whatever it pleases. Because the value is allowed to be “unstable”, the compiler does not have to make a “consistent choice” for the two comparisons, which would make such optimizations much less applicable. So, one time we “look” at x the compiler can pretend it is at least 150, and then when we look at it again it is at most 120, even though x did not change.

Also see http://nondot.org/sabre/LLVMNotes/UndefinedValue.txt.

pcwalton · on July 15, 2019

Here's a good gist describing important loop optimizations that this enables: https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759...

steveklabnik · on July 14, 2019

One classic helpful example that gets brought up is integer overflow and loops. Because overflow is undefined in C and C++, the loop doesn’t need to check for it on each iteration.

mhh__ · on July 14, 2019

https://d.godbolt.org/z/_r6C9N

Example of an Integer overflow based optimization

rini17 · on July 14, 2019

This is considered a practical example?

Regardless, checking a register or hot value already in cache against a constant is free on modern CPUs.

mhh__ · on July 14, 2019

It's not a practical example but it's relatively non trivial e.g. some otherwise good compilers can't figure that one out.

andreareina · on July 15, 2019

Wait, how does that work? Without overflow, how does the compiler prove that eventually x == 7?

cryptonector · on July 15, 2019

If x <= 7 then the answer will be 420, but if x > 7 then behavior is undefined because x will overflow, and x is signed, and signed integer overflow is UB in C, so the compiler is free to, for example, conclude that blazeit() is never called with x > 7, thus it can prove that blazeit() always returns 420. Besides, if the compiler chooses to implement signed integer overflow much like unsigned integer overflow, then x will eventually come around to 7 anyways, so the answer must be 420.

rini17 · on July 15, 2019

Then later, after some minor unrelated change, compiler stops optimizing and it suddenly starts overflowing the stack.

How can anyone put up with that?

mhh__ · on July 15, 2019

If you rely on an optimization like this, that's not a code smell as much as a code-sewer:

This is one of those optimizations that a very clever compiler might be able to do interprocedurally give a certain input for example.

cryptonector · on July 15, 2019

I expect the compiler would optimize the tail recursion into a loop, so the likely worst case is not that you blow the stack but that you spin for a while.

rini17 · on July 16, 2019

Not really, any expectations here are unsupported by C standards. Compiler is, by spec, allowed to do anything here.

mhh__ · on July 15, 2019

The loop (and function) terminates when x == 7, and the compiler can show that x will be 7 at some point i.e. it knows it's on a system with overflowing integers

quotemstr · on July 14, 2019

I disagree that bit and byte level uninitialized models are equivalent. Consider a program that uses one bit of a bitfield in a stack allocated strict. The compiler is free to preserve the value of that bit however it pleases --- e.g., in the carry flag --- and randomize the rest of the bits if you ever read the whole byte.

ralfjung · on July 15, 2019

Hm, good point about the bitfields. The paper I cite [1] actually talks specifically about bitfields as their precise semantics in the presence of "poison"-style uninitialized memory is not entirely clear yet.

[1]: http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf

martinhath · on July 14, 2019

Why aren't undefined bytes like these treated in the same way as I/O data? That is, arbitrary but fixed data? This seems to align fairly well with how I think about uninitialized data.

lmm · on July 14, 2019

If you're asking why the C standard didn't originally define it that way, it's because some architectures might use a trap/invalid representation (that traps when accessed) and we want compilers to be free to reorder memory accesses.

martinhath · on July 14, 2019

I think I'm mostly asking why this isn't a good solution for Rust, as I think C and C++'s design decisions should be absolutely irrelevant for it's development. However, since rustc uses LLVM, this seems to be difficult :(

I suppose it could be enlightening to understand why it wasn't a good decision for C or C++ at the time either.

> some architectures might use a trap/invalid representation

Traps on what? Access of an invalid representation? What if such representations doesn't exist?

lmm · on July 14, 2019

> Traps on what? Access of an invalid representation?

Yeah - certain bit-patterns are just "invalid" rather than representing any given value. It's much nicer to debug, because you get an immediate failure (at the point where your code tries to access the uninitialized variable) rather than having a corrupt value propagate through your program.

> What if such representations doesn't exist?

Then you can't implement that strategy (other than by emulating it with considerable overhead, e.g. by having an extra marker byte for each variable in your program and checking it on every access). Hence why the C standard doesn't require you to do this.

As originally intended, C left the behaviour undefined so that users on platforms that did have trap representations would be able to take advantage of them. (It's very hard to rigorously specify what accessing a trap representation should do without impeding the compiler's ability to reorder memory accesses). Unfortunately it's ended up being used to do the opposite by modern compilers - not only do they not trap on access to uninitialized values, they abuse the undefined behaviour rules to propagate unexpected behaviour even further from the code that caused it.

Faark · on July 14, 2019

But then this should be documented in the types definition. I don't see this "can also be something else than 0-255" in the in the types documentation (that is arguably not at all detailed).

We use types to restrain complexity. It was a mistake in C# to allow every object to be null. A better type system would allow devs to make a contract to easily disallow this and they try to fix this. Now here we have a blog post that seems to be fine with a function parameter of type u8 not actually being of 0-255. That's a huge change I always understood the type. Do I now have to do implement a null-check equivalent?

Undefined behavior for unsafe code is fine. But there has to be a transition were we go back to classical behavior. And in the blog posts example, this should be somewhere in main. Certainly not the seemingly safe always_returns_true.

ChrisSD · on July 15, 2019

I think you're misunderstanding the intent of the post. An unsafe block is absolutely meant to ensure everything is safe outside of that block. However it's up to the unsafe programmer to do that. Using `unsafe` is telling Rust "I'm going to break some rules now but don't worry, I known what I'm doing".

So if the programmer doesn't in fact know what they're doing then they can cause bad things to happen outside the `unsafe` block, as this post shows.

lmm · on July 15, 2019

> But then this should be documented in the types definition. I don't see this "can also be something else than 0-255" in the in the types documentation (that is arguably not at all detailed).

It's not a valid value of that type - it's not a value you'll ever see if you're using the language in accordance with the spec (and, in the case of Rust, not a value you can ever see in safe Rust). It's an uninitialised value.

> We use types to restrain complexity. It was a mistake in C# to allow every object to be null. A better type system would allow devs to make a contract to easily disallow this and they try to fix this. Now here we have a blog post that seems to be fine with a function parameter of type u8 not actually being of 0-255. That's a huge change I always understood the type. Do I now have to do implement a null-check equivalent?

The point is for the language to do the null-check equivalent for you. A trap representation is null done better. Silently defaulting to a valid value is even worse than silently defaulting to null, because the value propagates even further from the point where it's wrong - imagine e.g. a Map implementation that, rather than returning null for a key that isn't present, returned an arbitrary value.

(Of course in the case of a Map, returning Maybe is better. But there's no way to do an equivalent thing for uninitialized variables, unless we made every single field of every single struct be Optional, and that's actually just equivalent to reintroducing null - the advantage of using Optional is the ability to have values that aren't Optional, at least in safe code).

> Undefined behavior for unsafe code is fine. But there has to be a transition were we go back to classical behavior.

Unfortunately no, that's not and has never been how undefined behaviour works. Undefined behaviour anywhere in your program invalidates the whole program and can lead to arbitrary behaviour anywhere else in your program (this has always been true with or without trap representations).

Pragmatically, what you want in the blog post's example is to get an error that tells you that the bug is that x was uninitialized, as soon and as close as possible to the point where x is actually used uninitialized. Ideally that would be on the "let x = ..." line (and if you didn't use "unsafe", that line would already be an error), but given that you've made the mistake, you're better off having an error as soon as you touch x (which happens in always_returns_true). Then you can see what the problem is and what's caused it. If always_returns_true runs "successfully", returning false, then you don't actually find out there's a bug until later (potentially much later) in your program, and have to do a lot of detective work to find out what went wrong.

ralfjung · on July 15, 2019

> Unfortunately no, that's not and has never been how undefined behaviour works. Undefined behaviour anywhere in your program invalidates the whole program and can lead to arbitrary behaviour anywhere else in your program (this has always been true with or without trap representations).

I even have a post about this. :D https://www.ralfj.de/blog/2016/01/09/the-scope-of-unsafe.htm...

YawningAngel · on July 14, 2019

Because it's faster not to do that and these compilers don't attempt to behave in safe or predictable ways when UB is invoked.

martinhath · on July 14, 2019

Of course it's faster - any program can be compiled into something "faster" by making the entire thing do nothing. What I mean is, why isn't uninitialized memory properly defined to be an arbitrary fixed string of bytes? In the example of the post, the compiler would look into `always_returns_true`, and either say "okay, is `x < 150`: well, I have no idea what `x` is, so I can't tell for sure", OR "Ah, for any value of `x` this expression is true, so let's replace it with `true`". There would be no 257th value of "uninitialized"; the value of `x` would definitely be a single value in the allowed range of that type, but it's indeterminable.

comex · on July 15, 2019

To be clear, the compiler is not forbidden from optimizing `always_returns_true` to unconditionally return true. After all, undefined behavior can cause anything to happen, and that includes returning true. Normally LLVM would perform exactly that optimization. But in this case `always_returns_true` is inlined first, so it becomes something like `undef < 150 || undef > 120`; LLVM propagates the `undef` outward through the expression until the whole condition is `undef`, and then it arbitrarily picks that the condition should be false.

But `always_returns_true` is not an example of how this particular undefined behavior can be useful as an optimization, merely an example of how it can be dangerous. For some examples of how it can be useful:

- Document describing the origins of `undef` in LLVM: http://nondot.org/sabre/LLVMNotes/UndefinedValue.txt

Basically, it helps to be able to replace "cond ? some_value : undef" with "some_value", especially when the code has been transformed to SSA form.

- Some architectures literally have a 257th possible value, like Itanium [1] [2]. On Itanium, every register can be set to "Not a Thing", i.e. uninitialized, and the CPU will trap if you try to store such a value to memory. Ironically, NaT was created in order to make the CPU's behavior more defined in a certain case, or at least more predictable... argh, I'm too tired to explain it properly; look at section 2.3.1 of [2] for a somewhat confusing explanation.

[1] https://devblogs.microsoft.com/oldnewthing/20040119-00/?p=41...

[2] https://www.cse.unsw.edu.au/~cs9244/06/seminars/07-gaol.pdf

martinhath · on July 15, 2019

I _think_ I understand the rationale behind `undef` in LLVM, but I still think it's a bad one, since it can, and does, lead to very surprising behaviour.

> LLVM propagates the `undef` outward through the expression until the whole condition is `undef`, and then it arbitrarily picks that the condition should be false.

I think this illustrates what I find counter-intuitive about this whole mess; any function of `undef` shouldn't itself be `undef`. `undef < undef` is false in my head. `undef < 150` is just unknown, not undefined, since we don't know what `undef` is.

> Basically, it helps to be able to replace "cond ? some_value : undef" with "some_value"

This feels really contrived; in what setting would this actually be useful?

comex · on July 15, 2019

> This feels really contrived; in what setting would this actually be useful?

The text file I linked to explains it in more detail, so I'll defer to that.

> I think this illustrates what I find counter-intuitive about this whole mess; any function of `undef` shouldn't itself be `undef`. `undef < undef` is false in my head. `undef < 150` is just unknown, not undefined, since we don't know what `undef` is.

Except that for `undef < undef`, what if they're two different undefs? You would have to track each potentially uninitialized value separately. And then the optimizer would want to strategically choose values for the different undefs – e.g. "we want to merge 'cond ? some_value : undef123' with 'some_value', so let's set undef123 equal to some_value... except that could negatively impact this other section of code that uses it". It's certainly possible, but it would make the optimizer's job somewhat harder.

comex · on July 15, 2019

Since it's too late to edit my comment, I'm replying to note I was a bit off. `undef` actually doesn't propagate all the way to the conditional; instead, LLVM replaces `undef < 150` with `false`, and the same for `undef > 120`, and then `false || false` is simplified to `false`. In C, comparing indeterminate values is undefined behavior, so LLVM would be allowed to replace `undef < 150` with `undef`, but it seems that LLVM itself has slightly stronger semantics.

See also:

https://llvm.org/docs/LangRef.html#undefined-values

ralfjung · on July 15, 2019

Thanks, I have added a link to that LLVM document to the post!

cesarb · on July 15, 2019

It's like you suggested in your parent comment: uninitialized memory is similar to I/O data, which can change at any moment without warning. That is, it can be 200 at the "x < 150" and moments later 100 at the "x > 120".

> In the example of the post, the compiler would look into `always_returns_true`

You can't look at each function in isolation. One important optimization all modern compilers use is inlining: short functions (like this `always_returns_true`) or functions that are only used once (like this `always_returns_true`) have their body inserted directly into the caller function, which allows further optimizations like constant propagation.

martinhath · on July 15, 2019

Ah, I knew my analogy would bite me :) I didn't mean IO as in "can change at any moment", but as in "read from a file but you have no idea what it is".

You definitely _can_ look at each function in isolation (in this example it's even sufficient to get the "best" possible version of that function"), but I do know that you'd usually do an inline pass, and further optimization passes afterwards. I don't see how that changes anything, though. If the function was inlined you'd get the same expression, and still you'd be unable to tell anything about `x`, except that it would have a definite value that you cannot observe. Again, you could argue that no matter the value, the expression would be `true`, so you could replace it.

saagarjha · on July 14, 2019

> So each time an uninitialized variable gets used, we can just use any machine register—and for different uses, those can be different registers! So, one time we “look” at x it can be at least 150, and then when we look at it again it is less than 120, even though x did not change. x was just uninitialized all the time.

Is this actually true? I thought these would just be poisoned and then the optimizer would just do whatever it liked in the presence of undefined behavior (like optimize the function to return true).

cperciva · on July 14, 2019

It could do that, but it doesn't have to. A common example is "a variable is set inside a loop and used outside it"; if the loop runs zero times, the variable will be used uninitialized, but the fastest compilation will assume that the loop runs at least once and read the value from whatever register/memory location the loop writes into.

rthrowayay · on July 15, 2019

This poison thing is interesting because it seems like one poison value could have the capability of poisoning all the data in your program. You could imagine a faithful implementation of the rust abstract machine after hitting one code path would cause your program to start generating complete nonsense.

scoutt · on July 15, 2019

I appreciate the attempt to explain low-level stuff, but I think this is a high-level language programmer trying to understand herself/himself an issue without a clear idea of what he/she is talking about

>> The answer is that every byte in memory cannot just have a value in 0..256,

0..256?? Is this still 8-bits bytes??

>> it can also be “uninitialized”. Memory remembers if you initialized it.

This is plainly wrong.

>> So, one time we “look” at x it can be at least 150, and then when we look at it again it is less than 120, even though x did not change. x was just uninitialized all the time.

You might be dealing with a bug on a non-volatile variable. It has nothing to do with allocated but uninitialized memory.

noobermin · on July 15, 2019

The problem here (and I caught it too and sort of recoiled at it) was they were using their logic and worse language/word choice regarding the "abstract machine" being the actual thing we should think about when reasoning about a programming language before they actually went out and were explicit that's what they were doing. They saved the explicit thesis until the end of the piece. They should have traded the writer's desire to be witty and hide the lede and just lead with it upfront.

Of course you're right in part that their idea of "memory" is an abstraction, but it isn't too wrong. A "variable" in C or any compiled language on a modern machine is an abstraction that could refer to a register one moment, be a place in the cache in another, be a place in memory in the next, and be on a swapfile after. The "variables" are abstractions which lie in "memory" which is another abstraction because it need not be in one place.

ralfjung · on July 15, 2019

Good point, I should have at least mentioned that there is an "abstract machine" when I introduce this strange kind of memory. Thanks for the feedback!

littlestymaar · on July 15, 2019

> I think this is a high-level language programmer trying to understand herself/himself an issue without a clear idea of what he/she is talking about

FYI The author is a PhD student working on optimizing compiler.

dancek · on July 15, 2019

> 0..256?? Is this still 8-bits bytes??

In Rust a..b means the range a <= x < b, so yes, that is an 8-bit byte.

scoutt · on July 15, 2019

An 8-bit byte can hold 256 values, from 0 to 255, and cannot hold a value = 256, whatever the philosophy is.

ChrisSD · on July 15, 2019

Read what dancek said again. In Rust `a..b` means

    a <= x < b

So 0..256 is

    0 <= x < 256

That is "less than" 256. There's no equals (that would be `0..=256` in Rust).

scoutt · on July 15, 2019

Oh my....

Well, I guess the author realized about the confusion in every other non-Rust programmer it generates (https://git.ralfj.de/web.git/commitdiff/13622f8642fc071bedf3...).

IshKebab · on July 15, 2019

I don't think this is really right. He claims that uninitialised memory is not just random bytes, but it is!

It's just that there is a compile time optimisation that allows the compiler to assume you will never read from uninitialised memory.

It would be perfectly possible to make a language (or even a C++ compiler) that didn't perform that optimisation.

ralfjung · on July 15, 2019

> I don't think this is really right. He claims that uninitialised memory is not just random bytes, but it is!

No it's not. To describe the behavior of a program involving uninitialized memory (like the example in my post), at no point in time to you need to talk about arbitrarily chosen bytes. The "abstract machine" on which a Rust programs runs (of which your hardware is a fast implementation, but only accurate for UB-free programs) does not "pick random bytes" when you allocate new memory, it just fills it all with `None`.

You should not think in terms of optimizations when thinking about what your program does. The optimizations the compiler performs can change from version to version and are affected by seemingly random changes at the other end of your program.

> It would be perfectly possible to make a language (or even a C++ compiler) that didn't perform that optimisation.

Sure. That would be a different language though, with a different abstract machine. C/C++/Rust behave the way I described (and that behavior is not defined by what any particular compiler does).

IshKebab · on July 17, 2019

You missed my point. Rust and C may treat uninitialised memory in a special way, but that doesn't change what uninitialised memory actually is.

Think about assembly. What is uninitialised memory there? It's just memory with an unknown value.

pizlonator · on July 15, 2019

Sounds like a lot of excuses for what is really a compiler bug that found its way into the spec.

saagarjha · on July 15, 2019

Undefined behavior is not a compiler bug: it's a necessity that is required to have language constructs that give you a way of doing things for which there is no good way to define the behavior of. Your compiler optimizes your code every day by concluding that you're not doing anything illegal: it would have a rather miserable time if it couldn't make these assumptions.

verisimilitudes · on July 14, 2019

The real issue is the C and C++ languages are horrible languages. They're too high-level to correspond to any real machine, yet so low-level so as to make such an abstract machine useful. The C language leaves the precise lengths of types up for grabs, as merely one example. As for Rust, I'd figure it's poor as well, considering it follows in the footsteps of C++.

I can compile an Ada program that has an uninitialized variable and use it, but I get a warning; there's also a Valid attribute that acts as a predicate for whether a scalar value has an acceptable value or not.

To @userbinator , you're mistaken to believe the C has much design behind it. There are many things where one requires a certain range of values and C forces the programmer to use a type that's much larger than necessary and cope with unwanted values. The C language leaves details to the underlying machine, so long as that machine is determined to pretend it's a PDP-11. Most languages that have a standard expect the programmer to follow it; since most C programmers don't know what the standard says, having been lied to about it being a simple and base language, they're offended when they do something they never should've done; they shouldn't be using C anyway, however.

Abstract language details are necessary for a high-level language and can work quite well if the language is designed well; this then leaves high-level features to be implemented in whichever way is best for the machine; the C language doesn't do this well at all, however, and precisely specifies the nature of irrelevant details and so hinders the machine and implementation possibilities.

The C language doesn't even have true boolean values or arrays thereof. You're expected to use an entire integer that's zero or not and you're left to your own devices if you want an array of these values that isn't grotesque in its wastefulness. Meanwhile, most proper languages have the concept of types that only have two values and can easily use an underlying machine representation for efficiently representing these, without involving the programmer.

In closing, you may argue that C is necessary because it permits specifying these low-level details, albeit required in every case instead of only where necessary. To that, I direct you to look at Ada, which permits the programmer to ignore such details wherever unneeded, and so leave them to the compiler’s discretion, but allows size, address, representation, bit-level organization, and more to be specified in those cases where it's truly necessary.

Here's a link others may like for learning more about Ada and the deficiencies of C:

https://annexi-strayline.com/blog

comex · on July 15, 2019

The blog post is about Rust. In Rust:

- Safe code cannot access uninitialized memory under any circumstances (unless unsafe code accidentally vends it to safe code).

- The simple case you mentioned, of using a variable without initializing it, is always a hard error. This applies in both safe and unsafe code.

- ...However, unsafe code can explicitly ask for uninitialized memory, like the code in the blog post does. It's not really useful to ask for an uninitialized integer, but you may want to allocate a large struct on the stack and not initialize it.

- Unsafe code can also obtain uninitialized memory in other ways, such as by calling malloc, which allocates memory that starts in an uninitialized state. (The alternative is to zero the memory after allocating it, but that's slower.)

kyllo · on July 15, 2019

> Meanwhile, most proper languages have the concept of types that only have two values and can easily use an underlying machine representation for efficiently representing these, without involving the programmer.

Don't other languages still use an entire byte to represent a bool though, since memory access is at the byte level? Having a bool type in the type system is really a language usability concern, I don't think it's at all a performance optimization. And stdbool.h exists now, so that concern has been addressed. When you want a bitmap, you can just use an int of the appropriate length and do bitwise operations on it, instead of wasting space with an array of ints.

verisimilitudes · on July 15, 2019

>Don't other languages still use an entire byte to represent a bool though, since memory access is at the byte level?

While at the discretion of the implementation, Common Lisp is a language that can easily and transparently perform this optimization. Common Lisp even has a specialized array type, BIT-VECTOR, which can only hold values of zero or one, which is more likely to be optimized for size than other types. Ada allows the programmer to specify data structures be optimized for size, which is nice.

Now, representing a lone true or false value is a different matter and I'd expect it to consume an entire register or whatnot under most anything, since you probably wouldn't be able to store anything else with the remaining space.

>Having a bool type in the type system is really a language usability concern, I don't think it's at all a performance optimization.

Ada has a boolean type because there are clearly boolean situations, such as predicates, and having a dedicated type reduces use errors. Programmers are encouraged to define their own boolean types, though, such as (On, Off), say.

>And when you want a bitmap, you can just use an int of the appropriate length and do bitwise operations on it.

That's what I was describing. Why should a high-level language have you making your own arrays? Don't you agree that programs would benefit from a specialized type for this that can more easily be optimized and specialized for the particular machine and whatnot?

steveklabnik · on July 15, 2019

In general yes, there is the infamous specialization of a vector of bool in C++ though.

ralfjung · on July 15, 2019

AFAIK in Ada, deallocating memory is unsafe. So I'd say it has some catching-up to do when compared with safe Rust in that regard. And Rust of course has a two-element type, it is called `bool`.

That said, Ada certainly got many things right. it was an important milestone. But even Ada has "unchecked" operations (such as deallocation), which is exactly what unsafe Rust is, and then you have all the same problems about undefined behavior and having to describe an abstract machine to specify what exactly is (not) undefined behavior and so on.

tomp · on July 14, 2019

This is just terrible. I'm really sad that it's 2019, and not only are we still talking about undefined behaviour, but there are also blog posts arguing for undefined behaviour! I expect a good (programmer-friendly) compiler to at least warn the programmer in any case of provable, or potential, undefined behaviour, or ideally, refuse to compile/use implementation-defined behaviour (i.e. exactly "what the hardware does"). Anything else is basically just inviting security bugs in your code.

But, for a counter-point: what is an example of a code/algorithm that not only uses undefined behaviour (i.e. relies on it in order to compile to fast, optimized code), but also couldn't possibly be rewritten to eliminate undefined behaviour (while keeping the same speed)?

steveklabnik · on July 14, 2019

1. Safe Rust has no undefined behavior, by design.

2. The poster’s job is to work on defining unsafe Rust, where UB is still a thing. It has to be, to some degree, as that’s the entire point.

3. Miri, references in the post, is an interpreter for the Rust abstract machine (or will be, once we’re done defining it) and gives warnings for many kinds of UB already. The hope is that it will be able to do so for all of it in the future. Doing so means defining what “all of it” means, and that’s still in progress.

ummonk · on July 15, 2019

> Doing so means defining what “all of it” means, and that’s still in progress.

"This program relies on user input, which is undefined behavior."

nemetroid · on July 14, 2019

> But, for a counter-point: what is an example of a code/algorithm that not only uses undefined behaviour (i.e. relies on it in order to compile to fast, optimized code), but also couldn't possibly be rewritten to eliminate undefined behaviour (while keeping the same speed)?

Any code that handles signed integers is going to assume that overflow/underflow does not happen.

agolliver · on July 14, 2019

Spent the week trying to figure out how to reimplement __builtin_add_overflow (et al) on Windows and boy is it a chore. The previous implementer had literally just used operator+ in a function called "safe_add" and I was dumbstruck.

steveklabnik · on July 14, 2019

Note that this isn’t UB in Rust.

saagarjha · on July 15, 2019

> I expect a good (programmer-friendly) compiler to at least warn the programmer in any case of provable, or potential, undefined behaviour

Should the following code have a warning? It has undefined behavior if argc == 0.

  int main(int argc, char **argv) {
      return **argv;
  }

As you can see, alerting for potential undefined behavior is a very difficult problem to do in a way that doesn't cause a bunch of spurious issues.

> refuse to compile/use implementation-defined behaviour (i.e. exactly "what the hardware does")

malloc is guaranteed to be 16-byte aligned on macOS. Should the following code not compile?

  void *memalign(size_t size, size_t alignment) {
  #if TARGET_OS_MAC
      if (alignment < 16) {
          return malloc(size);
      }
  #endif
      // General case
  }

comex · on July 15, 2019

> but also couldn't possibly be rewritten to eliminate undefined behaviour (while keeping the same speed)?

Undefined behavior is, for the most part, meant to enable compiler optimizations. If you're willing to do all your optimizations by hand rather than relying on the compiler to do them – in other words, use C as the "portable assembler" it was originally conceived as – then you don't really need it. (At least, not to the extent it exists in C.) And for small, tight loops, that's a perfectly reasonable proposition. For large programs, on the other hand, especially if you want to pile on a lot of abstraction and rely on the compiler to turn it into nice code (see C++)... not so much.

ummonk · on July 14, 2019

You mentioned it briefly with "implementation-defined behavior" but tons of C/C++ relies on technically undefined behavior that has normal behavior on standard compiler / platform combinations.

JoshTriplett · on July 14, 2019

> what is an example of a code/algorithm that not only uses undefined behaviour (i.e. relies on it in order to compile to fast, optimized code), but also couldn't possibly be rewritten to eliminate undefined behaviour (while keeping the same speed)?

"couldn't possibly be rewritten" isn't always the issue; sometimes you need to improve the toolchain and language to provide a supported non-undefined solution, for instance.