Hacker News new | past | comments | ask | show | jobs | submit login

This bothered me as well. The only thing I can think of is that one of the calls to libc functions is consuming rax off the stack. The code as displayed by GodBolt is the code as generated by clang so it's not an error there.

EDIT: I've just noticed this line: leaq -24(%rbp), %rsp

It does a manual pop of rax by restoring rsp to what it was before the prologue.




What is going on is this.

A function activation record contains various parts.

* http://jdebp.uk./FGA/function-perilogues.html

The calls to push and pop R15, R14, and RBX in the perilogue are manipulating the save area. The space below the save area is the locals area, and there are local variables in the function.

The inner save and restore of the stack pointer in R15 is adding further, block-scoped, locals to the locals area for the duration of that block, which is why it hasn't been combined with the outer perilogue (one of the things asked about in a footnote). If this function were more complex, the restoration of the original stack pointer from R15 would be well before the function epilogue, demonstrating the disconnect more clearly.

The actual epilogue begins with moving the stack pointer back up to the bottom of the save area, from whereever it happens to have been after the locals area was allocated, so that the save area can then be popped. In the absence of alloca() and variable-length arrays, this could be done by ADDing to the stack pointer register, as the size of the locals area is known by the compiler and fixed at compile time.

alloca() and variable-length arrays make things more complex, though, and this function contains at least one of those. Getting the amount to ADD right would involve retaining the sizes of relevant alloca()s and variable-length arrays. (Also, compilers prefer LEA to ADD for the stack pointer, but for the sake of simplicitly let's just treat this as ADD.)

Fortunately, the address of the bottom of the save area is known, and working out how much to ADD is thus unnecessary and the problem is moot. The bottom of the save area is by its nature at a fixed offset from the frame pointer, with a size known from which registers had to be saved, so the generated code just resets the stack pointer to that offset relative to the frame pointer, effectively popping off the entire locals area that way.

There's no reason to POP what was pushed from RAX back into the register, because that part of the stack is not part of the save area. It's a dummy local variable that ensures that the stack remains 16-byte aligned once all fourof return address, saved frame pointers area, save area, and locals area are in place. As another footnote notes, PUSH RAX is just a quick way of subtracting from the stack pointer without using the ALU, which a SUB instruction would. RAX isn't actually being saved.


Thank you.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: