I think LLVM missed the boat, on this, by being an early mover. A lot of the optimizations are resource-only analyses; the few that re not are "just" various levels of interpretation. That kind of implies we need a framework to define resource utilization and evaluation at the instruction/machine-code level with a standard API. Having an optimizer for an abstract IR is less useful.
The point being that compilers would then target emitting reasonable machine code at speed, and The Real LLVM would do analysis/transform on the machine code.
Java is similar (but overall infrastructure around compiler makes it slow).
Golang also quite fast.