If you enjoy the science of injecting slowness to determine which component has the largest impact on performance, you would enjoy this work by Emery Berger.
With all the hardware "security" issues discovered in the last few years, CPU designers should provide the possibility to turn off many of hardware features to end up with a brutal in-order basic CPU.
Performance will be destroyed for somewhat more confidence in their "security".
1000x performance loss is what you'd get from turning off the CPU's entire cache hierarchy, not what you'd get from disabling out of order execution. Executing instructions in-order wouldn't make every instruction a cache miss.
>if we step back a few months to Hot Chips 2024, AMD, Intel, and Qualcomm all gave presentations on high performance cores there. All three were eight-wide, meaning their pipelines could handle up to eight micro-ops per cycle in a sustained fashion.
>Zen 5 is the only core out of the three that couldn’t give eight decode slots to a single thread.
If you add Apple and ARM. That is the only core out of the five. I am thinking if Zen 6 will be something different. Right now Intel is iterating like crazy. And Zen 6 is still quite far off.
Will be interesting to see ARM Cortex X5 / X730 with Mediatek Dimensity 9500 on N3 vs Qualcomm Oryon 2 on N3 and also Apple's A19 / M5 on N3 all in 2025.
Coz: Finding Code that Counts with Causal Profiling https://arxiv.org/abs/1608.03676
"Performance (Really) Matters" with Emery Berger https://www.youtube.com/watch?v=7g1Acy5eGbE