I think Intels E-cores are quite a bit smaller than the Zen 4c/5c cores, maybe at that scale it's prohibitive to even double up the register file? That's required even if the logic is double-pumped. AIUI the small Zen cores are mostly the same design as the big ones, just with less cache, silicon layout retuned for density rather than speed, and the removal of the 3D Cache stacking vias, while Intels small cores are clean-sheet designs with next to nothing in common with their big cores so they have to opportunity to shrink them a lot more.
Yes, while the big Intel cores are much bigger than the big AMD cores (e.g. 5 square mm in Meteor Lake vs. 3.8 square mm for Zen 4) the Intel small cores are much smaller than the AMD compact cores (e.g. 1.5 square mm in Meteor Lake vs. 2.5 square mm for Zen 4c).
The smaller size of the Intel E-cores is not only due to their different microarchitecture, but also because only their L1 cache memories are non-shared, while their L2 cache memories are shared within groups of 4 E-cores.
The shared L2 cache may not matter much for many general-purpose programs, but for other multi-threaded programs, which depend on having a great total throughput for the transfers with the L2 cache, the performance of each group of 4 E-cores becomes similar to that of a single core, instead of being 4 times greater.
The AMD compact cores have the same non-shared cache memories as the big cores. Only the shared L3 cache blocks that service a group of compact cores are smaller than for the same number of big cores.
My non-expert brain immediately jumped to double-pumping + maybe working with their thread director to have tasks using a lot of AVX512 instructions prefer P cores more. It feels like such an obvious solution to a really dumb problem that I assumed there was something simple I was missing.
The register file size makes sense, I didn't think they were that much of the die on those processors but I guess they had to be pretty aggressive to meet power goals?