"I made an extra test that wraps matrix4x4 in std::unique_ptr."
And that is your test? One piece of code? Not even presenting disassembly? Who the hell knows what your compiler wrought in response to your "std::unique_ptr" and "matrix4x4" ?
Sorry for my ignorance – linked list matching the performance of vector is something new to me. I would like to learn more. The best way to prove your point is to show us a benchmark. However, I couldn't find one with a quick google search.
Well I’ve provided 1 benchmark to your 0. I’d say I’m ahead.
My code benchmark is actually far more pre-fetch friendly than a LinkedList because it can prefetch more than one “node” at a time. In a LinkedList you can’t prefetch N+1 and N+2 at the same time because N+2 is dependent on the result of N+1.
I’m always open to learning new things. Modern compiler magic is deep and dark. If sometimes a compiler magically makes it fast and sometimes slow that’d be quite interesting! If you have any concrete examples I’d love to read them.
https://www.forrestthewoods.com/blog/memory-bandwidth-napkin...