Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/...

Suggests that it should be usable for even shorter copies. And that's really my point. We should have One True memcpy instruction sequence that we use everywhere and stop worrying. And yet...




Unfortunately, we can't change the past, and seemingly in the past it wasn't worth it to have a fast One True memcpy (and perhaps to a decent extent still isn't). I'm still typing this on a Haswell CPU, which don't have FSRM (rep movsb of 16 bytes in a loop takes ~10ns=36 cycles per iteration avg).

But, yeah it does seem that my 128 bytes of a quick search was wrong. (though, gcc & clang for '-march=alderlake' both never generate 'rep movsb' on '-O3'; on `-Os` gcc starts giving a rep movsb for ≥65B, clang still never does)




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: