Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A while back, I wrote a UTF-8 decoder in Common Lisp, targeting SBCL (it already has one built in, this was an exercise). Pretty much all of the optimization win (after the obvious low-hanging fruit) was structuring the code so that the compiler would generate cmov* instructions rather than branches.


What's some examples of the code changes that you made? And did you just do repeated disassemblies of the functions to see that it was using the correct instructions, or did you do some benchmarking to show your changes were actual improvements?


Gosh, I'd have to see if I can dig it up this was a few years ago.

I did all of the above, plus profiling (sb-sprof combined with disassemble will show assembly level profiling).


Branches are prone to be faster than conditional moves if they are correctly predicted, because they do not increase the critical path length. And utf-8 decoders are commonly run on all-ascii input. What were you benchmarking on?


I ran separate benchmarks on all-ASCII, BMP-only, and ascii with non-BMP. ASCII was not slower on the low-branch version.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: