Without CoT then training them to give specific answers reduces performance. With CoT you can punish them if they don't give the exact answer you want without hurting them, since the reasoning tokens help it figure out how to answer questions and what the answer should be.
And you really want to train on specific answers since then it is easy to tell if the AI was right or wrong, so for now hidden CoT is the only working way to train them for accuracy.
this pretty much summarises my opinion - one nitpick - i assume you meant "omit bounds and other checks", not "emit bounds and other checks" which seems to mean the opposite of what you're intending
Rust does emit bounds and other checks, though. Optimization passes can usually clear some of them away, but you'd need to check the assembly output to be sure.
- trying to access an arbitrary element in a slice, the compiler will emit bounds checks (`if index > len: panic()`) to avoid an uncontrolled out-of-bounds memory access — https://godbolt.org/z/cbY5ebzvK (note how if you comment out the assert, the code barely changes, because the compiler is adding an invisible assert of its own)
- if the compiler can infer that `index` will always be less than `len`, then it will omit the bounds check — https://godbolt.org/z/TTashYnjd
This is false, reasoning models are rewarded/punished based on performance at verifiable tasks, not human feedback or next-token prediction.