chaeronanaut's comments

chaeronanaut · 2025-04-03T18:43:22 1743705802

> The words that are coming out of the model are generated to optimize for RLHF and closeness to the training data, that's it!

This is false, reasoning models are rewarded/punished based on performance at verifiable tasks, not human feedback or next-token prediction.

Xelynega · 2025-04-03T18:46:44 1743706004

How does that differ from a non-reasoning model rewarded/punished based on performance at verifiable tasks?

What does CoT add that enables the reward/punishment?

Jensson · 2025-04-03T21:50:41 1743717041

Without CoT then training them to give specific answers reduces performance. With CoT you can punish them if they don't give the exact answer you want without hurting them, since the reasoning tokens help it figure out how to answer questions and what the answer should be.

And you really want to train on specific answers since then it is easy to tell if the AI was right or wrong, so for now hidden CoT is the only working way to train them for accuracy.

chaeronanaut · on Feb 9, 2024

BT2 is old news, we have BT4 now

chaeronanaut · on June 22, 2023

An excellent explanation of Magic Bitboards can be found here: https://analog-hors.github.io/site/magic-bitboards/

chaeronanaut · on Aug 9, 2022

this pretty much summarises my opinion - one nitpick - i assume you meant "omit bounds and other checks", not "emit bounds and other checks" which seems to mean the opposite of what you're intending

zozbot234 · on Aug 9, 2022

Rust does emit bounds and other checks, though. Optimization passes can usually clear some of them away, but you'd need to check the assembly output to be sure.

kitd · on Aug 9, 2022

Yes, that's "omit".

"Emit" means "to send out", eg "emit a strange noise", "emit radiation".

Shish2k · on Aug 9, 2022

It is specifically both:

- trying to access an arbitrary element in a slice, the compiler will emit bounds checks (`if index > len: panic()`) to avoid an uncontrolled out-of-bounds memory access — https://godbolt.org/z/cbY5ebzvK (note how if you comment out the assert, the code barely changes, because the compiler is adding an invisible assert of its own)

- if the compiler can infer that `index` will always be less than `len`, then it will omit the bounds check — https://godbolt.org/z/TTashYnjd

woodruffw · on Aug 9, 2022

Yes, thank you! That's an embarrassing typo!

(And thanks to the other person as well, who presumably deleted the same comment after seeing yours.)