Does anyone know how this “user decides how much compute” is implemented archite...

zora_goron 4 months ago | parent | context | favorite | on: Claude 3.7 Sonnet and Claude Code

Does anyone know how this “user decides how much compute” is implemented architecturally? I assume it’s the same underlying model, so what factor pushes the model to <think> for longer or shorter? Just a prompt-time modification or something else?