Also seems very impractical to embed this into a deployed product. How can you p...

sebzim4500 · 2024-09-12T23:18:09 1726183089

You can specify the max length of the response, which presumably includes the hidden tokens.

I don't see why this is qualitatively different from a cost perspective than using CoT prompting on existing models.

BoorishBears · 2024-09-13T00:27:41 1726187261

For one, you don't get to see any output at all if you run out of tokens during thinking.

If you set a limit, once it's hit you just get a failed request with no introspection on where and why CoT went off the rails

Aeolun · 2024-09-13T03:07:51 1726196871

Why would I pay for zero output? That’s essentially throwing money down the drain.

dartos · 2024-09-12T23:32:43 1726183963

You can’t verify that you’re paying what you should be if you can’t see the hidden tokens.

sebzim4500 · 2024-09-13T15:41:17 1726242077

With the conventional models you don't get the activations or the logits even though those would be useful.

Ultimately if the output of the model is not worth what you end up paying for it then great, I don't see why it really matters to you whether OpenAI is lying about token counts or not.

dartos · 2024-09-13T19:07:48 1726254468

As a single user, it doesn’t really, but as a SaaS operator I want tractable, hopefully predictable pricing.

I wouldn’t just implicitly trust a vendor when they say “yeah we’re just going to charge you for what we feel like when we feel like. You can trust us.”

HarHarVeryFunny · 2024-09-13T02:27:42 1726194462

They are currently trying to raise money (talk of new $150B valuation), so that may have something to do with it