> You pay a lot upfront for the hardware, but if your usage of the GPU is heavy, then you save a lot of money in the long run.
Last I saw data on this wasn’t true. A like for like comparison (same model and quant) API is cheaper than elec so you never make back hardware cost. That was a year ago and api costs have plummeted so I’d imagine it’s even worse now.
Datacenters have cheaper elec, can do batch inference at scale and more efficient cards. And that’s before we consider the huge free allowances by Google etc
Is this also the case for token-heavy uses such as Claude Code? Not sure if I will end up using CC for development in the future, but if I end up leaning on that, I wonder if there would be a desire to essentially have it run 24/7. When ran 24/7, CC would possibly incur more API fees than residential electricity would cost when running on your own gear? I have no idea about the numbers. Just wondering.
I doubt you’re going to beat datacenter under any conditions in any model that is vaguely like for like
The comparison I saw was a small llama 8B model. ie something you can actually get usable numbers on both home and api. So something pretty commoditized
> When ran 24/7, CC would possibly incur more API fees than residential electricity would cost when running on your own gear?
Claude is pretty damn expensive so plausible that you can undercut it with another model. That implies you throw out the like for like assumptions out the door though. Valid play practically, but kinda undermines the buy own rig to save argument
Last I saw data on this wasn’t true. A like for like comparison (same model and quant) API is cheaper than elec so you never make back hardware cost. That was a year ago and api costs have plummeted so I’d imagine it’s even worse now.
Datacenters have cheaper elec, can do batch inference at scale and more efficient cards. And that’s before we consider the huge free allowances by Google etc
Own AI gear is cool…but not due to economics