From what you've written, I don't see why any of this would require the LLM to "...

From what you've written, I don't see why any of this would require the LLM to "be trained to the point where a subset of the graph represents all the nand gates necessary for a cpu and ram" - you'd just be emulating a CPU, but slower.

Tool usage is better, because the LLM can access the relevant computing/simulation at the highest fidelity and as fast as they can run on a real or virtual computer, rather than emulated poorly in a giant pyramid of matrix multiplications.

Am I missing the point?