Typescript/Java are pretty much the GOATs of LLM codegen, because they have strong type systems and a metric fuck ton of training code. The main problem with Java is that a lot of the training code is Java 6-8, which really poisons the well, so honestly I'd give the crown to Typescript for best LLM codegen language.
Python is good because of the sheer volume of training data, but the lack of a strong type system means you can't have a cycle of codegen -> typecheck -> codegen be automated, and you have to get the LLM to produce tests and run those, which is mostly fine but not as efficient.
Python is good because of the sheer volume of training data, but the lack of a strong type system means you can't have a cycle of codegen -> typecheck -> codegen be automated, and you have to get the LLM to produce tests and run those, which is mostly fine but not as efficient.