Well as Wolfram has stated, we need to mix LLMs with some deterministic reasoning. Real knowledge based output that can be routed to for these purposes. The hard part is determining when.
We need to go back to using the AI tech stack as a tool for (broadly speaking) automated pattern recognition, i.e. it can be great at image analysis and spotting early cancerous changes in lungs, but it ought to stop at pre-qualification stage and leave it up to the human to decide what the change is and what to do with it. What the AI crowd today want is all the photos in the world to be used to train models to recognise cancerous changes and then accept blindly the following diagnosis: "it is a cancerous change of human lung tissue, therefore the patient is a labrador with chickenpox and we recommend immediate removal of the patients left wing." When you tell them it's pure garbage, they come back with answers worthy of the slimiest orators in history and solutions that Rube Goldberg would not have thought of.