The minimum you can do is not allow the AI to perform actions on behalf of the user without informed consent.
That still doesn't prevent spam mail from convincing the LLM to suggest an attacker controlled library, GitHub action, password manager, payment processor, etc. No links required.
The best you could do is not allow the LLM to ingest untrusted input.
I am not too familiar with the latest hype, but "reasoning" has a very straightforward definition in my mind. For example, can the program in question derive new facts from old ones in a logically sound manner. Things like applying modus ponens. (A and A => B) => B. Or, all men are mortal and Socrates is a man, and therefore Socrates is mortal. If the program cannot deduce new facts, then it is not reasoning, at least not by my definition.
People then say "of course it could do that, it just pattern matched a Logic text book. I meant in a real example, not an artificially constructed one like this one. In a complex scenario LLMs obviously can't do Modus Ponens.
I do not know whether the state of the art is able to reason or not. The textbook example you gave is admittedly not very interesting. What you are hearing from people is that parroting is not reasoning, which is true.
I wonder if the state of the art can reason its way through the following:
"Adam can count to 14000. Can Adam count to 13500?"
The response needs to be affirmative for every X1 and X2 such that X2 <= X1. That is reasoning. Anything else is not reasoning.
The response when X2 > X1 is less interesting. But, as a human it might be "Maybe, if Adam has time" or "Likely, since counting up to any number uses the same algorithm" or "I don't know".
Seems ChatGPT can cope with this. Other examples are easy to come up with, too. There must be benchmarks for this.
Input to ChatGPT:
"Adam can lift 1000 pounds of steel. Can Adam lift 1000 pounds of feathers?"
Output from ChatGPT:
"1,000 pounds of feathers would be much easier for Adam to lift compared to 1,000 pounds of steel, because feathers are much lighter and less dense."
gemma3-27b, a small model, had an interesting take:
> This is a classic trick question!
> While Adam can lift 1000 pounds, no, he likely cannot lift 1000 pounds of feathers.
> Volume: Feathers take up a huge amount of space for their weight. 1000 pounds of feathers would be an enormous volume – likely far too large for Adam to even get under, let alone lift. He'd be trying to lift a massive, bulky cloud.
> Practicality: Even if he could somehow get it under a barbell, the feathers would shift and compress, making a secure grip impossible.
> The question plays on our understanding of weight versus volume. It's designed to make you focus on the "1000 pounds" and forget about the practicalities of lifting something so voluminous.
Tried the counting question on the smallest model, gemma-3n-34b, it can run on a smartphone:
> Yes, if Adam can count to 14000, he can definitely count to 13500. Counting to a smaller number is a basic arithmetic operation. 13500 is less than 14000.
Thanks for trying these out :). Highlights the often subtle difference between knowing the answer and deducing the answer. Feathers could be ground into a pulp and condensed, too. I am not trying to be clever, just seems like the response is a canned answer.
I think it was more a PoC. I would be more impressed if it was deployed in production. "we want to reiterate that these are highly experimental results". If the dividends are massive, would they not deploy it in production and tell the world about it?
It's reasonable to perceive most of the value in math and computer science being "at the scale" where there is unpredictability arising from complexity, though scale may not really be the reason for the unpredictability.
But a lot of the trouble in these domains that I have observed comes from unmodeled effects, that must be modeled and reasoned about. GPZ work shows the same thing shown by the researcher here, which is that it requires a lot of tinkering and a lot of context in order to produce semi-usable results. SNR appears quite low for now. In security specifically, there is much value in sanitizing input data and ensuring correct parsing. Do you think LLMs are in a position to do so?
I see LLMs as tools, so, sure I think they’re in a position to do so the same way pen testing tools or spreadsheets are.
In the hands of an expert, I believe they can help. In the hands of someone clueless, they will just confuse everyone, much like any other tool the clueless person uses.
Maybe very very soft "engineering". Do you have metrics on which prompt is best? What units are you measuring this in? Can you follow a repeatable process to obtain a repeatable result?
I agree with the sentiment and analysis that most humans prefer short term gains over long term ones. One correction to your example, though. Dynamic bounds checking does not solve security. And we do not know of a way to solve security. So, the gains are not as crisp as you are making them seem.
Bounds checking solves one tiny subset of security. There are hundreds of other subsets that we know how to solve. However these days the majority of the bad attacks are social and no technology is likely to solve them - as more than 10,000 years of history of the same attack has shown. Technology makes the attacks worse because they now scale, but social attacks have been happening for longer than recorded history (well there is every reason to believe that - there is unlikely to evidence going back that far).
> However these days the majority of the bad attacks are social
You're going to have to cite a source for that.
Bounds checking is one mechanism that addresses memory safety vulnerabilities. According to MSFT and CISA[1], nearly 70% of CVEs are due to memory safety problems.
You're saying that we shouldn't solve one (very large) part of the (very large) problem because there are other parts of the problem that the solution wouldn't address?
While I do not have data comparing them, I have a few remarks:
1. Scammer Payback and others are documenting on-going attacks that involve social engineering that are not getting the attention that they deserve.
2. You did not provide any actual data on the degree to which bounds checks are “large”. You simply said they were because they are a subset of a large group. There are diseases that only affect less than 100 people in the world that do not get much attention. You could point out that the people affected are humans, which is a group that consists of all people in the world. Thus, you can say that one of these rare diseases affects a large number of people and thus should be a priority. At least, that is what you just did with bounds checks. I doubt that they are as rare as my analogy would suggest, but the point is that the percentage is somewhere between 0 and 70% and without any real data, your claim that it is large is unsubstantiated. That being said, most C software I have touched barely uses arrays for bound checks to be relevant, and when it does use arrays, it is for strings. There are safe string functions available for use like strlcpy() and strlcat() that largely solve the string issues by doing bounds checks. Unfortunately, people keep using the unsafe functions like strcpy() and strcat(). You would have better luck if you suggested people use safe string handling functions rather than suggest compilers insert bounds checks.
3. Your link mentions CHERI, which a hardware solution for this problem. It is a shame that AMD/Intel and ARM do not modify their ISAs to incorporate the extension. I do not mean the Morello processor, which is a proof of concept. I mean the ISA specifications used in all future processors. You might have more luck if you lobby for CHERI adoption by those companies.
You don't have to "solve" security in order to improve security hygiene by a factor of X, and thus risk of negative consequences by that same factor of X.
I am not suggesting we refuse to close one window because another window is open. That would be silly. Of course we should close the window. Just pointing out that the "950X" example figure cited fails to account for the full cost (or overestimates the benefit).
Security through compartmentalization approach actually works. Compare the number of CVEs of your favorite OS with those for Qubes OS: https://www.qubes-os.org/security/qsb/