I haven't read the paper yet, but from a structural 'attention' perspective being unable to detect unclassified omissions is completely expected. (Though I think it is can be solved with structured thought.)
For needle in a haystack you have to pay attention to the thing that you are trying to find. Attention can do this pretty well.
When looking for an omission, that omission can be anything, you can only reason about it by comparing one whole context to another whole context. The attention layers can't really do that.
This is similar to the "rank a long set of things" problem. Absent some meta cognition process, they just can't do that.
> When looking for an omission, that omission can be anything,
In this benchmark they give the LLM the necessary information to determine what is missing. For example “here is a poem, here is a version of that same poem that may or may not be missing lines. Are any lines missing?
It’s more a tuning issue IMHO than an inherent weakness in LLMs.
If I was asked to find an omission in an ML paper, my brain compares it with other ML papers, it does not need to compare it to Star Ward, Top Gear, Greek history, pottery and the other 1000s of contexts I may know about.
Sorry I meant the omission can be anything in the context, not anything in the world.. lol.
That is still hard. You only have so many attention heads looking for things.. you can't pay attention to EVERYTHING.. which is what is required to find the omission.
To pay attention to everything, set the query vector to 0. Then all attention scores will be equal and the attention output is the average of the value vectors.
We should note that "where is there a line missing from this poem: ____?" contains sufficient information to answer correctly without needing a copy of the original to compare to.
Here are two verses of a poem (song) in Mandarin Chinese:
yi quan ting ni de
er gei ni hao de
shu dao san yong yuan ai ni yi ge
si bu hui fan cuo
wu bu hui luo suo
shuo ni xiang shuo de
zuo ni xiang zuo de
bie pa shi bai yin wei ni you wo
pei ni kan ri luo
pei ni yi qi chang wan wo men ai de ge
I removed two lines. Where did that happen?
Would your answer be different if I told you that I might or might not have removed some lines?
> Here are two verses of a poem (song) in Mandarin Chinese:
> …
> I removed two lines. Where did that happen?
If you read the paper you will see they provide the original as well as the version missing information.
I did mention this in my comment too.
I am quite sure I could find your two missing lines if you provide me the full poem.
Given that you are a prolific commenter on HN, I am sure a LLM could be fine tuned to detect missing text from your comments without additional information. For example …
> WinForms is still around. There have been further tec lly just a big tire fire and about the best you can do is to ignore all of them and develop in WinForms.
It’s probably possible to detect missing information from “tec” until “lly”. But to know what is between is not possible for a human either, beyond plausible guesses.
...did you read my comment? The first - and really only - thing I say is that the original isn't necessary. Then there's an example. You shouldn't have trouble identifying where lines have been removed from the Chinese poem.
The fact that the original was provided doesn't demonstrate that it's necessary to the task. You can identify missing text without needing to know what was there.
> Given that you are a prolific commenter on HN, I am sure a LLM could be fine tuned to detect missing text from your comments without additional information.
Same thing. Why would you need to do tuning on text authored by me? You can easily detect missing text of that style by the fact that the sentence you have fails to be English. You can do the same thing in text for which you have no prior experience of the author.
> I am quite sure I could find your two missing lines if you provide me the full poem.
I’ll take the bait :-).
.
Endings of lines seem to come in pairs ( de, de; cuo, suo; de,de; wo,luo)
I’d therefore conjecture that lines are missing after ‘ge’ and ‘ge’.
This of course assumes Chinese poetry is based on vowels matching as e.g it is the case in german and not based on rhythm as would be the case in Latin and Arabic.
For needle in a haystack you have to pay attention to the thing that you are trying to find. Attention can do this pretty well.
When looking for an omission, that omission can be anything, you can only reason about it by comparing one whole context to another whole context. The attention layers can't really do that.
This is similar to the "rank a long set of things" problem. Absent some meta cognition process, they just can't do that.