More

ealexhudson · 2025-05-22T11:15:28 1747912528

Perhaps your content quality meter needs a recalibration?

beernet · 2025-05-22T12:51:51 1747918311

How so? What makes this blog stand out in terms of quality? I prefer a constructive discussion over personal questions, maybe you should, too.

lcnPylGDnU4H9OF · 2025-05-22T19:43:29 1747943009

It's the name at the top. This particular author has been active with LLM posts at least since the popularity explosion of ChatGPT and all of their posts on that topic seem to be well-informed (and they are otherwise community-famous for co-authoring Django). To your point, the content is only as special as the author's reputation makes it, which will be different from reader to reader.

https://en.wikipedia.org/wiki/Simon_Willison

cbeach · 2025-05-22T18:33:49 1747938829

His posts on AI are often very insightful and, unusually for someone so involved in AI, he's not connected to any of the big AI companies. Therefore he is impartial.

ealexhudson · 2025-05-22T11:11:18 1747912278

Agreed. "Dessert" vs "desert" - mistaking these two is often not a grammatical error (they're both nouns), but is a spelling error (they have quite different meanings, and the person who wrote the word simply spelled it wrongly).

macleginn · 2025-05-22T12:34:56 1747917296

I agree, but this is definitely the kind of spelling error (along with complementary/complimentary, discrete/discreet, etc.) that we normally don't expect our spellcheckers to catch.

ealexhudson · 2025-03-31T12:54:07 1743425647

The Judges impose the punishment set out in the law; they don't make this stuff up.

The alternative is Judges letting people off just because they're politicians. That seems like an extremely poor precedent to set, those in political life should be held to higher standards.

ChocolateGod · 2025-03-31T13:11:44 1743426704

I didn't say letting someone of, a crime has been committed and the person should be punished. But forbidding someone from running from office? I don't think that should be the power of the judiciary. That should be the power and responsibility of the electorate (to not vote for them).

nchagnet · 2025-03-31T15:58:59 1743436739

You do understand that this is explicitly mandated by the law and only in special cases can this be lifted (and here the judge mentioned the lack of remorse or admission from the defendant was a deciding factor for this)? Here is a reference for that: https://www.vie-publique.fr/questions-reponses/297965-inelig...

coffeebeqn · 2025-03-31T14:22:55 1743430975

And why are your feelings relevant to the French legal system?

z7 · 2025-03-31T15:50:43 1743436243

Why are you hallucinating feelings? Also, appeal to authority. ("Why are your feelings relevant to the wizarding laws of Hogwarts?")

ahtihn · 2025-03-31T20:43:47 1743453827

There are all kinds of eligibility rules, you think they shouldn't exist?

By your logic, Trump should be allowed to run for a 3rd term right?

rufus_foreman · 2025-03-31T22:59:35 1743461975

>> Trump should be allowed to run for a 3rd term right?

From the 25th Amendment:

"No person shall be elected to the office of the President more than twice, and no person who has held the office of President, or acted as President, for more than two years of a term to which some other person was elected President shall be elected to the office of the President more than once."

Trump might not be able to "be elected to the office of the President" again, but he could run as a temporary Vice President and then the President could resign, allowing Trump to serve another term, for example.

Of course the 12th Amendment says, "no person constitutionally ineligible to the office of President shall be eligible to that of Vice-President of the United States", but the 25th Amendment doesn't say a two-term President is ineligible to the office of President, it says he can't "be elected to the office of President".

The Supreme Court recently decided that a law prohibiting false statements did not prohibit misleading statements. If the legislators had wanted to prohibit misleading statements, they would have prohibited false and misleading statements, not just false ones. Words matter to them.

And there are many other possibilities for creative types.

As far as the French eligibility rules go, would you be comfortable with a system where anyone who Trump's DOJ can get a conviction on is ineligible to run for office, with no right of appeal on that holding? That would be a really terrible incentive.

phaylon · 2025-04-01T10:30:58 1743503458

Not only Trump. Without the rules, Musk or Putin could run as well, the latter even work-from-home style. Also, if justice being blind is so bad before an election, why not after? Figuring out who won shouldn't involve any courts either. The public will just need to figure out who really won for themselves!

(/s just in case)

ealexhudson · 2025-03-18T10:37:11 1742294231

Within UK dialect there would be some significant differences in many of these words, even ignoring the meddle/mettle examples - farrow/pharaoh is easily distinguishable, too.

I would say, though, that to people _outside_ the dialect, there may be many more words that are indistinguishable. Listening to Scots speakers requires a lot more effort for me because to my ears, many of the differences in the words are extremely subtle.

nmstoker · 2025-03-18T14:24:57 1742307897

I agree it's heavily accent dependent and I suspect the original compiler wasn't that aware of non-mainstream US accents.

It's interesting that many of these are only the same (initially at least) if you've been sloppy/ignorant in your pronunciation and then those become baked in ways of saying something.

We're due to get a lot more of these given how often you hear influencers guessing at what to me seem fairly mainstream pronunciations!

These are often a way that TTS systems slip up most obviously. A lockdown project I tinkered with several years back was a small (traditional) LM that had been fed with tagged examples and could thus predict fairly well the best sense for a particular case. It made a huge difference to perceived quality. Now of course, many TTS cope with this fairly well but you still hear the off slip up!

ralferoo · 2025-03-18T12:23:57 1742300637

"farrow"/"pharaoh" is more than easily distnguishable - to me, the first vowel in these are nowhere even close to the same - I use "a" from "apple" for "farrow" and "ai" from "air" for pharoah, along with a contrast in vowel lengths, again like "apple" and "air".

EDIT: interesting grammar note - as a native speaker, I can't even decide if that should be "first vowel in these is" or "first vowels in these are" or what I actually wrote above which is what seems more natural to me, although immediately stood out to me as grammatically inconsistent when I re-read it after posting...

bradrn · 2025-03-18T15:27:55 1742311675

> as a native speaker, I can't even decide if that should be "first vowel in these is" or "first vowels in these are" or what I actually wrote above which is what seems more natural to me, although immediately stood out to me as grammatically inconsistent when I re-read it after posting...

I would say that the former (“first vowel in these is”) is ‘more correct’, but it sounds weird because it contains the plural “these” immediately before the singular “is”. What you actually wrote is inconsistent strictly speaking, but it feels better because the verb agrees with the immediately preceding word. (This kind of thing is rather common in languages with agreement.)

ealexhudson · 2025-03-05T14:05:15 1741183515

I think you're right about the essential ingredient in this finding, but I feel like this is a pretty ARC-AGI specific result.

Each puzzle is kind of a similar format, and the data that changes in the puzzle is almost precisely that needed to deduce the rule. By reducing the amount of information needed to describe the rule, you almost have to reduce your codec to what the rule itself is doing - to minimise the information loss.

I feel like if there was more noise or arbitrary data in each puzzle, this technique would not work. Clearly there's a point at which that gets difficult - the puzzle should not be "working out where the puzzle is" - but this only works because each example is just pure information with respect to the puzzle itself.

cocomutator · 2025-03-05T14:13:31 1741184011

I agree with your observation about the exact noise-free nature of the problem. It allows them to formulate the problem as "minimize complexity such that you memorize the X-y relationship exactly". This would need to be generalized to the noisy case: instead of demanding exact memorization, you'd need to prescribe an error budget. But then this error budget seems like an effective complexity metaparameter, doesn't it, and we're back to square zero of cross-validation.

ealexhudson · 2025-03-05T14:27:29 1741184849

If we think of the 'budget' as being similar to a bandwidth limit on video playback, there's a kind of line below which the picture starts being pretty unintelligible, but for the most part that's a slider: the less the budget, the slightly less accurate playback you get.

But because this is clean data, I wonder if there's basically a big gap here: the codec that encodes the "correct rule" can achieve a step-change lower bandwidth requirement than similar-looking solutions. The most elegant ruleset - at least in this set of puzzles - always compresses markedly better. And so you can kind of brute-force the correct rule by trying lots of encoding strategies, and just identify which one gets you that step-change compression benefit.

ealexhudson · 2024-07-11T10:36:20 1720694180

No, it's incorrect and/or badly worded. The author is right that a machine cannot author things, and the stuff that the LLM might create de novo would not have copyright protection. But it's missing the point when the argument is that existing authored works could be generated via an LLM, and the authorship/copyright is already established.

Y-bar · 2024-07-11T10:48:41 1720694921

> the stuff that the LLM might create de novo would not have copyright protection

Can you expand on this? From my academic studies (which are indeed growing a bit stale) a Language Model (Large, Medium, Small doesn't matter) is a deterministic machine. Give the same x input n times it will produce the same output y, n times. Some implementations of LM:s might introduce noise to randomise output, but that is not intrinsic to all LM:s.

A language model has no volition, no intent, it does not start without the intervention of a human (or another machine if it is a part of an automated chain).

How is this different compared to a compiler?

With a compiler I craft something in a specific language, often a programming language, I commit it, then a long chain of automated actions happen:

1. The code gets automatically pushed to a repository by my machine

2. The second machine automatically runs tests and fuzzes

3. The second machine automatically compiles binaries

4. The second machine packages the binaries

5. The second machine publishes the binaries to a third machine

How is the above workflow any different from someone using a Language Model to craft something in a specific language and send it through a deterministic LM?

edit re-reading my own question, I think I need to clarify a bit: How can an LLM be said to create anything, and if yes, how is that really any different from a run-of-the-mill developer workflow?

ealexhudson · 2024-07-11T10:34:02 1720694042

I'm not sure that's completely true.

Having read MS code and starting to generate new code that is heavily inspired - sure, that's not copyright infringement. But, if you had memorized a bunch of code (and this is within human capability; people can recite many works of literature of varying length with total accuracy, given sufficient study) - that would be copyright infringement once the code was a non-trivial amount. The test in copyright is whether the copying is literal, not how the copying was done/did it pass through a human brain.

This scenario rarely comes up because humans are, generally, an awful medium for accurate repetition. However, it's not really been shown than LLMs are not: in fact, CoPilot claims (at least in its Enterprise agreements) to check its output _does not_ parrot existing code identically. The specific commitment they made in their blog post is/was, "We have incorporated filters and other technologies that are designed to reduce the likelihood that Copilots return infringing content". To be clear, they only propose to reduce the possibility, not remove it.

LLMs rely on a form of lossy compression which can sometimes give back verbatim content. I think it's pretty clear and unarguable that this is a copyright infringement.

ealexhudson · on June 18, 2024

We don't really know what GPT-4 "is". I remember reading a number of relatively well-informed suggestions that there are a number of a models inside there, and the API being interacted with is some form of outer-loop around them.

I don't think the location of the outer-loop or the design of it really makes much difference. There is no flock of birds without the individuals, the flock itself doesn't really exist as a tangible thing, but what arises out of the collective adjustments between all these individuals gives rise to a flock. Similarly, we may find groups of LLMs and various outer control loops give rise to an emergent phenomena much greater than the sum of their parts.

YeGoblynQueenne · on June 18, 2024

>> We don't really know what GPT-4 "is".

Yes, we do. It's a language model.

ealexhudson · on June 18, 2024

I think to be clear, brute force generally means an iterative search of a solution space. I don't think that's what this system is doing, and it's not like it's following some search path and returning as early as possible.

It's similar that a lot of wrong answers are being thrown up, but I think this is more like a probabilistic system which is being pruned than a walk of the solution space. It's much smarter, but not as smart as we would like.

lelanthran · on June 18, 2024

> I think to be clear, brute force generally means an iterative search of a solution space.

Sure, but not an exhaustive one - you stop when you get a solution[1]. Brute force does not require an exhaustive search in order to be called brute-force.

GP was using the argument that because it is not exhaustive, it cannot be brute-force. That's the wrong argument. Brute-force doesn't have to be exhaustive to be brute-force.

[1] Or a good enough solution.

naasking · on June 18, 2024

A brute force search can be expected to find a solution after a more thorough search of the space of possibilities. If it really is only searching 0.000001% of that space before finding solutions, then some structure of the problem is guiding the search and it's no longer brute force.

ealexhudson · on Jan 30, 2024

The point of doing the interviews is to ground the product in actual feedback from (potentially) paying customers.

This is a make-believe facsimile of that. This is what insights and feedback might look like, but I can't imagine that it would be usable as such...