More

inimino · 2025-04-29T02:31:20 1745893880

No, no, no. Each of them gives you information.

bcoates · 2025-04-29T02:39:59 1745894399

In the formal, information-theory sense, they literally don't, at least not on their own without further constraints (like band-limiting or bounded polynomial degree or the like)

nurettin · 2025-04-29T05:14:07 1745903647

They give you relative information. Like word2vec

inimino · 2025-04-29T02:41:02 1745894462

...which you always have.

inimino · 2025-03-31T19:39:58 1743449998

This is an embezzlement case. No-one tampered with election results, so can you explain how there is any logic to your argument?

bryanlarsen · 2025-03-31T19:41:14 1743450074

They used the money on the election campaign.

inimino · 2025-03-31T20:36:30 1743453390

All I have seen is that four people worked for the party while being paid by the EU. Nothing like routing money to advertising campaigns or anything that would actually swing an election. And the headlines are all about embezzlement, not election fraud. So this seems like a stretch.

bisRepetita · 2025-03-31T21:56:33 1743458193

I read that 9 European representatives, plus 12 assistants, plus 4 other members of the party were found guilty as part of a scheme to earn illegally EUR 2.9M for the party.

https://fr.wikipedia.org/wiki/Affaire_des_assistants_parleme...

bryanlarsen · 2025-03-31T20:41:06 1743453666

In my experience, boots on the ground are far more effective at affecting election outcomes than advertising campaigns are.

machomaster · 2025-03-31T22:33:14 1743460394

Go and congratulate Kamala for her win against Trump.

This is just the latest example showing how wrong your take is.

lesuorac · 2025-03-31T23:09:23 1743462563

So lets say you have a 6 sided die and you have two options.

1. You win when a 1 is rolled

2. You win when a 1 or a 2 is rolled

Despite both of the situations being less than 50% the 2nd one is still more effective.

(Also your comment implies Trump's campaign doesn't have boots on the ground which obviously isn't true ...)

inimino · 2025-03-14T20:50:16 1741985416

And in case it isn't obvious, with AI "companionship" models and such, this is about to get a lot worse as the cost of the string-along goes to zero.

LoganDark · 2025-03-14T21:05:18 1741986318

I await the day where the companionship models are actually as good at creative writing as even ChatGPT. Of course if they're ever as good as a real person (the holy grail) then I'll be very happy, but I don't see that happening any time soon.

inimino · 2025-03-14T20:47:40 1741985260

I have a paper coming up that I modestly hope will clarify some of this.

The short answer should be that it's obvious LLM training and inference are both ridiculously inefficient and biologically implausible, and therefore there has to be some big optimization wins still on the table.

snowwrestler · 2025-03-14T21:25:18 1741987518

I think the hard question is whether those wins can be realized with less effort than what we’re already doing, though.

What I mean is this: A brain today is obviously far more efficient at intelligence than our current approaches to AI. But a brain is a highly specialized chemical computer that evolved over hundreds of millions of years. That leaves a lot of room for inefficient and implausible strategies to play out! As long as wins are preserved, efficiency can improve this way anyway.

So the question is really, can we short cut that somehow?

It does seem like doing so would require a different approach. But so far all our other approaches to creating intelligence have been beaten by the big simple inefficient one. So it’s hard to see a path from here that doesn’t go that route.

sockaddr · 2025-03-14T21:49:30 1741988970

Also, a brain evolved to be a stable compute platform in body that finds itself in many different temperature and energy regimes. And the brain can withstand and recover from some pretty severe damage. So I'd suspect an intelligence that is designed to run in a tighter temp/power envelope with no need for recovery or redundancy could be significantly more efficient than our brain.

fallingknife · 2025-03-14T21:55:27 1741989327

The brain only operates in a very narrow temperature range too. 5 degrees C in either direction from 37 and you're in deep trouble.

choilive · 2025-03-14T22:02:12 1741989732

Most brain damage would not be considered in the realm of what most people would consider "recoverable".

numba888 · 2025-03-16T18:20:50 1742149250

In some cases it doesn't recover even without physical or chemical damage. Psychiatric clinics are full of this stories.

Etheryte · 2025-03-14T22:16:24 1741990584

How does this idea compare to the rationale presented by Rich Sutton in The Bitter Lesson [0]? Shortly put, why do you think biological plausibility has significance?

[0] http://www.incompleteideas.net/IncIdeas/BitterLesson.html

inimino · 2025-03-16T00:37:56 1742085476

I'll have to refer you to my forthcoming paper for the full argument, but basically, humans (and all animals) experience surprise and then we attribute that surprise to a cause, and then we update (learn).

In ANNs we backprop uniformly, so the error correction is distributed over the whole network. This is why LLM training is inefficient.

rsfern · 2025-03-14T22:23:18 1741990998

I’m not GP, but I don’t think their position is necessarily in tension with leveraging computation. Not all FLOPs are equal, and furthermore FLOPs != Watts. In fact a much more efficient architecture might be that much more effective at leveraging computation than just burning a bigger pile of GPUs with the current transformer stack

inimino · 2025-03-16T00:38:33 1742085513

Right.

zamubafoo · 2025-03-14T22:06:42 1741990002

Honest question: Given that the only wide consensus of anything approaching general intelligence are humans and that humans are biological systems that have evolved in physical reality, is there any arguments that better efficiency is even possible without relying on leveraging the nature of reality?

For example, analog computers can differentiate near instantly by leveraging the nature of electromagnetism and you can do very basic analogs of complex equations by just connecting containers of water together in certain (very specific) configurations. Are we sure that these optimizations to get us to AGI are possible without abusing the physical nature of the world? This is without even touching the hot mess that is quantum mechanics and its role in chemistry which in turn affects biology. I wouldn't put it past evolution to have stumbled upon some quantum mechanic that allowed for the emergence of general intelligence.

I'm super interested in anything discussing this but have very limited exposure to the literature in this space.

HDThoreaun · 2025-03-14T22:20:37 1741990837

The advantage of artificial intelligence doesnt even need to be energy efficiency. We are pretty good at generating energy, if we had human level AI even if it used an order of magnitude more energy that humans use that would likely still be cheaper than a human.

inimino · 2025-03-16T00:40:56 1742085656

Inference is already wasteful (compared to humans) but training is absurd. There's strong reason to believe we can do better (even prior to having figured out how).

numba888 · 2025-03-16T18:23:37 1742149417

That would mean with current resources AI can get so much more intelligent than humans, right? Aren't you scared?

inimino · 2025-03-17T05:26:56 1742189216

That's a potential outcome of any increase in training efficiency.

Which we should expect, even from prior experience with any other AI breakthrough, where first we learn to do it and then we learn to do it efficiently.

E.g. Deep Blue in 1997 was IBM showing off a supercomputer, more than it was any kind of reasonably efficient algorithm, but those came over the next 20-30 years.

vessenes · 2025-03-14T20:58:07 1741985887

I’m looking forward to it! Inefficiency (if we mean energy efficiency) conceptually doesn’t bother me very much in that feels like Silicon design has a long way to go still, but I like the idea of looking at biology for both ideas and guidance.

Inefficiency in data input is also an interesting concept. It seems to me humans get more data in than even modern frontier models; if you use the gigabit/s estimates for sensory input. Care to elaborate on your thoughts?

jedberg · 2025-03-14T20:50:55 1741985455

> and biologically implausible

I really like this approach. Showing that we must be doing it wrong because our brains are more efficient and we aren't doing it like our brains.

Is this a common thing in ML papers or something you came up with?

_3u10 · 2025-03-14T21:27:24 1741987644

Nah it’s just physics, it’s like wheels being more efficient than legs.

We know there is a more efficient solution (human brain) but we don’t know how to make it.

So it stands to reason that we can make more efficient LLMs, just like a CPU can add numbers more efficiently than humans.

jonplackett · 2025-03-14T22:39:58 1741991998

Wheels is an interesting analogy. Wheels are more efficient now that we have roads. But there could never have been evolutionary pressure to make them before there were roads. Wheels are also a lot easier to get to work than robotic legs and so long as there’s a road do a lot more than robotic legs.

_3u10 · 2025-03-16T10:54:27 1742122467

People think the first wheel was invented for making pottery. Biological machinery for the most part has to be self-reproducing so there is a lot of limitations on design, also it has to be able to evolve, so you get inefficient solutions like the vargas nerve (i think that's its name), basically there's a really long nerve in your body that takes a route under your trachea and then back up to another part of your brain, in giraffes its something like 40 feet long to go a few inches shortest path.

Wheels other than rolling would likely never evolve naturally because there's no real incremental path from legs to wheels, where as flippers can evolve from webbed fingers incrementally getting better for moving in water.

I dunno, maybe there's an evolutionary path for wheels, but i don't think so.

esafak · 2025-03-14T20:52:30 1741985550

Evolution does not need to converge on the optimum solution.

Have you heard of https://en.wikipedia.org/wiki/Bio-inspired_computing ?

parsimo2010 · 2025-03-14T20:56:00 1741985760

I don't think GP was implying that brains are the optimum solution. I think you can interpret GP's comments like this- if our brains are more efficient than LLMs, then clearly LLMs aren't optimally efficient. We have at least one data point showing that better efficiency is possible, even if we don't know what the optimal approach is.

esafak · 2025-03-14T21:00:14 1741986014

I agree. Spiking neural networks are usually mentioned in this context, but there is no hardware ecosystem behind them that can compete with Nvidia and CUDA.

leereeves · 2025-03-14T21:31:34 1741987894

Investments in AI are now counting by billions of dollars. Would that be enough to create an initial ecosystem for a new architecture?

vlovich123 · 2025-03-14T22:06:39 1741989999

A new HW architecture for an unproven SW architecture is never going to happen. The SW needs to start working initially and demonstrate better performance. Of course, as with the original deep neural net stuff, it took computers getting sufficiently advanced to demonstrate this is possible. A different SW architecture would have to be so much more efficient to work. Moreover, HW and SW evolve in tandem - HW takes existing SW and tries to optimize it (e.g. by adding an abstraction layer) or SW tries to leverage existing HW to run a new architecture faster. Coming up with a new HW/SW combo seems unlikely given the cost of bringing HW to market. If AI speedup of HW ever delivers like Jeff Dean expects, then the cost of prototyping might come down enough to try to make these kinds of bets.

esafak · 2025-03-14T21:56:58 1741989418

Nvidia has a big lead, and hardware is capital intensive. I guess an alternative would make sense in the battery-powered regime, like robotics, where Nvidia's power hungry machines are at a disadvantage. This is how ARM took on Intel.

jedberg · 2025-03-14T20:54:35 1741985675

It does not, you're right. But it's an interesting way to approach the problem never the less. And given that we definitely aren't as efficient as a human brain right now, it makes sense to look at the brain for inspiration.

fluidcruft · 2025-03-14T22:48:58 1741992538

How are you separating the efficiency of the architecture from the efficiency of the substrate? Unless you have a brain made of transistors or an LLM made of neurons how can you identify the source of the inefficiency?

inimino · 2025-03-16T02:22:56 1742091776

You can't but the transistor-based approach is the inefficient one, and transistors are pretty good at efficiently doing logic, so either there's no possible efficient solution based on deterministic computation, or there's tremendous headroom.

I believe human and machine learning unify into a pretty straightforward model and this shows that what we're doing that ML doesn't can be copied across, and I don't think the substrate is that significant.

inimino · 2024-08-12T17:14:10 1723482850

This reminds me of a rule I have for naming things in code (functions, variables, etc).

Say you add a function, and then the first time you call that function, you call it by a different name. Don't fix the function call to match the original name, but instead go back and change the name to match how you tried to call it. The state of mind you are in when you called the function is a better guide to naming than the state of mind you were in when you implemented it.

inimino · 2024-08-05T05:50:07 1722837007

The person who wrote both the original and new versions isn't qualified to say one is better than the other?

PeeMcGee · 2024-08-05T09:16:13 1722849373

If they are the only user or developer, sure. Otherwise they are the least qualified to say it's better -- like how I'd be the least qualified to declare myself winner of a handsome contest.

cocok · 2024-08-05T07:20:14 1722842414

I'm stealing this for all my future code reviews.

inimino · on May 16, 2024

It's the economically viable part that I think is currently hard, but agreed this is the right approach.

The basic problem is that scaling up understanding over a large dataset requires scaling the application of an LLM and tokens are expensive.

lysecret · on May 16, 2024

And the minutes of latency you currently get with this context lengths.

ZeroCool2u · on May 16, 2024

Yeah, this is why I mentioned Gemini with the context caching. It's not out yet, but supposedly launching soon. You pay a lower rate for storing the system prompt or whatever you dump in before the user query, plus you don't have to wait the full minute or so for all your research to be ingested every time.

https://ai.google.dev/gemini-api/docs/caching#get-started

inimino · on May 16, 2024

As someone who has worked on LLMs somewhat extensively, the idea that we are going to accidentally make a superintelligence by that path is literally laughable.

inimino · on May 16, 2024

> Wetterhahn

Who, just to be clear, was not being cavalier, but followed all the relevant precautions known at the time. They were just inadequate.

Joel_Mckay · on May 16, 2024

There was some controversy over the details, but it was a terrible way to go.

Most people are trained from day 1 that PPE is never perfect.

Arrath · on May 16, 2024

PPE is on the bottom of the hierarchy of controls for good reason.

Joel_Mckay · on May 17, 2024

Exactly, I lost count of the number of times I've done a PSA on YT explaining most carbide dust is tumorigenic.

It is like seeing a dog go after a porcupine... the risk assessment sub-process just isn't enough for some folks =3

inimino · on May 2, 2024

Simpler yet, just tell the model "Reply with 'Yes' or 'No'."