Hacker Newsnew | past | comments | ask | show | jobs | submit | wavelander's commentslogin


I believe this is a fantastic idea, even if only because I literally was ideating on the same lines just last week.

I think there were some usability issues, but overall, solid stuff.

I think if I were you, I would focus on the UX of legislative bill analysis rather than the LLM side of it, but that's just me.


Happy to talk more about this if you're interested. I have a lot of ideas around this. And see this kind of setup as really important, and benefitial.


Hey thanks for the quick feedback, I'm updating the UX atm and you can hit me up on twitter: https://x.com/jolerapidee



I'm really interested in the reasons behind Greg leaving. IIUC he was doing great engineering related work.


My guess is internal arguing and different vision paths for things.


Information also reporting that Greg and others have left, or are about to leave. I wonder how long he's been gone, given he was still on Twitter promoting work until recently...


This was pretty cool to read back in 2021. They also had a previous post in 2018: https://openai.com/index/scaling-kubernetes-to-2500-nodes/ where they had previously hit this limit.

Cool to see infra solutions at OpenAI. I wonder if these are still powering existing solutions. Ben, one of the authors, seems to have left OpenAI.


I wonder if that's true at a certain stage of OpenAI, which because of the product bootstrapping skills of Sam and co, has made his role irrelevant?

I mean, Jakub can take it forward at the current scale and leadership team of Sam and other people, but maybe he could not have earlier, which is where Ilya shone?


OpenAI is the Altavista of AI. There's nothing there to scale yet - their product needs another batch of innovations to get good first.


Altman's tweet (https://x.com/sama/status/1790518031640347056?s=46) makes it seem as if he wanted to stay, and Ilya disagreed and "chose" to depart. Very interesting framing.


PR statement. After nearly being ousted, I'm sure Sam is relieved to have a thorn removed from his side.


It could be a PR statement, it could also be genuine. From outside looking in there's no way to know, so I will just pretend this tweet doesn't exist.


It's PR for sure. A genuine announcement would have addressed the elephant in the room.


Sorry, which elephant?


That this was an employee who conspired against him in a failed palace coup


I think calling it a "palace coup" gives it an inappropriate framing of what happened.

I definitely think that how the board handled the situation was very inept, and I think the naivety over the blowback they would receive was one of the most surprising things for me. But after reading more about the details of what happened, and particularly writings and interviews given by the former board members, I don't think any of them did this out of any particular lust for power, or even as some sort of payback for a grudge. It seemed like all of them had real, valid concerns over Sam's leadership. Did those concerns warrant Sam's firing? From what I've read, I'm of the opinion they didn't, but obviously as just some rando on the Internet, what do I know. But I do think that there were substantive issues in question, and calling it a "palace coup" diminishes these valid concerns in my mind.


I'm not moralizing. There are palace coups that are justified.


At the time, Sam was more powerful than Ilya for sure. But framing their relationship as employee/employer when they were both in the board seems not correct.


Sam's employer is who, the US taxpayer?


Someone has already tried doing that, and it's pretty close:

https://twitter.com/eli_schein/status/1790520139164614820


Altman is the biggest con artist in tech.


Con artist is a bad description. The guy is legit dangerous. He's not after swindling you out of your money, that wouldn't be worth it.


What’s the con? Aren’t they constantly delivering frontier models?


I would say specifically the sort of chuunibyou cringe endemic to"AI safety" of claiming that their models are an existential threat.


Agree with the sentiment.

But Sam as 'conman' might just be the impression because he is more on the promotion/marketing side.

I've been under the impression that Ilya is the brains. So this seems bad for long term growth.


Surely he would have never gotten his current role if that is the case. There's way too much money and visibility involved.


Exactly. He’s only founded and led a company that’s built some of the most easily adoptable and exciting innovations in human-computer interactions in the last decade. Total fraud!


and which company would that be?


Loopt


And WorldCoin.


This was easily the most PR tweet of our generation.

The fact Ilya himself tweeted about it too was also easily the most PR tweet of our generation.

:D


Yeah, like when Cheney shot Harry Whittington and it was Whittington that apologized.


Since it's all in proper casing I'm going to assume he wrote it with chatgpt.


Or GPT-5 went rogue, took out the senior staff, and is running the game now, Westworld style.


He also literally mentioned Ilya's personal project; something that ChatGPT would do (it repeats parts of the prompt).


Ironically built by Ilya


Real life Miles Dyson


He used “easily one of the greatest minds of our generation” for two different people in the same message. 100% ai generated


it said one was "easily one of the greatest", and it said the second was "also easily one of the greatest"... it's puffery but it's not an awkward or mindless formulation.


doesn't read like chatgpt to me, but i certainly wouldn't call it good quality writing


AI or GPT usually doesn't repeat like that. Although it usually is easy to tell if something is GPT.


They probably avoid making it look like their writing is written with chatgpt.


I thought so, but they must've changed a lot then. In any case it's not like the type of message they wrote is something special, and it's just usual polite PR.


Good observation lol.


Nah. It’s same platitudes that’s always said when a someone high profile is fired.


> Ilya is easily one of the greatest minds of our generation ...

> Jakub is also easily one of the greatest minds of our generation ...

Phew, I was worried he'd be irreplaceable or something. Hopefully they've already standardized the comp package.


Easily the most plat of all platitudes of our generation


Personally I don’t trust much of anything sam says, so I’d take any framing with a large grain of salt


> a large grain of salt

You mean a lump of salt? I've always wondered what the right word is to describe this amount of salt /!jk


I've been offered a "lump" of sugar before, and it was not a single sugar crystal. When I hear "large grain of salt" I imagine something like this https://crystalverse.com/sodium-chloride-crystals/, quite different than a lump.


Those huge salt cubes are so fascinating! New side project added to the list...


That is absolutely brillant and will make a fantastic week-long father-daughter science project. Thanks.

Now back to figuring out something new for a 6-year old to program using Scratch...


I'm 37 and it's also going to make a great me-me project.


I usually use "a few bags of salt" to imply that I don't trust the source.


perhaps a "mountain of salt" in this case?


A handful of salt


A pinch of salt


a highway road side storage yard of salt


Boulder?


It’s too nice. Nobody is this nice. It’s like Truman Show nice.


While he does say, he is leaving for some personal and meaningful project, let’s see what it ends up being.


That "personal and meaningful" can just mean anything.


More like hustle culture’s “spend more time with the family”


Spend more time with my side projects.


He’s got a good PR team.


I don't have a horse in this race, and maybe the root comment came off as flippant and disparaging. But I'm not reading "outsiders" as being what you say "gatekeeping".

Maybe another perspective is that "outsiders" may not have the same view of the issue as experts in the field and may not (historically, in OP's experience) seem to want to work together with the experts to develop this view. Handwaving away complexities and not willing to get hands dirty is something I've seen as well so maybe I'm a bit more empathetic, but cold shoulders from "experts" towards newcomers is definitely a thing.

Both of which could help both sides - bring more depth to the fresh view of the "outsiders" and actually bring valuable freshness to the depth of the "experts".


I'm not sure folks who're putting out strong takes based on this have read this paper.

This paper uses GPT-2 transformer scale, on sinusoidal data:

>We trained a decoder-only Transformer [7] model of GPT-2 scale implemented in the Jax based machine learning framework, Pax4 with 12 layers, 8 attention heads, and a 256-dimensional embedding space (9.5M parameters) as our base configuration [4].

> Building on previous work, we investigate this question in a controlled setting, where we study transformer models trained on sequences of (x,f(x)) pairs rather than natural language.

Nowhere near definitive or conclusive.

Not sure why this is news outside of the Twitter-techno-pseudo-academic-influencer bubble.


It would be news is somebody showed transformers could generalize beyond the training data. Deep learning models generally cannot, so it's not a surprise this holds for transformers.


Define "generalize beyond training data" please. Because in my opinion asking a model to produce avacado chair is generalizing beyond training data.


It depends on what does "generalize beyond the training data" means. If I invent a new programming language and I teach (in-context) the language to the model and it's able to use it to solve many tasks, is it generalizing beyond the training data?


No. The way I'd look at it is that generalization or specifically extrapolation would mean that different features are needed to make a prediction (here, the next token) than what is seen in the training data. Something like a made up language could still result in the same patterns being relevant. That's why out-of-distribution research often uses mathematical extrapolation as a task.


"Supercharged Interpolation" is not something that actually exists.

Learning in High Dimension Always Amounts to Extrapolation

https://arxiv.org/abs/2110.09485

What you're asking for is not "generalization" but magic and humans would also fail.


Can humans do this?

Because I'm not convinced humans can do this.

Or that it reasonably means anything.


Can you provide a real world example? Because this sounds like nonsense. As in, not a weakness of any architecture but just the very concept of pattern matching.

What you might be asking for is a system that simply continually learns.


> somebody showed transformers could generalize

I read an interesting paper recently that had a great take on this: If you add enough data, nothing is outside training data. Thus solving the generalization problem.

Wasn’t the main point of that paper, but it made me go ”Huh yeah … I guess … technically correct?”. It raises an interesting thought that yes if you just train your neural network on everything, then nothing falls outside its domain. Problem solved … now if only compute was cheap.


Compute and data collection. This is the long tail problem.


doesn't the long tail just need to be a little outside the range of human capability for it to be AGI?


Not sure I understand but people don't need the long tail because we don't write rules and then blindly act on them when we encounter new things. We can reason about stuff we haven't seen before.


OpenAI showed it in 2017 with the sentiment neuron (https://openai.com/research/unsupervised-sentiment-neuron). Basically, the model learned to classify the sentiment of a text which I would agree is a general principle, so the model learned a generalized representation based on the data.

Having said that, the real question is what percentage of the learned representations do generalize. For a perfect model, it would learn only representations that generalize and none that overfit. But, that's unreasonable to expect for a machine *and* even for a human.

Maybe we just don't know. We are staring at a black box and doing some statistical tests, but actually don't know whether the current AI architecture is capable enough to get to some kind of human intelligence equivalent.


Has it even been shown that the average human can generalize beyond their training data? Isn't this the central thrust of the controversy around IQ tests? For example, some argue that access to relevant training data is a greater determinant of performance on IQ tests than genetics[1].

[1] https://www.youtube.com/watch?v=FkKPsLxgpuY


Just because we can be genuinely creative in some way doesn't mean that having access to data doesn't make it easier.


Indeed! And even then, can we be genuinely creative? As far as I can tell, everything is derivative.


Considering the human race bootstrapped itself from nothing, what’s your mental model here? Where did everything come from?


Humans and AIs both evolve as the result of some iterations dying. In both cases, we tacitly erase the ones who don't make it (by framing the discussion around the successful, alive ones). The difference is that humans have had a broader training set.


Remembering the outcomes of accidents and remixing those memories to produce derivative works.


Nature.


> I'm not sure folks who're putting out strong takes based on this have read this paper.

They haven't read the other papers either. It's really striking to me to watch people retweet this and it get written up in pseudo-media like Business Insider when other meta-learning papers on the distributional hypothesis of inducing meta-learning & generalization, which are at least as relevant, can't even make a peep on specialized research subreddits - like, "Pretraining task diversity and the emergence of non-Bayesian in-context learning for regression", Raventós et al 2023 https://arxiv.org/abs/2306.15063 (or https://arxiv.org/abs/2310.08391 ) both explains & obsoletes OP, and it was published months before! OP is a highly limited result which doesn't actually show anything that you wouldn't expect on ordinary Bayesian meta-reinforcement-learning grounds, but there's so much appetite for someone claiming that this time, for real, DL will 'hit the wall' that any random paper appears to be definitive to critics.


> Not sure why this is news outside of the Twitter-techno-pseudo-academic-influencer bubble.

The paper is making the rounds despite being a weak result because it confirms what people want, for non-technical reasons, to be true. You see this kind of thing all the time in other fields: for decades, the media has elevated p-hacked psychology studies on three undergrads into the canon of pop psychology because these studies provide a fig leaf of objective backing for pre-determined conclusions


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: