More

lazypenguin · 2025-05-27T23:54:34 1748390074

I’ve played with Slint and it’s nice but depending on how this plays out it might be a concern. QML is of course much more mature than Slint. It would be interesting to read an in depth blog post by Slint about why fundamentally Slint is better than QML since one of the main advantages (multi-language) is potentially being nullified here.

lazypenguin · 2025-05-03T19:48:17 1746301697

That’s absurd, that’s like saying we should only use C++ for backend code because my CRUD business app might one day scale to infinity. Better be safe than sorry and sling pointers and CMake just in case I need that extra juice!

quantadev · 2025-05-04T00:31:52 1746318712

imo, even if the only "interactivity" a web app has is just a login page, then even that alone is enough to warrant using a framework rather than doing direct DOM manipulation (or even worse, full page refreshes after a form submit).

It's not about using the most powerful tool always, it's about knowing how to leverage modern standards rather than reinventing and solving problems that are already solved.

yawaramin · 2025-05-09T20:21:22 1746822082

> ...a web app has is just a login page, then even that alone is enough to warrant using a framework rather than doing direct DOM manipulation (or even worse, full page refreshes

Maybe, but it doesn't necessarily need to be a SPA framework though. There are simpler libraries/frameworks like htmx that considerably reduce complexity and also let you avoid direct DOM manipulation.

quantadev · 2025-05-11T15:07:18 1746976038

Yeah, the debate about which framework is best is like debating which programming language is best, in that they all have pros and cons, that can be debated endlessly.

However, in general, whatever's most popular is a strong signal (like a crowd-sourced signal) that it's probably indeed "the best" all around, which are two words that can also, of course, be debated endlessly based on one's personal definition of "best".

willsmith72 · 2025-05-03T19:59:41 1746302381

Not really. If you use react router, you can have a client side js app and add SSR with a couple of hours work. You have your cake and eat it

lazypenguin · 2025-04-12T19:22:58 1744485778

Strange comment, there’s plenty of videos of both locations on YouTube to make the comparison and I think it’s quite apt. Chinese (and other SEA) major cities definitely feel much more modern than most American cities these days. Most American metropolitan areas are quite bland/bleak outside the “beautified” green areas.

sepositus · 2025-04-12T19:25:11 1744485911

> A trip to one of the major cities in China made it clear to me that they are ahead of the world right now

Sorry, I should have been more clear, this is what I was referencing. I have been to SF recently and would agree it's not hard to make a lot of cities look better in comparison.

lazypenguin · 2025-03-19T01:24:12 1742347452

I also married in my early twenties (21 actually) and am still happily married over a decade later. No regrets for me; I was lucky to find someone compatible early on and we were able to grow together. Yes undoubtedly you sacrifice some experiences choosing this path but I cherish that bond and antidote to loneliness. There's something comforting about having someone so close for important parts of your life spanning a long time. A real partner to the challenges so you don't have to do everything alone.

I was hesitant to share but I thought it would be better in the end. We often hear of negative experiences (divorce, abuse, etc.) but I regularly hear about couples quietly enjoying their long relationships together. Feels right to offer a counter-point to those stories to say it's not always a mistake.

lazypenguin · 2025-03-06T01:23:23 1741224203

Interesting correlation, if one doesn’t have offspring to “leave a better world” to why should they care? Hadn’t thought of that before.

2025-03-06T01:32:44 1741224764

[dead]

Aeolun · 2025-03-06T04:01:18 1741233678

How about being a dick makes you less inclined to have children? I’m inclined to believe selfish people are less inclined to have children (there will be ‘some’ that see them as a trophy, but fewer than the ones that don’t want them at all).

evrenesat · 2025-03-06T06:19:06 1741241946

I have a kid, and I'm very sad about the state of the world my daughter going to live. I was relatively lucky, she is not. If I didn't have her, I'm sure I would care much much less about stuff I failed to change for the better, it would be much easier to say "f.k it, that's natural order of things, history repeats itself", give up and live my life fast and lose.

lazypenguin · 2025-03-04T22:48:39 1741128519

Surprisingly the Russian army is mostly volunteer since the average pay right now is quite high by Russian standards. There are also standard conscripted soldiers as part of the country’s required service. Allegedly the conscripts are kept away from front lines since the death of conscripted young men in Afghanistan was part of the political death knell that led to the fall of the Soviet Union.

Unfortunately Ukraine is having to rely more on involuntary conscription to fill the ranks as volunteer numbers have dwindled. There are many documented cases of TASS “kidnapping” military age men.

I don’t see the parent poster blaming anybody. Maybe you can say they provided a one-sided view but what they wrote was factual.

lazypenguin · 2025-02-26T01:08:50 1740532130

If they want to have 6-8 hour gym sessions to simulate the real thing then sure.

lazypenguin · 2025-02-22T04:23:07 1740198187

You are right, building a good UI for the desktop has become exceedingly difficult. In my experience new UI toolkits try to mimic the web experience (kivy, QML, Slint, Flutter, etc.) and end up being threadbare with the simplest widgets available. In my opinion, every new UI toolkit since QtWidgets/WinForms/Delphi/WPF/Win32/Gtkmm era has missed the point. The desktop is a power tool and requires powerful widgets. Virtualized lists, data grid with complex interactions, drag and drop, OS integration, modals, background tasks, docking windows, etc. A toolkit that puts a slider, some text and buttons on the screen doesn’t solve the harder problems and the web browser will run circles around that workflow. Easier UX can be built in anything really. Until there’s a desktop app framework that solves the hard problems in desktop UI dev then desktop will languish and hard stuff will continue to be hard.

ogoffart · 2025-02-22T07:46:02 1740210362

I'm developer of Slint and I'm glad to see Slint mentioned. I want to clarify that we're not trying to mimic the web. We have a great vision for the desktop integration, but we unfortunately have limited resources. Our team is fairly small and we are mainly working with paying customers to pay the bills, which are mostly in embedded space at the moment. I agree with your points, and while there’s still a lot to do, I hope Slint can become a strong option for building powerful complex desktop applications.

lazypenguin · 2025-02-22T13:07:44 1740229664

I like Slint, the technical work is extremely impressive. I even contributed to the project! I hope you can realize those goals, you would fill a massive vacuum in the wider ecosystem.

lazypenguin · 2025-02-21T02:40:52 1740105652

Linus response here seems relevant to this context: https://lore.kernel.org/rust-for-linux/CAHk-=wgLbz1Bm8QhmJ4d...

thesuperbigfrog · 2025-02-21T02:54:51 1740106491

Linus's reply is perfect in tone and hopefully will settle this issue.

He is forceful in making his points, but respectful in the way he addressed Christoph's concerns.

This gives me great hope that the Linux maintainer community and contributers using Rust will be able to continue working together, find more common ground, and have more success.

j16sdiz · 2025-02-21T05:25:20 1740115520

The response addressed Christoph's concerns _in word_.

According to the policy, rust folks should fix the rust binding when C changes breaks the binding. The C maintainer don't need to care rust, at all.

In practice, though, I would expect this needs lots of coordination. A PR with C only changes that breaks the whole building (because rust binding is broken) is unlikely to be merged to mainline.

Linus can reiterate his policy, but the issue can't be resolved without some rust developers keep on their persistent work and builds up their reputation.

Tomte · 2025-02-21T05:57:55 1740117475

> rust folks should fix the rust binding when C changes breaks the binding

I have never understood how that could work long-time. How do you release a kernel, where some parts are broken? Either you wait for Rust people to fix their side or you drop the C changes. Or your users suddenly find their driver doesn‘t work anymore after a kernel update.

As a preliminary measure when there isn‘t a substantial amount of Rust code, yet, sure. But the fears of some maintainers that the policy will change to "you either learn Rust and fix things or your code can be held up until someone else helps you out" are well-founded, IMO.

sanxiyn · 2025-02-21T06:05:14 1740117914

Are you familiar with Linux kernel development process? Features can be merged only in two weeks long merge window. After the merge window closes, only fixes are merged for eight weeks. Rust binding can be fixed in that time. I don't see any problems.

tremon · 2025-02-21T19:06:02 1740164762

That's a gross simplificaftion of the development process. Yes, new features are mostly merged in that two-weeks window -- but you're now talking about the Linux release management process more than its development.

Before features are merged to Linus' release branch, pretty much all changes are published and merged to linux-next first. It is exactly here that build issues and conflicts are first detected and worked out, giving maintainers early visibility into changes that are happening outside their subsystem. Problems with the rust bindings will probably show up here, and the Rust developers will have ample time to fix/realign their code before the merge window even starts. And it's not uncommon for larger features (e.g. when they require coordination across subsystems) to remain in linux-next for more than one cycle.

Tomte · 2025-02-21T06:07:57 1740118077

And if no Rust developer has time or interest in those eight weeks? I don‘t claim that it can never work (or it cannot work in the common case), but as a hard rule it seems untenable.

josefx · 2025-02-21T06:34:24 1740119664

> And if no Rust developer has time or interest in those eight weeks?

What if Linus decided to go on a two month long vacation in the middle of the merge window?

> I don‘t claim that it can never work (or it cannot work in the common case), but as a hard rule it seems untenable.

There are quite a few rust developers already involved, if they cannot coordinate that at least some are available during a release critical two month period then none of them should be part of any professional project.

Lutger · 2025-02-21T09:46:05 1740131165

I'm not familiar with kernel development, but what's the difference anyway with C code? If you change the interface of some part, any users of it will be broken Rust or not. It will require coordination anyway.

It is customary for maintainers to fix _all_ usage of their code themselves? That doesn't seem scalable.

cozzyd · 2025-02-21T11:00:29 1740135629

Yes, that is the custom and is a key advantage of getting drivers in tree. I believe often the changes are applied automatically with a tool like coccinelle,

swiftcoder · 2025-02-21T10:22:28 1740133348

Keep in mind that actual breaking changes are by design incredibly rare in a project like the linux kernel. If you have a decade's-worth of device drivers depending on your kernel subsystem's API, you don't get to break them, you have to introduce a new version instead.

rcxdude · 2025-02-21T11:53:24 1740138804

I think it's more a degree of how much effort it is to adjust to the new interface. If it's just 'added a new parameter to a function and there's an obvious default for existing code', then it'll (potentially mechanically) be applied to all the users. If it's 'completely changed around the abstraction and you need to think carefully about how to port your driver to the new interface', then that's something where there needs to be at least some longer-term migration plan, if only because there's not likely one person who can actually understand all the driver code and make the change.

(I do have experience with this causing regressions: someone updates a set of drivers to a new API, and because of the differences and lack of a good way to test, breaks some detail of the driver)

cozzyd · 2025-02-21T10:58:51 1740135531

This isn't true; internal API's change all the time (e.g. adding extra arguments) Try running out of tree drivers on bleeding edge kernels to see for yourself.

tialaramex · 2025-02-21T13:18:40 1740143920

Of course, for trivial mechanical changes like adding an argument the Rust binding changes are also trivial. If you've just spent half an hour walking through driver code for hardware you've never heard of changing stuff like

quaff(something, 5, Q_DOOP) ... into ... quaff(something, 5, 0, Q_DEFAULT | (Q_DOOP << 4))

Then it's not beyond the wits of a C programmer to realise that the Rust binding

quaff(var1, n, maybe_doop) ... can be ... quaff(var1, n, 0, Q_DEFAULT | (maybe_doop << 4))

Probably the Rust maintainer will be horrified and emit a patch to do something more idiomatic for binding your new API but there's an excellent chance that meanwhile your minimal patch builds and works since now it has the right number and type of arguments.

mananaysiempre · 2025-02-21T23:30:47 1740180647

> If you've just spent half an hour walking through driver code for hardware you've never heard of changing stuff [...].

Isn’t the point of Coccinelle that you don’t have to spend time walking through (C) driver code you’ve never heard of?

tialaramex · 2025-02-22T10:43:03 1740220983

I have never used Coccinelle but yes, sort of. However, you're on the hook for the patch you submit, Coccinelle isn't a person so if you blindly send out a patch Coccinelle generated, without even eyeballing it, you should expect some risk of thrown tomatoes if (unknown to you) this utterly broke some clever code using your previous API in a way you hadn't anticipated in a driver you don't run.

sanxiyn · 2025-02-21T06:09:54 1740118194

If so kernel is released with broken Rust. That is the policy, and I am flabbergasted why everyone is going "that policy must not be literal".

Tomte · 2025-02-21T06:14:58 1740118498

Because if in a few years I have a device whose driver is written in Rust, a new kernel version might have simply dropped or broken my device driver, and I cannot use my device anymore. But sure, if R4L wants to stay a second-class citizen forever, it can still be acceptable.

adgjlsfhk1 · 2025-02-21T06:48:40 1740120520

this isn't policy forever. it's policy for now. if r4l succeeds, the policy will change.

mschuster91 · 2025-02-21T08:14:52 1740125692

> Because if in a few years I have a device whose driver is written in Rust, a new kernel version might have simply dropped or broken my device driver, and I cannot use my device anymore.

At least for Debian, all you need to do if you hit such a case is to simply go and choose the old kernel in the Grub screen. You don't even need to deal with installing an older package and dealing with version conflicts or other pains of downgrading.

account42 · 2025-02-21T10:50:53 1740135053

I hope you're not seriously suggesting this as a reasonable workflow.

mschuster91 · 2025-02-21T11:05:34 1740135934

For my server or laptop at home, sure. Why not. For servers in commercial fleets you should have staged rollouts as a policy anyway so if you do it right you shouldn't get hit.

prmoustache · 2025-02-21T11:44:41 1740138281

It is only a problem if you compile the kernel directly from the source tree instead of using the packages provided by your Linux distribution.

darthrupert · 2025-02-21T09:46:31 1740131191

Distros should be your firewall against that sort of thing. Just don't use a distro with a non-existent kernel upgrade process.

kelnos · 2025-02-21T06:10:19 1740118219

I think the way you do this is set things up so that no bits that are written in Rust are built by default, and make sure that the build system is set up such that Rust bindings for C code are only built when there's Rust code that's enabled that requires them.

Then sure, some people who download a kernel release might enable a Rust driver, and that makes the build fail. But until Rust is considered a first-class, fully-supported language in the kernel, that's fine.

In practice, though, I would expect that the Rust maintainers would fix those sorts of things up before an actual release is cut, after the 2-week merge window, during the period when only fixes are accepted. Maybe not every single time, but most of the time; if no one is available to fix a particular bit of breakage, then it's broken for that release. And that's fine too, even if it might be annoying to some users.

renox · 2025-02-21T12:25:10 1740140710

> I think the way you do this is set things up so that no bits that are written in Rust are built by default, and make sure that the build system is set up such that Rust bindings for C code are only built when there's Rust code that's enabled that requires them.

Which is currently the only way possible and it will stay that way for a long time because remember that clang support less targets than gcc and gcc cannot compile Rust.

Once gcc can /reliably/ compile Rust, then and only then Rust could be "upgraded" to a first class citizen in Linux. The "C-maintainers don't want to learn Rust" issue, will still be here of course, but there will already be many years of having a mixed code base..

Tomte · 2025-02-21T06:16:27 1740118587

I agree with all you say, but with longterm I really mean when we've arrived here

> But until Rust is considered a first-class, fully-supported language in the kernel, that's fine

A first-class language whose kernel parts may always break does seem unreasonable. I still think policy will have to change by that point.

tw04 · 2025-02-21T07:30:52 1740123052

Because nothing is forcing a distro to adopt a kernel that has items that are broken. Not a lot of folks out there are manually compiling and deploying standalone kernels to production systems.

C can break rust, and Debian/Ubuntu/Redhat/suse/etc can wait for it to be fixed before pushing a new kernel to end users.

NewJazz · 2025-02-21T07:07:44 1740121664

You can merge it into your branch as e.g. the DMA maintainer, then the rust folk can pull your changes and fix the bindings. Maybe you as maintainer could give them a heads up and a quick consideration of the error.

account42 · 2025-02-21T10:47:20 1740134840

Yes, Rust as something optional doesn't really make sense long term. Either it will continue to only be used in nieche drivers (in which case why bother?) or eventually you need to build Rust code to have a usable kernel for common hardware. Any promises to the contrary need to be backed up with more than "trus me bro".

sanxiyn · 2025-02-21T05:45:16 1740116716

Why wouldn't it be merged? No Rust code is built unless CONFIG_RUST is on, and it is off by default. It won't be on by default for a long time.

robinei · 2025-02-21T07:27:17 1740122837

That's the theory. However, isn't likely that as things like the new Nova Nvidia driver is written in Rust, the things that depend on Rust are suddenly so important, that shipping with it disabled is unrealistic, even without a policy change. (I don't think this is bad)

danieldk · 2025-02-21T08:02:25 1740124945

Rust for Linux is currently experiment. If a larger number of widely-used drivers get written in Rust and developers prefer writing them in Rust over C, then I guess it's time to declare the experiment a success and flip the switch?

cyanydeez · 2025-02-24T11:34:54 1740396894

When rust is important, the problem bootstraps itswlf.

_blk · 2025-02-21T06:03:20 1740117800

How many kb ram is enough for everyone?

nxobject · 2025-02-21T03:33:18 1740108798

Linus is one of the few people who can forcefully argue the case for moderation, and I've recognized some of the lines I've used to shift really contentious meetings back into place. There's the "shot-and-chaser" technique (a) this is what needs to happen now for the conversation...

"I respect you technically, and I like working with you[...] there needs to be people who just stand up to me and tell me I'm full of shit[...] But now I'm calling you out on YOURS."

...and (b) this is me recognizing that me taking charge of a conversation is a different thing than me taking control of your decisions:

"And no, I don't actually think it needs to be all that black-and-white."

(Of course Linus has changed over time for the better, he's recognized that, and I've learned a lot with him and have made amends with old colleagues.)

kelnos · 2025-02-21T06:15:42 1740118542

I really liked this reply from Torvalds. I've seen a lot of his older rants, and while I respect his technical achievements, it really turned me off on the guy himself. I was skeptical of his come-to-jesus moment back in 2018 (or whenever it was), but these days it's great to read his measured responses when there's controversy.

It's really cool to see someone temper their language and tone, but still keep their tell-it-like-it-is attitude. I probably wouldn't feel good if I were Christoph Hellwig reading that reply, but I also wouldn't feel like someone had personally attacked me and screamed at me far out of proportion to what I'd done.

SeanAnderson · 2025-02-21T03:38:38 1740109118

The whole "I respect you technically, and I like working with you." in the middle of being firm and typing in caps is such a vibe shift from the Linus of a decade ago. We love to see it!

atq2119 · 2025-02-21T03:52:25 1740109945

My impression is that this has always been part of the core of his character, but he had to learn to put it into writing.

Contrast this to people who are good at producing the appearance of an upstanding character when it suits them, but being quite vindictive and poisonous behind closed doors when it doesn't.

Ballas · 2025-02-21T05:50:05 1740117005

Yeah, from my limited view point it really looks like Linus is a genuine person. He says what he thinks and there are no hidden agendas. That is very refreshing in current times.

account42 · 2025-02-21T10:55:14 1740135314

[flagged]

dxdm · 2025-02-21T12:41:22 1740141682

In my eyes, this sentence by Linus is as straightforward as it gets. But if you think it's corporate lie-speak, I wonder which words you'd interpret as genuine - because there has to be a way to get something like that across if you really mean it, or we're all doomed. :)

ksec · 2025-02-21T03:16:57 1740107817

I have always thought Linus may not like Rust or at least not Pro-Rust, and the only reason Rust is marching inside the Kernel is most of his close lieutenant are extremely pro rust. So there is this Rust experiment.

But looking at all the recent responses it seems Rusted Linux is inevitable. He is Pro Rust.

dralley · 2025-02-21T05:30:16 1740115816

There was no reason to ever think otherwise. The experiment wouldn't have happened if he didn't want to try it and give it a certain amount of backing. He's been pretty vocal about his motivations.

https://www.youtube.com/watch?v=OvuEYtkOH88&t=6m07s

Symmetry · 2025-02-21T13:42:07 1740145327

He certainly didn't have any trouble keeping C++ out.

__s · 2025-02-21T17:34:51 1740159291

Also in that thread is Greg KH: https://lore.kernel.org/rust-for-linux/2025021954-flaccid-pu...

> C++ isn't going to give us any of that any decade soon, and the C++ language committee issues seem to be pointing out that everyone better be abandoning that language as soon as possible if they wish to have any codebase that can be maintained for any length of time.

miohtama · 2025-02-21T14:03:23 1740146603

Benefits C Vs. Rust are much more impactful than C Vs C++

Symmetry · 2025-02-21T14:56:23 1740149783

Oh certainly. And many fewer potentially dangerous complex corner cases than C++ brings.

the_duke · 2025-02-21T06:16:08 1740118568

I'm pretty sure there is significant pressure from corporate sponsors in the Linux foundation to make Rust happen. That includes Google, Microsoft, AWS, ...

snailmailstare · 2025-02-24T12:13:07 1740399187

I think the pressures are from everywhere all the time. There's always a risk that something picks up enough steam to replace an OS even if it is a derivative or fork of the same OS. For C++ adoption, Linus gauged correctly that it's steam is limited and even directly taunting its supporters had low risk in a project that could challenge Linux because its set of choices over time has had many foot guns, where many developers would intentionally use what any user would consider a foot gun. As long as Linus didn't step in to define strong rules for how C++ would be used there was no likely challenger over any reasonable amount of time (for Linux/C). This is not true for rust.

the_duke · 2025-02-25T10:18:44 1740478724

The difference is that at least Google is actively deploying Rust-based Linux code in Android already.

And the big Linux foundation members have more influence than is publically known.

snailmailstare · 2025-02-26T09:05:22 1740560722

Linus was pretty clear about his views on working with vendors, etc, in his Autobiography so I'm not sure if/why there would be surprise.

Solaris, Java and C/C++ conpilers were all owned by Sun. I feel that is a situation without anyone even trying to maintain whatever separation some might expect from Linus.

ksec · 2025-02-21T12:40:13 1740141613

That may actually make a little more sense.

tonyhart7 · 2025-02-21T06:02:45 1740117765

"Rusted Linux is inevitable" for a good reason because Rust is objectively good language or better compared to C (Rust is designed what to fix flaws many language has)

smidgeon · 2025-02-21T09:24:30 1740129870

The real reason being the longing for the 5 hour kernel compile of yore.

dc443 · 2025-02-21T09:50:43 1740131443

Well he DOES have a threadripper now.

account42 · 2025-02-21T10:56:55 1740135415

And I'm sure the one thread used by the rust build will be blazing fast.

billfruit · 2025-02-21T04:08:02 1740110882

But why though? What about legacy systems, which may not have a rust toolchain? What about new architectures that may come up in the future?

kelnos · 2025-02-21T06:22:37 1740118957

Well there are a few ways to deal with this.

- Systems not supported by Rust can use older kernels. They can also -- at least for a while -- probably still use current kernel versions without enabling any Rust code. (And presumably no one is going to be writing any platform-specific code or drivers for a platform that doesn't have a Rust toolchain.)

- It will be a long time before building the kernel will actually require Rust. In that time, GCC's Rust frontend may become a viable alternative for building Linux+Rust. And any arch supported by GCC should be more-or-less easily targetable by that frontend.

- The final bit is just "tough shit". Keep up, or get left behind. That could be considered a shame, but that's life. Linux has dropped arch support in the past, and I'm sure it will do so in the future. But again, they can still use old kernels.

As for new architectures in the future, if they're popular enough to become a first-class citizen of the Linux kernel, they'll likely be popular enough for someone to write a LLVM backend for it and the glue in rustc to enable it. And if not, well... "tough shit".

cyanydeez · 2025-02-24T11:43:53 1740397433

Linus wouldnt accept rust unless it had technical merit.

If we never planned to evolve hardware and platforms, it of course would be senseless.

barkingcat · 2025-02-21T07:30:53 1740123053

Thinking pragmatically, the legacy systems where there is no current rust toolchain most likely do not need the drivers and components that are being written in rust.

Unless you somehow want to run Apple M1 GPU drivers on a device that has no rust toolchain ... erm...

or you want to run a new experimental filesystem on a device that has no rust toolchain support?

The answer to the "new and emerging platforms" question is pretty much the same as before: sponsor someone to write the toolchain support. We've seen new platforms before and why shouldn't it follow the same pathway? Usually the c compiler is donated by the company or community that is investing into the new platform (for example the risc-v compiler support for gcc and llvm are both getting into maturity status, and the work is sponsored by the developer community, various non-profit[1][2] and for-profit members of the ecosystem as well as from the academic community.)

realistically speaking, it's very hard to come up with examples of the hypothetical.

[1] https://github.com/lowRISC/riscv-llvm

[2] https://lists.llvm.org/pipermail/llvm-dev/2016-August/103748...

remexre · 2025-02-21T04:15:22 1740111322

I suspect gcc-rs will be in good working order for a few years before any kernel subsystems require a Rust compiler to build; if the legacy system can't run a recent GCC, why does it need a much-newer kernel? (e.g., how would it cope with the kernel requiring an additional GCC extension, bumping the minimum standard version of C, etc.)

I honestly suspect new architectures will be supported in LLVM before GCC nowadays; most companies are far more comfortable working with a non-GPL toolchain, and IMHO LLVM's internals are better-documented (though I've never added a new target).

lmm · 2025-02-21T04:14:48 1740111288

> What about legacy systems, which may not have a rust toolchain?

Linux's attitude has always been either you keep up or you get dropped - see the lack of any stable driver API and the ruthless pruning of unmaintained drivers.

> What about new architectures that may come up in the future?

Who's to say they won't have a Rust compiler? Who's to say they will have a C one?

rat87 · 2025-02-21T04:20:55 1740111655

Linux also cant be built by any minimal c compiler for obscure arch, it requires many gcc extensions. Its only because llvm added them that its also can be compiled with llvm

cwillu · 2025-02-21T13:23:28 1740144208

> Linux's attitude has always been either you keep up or you get dropped

Gonna need a citation on that one. Drivers are removed when they don't have users anymore, and a user piping up is enough to keep the driver in the tree:

For example:

   > As suggested by both Greg and Jakub, let's remove the ones that look
   > are most likely to have no users left and also get in the way of the
   > wext cleanup. If anyone is still using any of these, we can revert the
   > driver removal individually.

https://lore.kernel.org/lkml/20231030071922.233080-1-glaubit...

Or the x32 platform removal proposal, which didn't happen against after some users showed up:

   > > > I'm seriously considering sending a patch to remove x32 support from
   > > > upstream Linux.  Here are some problems with it:
   > >
   > > Apparently the main real use case is for extreme benchmarking. It's
   > > the only use-case where the complexity of maintaining a whole
   > > development environment and distro is worth it, it seems. Apparently a
   > > number of Spec submissions have been done with the x32 model.
   > >
   > > I'm not opposed to trying to sunset the support, but let's see who complains..
   >
   > I'm just a single user. I do rely on it though, FWIW.
   > […snipped further discussion]

https://lore.kernel.org/lkml/CAPmeqMrVqJm4sqVgSLqJnmaVC5iakj...

danieldk · 2025-02-21T08:05:15 1740125115

Curious: what widely-used (Linux) legacy systems do not have a Rust toolchain?

In the end the question is whether you want to hold back progress for 99.9% of the users because there are still 200 people running Linux on an Amiga with m68k. I am pretty sure that the number of Linux on Apple Silicon users outnumbers m68k and some other legacy systems by at least an order of magnitude (if not more). (There are currently close to 50000 counted installs. [1])

[1] https://stats.asahilinux.org

openmarkand · 2025-02-21T09:07:14 1740128834

I currently dont.

pipe01 · 2025-02-21T14:16:31 1740147391

snvzz · 2025-02-22T00:59:35 1740185975

There are enough of them that some (e.g. me) actually read this comment.

rcxdude · 2025-02-21T12:05:17 1740139517

I think that'll become a question if/when rust starts to move closer to core parts of the kernel, as opposed to platform-specific driver code. It's already been considered for filesystems which could in theory run on those systems, and the project seems to be OK with the idea that it's just not supported on those platforms. But that's likely a long way off, after there's a significant body of optional rust code in the kernel, and the landscape may already be quite different at that point (both in terms of if those systems are still maintained, and in terms of the kind of targets rust can support, especially if the gcc backend matures)

yxhuvud · 2025-02-21T08:44:22 1740127462

You don't get to run legacy systems with rust based drivers. You were not going to do that anyhow, so what is the issue, really?

evidencetamper · 2025-02-21T04:20:37 1740111637

Those are the tradeoffs, and it seems to me that Linux doesn't have to run in everything under the Sun as Doom ports do, and there might be other kernels that are better suited to such cases.

j-krieger · 2025-02-21T08:44:04 1740127444

You can compile Rust for Win98. They‘ll be fine.

gulbanana · 2025-02-21T04:15:44 1740111344

The legacy systems are not very important. The new ones will be supported.

tonyhart7 · 2025-02-21T05:55:46 1740117346

"But why though? What about legacy systems" their called legacy for a reason right

I'm sorry you cant hinder kernel development just because some random guy/corpo cant use your shit in obscure system, like how can that logic is apply to everything

if your shit is legacy then use legacy kernel

jfbfkdnxbdkdb · 2025-02-21T10:10:27 1740132627

Uhhhhh IIRC rust uses llvm under the hood so ... Change the back end and you are good?

rcxdude · 2025-02-21T12:07:48 1740139668

There are some platforms which linux supports that LLVM does not (and GCC does). There is quite a lot of effort in making a decent LLVM backend, and these older systems tend to have relatively few maintainers, so there may not be the resources to make it happen.

Pet_Ant · 2025-02-21T16:29:47 1740155387

> There is quite a lot of effort in making a decent LLVM backend, and these older systems tend to have relatively few maintainers

Well, it also takes effort to be held back with outdated tools. Also, the LLVM backend doesn't have to be top-notch, just runnable. If they want to run legacy hardware they should be okay with running a legacy or taking the performance hit of a weaker LLVM back-end.

Realistically

At version 16[1], LLVM supports: * IA-32 * x86-64 * ARM * Qualcomm Hexagon * LoongArch * M68K * MIPS * PowerPC * SPARC * z/Architecture * XCore * others

in the past it had support for Cell and Alpha, but I'm sure that the old code could be revived if needed, so how many users are effected here? Lets not forget the Linux dropped Itanium support and I'm sure someone is still running that somewhere.

Looking through this list [2], what I see missing is Elbrus, PA-RISC, OpenRisc, and SuperH. So pretty niche stuff.

[1] https://en.wikipedia.org/wiki/LLVM#Backends

[2] https://en.wikipedia.org/wiki/List_of_Linux-supported_comput...

dannymi · 2025-02-22T15:01:40 1740236500

Then use the GCC backend. https://github.com/rust-lang/rustc_codegen_gcc

ddtaylor · 2025-02-21T12:47:25 1740142045

Aren't those already situations we use cross compiler for?

Filligree · 2025-02-21T13:42:19 1740145339

A cross compiler is just compiler backend for machine X running on machine Y. You still need the backend.

foota · 2025-02-21T08:04:16 1740125056

I don't know why he didn't write this email 3 weeks ago.

KingMob · 2025-02-21T11:00:41 1740135641

He wrote that he was hoping the email thread would improve the situation without his involvement, but that turned out not to be the case.

rcxdude · 2025-02-21T11:58:26 1740139106

It didn't seem super likely that this would be the case, because a lot of the contention was around what Linus specifically thought about it.

darthrupert · 2025-02-21T18:02:37 1740160957

Isn't it obvious? He thought about it.

AdmiralAsshat · 2025-02-21T13:46:26 1740145586

Boy that response would've been helpful like a week ago, before several key Rust maintainers resigned in protest due to Linus's radio silence on the matter.

geodel · 2025-02-21T17:26:23 1740158783

Oh several resigned. I thought all of them.

yolovoe · 2025-02-21T02:48:29 1740106109

Huh, thanks. Really good to know where Linus stands here. Seems to me like Linus is completely okay with introduction of Rust to the kernel and will not allow maintainers blocking its adoption.

Really good sign. Makes me hopeful about the future of this increasingly large kernel

kragen · 2025-02-21T07:56:59 1740124619

This is indeed an excellent response and will hopefully settle the issues. Aside from the ones already settled by Linus's previous email, such as whether social media brigading campaigns are a valid part of the kernel development process.

kennysoona · 2025-02-21T07:01:12 1740121272

Honestly I was waiting for a reply from Linux like this to put Hellwig in his place.

> The fact is, the pull request you objected to DID NOT TOUCH THE DMA LAYER AT ALL.

> It was literally just another user of it, in a completely separate subdirectory, that didn't change the code you maintain in _any_ way, shape, or form.

> I find it distressing that you are complaining about new users of your code, and then you keep bringing up these kinds of complete garbage arguments.

Finally. If he had been sooner maybe we wouldn't have lost talented contributors to the kernel.

kennysoona · 2025-02-21T10:06:13 1740132373

Ah I can't believe I misspelled Linus as Linux, seems like it should happen often enough but honestly I think I rarely make that typo.

sophacles · 2025-02-21T17:03:43 1740157423

I've made that mistake, and the inverse, often enough that I try to make sure to check I've written the correct word... and I still mess it up. Between the words being similar and the 'x' being right next to the 's' on US keyboard, it's bound to happen.

ON the flip side - when I (and I suspect many others) read Linux where Linus should be written, I rarely even notice and never really care because I've been there.

All this is a long winded way of saying: don't sweat it :) .

kombine · 2025-02-21T08:44:47 1740127487

> Finally. If he had been sooner maybe we wouldn't have lost talented contributors to the kernel.

I feel that departure of the lead R4L developer was a compromise deliberately made to not make Hellwig feel like a complete loser. This sounds bad of course.

fdrs · 2025-02-21T10:44:38 1740134678

no lead R4L left because of the current situation. Marcan was the lead of Asahi Linux, not R4L. Wedson (which was one of the leads of R4L) left some time ago, before all of this, and his problem was not with Hellwig (or, at least it was not the one that brought the last drop).

edit: whitespace

iknowstuff · 2025-02-22T16:15:45 1740240945

Hellwig had a spat with asahi lina back then as well.

dralley · 2025-02-21T15:57:33 1740153453

Marcan quitting wasn't a compromise, the resignation of a maintainer would never be used that way. Dude was just burnt out. I don't blame him at all, hopefully some time away from the situation does him some good.

josefx · 2025-02-21T09:21:35 1740129695

Who quit?

__s · 2025-02-21T17:42:14 1740159734

https://asahilinux.org/2025/02/passing-the-torch

josefx · 2025-02-21T17:55:01 1740160501

Aren't R4L and Asahi Linux separate projects?

__s · 2025-02-22T13:27:32 1740230852

Yes, but that's the most recent one I assume people are talking about

But maybe they mean https://lore.kernel.org/lkml/20240828211117.9422-1-wedsonaf@...

chris_wot · 2025-02-21T03:03:42 1740107022

Thank god for common sense.

mschuster91 · 2025-02-21T08:09:33 1740125373

Finally, took him long enough.

lazypenguin · 2025-02-05T19:19:29 1738783169

I work in fintech and we replaced an OCR vendor with Gemini at work for ingesting some PDFs. After trial and error with different models Gemini won because it was so darn easy to use and it worked with minimal effort. I think one shouldn't underestimate that multi-modal, large context window model in terms of ease-of-use. Ironically this vendor is the best known and most successful vendor for OCR'ing this specific type of PDF but many of our requests failed over to their human-in-the-loop process. Despite it not being their specialization switching to Gemini was a no-brainer after our testing. Processing time went from something like 12 minutes on average to 6s on average, accuracy was like 96% of that of the vendor and price was significantly cheaper. For the 4% inaccuracies a lot of them are things like the text "LLC" handwritten would get OCR'd as "IIC" which I would say is somewhat "fair". We probably could improve our prompt to clean up this data even further. Our prompt is currently very simple: "OCR this PDF into this format as specified by this json schema" and didn't require some fancy "prompt engineering" to contort out a result.

Gemini developer experience was stupidly easy. Easy to add a file "part" to a prompt. Easy to focus on the main problem with weirdly high context window. Multi-modal so it handles a lot of issues for you (PDF image vs. PDF with data), etc. I can recommend it for the use case presented in this blog (ignoring the bounding boxes part)!

kbyatnal · 2025-02-06T00:46:48 1738802808

This is spot on, any legacy vendor focusing on a specific type of PDF is going to get obliterated by LLMs. The problem with using an off-the-shelf provider like this is, you get stuck with their data schema. With an LLM, you have full control over the schema meaning you can parse and extract much more unique data.

The problem then shifts from "can we extract this data from the PDF" to "how do we teach an LLM to extract the data we need, validate its performance, and deploy it with confidence into prod?"

You could improve your accuracy further by adding some chain-of-thought to your prompt btw. e.g. Make each field in your json schema have a `reasoning` field beforehand so the model can CoT how it got to its answer. If you want to take it to the next level, `citations` in our experience also improves performance (and when combined with bounding boxes, is powerful for human-in-the-loop tooling).

Disclaimer: I started an LLM doc processing infra company (https://extend.app/)

TeMPOraL · 2025-02-06T09:40:17 1738834817

> The problem then shifts from "can we extract this data from the PDF" to "how do we teach an LLM to extract the data we need, validate its performance, and deploy it with confidence into prod?"

A smart vendor will shift into that space - they'll use that LLM themselves, and figure out some combination of finetunes, multiple LLMs, classical methods and human verification of random samples, that lets them not only "validate its performance, and deploy it with confidence into prod", but also sell that confidence with an SLA on top of it.

wraptile · 2025-02-06T18:59:32 1738868372

That's what we did with our web scraping saas - with Extraction API¹ we shifted web scraped data parsing to support both predefined models for common objects like products, reviews etc. and direct LLM prompts that we further optimize for flexible extraction.

There's definitely space here to help the customer realize their extraction vision because it's still hard to scale this effectively on your own!

1 - https://scrapfly.io/extraction-api

quantumPilot · 2025-02-16T21:58:52 1739743132

What's the value for a customer to pay a vendor that is only a wrapper around an LLM when they can leverage LLMs directly? I imagine tools being accessible for certain types of users, but for customers like those described here, you're better off replacing any OCR vendor with your own LLM integration

sitkack · 2025-02-06T13:38:23 1738849103

Software is dead, if it isn't a prompt now, it will be a prompt in 6 months.

Most of what we think software is today, will just be a UI. But UIs are also dead.

SketchySeaBeast · 2025-02-06T15:16:39 1738854999

I wonder about these takes. Have you never worked in a complex system in a large org before?

OK, sure, we can parse a PDF reliably now, but now we need to act on that data. We need to store it, make sure it ends up with the right people who need to be notified that the data is available for their review. They then need to make decisions upon that data, possible requiring input from multiple stakeholders.

All that back and forth needs to be recorded and stored, along with the eventual decision and the all supporting documents and that whole bundle needs to be made available across multiple systems, which requires a bunch of ETLs and governance.

An LLM with a prompt doesn't replace all that.

sitkack · 2025-02-06T20:38:16 1738874296

We need to think terms of light cones, not dog and pony take downs of whatever system you are currently running. See where thigns are going.

I have worked in large systems, both in code and people, compilers, massive data processing systems, 10k business units.

collingreen · 2025-02-06T20:49:26 1738874966

I don't know what light cones or dog and pony mean here but I'm interested in your take - would you care to expand a bit on how the future can reshape that very complicated set of steps and humans described in the parent?

SketchySeaBeast · 2025-02-07T00:34:15 1738888455

I think collingreen followed-up better than I ever could, so I'm hoping you can respond to them with more details.

victorbjorklund · 2025-02-06T16:46:51 1738860411

Can you prompt a salesforce replacement for an org with 100 000 employees?

mrbungie · 2025-02-06T20:33:42 1738874022

Yesterday I read an /r/singularity post in awe cus of a screenshot of a lead management platform from OAI in a japan convention supposedly meant a direct threat to SalesForce. Like, yeah sure buddy.

I would say most acceleracionist/AI bulls/etc don't really understand the true essential complexity in software development. LLMs are being seen as a software development silver bullets, and we know what happens with silver bullets.

sitkack · 2025-02-06T20:35:55 1738874155

Come back your comment in 18 months.

collingreen · 2025-02-06T20:51:55 1738875115

I assume this is a slap intended to imply that ai actually IS a silver bullet answer to the parent's described problem and in just 18 months they will look back and realize how wrong they are.

Is that what you mean and, if so, is there anything in particular you've seen that leads you to see these problems being solved well or on the 18 month timeline? That sounds interesting to look at to me and I'd love to know more.

sitkack · 2025-02-06T23:41:05 1738885265

It isn't a silver bullet in that it can just "make software" but it is changing the entire dynamic.

You can't do point sampling to figure out where things are going. We have to look at the slope. People see a paper come out, look at the results and say, "this fails for x, y and z. doesn't work", that is now how scientific research works. This is why two minute papers has the tag line, "hold on to your papers ... two papers down the line ..."

Copy and paste the whole thread into a SOTA model and have meta me explain it.

ethbr1 · 2025-02-08T04:23:25 1738988605

That's not why more experienced people are doubting you.

They're doubting you because the non-digital portions of processes change at people/org speed.

Which is to say that changing a core business process is a year political consensus, rearchitecture, and change management effort, because you also have to coordinate all the cascading and interfacing changes.

sitkack · 2025-02-08T19:38:00 1739043480

> changing a core business process is a year political consensus, rearchitecture, and change management effort

You are thinking within the existing structures, those structures will evaporate. All along the software supply chain, processes will get upended, not just because of how technical assets will be created, but also how organizations themselves are structured and react and in turn how software is created and consumed.

This is as big as the invention of the corporation, the printing press and the industrial revolution.

I am not here to tutor people on this viewpoint or defend it, I offer it and everyone can do with it what they will.

ethbr1 · 2025-02-08T20:51:59 1739047919

Ha. Look back on this comment in a few years.

cpursley · 2025-02-06T14:02:05 1738850525

Software without data moats, vender lock-in, etc sure will. All the low handing fruit saas is going to get totally obliterated by LLM built-software.

fragmede · 2025-02-06T18:42:16 1738867336

If I'm an autobody shop or some other well-served niche, how unhappy with them do I have to be to decide to find a replacement, either a competitor of theirs that used an LLM, or bring it in house and go off and find a developer to LLM-acceleratedly make me a better shopmonkey? And there are the integrations. I don't own a low hanging fruit SaaS company, but it seems very sticky, and since the established company already exists, they can just lower prices to meet their competitors.

B2B is different from B2C, so if one vendor has a handful of clients and they won't switch away, there's no obliterating happening.

What's opened up is even lower hanging fruit, on more trees. A SaaS company charging $3/month for the left-handed underwater basket weaver niche now becomes viable as a lifestyle business. The shovels in this could be supabase/similar, since clients can keep access to their data there even if they change frontends.

sitkack · 2025-02-06T20:35:08 1738874108

Which means that the current vc-software-ecosystem is the walking dead. The front end webdev is now going to do things that previously took a 10 person startup.

cpursley · 2025-02-07T12:22:54 1738930974

Integrations is part of the data moat I mentioned.

Vrondi · 2025-02-08T15:19:04 1739027944

The only thing that will be different for most is vendor lock-in will be to LLM vendors.

sitkack · 2025-02-06T18:14:48 1738865688

Totally agree.

Cumpiler69 · 2025-02-06T10:37:29 1738838249

>A smart vendor will shift into that space - they'll use that LLM themselves

It's a bit late to start shifting now since it takes time. Ideally they should already have a product on the market.

TeMPOraL · 2025-02-06T11:01:25 1738839685

There's still time. The situation in which you can effectively replace your OCR vendor with hitting LLM APIs via a half-assed Python script ChatGPT wrote for you, has existed for maybe few months. People are only beginning to realize LLMs got good enough that this is an option. An OCR vendor that starts working on the shift today, should easily be able to develop, tune, test and productize an LLM-based OCR pipeline way before most of their customers realize what's been happening.

But it is a good opportunity for a fast-moving OCR service to steal some customers from their competition. If I were working in this space, I'd be worried about that, and also about the possibility some of the LLM companies realize they could actually break into this market themselves right now, and secure some additional income.

EDIT:

I get the feeling that the main LLM suppliers are purposefully sticking to general-purpose APIs and refraining from competing with anyone on specific services, and that this goes beyond just staying focused. Some of potential applications, like OCR, could turn into money printers if they moved on them now, and they all could use some more cash to offset what they burn on compute. Is it because they're trying to avoid starting an "us vs. them" war until after they made everyone else dependent on them?

anon84873628 · 2025-02-06T19:11:40 1738869100

To the point after your edit, I view it like the cloud shift from IaaS to PaaS / SaaS. Start with a neutral infrastructure platform that attracts lots of service providers. Then take your pick of which ones to replicate with a vertically integrated competitor or manager offering once you are too big for anyone to really complain.

bayindirh · 2025-02-06T12:47:50 1738846070

Never underestimate the power of the second mover. Since the development is happening in the open, someone can quickly cobble up the information and cut directly to the 90% of the work.

Then your secret sauce will be your fine tunes, etc.

Like it or not AI/LLM will be a commodity, and this bubble will burst. Moats are hard to build when you have at least one open source copy of what you just did.

SoftTalker · 2025-02-06T17:26:08 1738862768

And next year your secret sauce will be worthless because the LLMs are that much better again.

Businesses that are just "today's LLM + our bespoke improvements" won't have legs.

pmarreck · 2025-02-17T17:05:50 1739811950

I have some out-of-print books that I want to convert into nice pdf's/epubs (like, reference-quality)

1) I don't mind destroying the binding to get the best quality. Any idea how I do so?

2) I have a multipage double-sided scanner (fujitsu scansnap). would this be sufficient to do the scan portion?

3) Is there anything that determines the font of the book text and reproduces that somehow? and that deals with things like bold and italic and applies that either as markdown output or what have you?

4) how do you de-paginate the raw text to reflow into (say) an epub or pdf format that will paginate based on the output device (page size/layout) specification?

raghavsb · 2025-02-10T12:33:23 1739190803

Great, I landed on the reasoning and citations bit through trial and error and the outputs improved for sure.

MajorData · 2025-02-09T20:50:29 1739134229

`How did you add bounding boxes, especially if it is variety of files?

bitdribble · 2025-02-16T16:23:51 1739723031

In my open source tool http://docrouter.ai I run both OCR and LLM/Gemini, using litellm to support multiple LLMs. The user can configure extraction schema & prompts, and use tags to select which prompt/llm combination runs on which uploaded PDF.

LLM extractions are searched in OCR output, and if matched, the bounding box is displayed based on OCR output.

Demo: app.github.ai (just register an account and try) Github: https://github.com/analytiq-hub/doc-router

Reach out to me at andrei@analytiqhub.com for questions. Am looking for feedback and collaborators.

montecruiseto · 2025-02-07T11:44:33 1738928673

So why should I still use Extend instead of Gemini?

panta · 2025-02-06T17:35:55 1738863355

How do you handle the privacy of the scanned documents?

bitdribble · 2025-02-16T16:29:12 1739723352

With the docrouter.ai, it can be installed on prem. If using the SAAS version, users can collaborate in separate workspaces, modeled on how Databricks supports workspaces. Back end DB is Mongo, which keeps things simple.

One level of privacy is the workspace level separation in Mongo. But, if there is customer interest, other setups are possible. E.g. the way Databricks handles privacy is by actually giving each account its own back end services - and scoping workspaces within an account.

That is a good possible model.

kbyatnal · 2025-02-06T17:48:00 1738864080

We work with fortune 500s in sensitive industries (healthcare, fintech, etc). Our policies are:

- data is never shared between customers

- data never gets used for training

- we also configure data retention policies to auto-purge after a time period

panta · 2025-02-06T18:41:52 1738867312

But how to get these guarantees from the upstream vendors? Or do you run the LLMs on premises?

Karrot_Kream · 2025-02-06T19:20:11 1738869611

If you're using LLM APIs there are SLAs from the vendors to make sure your inputs are not used as training data and other guarantees. Generally these endpoints cost more to use (the compliance fee essentially) but they solve the problem.

makeitdouble · 2025-02-05T23:35:29 1738798529

> After trial and error with different models

As a mere occasional customer I've been scanning 4 to 5 pages of the same document layout every week in gemini for half a year, and every single week the results were slightly different.

To note the docs are bilingual so it could affect the results, but what stroke me is the lack of consistency, and even with the same model, running it two or three times in a row gives different results.

That's fine for my usage, but that sounds like a nightmare if everytime Google tweaks their model, companies have to reajust their whole process to deal with the discrepancies.

And sticking with the same model for multiple years also sound like a captive situation where you'd have to pay premium for Google to keep it available for your use.

tomrod · 2025-02-06T00:25:03 1738801503

Consider turning down the temperature in the configuration? LLMs have a bit of randomness in them.

Gemini 2.0 Flash seems better than 1.5 - https://deepmind.google/technologies/gemini/flash/

mejutoco · 2025-02-06T08:51:06 1738831866

> and every single week the results were slightly different.

This is one of the reasons why open source offline models will always be part of the solution, if not the whole solution.

rafaelmn · 2025-02-06T09:27:04 1738834024

Inconsistency comes from scaling - if you are optimizing your infra to be cos effective you will arrive at same tradeoffs. Not saying it's not nice to be able to make some of those decisions on your own - but if you're picking LLMs for simplicity - we are years away from running your own being in the same league for most people.

mejutoco · 2025-02-06T12:53:25 1738846405

And if you are not you wont.

You can decide if you change your local setup or not. You cannot decide the same of a service.

There is nothing inevitable about inconsistency in a local setup.

iandanforth · 2025-02-06T00:32:01 1738801921

At temperature zero, if you're using the same API/model, this really should not be the case. None of the big players update their APIs without some name / version change.

pigscantfly · 2025-02-06T02:58:17 1738810697

This isn't really true unfortunately -- mixture of experts routing seems to suffer from batch non-determinism. No one has stated publicly exactly why this is, but you can easily replicate the behavior yourself or find bug reports / discussion with a bit of searching. The outcome and observed behavior of the major closed-weight LLM APIs is that a temperature of zero no longer corresponds to deterministic greedy sampling.

brookst · 2025-02-06T04:51:06 1738817466

If temperature is zero, and weights are weights, where is the non-deterministic behavior coming from?

wodenokoto · 2025-02-06T08:48:48 1738831728

Temperature changes the distribution that is sampled, not if a distribution is sampled.

Temperature changes the softmax equation[1], not weather or not you are sampling from the softmax result or choosing the highest probability. IBM's documentation corroborates this, saying you need to set do_sample to True in order for the temperature to have any effect, e.g., T changes how we sample, not if we sample [2].

A similar discussion on openai forum also claim that the RNG might be in a different state from run to run, although I am less sure about that [3]

[1] https://pelinbalci.com/2023/10/16/Temperature_parameter.html

[2] https://www.ibm.com/think/topics/llm-temperature#:~:text=The...

[3] https://community.openai.com/t/clarifications-on-setting-tem...

zelphirkalt · 2025-02-06T09:19:35 1738833575

I have dealt with traditional ML models in the past and things like tensorflow non-reproducibility. Managed to make them behave reproducibly. This is a very basic requirement. If we cannot even have that or people who deal with Gemini or similar models do not even know why they don't deliver reproducible results ... This seems very bad. It becomes outright unusable for anyone wanting to do research with reliable result. We already have a reproducibility crisis, because researchers often do not have the required knowledge to properly handle their tooling and would need a knowledgeable engineer to set it up. Only that most engineers don't know either and don't show enough attention to the detail to make reproducible software.

sidkshatriya · 2025-02-06T14:17:37 1738851457

Your response is correct. However, you can choose to not sample from the distribution. You can have a rule to always choose the token with the highest probability generated by the softmax layer.

This approach should make the LLM deterministic regardless of the temperature chosen.

P.S. Choosing lower and lower temperatures will make the LLM more deterministic but it will never be totally deterministic because there will always be some probability in other tokens. Also it is not possible to use temperature as exactly 0 due to exp(1/T) blowup. Like I mentioned above, you could avoid fiddling with temperature and just decide to always choose token with highest probability for full determinism.

There are probably other more subtle things that might make the LLM non-deterministic from run to run though. It could be due to some non-deterministism in the GPU/CPU hardware. Floating point is very sensitive to ordering.

TL;DR for as much determinism as possible just choose token with highest probability (i.e. dont sample the distribution).

TeMPOraL · 2025-02-06T07:26:09 1738826769

Here probably routing would be dominating, but in general, unless I missed all the vendors ditching GPUs and switching to ASICs optimized for fixed precision math, floating points are still non-commutative therefore results are non-deterministic wrt. randomness introduced by parallelising the calculations.

zelphirkalt · 2025-02-06T09:24:01 1738833841

Of course which part of the calculations happens where should also be specifiable and be able to be made deterministicor should not have an effect on the result. A map reduce process' reduce step, merging results from various places also should be able to be made to give reproducible results, regardless of which results arrive first or from where.

Is our tooling too bad for this?

TeMPOraL · 2025-02-06T10:45:37 1738838737

> Is our tooling too bad for this?

Floating points are fundamentally too bad for this. We use them because they're fast, which usually more than compensates for inaccuracies FP math introduces.

(One, dealing with FP errors is mostly a fixed cost - there's a branch of CS/mathematics specializing in it, producing formally proven recipes for computing specific things in way that minimize or at least give specific bounds on errors. That's work that can be done once, and reused forever. Two, most programmers are oblivious to those issues anyway, and we've learned to live with the bugs :).)

When your parallel map-reduce is just doing matrix additions and multiplications, guaranteeing order of execution comes with serious overhead. For one, you need to have all partial results available together before reducing, so either the reduction step needs to have enough memory to store a copy of all the inputs, or it needs to block the units computing those inputs until all of them finish. Meanwhile, if you drop the order guarantee, then the reduction step just needs one fixed-size accumulator, and every parallel unit computing the inputs is free to go and do something else as soon as it's done.

So the price you pay for deterministic order is either a reduction of throughput or increase in on-chip memory, both of which end up translating to slower and more expensive hardware. The incentives strongly point towards not giving such guarantees if it can be avoided - keep in mind that GPUs have been designed for videogames (and graphics in general), and for this, floating point inaccuracies only matter when they become noticeable to the user.

Dylan16807 · 2025-02-06T09:22:49 1738833769

Why would the same software on the same GPU architecture use different commutations from run to run?

Also if you're even considering fixed point math, you can use integer accumulators to add up your parallel chunks.

TeMPOraL · 2025-02-06T10:08:04 1738836484

Why would the same multithreaded software run on the same CPU (not just architecture - the same physical chip) have its instructions execute in different order from run to run? Performance. Want things deterministic? You have to explicitly keep them in sync yourself. GPUs sport tens of thousands of parallel processors these days, which are themselves complex, and are linked together with more complexity, both hardware and software. They're designed to calculate fast, not to ensure every subprocessor is always in lock step with every other one.

Model inference on GPU is mostly doing a lot of GPU equivalent of parallelized product on (X1, X2, X3, ... Xn), where each X is itself some matrix computed by a parallelized product of other matrices. Unless there's some explicit guarantee somewhere that the reduction step will pause until it gets all results so it can guarantee order, instead of reducing eagerly, each such step is a non-determinism transducer, turning undetermined execution order into floating point errors via commutation.

I'm not a GPU engineer so I don't know for sure, especially about the new cards designed for AI, but since reducing eagerly allows more memory-efficient design and improves throughput, and GPUs until recently were optimized for games (where FP accuracy doesn't matter that much), and I don't recall any vendor making determinism a marketing point recently, I don't believe GPUs suddenly started to guarantee determinism at expense of performance.

Dylan16807 · 2025-02-06T16:32:39 1738859559

Each thread on a CPU will go in the same order.

Why would the reduction step of a single neuron be split across multiple threads? That sounds slower and more complex than the naive method. And if you do decide to write code doing that, then just the code that reduces across multiple blocks needs to use integers, so pretty much no extra effort is needed.

Like, is there a nondeterministic-dot-product instruction baked into the GPU at a low level?

TeMPOraL · 2025-02-07T09:52:33 1738921953

> Each thread on a CPU will go in the same order.

Not unless you control the underlying scheduler and force deterministic order; knowledge of all the code running isn't sufficient, as some factors affecting threading order are correlated with physical environment. For example, minute temperature gradient differences on the chip between two runs could affect how threads are allocated to CPU cores and order in which they finish.

> Why would the reduction step of a single neuron be split across multiple threads?

Doesn't have to, but can, depending on how many inputs it has. Being able to assume commutativity gives you a lot of flexibility in how you parallelize it, and allows you to minimize overhead (both in throughput and memory requirements).

> Like, is there a nondeterministic-dot-product instruction baked into the GPU at a low level?

No. There's just no dot-product instruction baked into GPU at low level that could handle vectors of arbitrary length. You need to write a loop, and that usually becomes some kind of parallel reduce.

Dylan16807 · 2025-02-07T10:31:25 1738924285

> could affect how threads are allocated to CPU cores and order in which they finish

I'm very confused by how you're interpreting the word "each" here.

> Being able to assume commutativity gives you a lot of flexibility in how you parallelize it, and allows you to minimize overhead (both in throughput and memory requirements).

Splitting up a single neuron seems like something that would only increase overhead. Can you please explain how you get "a lot" of flexibility?

> You need to write a loop, and that usually becomes some kind of parallel reduce.

Processing a layer is a loop within a loop.

The outer loop is across neurons and needs to be parallel.

The inner loop processes every weight for a single neuron and making it parallel sounds like extra effort just to increase instruction count and mess up memory locality and make your numbers less consistent.

TeMPOraL · 2025-02-07T11:35:59 1738928159

I feel like you're imagining a toy network with couple dozen neurons in few layers, done on a CPU. But consider a more typical case of dozens of layers with hundreds (or thousands) of neurons each. That's some thousand numbers to reduce per each neuron.

Then, remember that GPUs are built around thousands of tiny parallel processors, each able to process a bunch (e.g. 16) parallel threads, but then the threads have to run in larger batches (SIMD-like), and there's a complex memory management architecture built-in, over which you only have so much control. Specific numbers of cores, threads, buffer sizes, as well as access patterns, differ between GPU models, and for optimal performance, you have to break down your computation to maximize utilization. Or rather, have the runtime do it for you.

This ain't an an FPGA, you don't get to organize hardware to match your network. If you have a 1000 neurons per hidden layer, then individual neurons likely won't fit on a single CUDA core, so you will have to split them down the middle, at least if you're using full-float math. Speaking of, the precision of the numbers you use is another parameter that adds to the complexity.

On the one hand, you have a bunch of mostly-linear matrix algebra, where you can tune precision. On the other hand, you have a GPU-model-specific number of parallel processors (~thousands), that can fit only so much memory, can run some specific number of SIMD-like threads in parallel, and most of those numbers are powers of two (or a multiple of), so you have also alignment to take into account, on top of memory access patterns.

By default, your network will in no way align to any of that.

It shouldn't be hard to see that assuming commutativity gives you (or rather the CUDA compiler) much more flexibility to parallelize your calculations by splitting it whichever way it likes to maximize utilization.

Dylan16807 · 2025-02-07T18:55:59 1738954559

I'm not imagining toy sizes. Quite the opposite. I'm saying that layers are so big that splitting per neuron already gives you a ton of individual calculations to schedule and that's plenty to get full usage out of the hardware.

You can do very wide calculations on a single neuron if you want; throwing an entire SM (64 or 128 CUDA cores) at a single neuron is trivial to do in a deterministic way. And if you have a calculation so big you benefit from splitting it across SMs, doing a deterministic sum at the end will use an unmeasurably small fraction of your runtime.

And I'll remind you that I wasn't even talking about determinism across architectures, just within an architecture, so go ahead and optimize your memory layouts and block sizes to your exact card.

michalsustr · 2025-02-06T07:48:21 1738828101

I recently attended a STAC conference where they claimed the GPUs themselves are not deterministic. The hand-wavy speculation is they need to temperature control the cores and the flop ops may be reordered during that process. (By temperature I mean physical temperature, not some nn sampling parameter). On such large scale of computation these small differences can show up in the actually different tokens.

fancyfredbot · 2025-02-06T08:09:21 1738829361

I can assure you this isn't true. Having worked with GPUs for many years in an application where consist results are important it's not only possible but actually quite easy to ensure consistent inputs produce consistent results. The temperature and clock speed do not affect the order of operations, only the speed, and this doesn't affect the results. This is the same as with any modern CPU which will also adjust clock for temperature.

petesergeant · 2025-02-06T04:55:06 1738817706

The parent is suggesting that temperature only applies at the generation step, but the choice of backend “expert model” that a request is given to (and then performs the generation) is non-deterministic. Rather than being a single set of weights, there are a few different sets of weights that constitute the “expert” in MoE. I have no idea if that’s true, but that’s the assertion

brookst · 2025-02-06T05:27:05 1738819625

I don't think it makes sense? Somewhere there has to be a RNG for that to be true. MOE itself doesn't introduce randomness, and the routing to experts is part of the model weights, not (I think) a separate model.

pigscantfly · 2025-02-06T06:01:20 1738821680

The samples your input is batched with on the provider's backend vary between calls and sparse mixture of experts routing when implemented for efficient utilization induces competition among tokens with either encouraged or enforced balance of expert usage among tokens in the same fixed-size group. I think it's unknown or at least undisclosed exactly why sequence non-determinism at zero temperature occurs in these proprietary implementations, but I think this is a good theory.

[1] https://arxiv.org/abs/2308.00951 pg. 4 [2] https://152334h.github.io/blog/non-determinism-in-gpt-4/

kettleballroll · 2025-02-06T06:25:09 1738823109

I thought the temperature only affects randomness at the end of the network (when turning embeddings back I to words using the softmax). It cannot influence routing, which is inherently influenced by which examples get batched together (ie, it might depend on other users of the system)

menaerus · 2025-02-06T07:40:10 1738827610

You don't need RNG since the whole transformer is an extremely large floating-point arithmetic unit. A wild guess - how about the source of non-determinism is coming from the fact that, on the HW level, tensor execution order is not guaranteed and therefore (T0 * T1) * T2 can produce slightly different results than T0 * (T1 * T2) due to rounding errors?

daralthus · 2025-02-06T12:28:44 1738844924

I have seen numbers come differently in JAX just depending on the batch size, simply because the compiler optimizes to a different sequence of operations on the hardware.

kiratp · 2025-02-06T09:50:17 1738835417

Quantized floating point math can, under certain scenarios, be non-associative.

When you combine that fact with being part of a diverse batch of requests over an MoE model, outputs are non-deterministic.

bushbaba · 2025-02-06T14:33:28 1738852408

That’s why you have azure openAI APIs which give a lot more consistency

itissid · 2025-02-05T21:56:35 1738792595

Wait isn't there atleast a two step process here one is semantic segmentation followed by a method like texttract for text - to avoid hallucinations?

One cannot possibly say that "Text extracted by a multimodal model cannot hallucinate"?

> accuracy was like 96% of that of the vendor and price was significantly cheaper.

I would like to know how this 96% was tested. If you use a human to do random sample based testing, well how do you adjust the random sample for variations in distribution of errors that vary like a small set of documents could have 90% of the errors and yet they are only 1% of the docs?

themanmaran · 2025-02-05T22:04:06 1738793046

One thing people always forget about traditional OCR providers (azure, tesseract, aws textract, etc.) is that they're ~85% accurate.

They are all probabilistic. You literally get back characters + confidence intervals. So when textract gives you back incorrect characters, is that a hallucination?

kapitalx · 2025-02-05T22:22:30 1738794150

I'm the founder of https://doctly.ai, also pdf extraction.

The hallucination in LLM extraction is much more subtle as it will rewrite full sentences sometimes. It is much harder to spot when reading the document and sounds very plausible.

We're currently working on a version where we send the document to two different LLMs, and use a 3rd if they don't match to increase confidence. That way you have the option of trading compute and cost for accuracy.

LeafItAlone · 2025-02-06T04:38:53 1738816733

>We're currently working on a version where we send the document to two different LLMs, and use a 3rd if they don't match to increase confidence.

I’m interested to hear more about the validation process here. In my limited experience, I’ve sent the same “document” to multiple LLMs and gotten differing results. But sometimes the “right” answer was in the minority of responses. But over a large sample (same general intent of document, but very different possible formats of the information within), there was no definitive winner. We’re still working on this.

nnurmanov · 2025-02-06T04:06:34 1738814794

What if you use a different prompt to check the result, did this work? I am thinking to use this approach, but now I think maybe it is better to use two different LLM like you do.

anon373839 · 2025-02-05T22:17:07 1738793827

It’s a question of scale. When a traditional OCR system makes an error, it’s confined to a relatively small part of the overall text. (Think of “Plastics” becoming “PIastics”.) When a LLM hallucinates, there is no limit to how much text can be made up. Entire sentences can be rewritten because the model thinks they’re more plausible than the sentences that were actually printed. And because the bias is always toward plausibility, it’s an especially insidious problem.

themanmaran · 2025-02-05T23:17:45 1738797465

It's a bit of a pick your poison situation. You're right that traditional OCR mistakes are usually easy to catch (except when you get $30.28 vs $80.23). Compared to LLM hallucinations that are always plausibly correct.

But on the flip side, layout is often times the biggest determinant of accuracy, and that's something LLMs do a way better job on. It doesn't matter if you have 100% accurate text from a table, but all that text is balled into one big paragraph.

Also the "pick the most plausible" approach is a blessing and a curse. A good example is the handwritten form here [1]. GPT 4o gets the all the email addresses correct because it can reasonably guess these people are all from the same company. Whereas AWS treats them all independently and returns three different emails.

[1] https://getomni.ai/ocr-demo

miki123211 · 2025-02-06T07:14:18 1738826058

The difference is the kind of hallucinations you get.

Traditional OCR is more likely to skip characters, or replace them with similar -looking ones, so you often get AL or A1 instead of AI for example. In other words, traditional spelling mistakes. LLMs can do anything from hallucinating new paragraphs to slightly changing the meaning of a sentence. The text is still grammatically correct, it makes sense in the context, except that it's not what the document actually said.

I once gave it a hand-written list of words and their definitions and asked it to turn that into flashcards (a json array with "word" and "definition"). Traditional OCR struggled with this text, the results were extremely low-quality, badly formatted but still somewhat understandable. The few LLMs I've tried either straight up refused to do it, or gave me the correct list of words, but entirely hallucinated the definitions.

Scoundreller · 2025-02-05T22:27:52 1738794472

> You literally get back characters + confidence intervals.

Oh god, I wish speech to text engines would colour code the whole thing like a heat map to focus your attention to review where it may have over-enthusiastically guessed at what was said.

You no knot.

gioazzi · 2025-02-05T23:09:30 1738796970

We did this for a speech to text solution in healthcare. Doctors would always review everything that was transcribed manually (you don’t want hallucinations in your prescription), and using a heatmap it was trivial to identify e.g. drugs that were pretty much always misunderstood by STT