Why Async Rust?

hdevalence · on Oct 15, 2023

A lot of discussion about async Rust assumes that the reason one would want to use async/Futures is for performance and scalability reasons.

Personally, though, I would strongly prefer to use async rather than explicit threading even for cases where performance wasn’t the highest priority. The conceptual model is just better. Futures allow you to cleanly express composition of sub-tasks in a way that explicit threading doesn’t:

https://monkey.org/~marius/futures-arent-ersatz-threads.html

ori_b · on Oct 15, 2023

Futures are nicely equivalent to one shot channels with threads.

pornel · on Oct 15, 2023

Not quite: Rust futures also have immediate cancellation and easy timeouts that can be imposed externally on any future.

In threads that perform blocking I/O you don't get that, and need to support timeouts and cancellation explicitly in every blocking call.

littlestymaar · on Oct 15, 2023

With a nice syntax sugar on top, but yeah pretty much.

riku_iki · on Oct 15, 2023

Futures as a concept are orthogonal to async, they can totally work in explicit threading model.

jupp0r · on Oct 15, 2023

They can be used that way, but you end up with exactly the same problems that async programming aims to avoid (performance, deadlocks, your business logic being cluttered with low level implementation details).

riku_iki · on Oct 15, 2023

> that async programming aims to avoid (performance, deadlocks, your business logic being cluttered with low level implementation details).

I disagree with you, my code looks safe and simple with explicit blocking threading, and at the same time is much simpler to reason about what is going on and tune in contrast to async frameworks which hide most of the details under the hood.

You can argue about performance, that async/epoll/etc allows to avoid spawning thousands of threads and remove some overhead, but there is no much benchmarks in internet (per my research) which would say that this performance overhead is large.

jupp0r · on Oct 15, 2023

If you are using explicit blocking, share data between threads and have not run into deadlocks then your application is trivial (which is great if it solves your problem).

riku_iki · on Oct 15, 2023

Could you explain how sharing data between threads is different in async programming and blocking programming?

jupp0r · on Oct 16, 2023

You can minimize sharing data between threads because it's easier to have data affinity with threads (ie only thread A will read or write to a piece of data). You can still access that data from multiple modules because the whole thread is never blocked waiting for IO (because of async). An extreme example is nodejs, where you only have one thread, can concurrently do thousands of things and never have to coordinate (ie via mutexes) data access.

riku_iki · on Oct 16, 2023

that may be true if you are Ok to have only one thread and not utilize parallelism.

jupp0r · on Oct 16, 2023

It's not either or, you can combine the two. I've worked on a system that did real time audio mixing for 10000s of concurrent connections, utilizing >50 cores, mostly with one thread each. Each thread had thread-local data, was receiving/sending audio packets to hundreds/thousands of different IP addresses just fine without worrying about mutexes at all. Try that with tens of thousands of actual OS threads and the associated scheduling overhead.

Having data affinity to cores is also great for cache hit rates.

Here is part of the C++ runtime this is based on: https://github.com/goto-opensource/asyncly. I was the principal author of it when it was created (before it was open sourced).

riku_iki · on Oct 16, 2023

> Each thread had thread-local data, was receiving/sending audio packets to hundreds/thousands of different IP addresses just fine without worrying about mutexes at all.

it doesn't sound they really sharing data with each other, it looks like your logic is well lineralizable and data localized, and you can't implement access to some global hashmap in that way for example.

> Try that with tens of thousands of actual OS threads and the associated scheduling overhead.

I run this(10k threads blocked by DB access) in prod and it works fine for my needs. There are lots of statements in internet about overhead, but not much benchmarks how large this overhead is.

> Here is part of the C++ runtime this is based on

yeah, I need one runtime on top of another runtime, with unknown quality, support, longevity and number of gotchas.

jupp0r · on Oct 16, 2023

> it doesn't sound they really sharing data with each other, it looks like your logic is well lineralizable and data localized, and you can't implement access to some global hashmap in that way for example.

Yes, because data can have thread affinity. Data doesn't need to be shared by _all _ connections, just by a few hundred/thousand. This enables connections to be scheduled to run on the same thread so that they can share data without synchronization.

> I run this(10k threads blocked by DB access) in prod and it works fine for my needs. There are lots of statements in internet about overhead, but not much benchmarks how large this overhead is.

The underlying problem is old and well researched: https://en.wikipedia.org/wiki/C10k_problem

riku_iki · on Oct 16, 2023

> Data doesn't need to be shared by _all _ connections,

data doesn't need to be shared in your specific case, not in general.

> The underlying problem is old and well researched: https://en.wikipedia.org/wiki/C10k_problem

wiki page doesn't mean it is well researched, where can I see results of overhead measurements on modern hardware?

jupp0r · on Oct 17, 2023

> wiki page doesn't mean it is well researched, where can I see results of overhead measurements on modern hardware?

Here is how this works: at the bottom of the wiki page, there are referenced papers. They contain measurements in modern hardware. You read those, then perhaps go to Google and see if there is any newer research that cites those papers.

If you don't feel like reading papers, HN has a search bar at the bottom that yields a wealth of results: https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...

riku_iki · on Oct 17, 2023

I spent short time looking and found that most papers are very outdated or don't have relevant info (no measurements of overhead) on that page. Give specific paper and citation or we finish this discussion.

jupp0r · on Oct 17, 2023

https://blog.erratasec.com/2013/02/multi-core-scaling-its-no...

Maybe you should just take a college computer architecture course along the lines of Hennessy/Patterson. This is nothing new, I learned much of this in college 15 years ago. The problem has only gotten worse since then, computers have not become more single threaded.

riku_iki · on Oct 17, 2023

my reading is that graphs in that post are just fantasized by author to demonstrate his idea and not backed by any benchmarks or measurements, at least I don't see any links on code in article and no mentions what logic he actually tried to run, how many threads/connections he spawned.

> The problem has only gotten worse since then, computers have not become more single threaded.

Computers are now can handle 10k blocking connections with ease.

jupp0r · on Oct 16, 2023

> yeah, I need one runtime on top of another runtime, with unknown quality, support, longevity and number of gotchas.

It's a library. It solved our problems at the time, years ago. It's still used in production and piping billions of audio minutes per month through it. You don't have to use it, I merely referred to it as an example. A similar library is proposed to be included in C++23: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p23...

riku_iki · on Oct 16, 2023

> It's still used in production and piping billions of audio minutes per month through it.

there are tons of overengineered unmaintainable code in prod, it doesn't mean I need to follow them as example without much justification.

> A similar library is proposed to be included in C++23

hm, I went through the code example, and would prefer my current approach as a much simpler and readable.

pornel · on Oct 15, 2023

At the lowest level Rust's async is a syntax for generators (yield).

I've (ab)used them that way, without any async runtime, just to easily write stateful iterators.

random_ · on Oct 15, 2023

Do you know an example of such case in the Rust ecosystem?

riku_iki · on Oct 15, 2023

No, I don't know, I am talking about general concept.

In Java, Future.get() blocks current thread, and it is trivially integrated into explicit threading programming. In Rust, Future.poll() is not blocking, and one would need to rely on some async framework, or build own event loop which can potentially block thread.

tux3 · on Oct 15, 2023

It's worth noting that a thread's JoinHandle provides a similar interface.

You can spawn your tasks, store the JoinHandle "futures", and wait for completion whenever you need the result.

A difference being that Futures do nothing until polled, while threads start on their own, but that's arguably a helpful simplification for this purpose.

convolvatron · on Oct 15, 2023

this I really found disconcerting - the model I often want is I want to start some work, and then join at some later point - or even chain directly into the next task.

but instead I start a future, and then to run it at all I need to wait for the result. I understand the there are tools to effect this, but it really leaves you wondering - what did I just do? start an async task and then .. block on it in order to get it to execute?

laurencerowe · on Oct 15, 2023

I'm not an expert, but my understanding is that Rust makes a distinction between Futures and Tasks, the top level Futures that are run by the executor.

In JavaScript terms Futures are more like sugar around callbacks, they don't do anything until you call/poll them. Tasks are independent entities like Promises which are being run by the executor, though they may currently be blocked on other tasks.

> the model I often want is I want to start some work, and then join at some later point - or even chain directly into the next task.

Rust wants you to do this the other way around. First chain together your futures so that when you start the top level one as a task there is a single state machine for it to run.

twic · on Oct 15, 2023

> I don’t know what to tell users who would rather just use threads and blocking IO. Certainly, I think there are a lot of systems for which that is a reasonable approach. And nothing in the Rust language prevents them from doing it. Their objection seems to be that the ecosystem on crates.io, especially for writing network services, is centered on using async/await. Ocassionally, I see a library which uses async/await in a “cargo cult” way, but mostly it seems safe to assume that the author of the library actually wants to perform non-blocking IO and get the performance benefits of user-space concurrency.

This seems like a glib dismissal of a real problem. If you want to do threads-and-blocking programming in the large in Rust, you basically can't, because async has become a very strong default in the ecosystem. And i don't think anyone could look at the ecosystem and honestly form the opinion that it's that way because every library developer has made the informed decision that their use-case requires async, when use cases which require async are so incredibly rare - it is absolutely a social default.

> None of us can control what everyone else decides to work on, and the fact of the matter is just that most people who release networking-related libraries on crates.io want to use async Rust, whether for business reasons or just out of interest. I’d like it to be easier to use those libraries in a non-async context (e.g. by bringing a pollster-like API into the standard library), but it’s hard to know what to say to people who’s gripe is that the people putting code online for free don’t have exactly the same use case as them.

The criticism is not about the library authors, but about the stewards of the language and community who laid the path for them to follow.

galangalalgol · on Oct 15, 2023

Ok, I feel a bit dense here, but I have an application that has threads and blocking. Thread pools for processing bulk data, and tokio for serving up the results. Am I missing something?

bryanlarsen · on Oct 16, 2023

I believe your criticism is answered by one of the proposals in the post: to pull the 'pollster' crate or something similar into the standard library.

It's not quite "colorless", but it's practical.

Pulling pollster into the standard library is a very minor change technically, but is much larger socially. The blessing of pollster will make blocking a first class citizen again.

jrpelkonen · on Oct 15, 2023

This is an excellent post about the evolution and the milestones that lead to the current async Rust approach. I thought the technical arguments were clearly articulated, dispassionately responding to the recent critical viewpoints regarding async Rust.

I hope that people, especially the ones that have voiced these criticisms, take the time to read and digest this article. It may not change their mind, but perhaps it will help them understand the current situation better.

Unfortunately, there are already some comments that prove my optimism misguided elsewhere in this thread.

soerxpso · on Oct 15, 2023

Although the articles arguments show that the decisions made were understandable and perhaps the "right decisions" for the language, there's a common theme that backwards compatibility was a motivator. I'm not saying they should have broken backwards compatibility, but "This is actually the best way to do it," and, "This is the best way that we could do it given our tough situation," are two very different points. I don't really care about the latter when I'm evaluating whether I actually like the way it's done.

withoutboats3 · on Oct 16, 2023

The only thing constrained by backward compatibility was using Pin instead of Move. Async/await itself is the natural progression for Rust for the reasons I outlined at length regarding how Rust represents zero-cost coroutines.

msvan · on Oct 15, 2023

I think the final remark about a hypothetical language, Rust-like but without all the low-level requirements, is important here. There is essentially no widely-adopted programming language out that feels like a modern ML with a good tooling situation. Until that happens, Rust will continue to awkwardly serve the audience of such a language while never truly being what they want it to be.

evntdrvn · on Oct 15, 2023

No love for F# ? The tooling is pretty dang good, all things considered.

pjmlp · on Oct 16, 2023

It could be even better if despite being from Microsoft wasn't handled like a 3rd party guest language on .NET design decisions, and VS roadmap tooling.

evntdrvn · on Oct 20, 2023

absolutely

zozbot234 · on Oct 15, 2023

ReasonML is at least as widely adopted as any other niche language, and seems to fit the bill as a ML-like which plays nice with modern tooling.

pohl · on Oct 16, 2023

Why wouldn’t Swift fit that description?

tempodox · on Oct 17, 2023

Like Rust, it's not an actual FP language. Borrowing some FP features is not sufficient.

pohl · on Oct 20, 2023

What's missing, in your view?

egnehots · on Oct 15, 2023

As a Rust user, I have to say that it was an incredibly confusing feature.

Not because of the technical decisions behind async/await, but because of the async ecosystem, especially the runtimes.

Picture a Rust user when the feature came:

- You can use it with future combinators!

- No, actually, use it with a macro library that almost works but not really.

- Here is Tokio (with a lot of moving parts).

- Wait, async std is much simpler (again, almost seems to work but not really).

- Wait, here is Smol, which is truly simpler (or at least smaller but not used).

=> You get "async fatigue."

I can understand the position: let's not commit to anything before seeing what sticks on the wall.

It might have worked for Serde (even that is debatable by some).

But as a user, it's hard to follow, and you get the impression that this feature is not the stable foundation you can build on.

littlestymaar · on Oct 15, 2023

Honestly, it gets better once you learn to stop worrying and love tokio.

I don't like the idea of having a single framework shaping up the entire Rust ecosystem, but at the same time once you just jump the tokio ship, everything just works out of the box and you don't have to worry. lots I wonder what are you referring to when talking about “lots of moving parts” when tokio has reached 1.0 three years ago and been pretty much stable since then.

dgacmu · on Oct 15, 2023

Tokio is awesome and I end up using it on almost all of my server projects.

But, particularly for library developers, one reason not to just jump on tokio and instead strive for (some, reasonable) compatibility across executors is the embedded world - async is an amazing match for programming tiny microprocessors because it provides an elegant syntactic sugar for all of those little interacting state machines, you would otherwise write, but you don't want a heavyweight thread-based executor to be mandatory.

(The embassy embedded framework for rust is an example of this. It doesn't yet have as wide support as some non-async frameworks but for the things that it's compatible with, it's an absolute delight.)

zbentley · on Oct 20, 2023

> heavyweight thread-based executor

While it's not the default, Tokio is usable in full without threads using the current-thread executor: https://docs.rs/tokio/latest/tokio/runtime/index.html#curren...

yencabulator · on Oct 16, 2023

By fixing yourself into the current Tokio API, you're locking your app into some decisions that might not be the best.

https://github.com/tokio-rs/tokio-uring/

https://www.datadoghq.com/blog/engineering/introducing-glomm...

https://itnext.io/modern-storage-is-plenty-fast-it-is-the-ap... https://news.ycombinator.com/item?id=25220892 https://www.reddit.com/r/rust/comments/k16j6x/modern_storage... https://www.reddit.com/r/programming/comments/k0yyk7/modern_...

https://www.youtube.com/watch?v=PbgTyCSDPrs

littlestymaar · on Oct 16, 2023

> By fixing yourself into the current Tokio API, you're locking your app into some decisions that might not be the best.

The impact of tokio on your app's code base isn't actually particularly big, and it wouldn't be too much trouble to change (the interface of other runtimes is very close to tokio's AFAIK). the main issue is the ecosystem: most of it is using tokio already so opting out of tokio also means cornering yourself in a place where there's little available external libraries to use.

The ecosystem strong ties with tokio isn't a good thing, and I wish there were ways to make things generic over the runtime, but it's not an application developer's decision in any case.

yencabulator · on Oct 16, 2023

You didn't really look at the links. Other ways of arranging the computation & I/O, like the thread-per-core model of glommio and io_uring, fundamentally change the API. There's even a second implementation of Tokio with an API different from the first one!

littlestymaar · on Oct 17, 2023

Sorry I did not read your 6 links… (3 out of 6 are already grey though, which means I already got there before)

> Other ways of arranging the computation & I/O, like the thread-per-core model of glommio and io_uring, fundamentally change the API. There's even a second implementation of Tokio with an API different from the first one!

It's an incompatible API in the sense that you need to update your code, but it doesn't require deep re-architecture work or anything (going from thread-per-core like glomio to tokio would be harder for instance, because then you'd need your futures to be Send).

egnehots · on Oct 15, 2023

The tokio ecosystem is also a world of its own.

Tokio internals are plentiful and more complex than in other runtimes. But the common difficulty is choosing what to use for an http server.

- hyper? just an http library but bring your own boilerplate?

- actix web? it was quite opiniated and there was some drama from its main author.

- axum? wait that seems to be the latest consensus actually.

And that's just the remaining popular choices.

spoiler · on Oct 15, 2023

Honesty, Axum is a breeze to use

foolswisdom · on Oct 15, 2023

Axum looks interesting, haven't seen that before.

I've been using Rocket.

paholg · on Oct 16, 2023

A nicety of axum is it uses hyper under the hood.

codeflo · on Oct 15, 2023

> OS threads have a large pre-allocated stack, which increases per-thread memory overhead.

It should be noted that AFAIK on all modern operating systems, only one page of the stack is actually allocated on thread creation, and the rest is merely reserved address space that will be allocated on demand.

That doesn't make this point wrong: 4 KB is still a lot more than the couple of bytes a future might need. And setting up the page table is part of what makes spawning threads so expensive.

vlovich123 · on Oct 15, 2023

That doesn’t really matter because in a long running process it’ll grow to use whatever your deepest call stack is. Matters less when you have few threads, matters a lot more when you have lots.

nyanpasu64 · on Oct 15, 2023

Can you have the runtime run a job every second, to trim thread stack depths and return memory to the kernel (whatever was meant by https://www.youtube.com/watch?v=kPR8h4-qZdk&t=1150s) while holding a FFI mutex (or a sharded one), and have every C function call first lock the mutex and expand the stack to the necessary depth?

scottlamb · on Oct 15, 2023

Yes, you can use madvise(MADV_FREE) to return the RAM on Linux. I've seen a thread pool library that does this. I don't remember how this part works, but I think it's possible to maintain a high water mark in some inexpensive fashion to know when it's worth making this call.

vlovich123 · on Oct 16, 2023

Fucking with memory maps like that is super expensive and I wouldn’t advise it in any part that’s performance critical due to the cross CPU TLB flush that syscall will entail.

nitwit005 · on Oct 15, 2023

People have written code that tries to keep a short stack to prevent this.

That was one way to scale up to a large number of clients when memory was more limited.

scottlamb · on Oct 15, 2023

> That doesn't make this point wrong: 4 KB is still a lot more than the couple of bytes a future might need.

It makes the memory usage acceptable in many contexts. E.g., 4 KiB is small compared to a socket buffer. YMMV.

> And setting up the page table is part of what makes spawning threads so expensive.

Stacks can be reused.

troppl · on Oct 15, 2023

I have a question that maybe someone here knows. It has been noted multiple times in the comments here that at this point, one should just use tokio as the async runtime. (From my limited experience, I agree)

The async Rust book [0] says this about runtimes:

> Importantly, executors, tasks, reactors, combinators, and low-level I/O futures and traits are not yet provided in the standard library. In the meantime, community-provided async ecosystems fill in these gaps.

Notably, it says "not yet". My question is if someone knows if there are actual plans to incorporate any (existing) async runtime, and if so, whether there is a timeline? Also, is tokio in the talks to be the runtime, or is this still open?

[0]: https://rust-lang.github.io/async-book/08_ecosystem/00_chapt...

afdbcreid · on Oct 15, 2023

There are no plans to incorporate a big async runtime like tokio. This does not matches Rust's vision of small, well-focused std.

There are possible plans to implement a minimal runtime that only allows you to execute futures. No I/O or such. Mainly for tests or examples. No timeline though, at least as far as I'm aware.

There are also plans to standardize the async traits, e.g. spawn a new task, AsyncRead and AsyncWrite etc.. I don't think there is a timeline, but they wait at least for async fn in traits.

scottlamb · on Oct 15, 2023

> I don't think there is a timeline, but they wait at least for async fn in traits.

That was just merged! [1] It should be stable in 1.75 on December 28th.

[1] https://github.com/rust-lang/rust/pull/115822

afdbcreid · on Oct 16, 2023

I know it was merged, but currently there is only support for statically-dispatched async fn in traits. I believe we also need dynamic dispatch to compete with the current style of `poll_X()` functions in the I/O traits.

scottlamb · on Oct 16, 2023

Fair enough. I'm still excited!

bryanlarsen · on Oct 16, 2023

The article proposes putting "pollster" or similar into the standard library.

https://docs.rs/pollster/latest/pollster/

Technically pollster is a runtime, although in practice it's an anti-runtime.

Dowwie · on Oct 15, 2023

Is anyone working on an async rust book? I'm aware of the draft that was last updated several years ago. I'm talking about a comprehensive, complete text on par with TRPL. The most difficult subject for rust programmers remains the least written about.

steveklabnik · on Oct 15, 2023

I am not aware of one, but I agree with you that it is sorely needed.

echelon · on Oct 15, 2023

Steve, could we sponsor you to write one or coordinate it?

You've done so much great work on Rust education, documentation, community building, etc. I'd be happy to send money your way.

steveklabnik · on Oct 15, 2023

I don't know if I have the energy in me. The Rust Project burned me out and continues to burn me out. Writing a book is a tremendous amount of work, and I am focused on my job over more broad open source contribution.

That said, I appreciate the kind words. But I am struggling to have the energy to accept conference invitations and do smaller open source work, so I wouldn't commit to anything like that any time soon. (that said I literally arrived in Raleigh for All Things Open earlier today, where my talk will actually be about async/await, so... never say never.)

echelon · on Oct 16, 2023

Totally understandable. You've worked like crazy for Rust.

Everything you've done is greatly appreciated!

Dowwie · on Oct 16, 2023

Can a book of this kind succeed while having many authors? Could this be a book project?

steveklabnik · on Oct 18, 2023

Creative works are hard. I’m sure it’s possible in theory, but it depends on the people who are interested in making it.

jph · on Oct 15, 2023

Steve I'll donate $100 to whomever wants to lead it now.

sujayakar · on Oct 15, 2023

as someone who's followed along (in large production systems) from `eventual` to a large combinator-based `futures`-0.1 system to async/await today, I really enjoyed this post and all its historical context.

despite its complexity and slight ergonomic annoyances, async rust is a monumental achievement: safe userspace concurrency without heap allocation!

Buttons840 · on Oct 15, 2023

I used Python's Twisted framework back in 2008 or so. This is before any language had async / await keywords (of course, the concept was there, just not the syntax level support).

Twisted built async on top of Python's generator functions, as far as I understand. I see now that the Rust community talks about supporting generators, and I've wondered if Rust did things backwards. If Rust had received generators first, could async have been built on top of generators without async specific special syntax?

pornel · on Oct 15, 2023

Rust had generators first and used them to experiment with async. They're still there, stuck in nightly/unstable.

https://doc.rust-lang.org/beta/unstable-book/language-featur...

I think that's because there was a huge demand for async specifically, and Rust could ship higher level async without solving all the design details of less desired generators first.

Buttons840 · on Oct 15, 2023

I didn't know that, thanks for telling me.

IshKebab · on Oct 15, 2023

> Context-switching between the kernel and userspace is expensive in terms of CPU cycles. > OS threads have a large pre-allocated stack, which increases per-thread memory overhead.

It feels like it would be totally possibly to improve OSes to remove this limitation. Is anyone actually working on that?

For example I don't see why you couldn't have growable stacks for threads. Or have first class hardware support for context switching. (Yes that would take a long time to arrive.)

spease · on Oct 15, 2023

I’m pretty sure that modern CPUs do generally have shadow registers to make context switching faster. However, you also have to consider cache contention, the fact it needs to access an indefinite amount of machine code during the context switch, it may service other processes during the context switch, and the effects all this will have on the instruction pipeline. Not to mention bugs in the CPU that might require it to flush the pipeline to maintain complete process separation.

So you can keep dedicating more and more silicon to redundant components to get closer to physically representing each thread, or you can code more efficiently.

Eg even if you did nothing but loop over a do-nothing system call, it would still need to have two separate executable pages in the cache instead of just one.

Not only that, but often the kernel is acting as a mediator for the hardware - which could mean synchronizing between cores, which brings its own obstacles (using slower shared cache, waiting for other cores, etc)

duped · on Oct 15, 2023

One problem is that when the stack is allocated it goes into the rest of the same virtual address space, and "growing" it might result in overlapping other address space allocated to the rest of the program. That's why there are guard pages around the stack (also, worth knowing that the stack doesn't grow up on some architectures, it can also grow down like on x86). That's why the post points out that growable stacks have to be moveable.

The main solution is to just reserve a ton of virtual address space but avoid committing to it until the process actually writes to it, which is exactly what OS threads do. They reserve a large amount of virtual address space to start but it's more or less free until a thread actually uses it. However you may not see it released back to the OS until the process exits.

IshKebab · on Oct 15, 2023

Yeah maybe an issue on 32-bit, but on 64-bit virtual address space is basically unlimited.

duped · on Oct 15, 2023

The problem isn't the amount of virtual address space but guaranteeing that when you grow you don't hit anything else. And the solution is to just reserve a ton of it. But then the problem becomes reclaiming memory.

dwattttt · on Oct 15, 2023

At this point though you start to accumulate other performance related issues, such as TLB pressure from all the reservations.

zeroCalories · on Oct 15, 2023

> It feels like it would be totally possibly to improve OSes to remove this limitation. Is anyone actually working on that?

This is just my guess, but I assume they don't use growable stacks for the same reason Rust doesn't. C doesn't have a garbage collector, and making a fragmented stack would incur unacceptable performance penalties for many workloads. Getting around that would require a ton of work, but maybe with hardware support, like automatically following the stack pointer to bring it into cache like a normal stack could get around that.

jrvidal · on Oct 15, 2023

I kinda recall some comments by pcwalton about APIs that would ameliorate context-switching costs at the kernel level:

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

IshKebab · on Oct 15, 2023

Ah that Switchto thing looks really interesting. From my brief skimming it sounds like the actual biggest cost of context switching is not saving and restoring all the state, but deciding which thread to switch to.

The Switchto patch allows you to just tell Linux which thread to switch to, so it doesn't need to figure it out. Looks like they reduced the costs by ~20x. I wonder why it was never merged.

I guess my intuition was right then. I mean, async is still useful for WASM and microcontrollers, but for "I need to support 10000000 concurrent connections" (which is the usual motivation), it's a hack around poor OS APIs.

smw · on Oct 15, 2023

io_uring is an interesting effort to drastically reduce the context switch overhead by not context switching.

lukeh · on Oct 15, 2023

Good match for async/await, too: I wrote a wrapper in Swift for it. [1]

[1] https://github.com/PADL/IORingSwift

zeroCalories · on Oct 15, 2023

I've been feeling annoyed by async rust too, and have been brought back to Python for some of my side projects. That said, a lot of the cool features that I like from Rust feel tacked on in Python, so I've been looking at Ocaml. The ecosystem looks even worse than Rust, but it seems like it could be a solid replacement in terms of pure language features.

mattnewport · on Oct 16, 2023

Have you looked at F#? If you like the MLish aspects of Rust but want to not have to deal with some of the manual memory management complexity it's pretty nice. There's access to the whole .NET ecosystem too, as well as really good tooling and IDE support, though not all .NET libraries written in / for C# feel totally natural to consume from F#.

pjmlp · on Oct 16, 2023

Some of them going forward aren't even possible to consume from F# unless there is a C# shim, e.g. anything that relies on code annoations for code generators, or post .NET 8, interceptors.

OCaml is a much better option in the UNIX world.

topspin · on Oct 15, 2023

The syntax isn't the problem. The syntax is fine: I've never had a serious problem with Rust's async syntax, and the language has been and continues to remove the papercuts that have emerged since async appeared.

The thing about async Rust that absolutely destroys my efforts with it is the legion of problems involving dependencies and their async runtime peculiarities.

As a result I've had to set Rust aside for most things where I'd otherwise love to employ it. I can't risk losing days to some tangle of an async runtime compatibility snafu. The one area that Rust has really astonished me recently is embedded Rust and MCUs: the ecosystem there is still young, but the results that can be produced with Rust in embedded development are really astonishing.

Of course, there is typically no async runtime involved at that level, so problem solved...

pornel · on Oct 15, 2023

The pain only exists if you're trying not to use Tokio that has 90% "market share" of Rust runtimes.

I get that it is sometimes needed (if you use GTK or browser JS as your async runtime, or write your own executor).

But for majority users there is only one runtime and there's no compatibility problem. Tokio won, and network effects killed everything else. Think of it like Golang's choice of runtimes (there isn't one).

p4ul · on Oct 16, 2023

This is my impression, too! The door is open for other async runtimes, but for the majority of us, Tokio is the solution. And that's been just fine for me!

random_ · on Oct 15, 2023

> I can't risk losing days to some tangle of an async runtime compatibility snafu.

At this point in time, if you use Tokio, won't basically everything mainstream work? At least that has been my experience so far.

bryanlarsen · on Oct 16, 2023

> I’d like it to be easier to use those libraries in a non-async context (e.g. by bringing a pollster-like API into the standard library),

I think this would simultaneously solve two of the major gripes expressed here and elsewhere.

1) An easy answer for those who wish to avoid being sucked into async just because there's a useful crate that has async.

2) Blessing an executor other than tokio avoids tokio lock in.

lukebitts · on Oct 15, 2023

Great read! I always enjoy these dives into the reasoning behind the language

nick238 · on Oct 15, 2023

Is the question "why async X" just short for "why cooperative multitasking in X language"?

Coming from the Python world where I've written a ton of async code there (which I think is just a short way of saying cooperative concurrency?), it's generally easier to reason because you don't need to worry about getting preempted anywhere. If your application is mostly comprised of hurry-up-and-wait I/O code, single thread, single process can be great.

harpiaharpyja · on Oct 15, 2023

No, if you read the article the question is not "why you should use async in Rust" or anything like that.

It's literally "why async Rust" as in "why async Rust is the thing that it is," why it is that way, how it came to be, etc.

There's some unavoidable level of relationship, but that's not the focus of the article.

zozbot234 · on Oct 15, 2023

Not quite, you can do cooperative multitasking using fibers ('stackful' coroutines).

randyrand · on Oct 16, 2023

another async approach that i don’t see mentioned often is what apple chose with Grand Central Dispatch

i.e pre allocated worker threads and work queues

perhaps all we needed was some nice semantics around that

yencabulator · on Oct 16, 2023

See glommio and tokio-uring. Unfortunately not stable, not polished at all, and receiving a fraction of the attention the main Tokio API is getting.

The biggest pain coming up is how io_uring changes the shape of the read/write APIs everywhere :-/

https://www.datadoghq.com/blog/engineering/introducing-glomm... https://itnext.io/modern-storage-is-plenty-fast-it-is-the-ap... https://news.ycombinator.com/item?id=25220892 https://www.reddit.com/r/rust/comments/k16j6x/modern_storage... https://www.reddit.com/r/programming/comments/k0yyk7/modern_... https://www.youtube.com/watch?v=PbgTyCSDPrs

https://github.com/tokio-rs/tokio-uring/

demi56 · on Oct 15, 2023

Honestly I see no reason why rust is so focused on Async if they’re C++ alternative (that is smaller runtime), having a unique keyword doesn’t seat right with me, I won’t be surprised if there will one day be a popular GC runtime

steveklabnik · on Oct 15, 2023

Just so you're aware, C++ has three keywords here: co_async, co_await, and co_return, so it feels a bit odd to complain about this aspect, specifically.

Also, as I mentioned below, Rust fares even better than C++ on minimizing allocations here.

C++ is used in high performing network services all the time, it shouldn't be a shock that Rust gets used for them as well.

saurik · on Oct 16, 2023

> Just so you're aware, C++ has three keywords here: co_async, co_await, and co_return...

This is admittedly an unimportant correction to what you said that doesn't in any way change your point; and yet, I still think it is an important one to keep in mind for people who only might end up with an indirect understanding of the C++ feature: C++ additionally chose to add a specialized co_yield... and specifically does not have co_async! This latter tradeoff then relates to the Rust discussions I have seen come up again recently due to the article "Was async fn a mistake?".

https://seanmonstar.com/post/66832922686/was-async-fn-a-mist...

https://news.ycombinator.com/item?id=37789057

https://old.reddit.com/16ugwuc/

kccqzy · on Oct 15, 2023

C++ strongly depends on the actual implementation quality by leaving a lot of things to implementations. For example, the person leading the coroutine implementation in clang is very inconsistent on minimizing allocations than the GCC implementations. As an example they implemented a generator system using C++ coroutines that eventually compile down to a single instruction (https://godbolt.org/z/nsTjjGbn4).

As a counter argument though, clang's support for coroutines is still buggy and not ready for production use.

Conscat · on Oct 16, 2023

Optimistically, the explicit coroutine allocation proposal and the Clang-IR project should both make the situation a lot better in the future.

demi56 · on Oct 15, 2023

Wow, so c++ is joining the gang too, looks like rust really is changing the programming ecosystem

steveklabnik · on Oct 15, 2023

Rust was not the impetus for C++ getting this feature, just to be clear about it. Or at least, I never read that in any of the many papers over the years working through the design.

Buttons840 · on Oct 15, 2023

C# added async / await keywords first it seems, JavaScript a few years later. Rust is not the leader here.

Animats · on Oct 15, 2023

He gives the business case:

It was clear, though usually left unsaid, that what Rust needed to succeed was industry adoption, so that it could continue to receive support once Mozilla stopped being willing to fund an experimental new language. And it was clear that the most likely path to short-term industry adoption was in network services, especially those with a performance profile that compelled them at the time to be written in C/C++ ...

The other advantage of network services was that this wing of the software industry has the flexibility and appetite to rapidly adopt a new technology like Rust. The other domains were - and are! - viable long term opportunities for Rust, but they were seen as not as quick to adopt new technology (embedded), depended on a new platform that had not yet seen widespread adoption itself (WebAssembly), or were not a particularly lucrative industrial application that could lead to funding for the language (CLIs). I drove at async/await with the diligent fervor of the assumption that Rust’s survival depended on this feature.

... Many of the most prominent sponsors of the Rust Foundation, especially those who pay developers, depend on async/await to write high performance network services in Rust as one of their primary use cases that justify their funding. Using async/await for embedded systems or kernel programming is also a growing area of interest with a bright future. Async/await has been so successful that the most common complaint about it is that the ecosystem is too centered on it, rather than “normal” Rust.

I don’t know what to tell users who would rather just use threads and blocking IO. Certainly, I think there are a lot of systems for which that is a reasonable approach. And nothing in the Rust language prevents them from doing it. Their objection seems to be that the ecosystem on crates.io, especially for writing network services, is centered on using async/await. ...

None of us can control what everyone else decides to work on, and the fact of the matter is just that most people who release networking-related libraries on crates.io want to use async Rust, whether for business reasons or just out of interest. I’d like it to be easier to use those libraries in a non-async context (e.g. by bringing a pollster-like API into the standard library), but it’s hard to know what to say to people who’s gripe is that the people putting code online for free don’t have exactly the same use case as them.

Well, that says it. Rust has pivoted to web stuff. For which Go is probably better suited. Good-bye, Rust as a systems language, or for game development. Younger programmers will probably still be seeing C/C++ buffer overflows in 2050.

The technical problem is that pure async, like JavaScript, is fine, and pure threading, like classic Rust, is fine. But the combination is awful.

steveklabnik · on Oct 15, 2023

This blog post goes into detail about how technical decisions were made to keep Rust a "systems language." Rust's async/await is even moreso here than C++ coroutines, as Rust requires zero allocations and C++ coroutines require at least one (that hopefully could be optimized out but that's not a guarantee.) People use async/await in embedded contexts, with no or tiny heaps, thanks to this careful design. It doesn't get more "systems" than that.

The end of the post specifically lays out a space for a "less systems Rust" that would make these features nicer, but that cannot exist in Rust due to its strong commitment to being as zero-overhead as possible.

I don't know what it is about this feature that leads to everyone grandstanding all the time. It's incredibly frustrating.

alilleybrinker · on Oct 15, 2023

Agreed on the grandstanding! I can’t think of another topic area where the option space is extremely rich and full of trade-offs and the discussion space is so full of strong claims of there being one obvious and unquestionably best solution.

corethree · on Oct 15, 2023

The community needs a language that isn't so extreme that is opinionated on what is zero cost and what is not. Golang sort of fits this bill as it has an opinionated take on this, but overall from my pov it's a bad opinion. No sum types, no namespaces, garbage collection.... I feel it's still the only other language that fills this space that isn't so OOP. It's a bad solution and it's popular because it's the only solution.

Anyway to get back to your point, I think this is why so many people complain. It's either go or rust and neither is ideal so one way to deal with the problem is attempting to shape rust into this ideal.

Right? Where can I find a language with no garbage collection, nice sum types and no over complicated async syntax? Where? Nowhere. This is what people want.

Rusky · on Oct 15, 2023

You can just not write async code in Rust, no? There are a lot of Rust programmers that work this way, and it's fine because async is quite separable from the rest of the language.

Even if you have to pull in a dependency that uses async (though if you are not doing async stuff yourself, why would you??) you can trivially wrap it up with `block_on` and move on with your life.

corethree · on Oct 15, 2023

I could. But I could also write the whole thing in assembly or go. I want a language that does everything I want and that includes a clean async api. I think there's a hole here because what "I" described as what "I" want is what a lot of people want.

withoutboats3 · on Oct 16, 2023

You are commenting on a very long post in which I explained in detail how green threading wasn't compatible with Rust's lack of garbage collection. Your response is to just insist that you wish that wasn't the case - that you could have no garbage collection and user-space concurrency without any special syntax. Just repeating a demand in the face of an explanation as to why it isn't possible isn't going to get you far.

I actually think what you want is green threading and garbage collection and the language I described at the end of my post, but you've sort of ideologically decided garbage collection is bad for whatever reason.

corethree · on Oct 16, 2023

No I'm saying just do allocations and box under the hood for async await. Not for rust but for the ideal language people want but is missing.

Rust obviously has a zero cost objective. I'm not directly talking about that.

ReleaseCandidat · on Oct 15, 2023

> Where can I find a language with no garbage collection, nice sum types and no over complicated async syntax?

Austral: https://borretti.me/article/introducing-austral It doesn't have complicated async syntax, because it doesn't have it at all :) https://borretti.me/article/introducing-austral#fn:async

corethree · on Oct 15, 2023

I should add: that is also popular and easy to find a job for. I'm sure plenty of languages fit the bill, but will I find gainful employment with that language? Probably not.

withoutboats3 · on Oct 15, 2023

This is a completely ridiculous mischaracterization of what I wrote, John, and you of all people should understand that sometimes systems programming and network programming are the same goddamn thing. I can tell you that there are plenty of systems written in async Rust that you have probably interacted within the past 4 years and that they would otherwise have been written in C or C++. That's what async Rust was designed for - not "web stuff."

echelon · on Oct 15, 2023

> Well, that says it. Rust has pivoted to web stuff. For which Go is probably better suited.

Hard, hard, hard disagree. Oh my god, I disagree so much.

We're using Rust in production in our microservices stack (Actix), and we're developing animation systems in Bevy.

The only places we use other languages are Python for pytorch and Typescript for rapid UI development. We do have a bit of Rust / WASM / Typescript interplay though.

Rust is turning into a truly full stack, cross-domain, cross-dicipline language. And that's powerful. It's something Python had going for it (scripting, web, numerical, etc.) when it got picked up for its massive adoption. Rust could totally replicate this.

Rust rocks. Its type system, package manager, and threading / async / memory model kick ass.

I wouldn't choose Go for anything new where it wasn't already entrenched.

Rust is only getting started. It's a phenomenal language and the most exciting "new" language to introduce to problems.

demi56 · on Oct 15, 2023

Rust has No specified category when it comes to development, with the number of libraries with macros infested I could write rust and still get the same knowledge I would have gotten in languages like Go or Java

echelon · on Oct 15, 2023

Have fun with Go or Java if you like.

Rust is a blissful experience for those that want and enjoy it.

nemothekid · on Oct 15, 2023

>Well, that says it. Rust has pivoted to web stuff. For which Go is probably better suited.

I think the focus should be entirely on this line:

>The other domains were - and are! - viable long term opportunities for Rust, but they were seen as not as quick to adopt new technology (embedded), depended on a new platform that had not yet seen widespread adoption itself (WebAssembly), or were not a particularly lucrative industrial application that could lead to funding for the language (CLIs)

The problem is "web stuff" was the only domain that could realistically fund the development of Rust. This is further supported by Mozilla dropping the Rust project with all it's funding issues and the slack being largely picked up by companies like Amazon.

The systems/game languages are still possible, but if the Rust team hadn't focused on serving the needs of an industry that could ensure it's longevity, there might not be any Rust today to say "Good-bye" to.

And I think that's fair. It's unreasonable to expect a project to prioritize your needs when you are unable or unwilling to fund the project. The is also true of the 'crates.io' async problem - the companies that are ready to adopt Rust and pay developers to write Rust are companies in which async networking is a big deal. If there are other libraries that need to exist in any other domain, well someone needs to be paid to write them and it doesn't look like there are very many entities that exist to take up that challenge.

ReactiveJelly · on Oct 15, 2023

> Well, that says it.

It does not say that though. It says that the people doing the work value doing it with async.

Are game developers gonna have that much trouble calling `tokio::runtime::blocking_spawn`?

zozbot234 · on Oct 15, 2023

> Well, that says it. Rust has pivoted to web stuff.

Rust is just a language - and it's just as suitable to deep embedded and general system programming as it's ever been. The real difference is that it would be insane to write network services and especially web stuff in C/C++, whereas Rust makes this quite feasible. Why are you surprised that web folks are interested in doing that?

Conscat · on Oct 15, 2023

Rust doesn't have taint, and C++ does, making C++ the better option for any secure system with user-input.

pcwalton · on Oct 15, 2023

Leaving aside the fact that I don't know of any of the largest Web properties that use tainting: tainting, like any other type system property, doesn't mean much if the language isn't memory safe.

smw · on Oct 15, 2023

If this isn't trolling, I worry about you.

Rusky · on Oct 15, 2023

Rust can be useful for more than one domain.

vvanders · on Oct 15, 2023

If you've got a sync codebase you can constrain most of the async pieces via block_on[1]. That said my experience is the async bits tend to mostly pop up in the http/web space and it's fine language for the domains you mentioned.

[1] https://docs.rs/tokio/latest/tokio/runtime/struct.Runtime.ht...