How fast are Linux pipes anyway?

zh3 · 2025-06-22T18:38:17 1750617497

Seared into my soul is the experience porting a linux pipe-based application to Windows, thinking it's all posix and given it's all in memory the performance will be more or less the same. The performance was hideous, even after we found that having pipes waiting for a connection more or less ground windows to a halt.

Some years later this got revisited due to needing to use the same thing under C# on Win10 and while it was better it was still a major embarrassment how big the performance gap was.

dataflow · 2025-06-23T05:08:25 1750655305

> The performance was hideous, even after we found that having pipes waiting for a connection more or less ground windows to a halt.

When you say the performance was hideous, are you referring to I/O after the pipe is already connected/open, or before? The former would he surprising, but the latter not - opening and closing a ton of pipes is not something you'd expect an OS to be optimized for - and it would be somewhat surprising if your use case requires the latter.

zh3 · 2025-06-23T06:16:31 1750659391

Literally just having spare listening sockets, ready for incoming connections (and obv. not busy-waiting on them). Just reducing to the number actually in-use was the biggest speed-up - it was like Windows was busy-waiting internally for new connections (it wasn't a huge number either, something like 8 or 12).

dataflow · 2025-06-23T08:27:13 1750667233

By "spare listening sockets" do you mean having threads on the server calling ConnectNamedPipe? A bit confused by your terminology since these aren't called listening sockets. (You're not referring to socket() or AF_UNIX, right?)

And yeah, that seems more or less what I expected. The implementation is probably optimized for repeated I/O on established connections, not repeated unestablished ones. Which would be similar to filesystem I/O on Windows in that way - it's optimized for I/O on open files (especially larger ones), not for repeatedly opening and closing files (especially small ones). It makes me wonder what kinds of use cases require repeated connections on named pipes.

If the performance is comparable to Linux's after the connection, then I think that's important to note - since that's what matters to a lot of applications.

zh3 · 2025-06-23T09:06:35 1750669595

Yes, it was indeed using ConnectNamedPipe - just had a look at the code (which I can't share) to refresh my memory. The main problem was traced to setup delays in WaitForSingleObject()/WaitForMultipleObjects(); we fixed it as above (once all sessions were connected there were no spares left, so no problems), actual throughput was noted as quite inferior to linux but more than enough for our application so we left it there.

dataflow · 2025-06-23T10:14:52 1750673692

Ah interesting, thanks for checking. Not entirely sure I understand where the waits were happening, but my guess here is that the way Microsoft intended listening to work is for a new pipe listener to be spawned (if desired) once an existing one connects to a client. That way you don't spawn 8 ahead of time, you spawn 1 and then count up to 8.

I would intuitively expect throughout (once all clients have connected) to be similar to on Linux, unless the Linux side uses syscalls like vmsplice() - but not sure, I've never tried benchmarking.

yndoendo · 2025-06-23T17:17:46 1750699066

Windows API is built on kludge of functionality, not performance. For example, GetPrivateProfileString [0] does exactly what you stated for files. Opens, parses a single key value, and closes. So much time and resources are wasted with the GetPrivateProfileXXXX APIs.

[0] https://learn.microsoft.com/en-us/windows/win32/api/winbase/...

dataflow · 2025-06-23T17:50:47 1750701047

This function is provided only for compatibility with 16-bit Windows-based applications. Applications should store initialization information in the registry.

They literally provided the registry to solve this very problem from the days of 16-bit Windows. Holding it against them in 2025 when they have given you a perfectly good alternative for decades is rather ridiculous and is evidence for the exact opposite of what you intended.

asveikau · 2025-06-22T20:09:57 1750622997

Some years back Windows added AF_UNIX sockets, I wonder how those would perform relative to Win32 pipes. My guess is better.

manwe150 · 2025-06-23T00:20:00 1750638000

Seems to reportedly be slightly faster in a few cases, but nothing particularly dramatic https://www.yanxurui.cc/posts/server/2023-11-28-benchmark-tc...

asveikau · 2025-06-23T16:21:00 1750695660

Are we reading the same tables? It seems to be about 3x faster than named pipes, and marginally faster than local TCP.

It's worth noting that in Win32, an unnamed pipe is just a named pipe with the name discarded. So this "3x faster" is, I think, the exact comparison we're interested in.

SoftTalker · 2025-06-22T20:24:44 1750623884

Well POSIX only defines behavior, not performance. Every platform and OS will have its own performance idiosyncracies.

klysm · 2025-06-22T21:23:23 1750627403

How on earth would POSIX define performance of something like pipes?

SoftTalker · 2025-06-22T21:30:36 1750627836

I was addressing "it's all posix and given it's all in memory the performance will be more or less the same."

Not claiming that POSIX should or could attempt to address performance.

pjmlp · 2025-06-23T04:14:38 1750652078

By using Big O notation, or deadlines like on RTOS APIs, as two possible examples on how to express performance on a standard.

variadix · 2025-06-23T17:07:22 1750698442

Some standards do define performance requirements, e.g. operations on data structures, in BigO notation.

vardump · 2025-06-23T15:28:59 1750692539

Last I checked, on Windows local TCP outperforms pipes by a large margin.

andrewmcwatters · 2025-06-22T20:02:11 1750622531

Did you find that you needed interprocess communication to replace the gap?

spacechild1 · 2025-06-23T03:20:18 1750648818

pipes are a form of interprocess communication :) I guess you meant shared memory?

andrewmcwatters · 2025-06-23T03:39:59 1750649999

Yes. Yeah, you're right. Sockets could also be used, but I guess when I think of IPC, I generally think of shared memory.

hk1337 · 2025-06-23T10:15:16 1750673716

I remember years ago, we had an opposite experience. Not necessarily with pipes. We were running on Linux with a php app that would communicate with a soap api on .net and found that a .net implementation had better response time.

johnisgood · 2025-06-22T18:39:04 1750617544

FWIW there is readv() / writev(), splice(), sendfile(), funopen(), and io_buffer() as well.

splice() is great when transferring data between pipes and UNIX sockets with zero-copy, but it is Linux-only.

splice() is the fastest and most efficient way to transfer data through pipes (on Linux), especially for large volumes. It bypasses memory allocations in userspace (as opposed to read(v)/write(v)), there is no extra buffer management logic, there is no memcpy() or iovec traversal.

Sadly on BSDs, for pipes, readv() / writev() is the most performant way to achieve the same if I am not mistaken. Please correct me if I am wrong.

At any rate, this is a great article.

messe · 2025-06-22T18:52:47 1750618367

> sendfile() is file-to-socket (zero-copy as well), and has very high performance as well, for both Linux and BSDs. It only supports file-to-socket, however, and well, to stay relevant, sendmsg() can't be used with pipes in the general case, it is for UNIX domain sockets, INET sockets, and other socket types.

On Linux, sendfile supports more than just file to socket, as it's implemented using splice. I've used it for file-to-block-device in the past.

johnisgood · 2025-06-22T20:13:37 1750623217

On BSDs probably not, as they don't have splice, but that is good to know. I wonder if on BSDs it really is readv() and writev() that are the fastest way to achieve the same thing as has been done in the article. Maybe I am missing something. I would like to be corrected.

messe · 2025-06-22T20:22:07 1750623727

AFAIK, neither OpenBSD nor NetBSD has sendfile. On FreeBSD, I think you're correct regarding it being file-to-socket only.

zambal · 2025-06-22T20:35:02 1750624502

Indeed, if I'm not mistaken Netflix at least used to use (and commit to kernel) FreeBSD on content servers because of its superior sendfile performance

messe · 2025-06-23T07:28:27 1750663707

Not only that, but they even contributed patches to allow the FreeBSD kernel to handle the TLS part of SSL_sendfile as well[1].

[1]: https://man.freebsd.org/cgi/man.cgi?ktls(4)

wavesquid · 2025-06-23T02:37:24 1750646244

> splice() is the fastest and most efficient way to transfer data through pipes (on Linux), especially for large volumes. It bypasses memory allocations in userspace (as opposed to read(v)/write(v)), there is no extra buffer management logic, there is no memcpy() or iovec traversal.

Proper use of io_uring should finally have it beat or at least matched.

tedunangst · 2025-06-22T22:56:54 1750633014

Shared memory, like shm_open and fd passing, would be even faster and fully portable.

gkfasdfasdf · 2025-06-22T14:55:06 1750604106

Great article, discussed previously on HN:

https://news.ycombinator.com/item?id=31592934 (200 comments)

https://news.ycombinator.com/item?id=37782493 (105 comments)

gigatexal · 2025-06-22T17:05:08 1750611908

This is such a dope article. I love that it comes from time to time.

gigatexal · 2025-06-23T03:02:40 1750647760

s/comes/comes up

aeonik · 2025-06-22T14:17:35 1750601855

I feel bad that this doesn't have any comments, the article was really great.

I'd like to use splice more, but the end of the article talked about the security implications and some ABI breaking.

I'm curious to know if long term plans are to keep splice around?

I'd also be curious how hard it would be to patch the default pipe to always use splice for performance improvements.

amelius · 2025-06-22T16:27:13 1750609633

For more comments, see: https://news.ycombinator.com/item?id=44347412

lukeh · 2025-06-22T21:15:42 1750626942

Does modern Linux have anything close to Doors? I’ve an embedded application where two processes exchange small amounts of data which are latency sensitive, and I’m wondering if there’s anything better than AF_UNIX.

the8472 · 2025-06-22T23:02:09 1750633329

shared memory provides the lowest latency, but you still need to deal with task wakeup, which is usually done via futexs. Google was working on a FUTEX_SWAP call for linux which would have allowed direct handover from one task to another, not sure what happened to that.

Galanwe · 2025-06-23T06:44:30 1750661070

If you really want low latency, then you should be OK to trade power/CPU for it, and you can just spin instead of being woken up.

themerone · 2025-06-23T01:55:51 1750643751

What are Doors, it's too common a word to Google.

kjellsbells · 2025-06-23T02:51:00 1750647060

Lightweight IPC invented by Sun.

https://en.m.wikipedia.org/wiki/Doors_(computing)

Look for Doors solaris and there are quite a few articles.

mort96 · 2025-06-22T23:15:36 1750634136

Would be helpful to know what your problem is with AF_UNIX at the moment. Is it lacking in features you want? Is it higher latency than you'd want? Is the server/client socket API style not appropriate for your use-case?

lukeh · 2025-06-22T23:46:21 1750635981

Well, it’s probably fine but, it’s an audio application where metering (not audio) is delivered from a control plane process to a UI process. Lower latency is better. But haven’t measured it.

layer8 · 2025-06-22T17:43:08 1750614188

(2022)