Okay here's a peeve of mine: just because something is in C ("no overhead!") doesn't mean it's faster than something in another language in all situations.
As an example, this server uses a thread pool architecture. This architecture will perform poorly with slow clients (common on the public Internet), servers which have to interact with slow disks or external services, and is useless for long-polling. It's only useful for CPU-bound applications when you can assume fast clients and short requests.
In fact, I could make this server grind to a halt by opening one connection per worker, issuing partial requests to each, then letting the connection hang. So to be used in production, this server will have to sit behind something like nginx, which can insulate your application from pathologically slow clients.
It looks like the README is accurate. A number of child processes are forked off and then event processing is done with kqueue on the BSD platform and epoll on Linux.
Threaded servers are not necessarily slower than asynchronous architectures, though they can be. In fact, threaded models are coming back into fashion these days, and can (sometimes) be faster than asynchronous models when each thread is affine to a particular physical CPU. This is due to cache effects and the substitution of fast stack allocation for slow heap allocation.
Bazillions of threads using too much stack space have been a problem in the past, but these days servers have plenty of memory, and to some extent size can also be mitigated by tail call optimizations.
I don't think your particular DoS attack will work with this server, which implements an idle timer to recover stack space. However, they could go one step further and only allow worker threads to use n% of server memory, and killing old, slow connections when the thread is needed and the memory limit has been reached.
Modern threading implementations are pretty good. Plus, a lot of the memory threads use is just virtual address space, not real memory. On 32-bit systems you ran into the problem of running out of virtual address space with a few hundred threads. With the popularity of 64-bit platforms this problem has gone away. Plus, because this is C, it's trivial to reduce the stack size to, say, 512 KB without too many problems.
That's a terrible example, both because it has nothing to do with speed (your complaint is about scalability, not speed) and because it is wrong. Having a pool of workers to handle requests while using event driven interfaces is normal now, and gives you the best of both worlds. Your theoretical attack works the same as it would on nginx: you'd need to run it out of file handles.
I choose HTTP/1.1 pipelining. Uncompressed headers are useful. Ordered records are returned (unlike SPDY), where "HTTP/1.1 200 OK" is the record separator. Been using this for a decade. Can't see the benefit of SPDY.
Anyway pipelining is only useful where numerous resources are coming from the same host. But the way the www has evolved, so much (unneeded) crap gets served from ad servers and CDN's. Pipelining isn't going to speed that up.
HTTP/1.1 pipelining was never broken. It was usually just turned off (e.g. in Firefox), while most web servers have their max keep alive set around 100. In plain English, what does that mean? It means "Dear User, You have permission to download 100 files at a time from http://stupidwebsite.com. That is you can make one request for 100 files, instead of 100 separate requests, each for a single file." And what do Firefox and other braindead web browsers do? They make a separate request for each file. But heay, never mind all those numerous connections to ad servers to retrieve marketing garbage (i.e. not the content you are after), lets concentrate on compressing HTTP headers instead. Brilliant.
It's trivial to use pipelining: 1. Feed your HTTP requests through netcat or some equivalent to retrieve the files and save them to a concatenated file, 2. split the concatenated file into separate files if desired, 3. view in your favorite browser.
Pipelining falls short of SPDY in several respects. The biggest problem is that it suffers from head of line blocking. One slow request or response prevents others from making progress.
I trust in theory this is true, but I've never personally observed this in practice.
I guess SPDY fans' marketing of this "feature" would be more convincing if I could see a demonstration.
I just don't see any noticeable delays when using pipelining.
What strikes me as peculiar about the interest in SPDY is that I never saw any interest in pipelining before SPDY. And I really doubt it was because of potential head of line blocking or lack of header compression. I think users just were not clued in about pipelining.
The speed up between not using pipelining and using it is, IME, enormous. 1 connection for 100 files versus 100 connections for 100 files. It is a huge efficiency gain.
Yet most users have never even heard of HTTP pipelining, or never tried it. If they really wanted such a big speed up, why wouldn't they use pipelining, or at least try it? Why wouldn't they demand that browsers implement it and turn it on by default?
Users are being encouraged to jump right into SPDY, a very recent and relatively untested internal project (e.g. see the CRIME incident) of one company, most users, if not all, having never previously experimented with even basic pipelining, which has been around since the 1999 HTTP/1.1 spec and has support via keep alives in almost all web servers.
Noticeable speed gains would be seen if www pages were not so burdened with links to resources on external hosts. That's what's really slowing things down, as browsers make dozens of connections just to load a single page with little content. The speed gains from cutting out all that third party host cruft would make any speed gains from avoiding theoretical potential head of line blocking during pipelining seem miniscule and hardly worth all the effort.
If you want to see how much pipelining speeds up getting many files from the same host, you do not need SPDY to do that. Web servers already have the support you need to do HTTP/1.1 pipelining. (Though on rare occasions site admins have keep-alives disabled, like HN for example. In effect these admins are saying, "Sorry, no pipelining for you.")
HTTP pipelining is turned off by default in most browsers due to concerns with buggy proxies and servers (see https://bugzilla.mozilla.org/show_bug.cgi?id=264354 ). It may work for you and the particular set of servers you visit, but I suspect browser developers would rather have a browser that by default works with the widest possible range of configurations.
Unfortunately, it being turned off by default in most browsers means that most people won't see the benefits from it. Hopefully, the upcoming HTTP/2 standard will fare better (latest draft: https://tools.ietf.org/html/draft-unicorn-httpbis-http2-01 ).
Note that HTTP/2 will be based on SPDY (in particular, SPDY/4 with the new header compressor). Hopefully, when the standard is finalized and we have multiple strong implementations, that will allay the concerns you seem to have with SPDY today.
(Disclaimer: I work on SPDY / HTTP/2 for Chromium.)
Yes, I understand there are buggy servers and proxies... and I use a browser that has settings to accomodate them. However... I do not know about HTTP bugs that affect <emphasis>pipelining<emphasis>. And... in addition, for pipelining, I do not use a browser to do the initial retrieval. I use something like netcat to fetch and then I view the results with a browser.
Can you give me a list of buggy servers where my HTTP/1.1 pipelining will not work as desired? I've been doing pipelining for 10 years (that's quite a few servers I've tried) with no problems.
The arguments made by SPDY fans (e.g. Google employees) all seem plausible. But I wonder why they are never supported by evidence? IOW, please show me, don't just tell me. SPDY seems to solve "problems" I'm not having. Where can I see these HTTP/1.1 pipelining problems (not just problems with browsers like Firefox or Chrome) in action? I'd love to try some of the buggy servers you allude to and see if they slow down pipelining with netcat.
I didn't have to look hard to find bug reports for pipelining. An example is https://bugs.launchpad.net/ubuntu/+source/apt/+bug/948461 for Amazon's S3. I'd be interested if the problem is still reproducible now. Also, one of the comments mentions Squid 2.0.2 as being buggy.
Most of the improvements in SPDY are latency improvements, so if you're downloading sites with netcat and then viewing them in a browser, I'm pretty sure the overhead of that would dwarf anything SPDY would save. That having been said, there's ample evidence of SPDY improving things. From http://bitsup.blogspot.com/2012/11/a-brief-note-on-pipelines... :
"Also see telemetry for TRANSACTION_WAIT_TIME_HTTP and TRANSACTON_WAIT_TIME_HTTP_PIPELINES - you'll see that pipelines do marginally reduce queuing time, but not by a heck of a lot in practice. (~65% of transactions are sent within 50ms using straight HTTP, ~75% with pipelining enabled).... Check out TRANSACTON_WAIT_TIME_SPDY and you'll see that 93% of all transactions wait less than 1ms in the queue!"
You omitted the sentence before your excerpt where Mr. McManus suggests we move to a multiplexed pipelined protocol for HTTP.
I'll go further. I say we need a lower level, large framed, multiplexed protocol, carried over UDP, that can accomodate HTTP, SMTP, etc. Why restrict multiplexing to HTTP and "web browsers"? Why are we funnelling everything through a web browser ("HTTP is the new waist") and looking to the web browser as the key to all evolution? It seems obvious to me what we all want in end to end peer to peer connectivity. Although the user cannot articulate that, it's clear they expect to have "stable connections". This end to end connectivity was the original state of the internet. Before "firewalls". Client-server is only so useful. It seems to me we want a "local" copy of the data sources that we need to access. We want data to be "synced" across locations. A poor substitute for such "local copies" has been moving data to network facilities located at the edge, shortening the distance to the user.
But, back to reality, in the case of http servers, common sense tells me that opening myriad connections to (often busy) web servers to retrieve myriad resources is more prone to potential delays or other problems (and such delays could be due to any number of reasons) than opening a single connection to retrieve said myriad resources. Moreover, are his observations are in the context of one browser?
I guess when you work on a browser development team, you might get a sort of tunnel vision, where the browser becomes the center of the universe.
If you dream of multiplexing over stable connections, then you should dream bigger than the web browser. IMO.
I'm aware of a bug in some PHP databases with keep alive after POST. I mainly use pipelining for document retrieval (versus document submission) so I am not a good judge of this. What I'm curious about is where keep alives after POST would be desirable. You alluded to that usage scenario (a series of GET's after a large POST).
Re. Patrick's sentence, you're right, but as I mentioned above, SPDY/4 will become HTTP/2 (we're working through the standardization process). So I think most of the major players are on board with "fixing" HTTP pipelining by using SPDY-style multiplexing.
Re. thinking bigger, you might want to read up on QUIC, which was announced recently: http://en.wikipedia.org/wiki/QUIC . Based on that, I would content that at least we on the Chromium team don't have tunnel vision. :)
Re. your question, Patrick's data is from Firefox only I believe. You're right that it's not surprising his stats show that SPDY helps over HTTP without pipelining. But the more interesting thing is that HTTP with pipelining still doesn't help that much over HTTP without pipelining (on average) and SPDY still beats it by orders of magnitude. I'd have to dig, but I'm pretty sure there are similar stats on the Chromium side.
Yes, a major appeal of pipelining to me is efficiency with respect to open connections. It's easier to monitor the progress of one connection sending multiple HTTP verbs than multiple connections each sending one verb.
Whether multiple verbs over one connection are processed by the given httpd more efficiently than single verbs over single connections is another issue. IME, a purely client-side perspective, pipelining does speed things up. But then I'm not using Firefox to do the pipelining.
I'm sure the team reponsible for Googlebot would have some insight on this question. (And I wonder how much SPDY makes the bot's job easier?)
In any event, multiplexing would appear to solve the open connections issue. And I don't doubt it will consistently beat HTTP/1.1 pipelining alone. I'm a big fan of multiplexing (for peer-to-peer "connections"), but I am perplexed by why it's being applied at the high level of HTTP (and hence restricted to TCP, and all of its own inefficiencies and limitations).
I'm curious about something you said earlier. You said something about the "overhead" of using netcat. It's relatively a very small, simple program with modest resource requirements. What did you mean by overhead?
Re. multiplexing at the HTTP layer, because an HTTP replacement has to be deployable and testable. However, now that the ideas in SPDY have been proven and are on their way to being standardized, you can look at QUIC to see what can be done when not limited to TCP and HTTP.
By overhead I mean latency overhead -- running a program to download a site to a local file and then displaying it in a browser will almost certainly have a higher time to start render. Not to mention you're hitting everything cold (i.e., not using the browser's cache).
I don't measure latency as including rendering time. Maybe I'm not "rendering" anything except pure html.
I measure HTTP latency as the time it takes to retrive the resources.
Whatever happens after that is up to the user. Maybe she wants to just read plain text (think text-only Google cache). Maybe she wants to view images. Maybe she wants to view video. Maybe she only wants resources from one host. Maybe she does not want resources from ad servers. We just do not know. Today's webpages are so often collections of resources from a variety of hosts. We can't presume that the user will be interested in each and every resource.
Of course those doing web development like to make lots of presumptions about how users will view a webpage. Still, these developers must tolerate that the speed of users' connections vary, the computers they use vary, and the browsers they use vary, and some routinely violate "standards". Heck, some users might even clear their browser cache now and again.
But HTTP is not web development. It's just a way to request and submit resources. Nothing more, and nothing less.
This isn't a problem when the primary request is dynamic and served from one server/domain/connection and the remaining requests are for static assets stored on and served from another server/domain/connection.
Consider if the client makes a large POST followed by a few GETs. If the client has little upload bandwidth, the GETs will be delayed until the POST completes. With SPDY, they all can proceed concurrently. Similarly, if the client makes 5 GET requests, with the first being a heavy/slow/expensive resource for the server, the cheap resources can't be delivered until the slow resource finally is computed and returned.
Yes, that's the reason that SPDY exists, but my point is that that's actually a rare implementation in the real world. As was said elsewhere in this thread, and what I was saying, is that the most likely implementation is that one request goes to, say, www.example.com, which serves a single HTML file, and remaining requests for resources that that HTML references go to outsourced-cdn.example.com. So it's actually more important to have pipelining and SPDY support on outsourced-cdn.example.com than it is on www.example.com. That is, chances are you don't need to worry about it, that's why you pay outsourced-cdn to. There is less of a need for multiple simultaneous requests when you have good client-side caching of resources too. The usefulness of multiplexed requests is negated if you only serve one request.
The above has been the exact case at a number of companies I've worked at.
Sites and companies like Google or Facebook, that serve all their own traffic, it becomes more important for.
However, it's more difficult for other SPDY implementations to use the patched zlib, so this isn't an ideal solution. For SPDY/4 / HTTP/2, we will have a custom header compressor which is intended to eliminate CRIME-like attacks: https://tools.ietf.org/html/draft-ietf-httpbis-header-compre... .
(Disclaimer: I work on SPDY / HTTP/2 for Chromium.)
I saw an awesome talk my mnot recently about http/2.0 for which I'm really excited. A large part of it was basically "Lessons we learned from SPDY" which is great, in the longterm.
Personally I feel SPDY was a huge benefit to the internet, but in saying that it was a huge benefit in the form of "a cautionary tale to others"
You can, but then you've got to write this enormous block comment saying "I realise this looks wrong and broken, but ssl is also broken so don't change this constant", until some junior dev inevitably does anyway.
Having known vulnerabilities baked into a standard with "weird looking" mitigation strategies is really poor choice IMO.
That said, I do see your point. There are also other edgecases, like serving statics on a seperate, uncookied domain that benefit greatly from SPDY in the here and now.
Would it be an interesting idea to create a framework for making webapps as nginx modules? Sure, it's a pain in the ass, but nginx is evented as opposed to thread pooled, and tried and tested. Because nginx takes up hardly any memory, you could run a single nginx instance and proxy to other nginx instances that are compiled purely to run the app.
Though the scope for creating critical vulnerabilities is huge.
It uses a wide range of 3rd-party nginx modules, and Lua as a scripting language, to form a nice little framework. Extraordinarily fast by any standards.
Ceased reading after mem.c - pool-based memory allocation which cache-locality awareness and data partitioning by access patterns is the must for a modern server. Just to malloc every buffer you need is a naive strategy, which, probably, will result in memory fragmentation and cache misses.
Surprisingly accurate and clean C. This guy is good one for hiring (assuming he wishes to be hired).
It's quite a gift for embedded systems. It's very convenient to have a self-contained solution for a web interface, rather than drag an interpreter and a web server on an already crowded rootfs. Plus, you can get the benefit of static analysis and the like, which is extra useful on such systems.
The more "close to the metal" you can get and more awkward and painful it is to program in, the more cred you get. Actual ramifications of slow connections, long polling etc... be damned.
Come to think of it, how come we don't have a web server in Var'aq?
Why the ISC license? The site just says "Kore is licensed under the ISC license allowing it to be used in both free and commercial products." but don't Apache 2, BSD, and MIT all satisfy this as well.
As an example, this server uses a thread pool architecture. This architecture will perform poorly with slow clients (common on the public Internet), servers which have to interact with slow disks or external services, and is useless for long-polling. It's only useful for CPU-bound applications when you can assume fast clients and short requests.
In fact, I could make this server grind to a halt by opening one connection per worker, issuing partial requests to each, then letting the connection hang. So to be used in production, this server will have to sit behind something like nginx, which can insulate your application from pathologically slow clients.