If you were wanting to use Rust but also wanted to use HTTP (client or server), I've got https://github.com/chris-morgan/rust-http which has rapidly become the de facto HTTP library (although it started after Rust 0.7 was released). It's far from complete—but that's an opportunity for you to join in, if you want.
Thanks for reminding me: done. I saw that one earlier when I was looking for places to <del>plug</del><ins>advise people about</ins> rust-http, but at that time I only had an HTTP server. Then when I implemented the client I forgot to add an answer there.
It's good enough that the Servo team decided to use its client (and now do), but the approach I've taken (implementing the HTTP spec thoroughly, putting Rust's type system to good use) will take quite a long time before it's really polished (e.g. we must implement types for every type of header, rather than just using strings and leaving it to users to interpret them [often incorrectly]). It's still a little experimental, but it's the only really serious HTTP library there is for Rust out there.
Just wait until it's done. It will be a really great HTTP library.
Having done occasional database design work in the past, I can sympathise with the impulse to model things with properly restrictive types, and I'd be surprised if 10% of HTTP clients and servers could correctly quote on unquote the filename in a Content-Disposition header, but I have to wonder whether such a restrictive HTTP library would be very useful in practice.
For example, last week as a learning exercise I implemented an OAuth client, which involves adding a bunch of stuff to the "Authorization" header of an HTTP request, none of which was ever mentioned in the original HTTP RFCs, let alone specified. Likewise, the HTTP RFCs have a fixed and rather small set of verbs, but things like DAV add a bunch more.
How can you balance the reliability of strict typing with all the HTTP extensions that expect anyone can stick arbitrary strings anywhere?
An important thing with it is to support everything; for headers, for example, unsupported ones are of the enum variant `ExtensionHeader(~str, ~str)`, and with methods there's the `ExtensionMethod(~str)`.
Taking your example of the Authorization header: that uses the `credentials` type, defined in RFC 2617 (https://tools.ietf.org/html/rfc2617), which ends up thus:
But then, Basic and Digest come into the mix, and they've got data that should be treated as data rather than text. I'll probably end up with renaming the struct above to ExtensionCredentials (oh no! it doesn't have a proper name!) and using an enum:
enum Credentials {
BasicCredentials(BasicCredentials), // A new struct
DigestCredentials(DigestCredentials), // Ditto
ExtensionCredentials(ExtensionCredentials),
}
I've been playing in my mind with having traits to convert such things as custom credentials in some way without you needing to maintain it yourself in your own place, but it's not an easy problem however you dice it.
In the end, it is all about balance, as you say, and I'm not sure precisely where the balance falls, yet. But I know it uses the type system a whole lot more than almost all of the code that's out there.
I haven't taken a look at your library, but as a developer that has always wondered "Why use strings for these things that could be enums, contants, or types?" in API's, I appreciate your effort and your thoroughness.
That's been my feeling exactly, and why I was delighted that Rust didn't have HTTP support yet.
That follows through to many other aspects of e.g. web frameworks; there's a lot there where string typing is used. When rust-http is stable enough, I'll be getting on to my dream framework which will be astonishingly safe and bamboozlingly quick (to start and to run, if not quite to compile), incorporating and extending various ideas at present only present in Haskell frameworks and a couple of other similar language-frameworks (e.g. Ur/Web). It'll be fun!
I've been mostly a Python developer hitherto, but I'd never have tried something like this in Python—it simply wouldn't work. You need a type system like Rust's before it can work, but then it really works.
The answer is because that's what's in the spec. For HTTP, it's probably a mistake to hardcode header types. They are defined as key/value pairs of strings in the spec. There are a few keys that are specified, but how they actually work in practice (upper? lower? quoted?) is difficult to predict. There are just too many variations. So you end up with proper types for the most common ones, and then throw the extras into a separate "others" type. Which is great, except that now you have two places to check for things.
In this case (HTTP), it's easier (and more correct) to just leave them as stings. The general principle with network protocols is to be strict in what you send, and forgiving in what you receive.
The headers are data, not text. Somewhere along the way you'll need to interpret them; doing a good job of that at the system boundary is the only sensible approach. (It's not the approach the majority of tools have taken, but it is the only sensible approach). If it gets into the system as text, people will start pulling it apart in even worse and less consistent ways.
I agree with you that the parse behaviour for HTTP headers is poorly defined. That's something I'll be wrestling with all the time.
Supported headers will be in one place and uncommon extension headers in another. Such, alas, is life. But really, the only time when I would expect this to cause any trouble at all is when new headers are added. Compare it with things like the CGI standard and how it handles headers and you'll realise it's not such a bad system.
I should make it quite clear that the specs are (unfortunately) only a starting point for rust-http. Where there are deviations, more leniency may be added. But it'll be added thoroughly and properly.
I get a rather strong sense of déjà vu looking at lots of spray-http code: it's a pretty good model of what I was already starting to do or what I had in mind a lot of the time.
My own header definitions are pretty clumsy at present; I'm just about up to the stage of improving that with macros now. (I didn't do that to start with so that I could write a few and get a feel for what it would need to be like.)
Interesting. I took a quick look at the methods, and it looks like your library doesn't accept lower-case methods (which is technically correct). Made a bit curious, and so I tried google.com and apache.org, and indeed, both give an error with a simple "get /", but gives reasonable http/1.0 responses to a "GET /".
Method is one of the few places where the spec does indicate case-sensitive rather than the default of case-insensitive: From RFC 2616, section 5.1.1: "The method is case-sensitive."
This is going to be a problem for your users. RFC's are worth following when they represent a superset of the standard implementations, but when the RFC is more restrictive than what people actually use you're doing yourself a disservice by sticking to the spec.
That is something where I'll be needing to take care. Real-world usage will be very important. At present, for performance, it doesn't preserve the header value as it is reading it and so an invalid value is entirely lost. (Performance meaning you don't need to do an extra heap allocation for each header.) Providing the raw value of the header when parsing fails is something I may need to do; I'm not yet sure. I already know that "invalid" values for the Expire header (especially -1, as noted in RFC 2616) are normal (and so the Expires header has been switched back from being a Tm to being a ~str for the moment).
As I get further along, I intend to use the data from the Common Crawl, which fortuitously includes response headers, to see how my validation goes. Of course, that's only a small set of the real-world headers (cache ones in particular will be scarcely stressed by that at all). Validating request headers will be harder; I've still got to figure out what to do about that.
In the end, though, I'm determined that it will work and work well. Servo using it (and thus demanding robust HTTP support) will help with that goal.
Something I discovered a few hours ago, reading the specs: I believe this header should be valid, with the value being interpreted as the weak entity tag ``Super Encoding™``. I wonder how many clients or servers would support it? No idea yet.
I don't know the Rust type system well enough (nor the internal representation of strings), but if strings allow you to reference sub-strings without re-allocating, then you only have one contiguous section in memory for headers that you can "point" to for the values (maybe this is too C-like to be possible in Rust). My recommendation (feel free to ignore it) would be something that supports typed headers as well as arbitrary string headers, because the ability to fall-back to strings will make your library usable in a much broader sense.
I'd need to think about whether that's feasible or not in the overall design. (Locally, it'd work fine, but I don't think I want to be keeping the raw value around once it's validated.)
Arbitrary string headers are essential. Conversion between the typed header and strings is part of the design (though only partially implemented at present). As for other extension-headers (as they are designated in RFC 2616), that's the header enum variant ExtensionHeader(~str, ~str).
I'm not sure that's the case when it comes to http methods though -- I thought it was, but seeing two pretty high traffic sites, running different "real-world" web servers both give errors on this -- apparently it's an area in which we've already moved a bit away from "be lenient in what you accept; be strict in what you send".
It's been so long since I've played with netcat and HTTP that I can't remember if 'get' vs 'GET' "used to" work or not...
Still, might be something that should be possible to toggle with a flag (case insensitive parsing on/off or something like that).
Might also be useful to keep in mind that there are very real differences between HTTP/1.0 and HTTP/1.1. For browser-facing stuff, 1.1 should be fine these days(?) -- for apis etc, I don't know if "proper" 1.0 support makes sense or not.
If they make their next-gen browser using Rust, I hope they make it 64-bit-only, and optimize it as much as possible for x64 and ARMv8. None of that "support back to Windows XP!" stuff. It wouldn't be necessary (will still have Firefox a few years longer for that), and it would just hold them back in terms development time, maintenance, performance and security.
Then it would be not just great PR for their "highly-optimized no legacy cruft 64-bit browser", but would also give people a reason to switch from Firefox to it. It would also give Mozilla an excuse to not make a 64-bit Firefox anymore.
I assume it's going to take them at least 2-3 years to do it (if they ever plan to release it), and by then Microsoft will probably release a 64-bit only Windows 9, iOS will be 64-only, too (probably not relevant to Mozilla, but could be in the future), and at least half of all Android smartphone users will have ARMv8-based devices.
Mozilla should take full advantage of this, and really push for performance (and security), and they should only make it available from Android 5.0 (probably the first 64-bit Android version) and Windows 7 x64. Support for Linux kernel should probably start with no lower than 3.10 LTS (contains all the ARMv8 support).
You make good points, and the Servo team is doing more or less as you described. Our primary target is 64bit and we plan to leave behind lots of legacy stuff.
The extra parallelism we're after also enables new things too. For example, Servo runs cross-origin and sandboxed iframes in parallel.
Very few of today's apps actually benefit from having more than 3GB of address space. Nothing on my desktop PC is using more than 300MB at the moment. All else being equal, 32 bits will be more efficient, particularly on portable devices which tend to have far less memory bandwidth.
Of course, the performance gains have really nothing to do with 64-bit addressing alone, but ARM took the opportunity to almost completely redesign the ISA, and A64 is in general a better and faster ISA than A32. So all else is not equal.
I actually find this 64-bit-has-better-ISA-than-32-bit situation annoying. I would like to have better ISA, without making pointers larger. There is x32(https://sites.google.com/site/x32abi/), will there be something similar for ARM?
mmapping files without blowing out your address space works a lot better in a 64-bit process (and system libraries love to mmap things like fonts, which can quickly take up a bunch of address space).
I just took a look at the address space usage on my Mac, and the spotlight indexer seems to be using about 1.5GB of address space (and 350MB of actual RAM, for some reason?), and a number of other built-in things (WindowServer, Apple80211Agent, Dashboard) are all north of the 500MB address space mark.
But I think most stuff that runs off of soldered-on DDR3L and flash RAM (e.g., apps on phone and tablet OSes) may be better off sticking with 32 bit addresses for another generation.
OK, yeah the fundamental number type of Javascript is the 64-bit floating point double so I could see how the ability of a JIT to emit code to handle 56 bit integers might be useful. But it would still be a tradeoff against the cost of doubling the size of every pointer.
Remember that Rust is still a rapidly changing language, and releases have very little connotation of stability, support, or any other guarantees.
They're released on a schedule, not by features. They're useful to be able to refer to periods of time in Rust's rapidly changing lifetime, but not much beyond that.
> The purpose of this milestone is to represent a high degree of confidence the
> language's suitability for industrial use due to a high level of measured
> correctness and performance in the tests.
> In other words, for each of the aspects described in the "well covered"
> milestone, and for each supported target platform, test success and benchmark
> performance has reached a level deemed suitable for production use in
> environments with substantial business consequences for major defects or low
> performance. Where possible, performance benchmarks are linked to equivalent
> benchmarks in competitive languages, and rust is sufficiently close to
> equivalent in performance.
> maturity #2 - backwards compatible
> we are comfortable making a long-term support commitment to downstream
> users (servo in particular) to keep the set of symbols, definitions
> and passing-tests from contracting
This! I so want to play with Rust. I want to explore it over programs that are several thousand lines long, so I can get a real flavor for it. But the rate of change especially to important elements of the language is still too great. I don't want to rewrite the code to stay abreast of revs.
Not a critique by any means--I admire the language and the work being done. More of a wish that the core language syntax would settle down soon.
Of course, the change also means your play might have an impact still: If you use it and find something about the design stinks, wastes your time, induces errors, etc... while the language is still in flux there exists the possibility of fixing these things.
In this regard it couldn't be a better time to try it out.
To be clear, the syntax changes in this release were very minor. It's mostly the standard library and runtime that are changing at this time.
That doesn't mean that your sentiment is wrong; if you don't want to be keeping up with the langauge's changes, certainly don't write projects in Rust. That said, there are more libraries than you'd expect, including a few that are several thousands of lines.
I've been writing Rust code off and on, and even moving some code from back in February to the recent trunk codebase wasn't that painful: Yes, there were some changes I had to make, and the code isn't perfectly idiomatic modern Rust, but it does run.
I wouldn't devote a crucial project, school assignment, or startup to rust. But for some toy side projects, or small programs, it's not so bad right now.
10 is the number that comes after 9, so I'd expect a 0.10. Rust has SemVer baked into its standard library, and the package manager (hopefully soon) will assume SemVer.
Rust is pretty good as a language but things are still changing a lot, especially on the library side.
I did a project in Rust as a learning exercise. The language is easy to pick and I was able to hit the ground running from the start. The major learning hurdle I think is the memory model, which is different from most languages out there.
Here's my first Rust project after two weeks of on and off hacking. It's a Memcached client library implementing the Memcached protocols in pure Rust. https://github.com/williamw520/rustymem
I just want to note, if you ever release something (anything, be it a version or a whole startup), copy their first paragraph intro! I didn't know what Rust was, but after just 8 seconds I did. Fantastic and all too rare.
I've messed around with Rust a little and I love what I see! The documentation is still poor (understandable given how quickly the target is moving) but that seems to be changing. This is a really exciting, multi-paradigm language and I wish Mozilla all the best in developing it further!
1) I think I've had 4 different people try to explain lifetimes to me and I still don't think I understand.
2) The use of pointer dereferencing in closures is still quite confusing to me. For example, from the tutorial:
let square = |x: int| -> uint { (x * x) as uint };
no pointer dereferencing, yet:
[1, 2, 3].map(|x| if *x > max { max = *x });
uses pointer dereferencing. I can't figure out any rhyme or reason behind it.
3) How do you create traits that can be automatically derived? How do you implement a default method?
4) How do you create and use macros, and in what situations are they the appropriate solution over other forms? (I'm used to using macros in lispy languages, but using them as pervasively in other languages seems to be a form of code smell).
As for the pointer dereferencing, perhaps putting the type there will make it clearer:
[1, 2, 3].map(|x: &int| if *x > max { max = *x });
Now you can see that x isn't actually an int but a reference to one.
As for automatically derived traits, those are actually slightly more powerful macros implemented in the compiler itself. You can see it at rust/src/libsyntax/ext/*.
For default methods, you just put the code you want in the trait itself. Like so:
The make_sound method is what's called a default method. If you implemented that trait, at the very least you would have to define the sound method and if you wanted to, you could override the default make_sound method.
> How do you create and use macros, and in what situations are they the appropriate solution over other forms?
I would say to follow a Lisp rule, I like to follow in all languages that have macro support, "only implement a macro if it cannot be done with a function".
I would really love if the reference docs pointed to the underlying code. Many times I'll skim through the reference docs looking for something that I need...then search github for the same section in code so that I can see how it is actually used (or implemented, since the compiler is often the best place to learn Rust itself).
Would be great if the docs just linked straight to the function/crate in github.
I'm working on something Rusty
https://github.com/DanielFath/xml_parser ;) as my side project. But it's so woefully incomplete I am ashamed of showing it. It's mostly based on
rust-http was the very first thing that I worked on when I came to Rust, not knowing the language up until then. And it hasn't been turning out too badly.
Yeah it's been very smooth sailing :) Only pain point was @ pointers in BytesReader interface, but aside from that it's way less bumpy than anticipated.
Your book is definitely an excellent contribution (I bought it as soon as I found out about it). When I first came to the language I was a little surprised that O'Reilly didn't have a book out, but I guess it was too unstable even for them. Having a good reference that covers not only the syntax (which isn't that hard to learn, let's be honest) but the best practices and common patterns in the language is always a real help.
Not yet. As chrismorgan points out above, the http library is still being developed, once it's more stable, web frameworks can be built on top of that.
Why? FastCGI is slow. Controlling the entire HTTP stack is fast, and makes deployment much easier (only one piece of software to configure, not two or three).
Congrats on all the progress, guys! I'm really looking forward to getting to work with Rust. I have a couple of questions:
Is the new fixed-stack FFI arrangement the end goal, or is it a stepping stone to a different system? It seems as though always using a big, fixed stack would cause performance/memory issues. Could the compiler detect which Rust fn's call extern "C" functions so I don't have to write annotations? Thanks!
Fixed-stack is not the end goal. The intent is to migrate back towards small, growable stacks.
There were long discussions over how "smart" the extern stack-size strategy should be. The current arrangement is, as ever, a compromise. In practice, most people writing bindings to C from Rust will wrap the C call into a very thin wrapper function whose job is to handle type conversions and managing the necessary `unsafe` bits. The hope is that putting the annotation on these wrapper functions won't be very onerous, with the result that any Rust code that calls the wrapper functions won't ever have to bothered with remembering the annotations.
Makes sense, thanks! As an outsider it can be tricky to know which things in the release notes are "This feature is ready" vs "This is simply the present state of things."
Plus there's also rather a lot of non-codified knowledge about what's happening and going to happen. But if you pay much attention to the mailing list and stay in #rust you'll pick up an awful lot of it.
See also https://github.com/mozilla/rust/issues/3591, where re2 is graydon's expressed preference. (graydon was the project leader at that point; more recently brson has taken up that mantle.)