Looks interesting. Based on prior experience, here are some concerns people will bring up with this approach:
1. Spectre. You may have to assume the plugin code can read anything in the address space including any secrets like passwords, keys, file contents etc. If the plugin can't communicate with the outside world this may not be a problem.
2. When you say WASM has a "sandboxing architecture", this is only partially true. It's quite easy to define a simple language that doesn't provide any useful IO APIs and then claim it's sandboxed - that's practically the default state of a new language that's being interpreted. The problems start when you begin offering actual features exposed to the sandboxed code. The app will have to offer APIs to the code being run by the WASM engine and those APIs can/will contain holes through which sandboxed code can escape. If you look at the history of sandboxing, most sandbox escapes were due to bugs in the higher privileged code exposed to sandboxed code so it could be useful, but you can't help devs with that.
3. WASM is mostly meant for low level languages (C, C++, Rust etc). Not many devs want to write plugins in such low level languages these days, they will often want to be using high level languages. Even game engines are like that: "plugin" code is often written in C#, Lua, Blueprint, etc. This is especially true because WASM doesn't try to solve the API typing/object interop problem (as far as I know?), which is why your example APIs are all C ABI style APIs - the world moved on from those a long time ago. You'll probably end up needing something like COM as otherwise the APIs the host app can expose will be so limited and require so much boilerplate that the plugin extension points will just be kind of trivial.
I've been saying it for years, but I think finally 2023 has the chance of being the year in which Wasm GC ships and managed languages start targeting Wasm more widely. We've made a lot of progress with the design, and V8 has a basically complete implementation. Google is targeting some internal apps to Wasm GC and seeing perf improvements over compile-to-JS, so I think this will likely be a success.
That would be cool but I wonder how much difference it will make. Most managed languages share at least two attributes:
1. Large runtimes and std libs. Python's "batteries included" is the epitome of this but even just java.base is large. This doesn't play well at all with browser cache segmentation.
2. You need a JIT for performance.
A good test of whether WASM is really heading towards generality is whether you could implement V8 as https://www.google.com/v8-js.wasm and just auto-include it into HTML pages for backwards compatibility. I know you have unusual experience and expertise in meta-circular VMs - is WASM really heading in this direction? The two obvious sticking points today are: V8 would get downloaded fresh on each origin, and JITd fresh on each page load, and what does such a runtime emit as compiled code?. Is your JITC being JITCd by a JITC and if so is the JITCd output then being JITCd a second time? If so, how on earth does this make sense?
An alternative would be to explore whether the process level sandboxes are now good enough to just allow native code to run inside them and let people use their existing managed language VMs. Google thought that was close to plausible many years ago with NaCL, and kernel sandboxes got a lot stronger since then. It seems we ended up with WASM more due to Mozilla politics than what makes sense technically.
Indeed. With the web's model, it seems tempting to make the browser cache do the work for you by putting the language runtime at a standard URL. That works, modulo the security features today that cache Wasm modules per-origin to avoid an engine JIT bug creating a cross-origin vulnerability. GC helps a bit with that in the sense that the GC algorithm itself moves down into the engine.
> 2. You need a JIT for performance.
I worked a bit on a prototype JVM on Wasm that used the Wasm bytecode format to encode Java constructs. That helps because then you don't have another bytecode interpreter running on level up, but ultimately you want somewhat more control over the JIT and the code it generates. Wasm engines supporting dynamic code generation at all (any finer-grained than a module) would help a lot there.
Compiling V8 whole-hog seems like it would take a lot of doing. In particular, it has JITs and an interpreter that want to spit out machine code. That'd have to be replaced with Wasm backends.
> It seems we ended up with WASM more due to Mozilla politics than what makes sense technically.
The politics were...complicated. I'd tell my side but it's probably best I don't.
Hmm interesting. I thought the cache segmentation was to block timing attacks by the web server (measure time between first page send and js execution to figure out what files are cached i.e. your browser history). I didn't realize it was due to lack of confidence in the JIT. Isn't a WASM JITC a fairly straightforward thing?
Re JVM on wasm with gc: it might be simpler at first to build a Graal/native-image backend for wasm+gc. No runtime JIT needed. Happy to join in on the fun.
> 1. Large runtimes and std libs. Python's "batteries included" is the epitome of this but even just java.base is large. This doesn't play well at all with browser cache segmentation.
Right but that's only one, niche use case (why you'd want plugins in a web page ?).
Even if you are web app with plugins
* that is not a problem for first load, user adds plugins later once they get familiar with the app
* you can still (I assume) just download it in the background and shove it into local storage
The whole use case seems to be "there is an app that would benefit from plugins and we don't want someone to learn the language just to write that plugin" and the idea is pretty sound - WASM embeds well and is fast enough.
I think there’s somewhat of a disconnect between the original idea of WASM (in browser) versus headless. In the browser folks get JavaScript for free which collects its own garbage. WASM is there to supplement higher level language for performance-intensive tasks and as such, “lower level” languages make more sense for these code paths.
I’d like to point out also that providing users a million languages to write plugins in for a product could create a lot of bloat. Imagine an image editor with 5 plugins, each written in its own language running in WASM sandboxes: golang, C#, assemblyscript, ruby, python. That’s 5 runtimes each running it’s own garbage collection logic.
I can see the value for compute hosts because the very nature of the provided service is allowing users to write sandboxed apps. But I think for stand-alone applications it’s best to support one or two simple targets, whether sandboxed or otherwise.
There are languages (Lua for example) optimized for this already.
I suppose the benefit is that each application which uses the WASM backend can decide on their “official” language and provide a decent built-in IDE experience.
I don't think WASM should/would unify the GC across memory models, that could be extremely problematic.
The gist of the idea is polyglot languages can leverage libraries across many languages. The fastest code is the code that was already built (that you didn't need to write).
It's unlikely applications would actually implement libraries from 5 different runtimes (they could, but shouldn't), and if they use RUST libraries, there definitely wouldn't be any GC anyway.
The benefit of this tech is it allows a new language to leverage historical codebases quickly without needing to re-invent every common utility library.
This will inevitably speed adoption of newer languages, zero code tools, etc .. and is the epitome of Proebstings law, which could also accelerate Proebstings (which is every decade) to being to approach Moores law (but I'm not specifically saying that will happen, only that it could).
> I don't think WASM should/would unify the GC across memory models
WASM already has a GC proposal[0] which is already at the "Implementation stage"[1] so it looks like this IS going to happen, although it's uncertain if language runtimes like Go will actually make use of the feature, or what.
A glance of the overview and spec seems to indicate that WASM will provide some primitive data types, and any GC language can build their implementation on top of it. As I understand it, it's heavily based on Reference Types[3], which allows acting on host-provided types, and is already considered part of the spec [4]. It doesn't remove the need for the 5 different runtimes to have their own GC, but it lowers the bulk that the runtimes need to carry around, and offloads some of that onto the WASM runtime instead.
A one size fits all GC is never going to be a great solution though, is it?
Different languages have different GCs that are designed to work with their semantics. Some languages use a flag bit to tell the runtime if a value is stack or heap (Go, Ocaml), some languages allocate almost everything and assume a lot of short lived objects (Java)...
But the competition is compile-to-javascript, which has exactly the same problem. With WASM, you get to choose between using a generic, potentially sub-optimal GC, or having a larger payload and including a GC bundled with the application.
There's no such thing as one-size-fits-all GC. Wasm GC will probably be just as successful as other attempts at generic "managed" runtimes, which is to say, not very.
One is the implementation, e.g. the GC algorithm. They vary widely in their performance characteristics. For the most part, they are semantically invisible. I fully expect many different engines to have different algorithms, and ultimately you can choose and tune the GC algorithm to your application's needs.
Two is the semantics. We're aware of many failed attempts to make generic runtimes, and a critical factor is how universal the object model is. Of the many over the years, most have originated for a single language or paradigm of languages and have, in some sense, too high a level of abstraction. Wasm GC is a lower level of abstraction (think: typed structs), from which higher level constructs are implemented (like vtables, objects, classes, etc). Being lower level is a tradeoff towards universality that we have consciously made. That said, there are downsides, such as more casts, because it gets increasingly harder to safely encode invariants of source languages to avoid such casts at the lower level. We're OK with the overheads we've measured so far but are always looking for mechanisms to reduce or eliminate these.
> We're OK with the overheads we've measured so far but are always looking for mechanisms to reduce or eliminate these.
You could encode arbitrary invariants by implementing verifiable proof-carrying code within Wasm. Then a wasm-to-native compiler could be designed to take advantage of such invariants in order to dispense with these overheads.
That's a legitimately neat idea. There are couple of projects to improve safety of Wasm code using linear memory (such as RichWasm by Amal Ahmed and MSWasm by a number of folks at Stanford, UCSD, and CMU). Obviously unrestricted aliasing of the giant byte array that is memory makes this more difficult. I hope that Wasm GC can offer an abstraction base to express even more invariants. In some sense that will be a study in adding more powerful types and more powerful proof constructs that are either on the side or embedded in the code. So, exciting future directions!
There are, Go can also be compiled to WASM, but it has to carry the whole Go runtime with it (including but not limited to the GC), so the WASM files are a bit fat. You can however use TinyGo, a "Go compiler for small places" (https://tinygo.org/).
Yeah, I get the runtime can be big, but letting gc cross module boundaries seems really bad for sandboxing. I also kind of wonder why we'd want to use those languages in wasm other than their libraries and runtimes. If all you want is gc and performance why not use JavaScript or typescript? C# or go after a ~20% wasm penalty isn't going to be that much faster than js right? Or use something like nim that compiles to js and get libraries as well. The case for wasm always seemed to be absolute max performance in a safer package, which would send you towards languages that support manual memory management anyway.
One of the improvements that you get from the coming Wasm GC proposal is an object model that is inherently more space-efficient than JavaScript. While a lot of speculative optimizations can make JITed JS (of the right form) run fast with few checks, the object model inherently requires boxing or tagging and is not as memory-efficient as what Wasm GC structs give.
Lots of good points here. I'll try my best to address them. But you are right, There aren't really complete answers to each of these problems.
Regarding #1, it's good to point out that spectre is still an ongoing problem. wasmtime, the runtime we use, has mitigations for spectre but things will likely come up. I think we need more time and tools to work on things like detecting attacks and mitigating future problems. Fortunately there are some big companies working on it. Our team cannot solve that problem.
Regarding #2, sandboxing here refers to the memory access model and fault isolation. For each capability you give the plugin you may introduce risk. We don't claim to address this problem but I think education and mitigation are good goals for us.
Regarding #3, I think there has been some good progress here. I have an experimental C# plugin PDK and a JS one based on quickjs. We also support Haskell. I think there will be improvement here. And regarding the ABI comment, there are going to be layers on top of this to make it more ergonomic.
At least the common plugin language lua runs in wasm, and micropython is less than a megabyte in wasm, so that point is a bit weak. (having an extra level of abstraction may make the plugins minimally slower, but running lua in wasm adds an extra level of security)
1. Spectre. You may have to assume the plugin code can read anything in the address space including any secrets like passwords, keys, file contents etc. If the plugin can't communicate with the outside world this may not be a problem.
2. When you say WASM has a "sandboxing architecture", this is only partially true. It's quite easy to define a simple language that doesn't provide any useful IO APIs and then claim it's sandboxed - that's practically the default state of a new language that's being interpreted. The problems start when you begin offering actual features exposed to the sandboxed code. The app will have to offer APIs to the code being run by the WASM engine and those APIs can/will contain holes through which sandboxed code can escape. If you look at the history of sandboxing, most sandbox escapes were due to bugs in the higher privileged code exposed to sandboxed code so it could be useful, but you can't help devs with that.
3. WASM is mostly meant for low level languages (C, C++, Rust etc). Not many devs want to write plugins in such low level languages these days, they will often want to be using high level languages. Even game engines are like that: "plugin" code is often written in C#, Lua, Blueprint, etc. This is especially true because WASM doesn't try to solve the API typing/object interop problem (as far as I know?), which is why your example APIs are all C ABI style APIs - the world moved on from those a long time ago. You'll probably end up needing something like COM as otherwise the APIs the host app can expose will be so limited and require so much boilerplate that the plugin extension points will just be kind of trivial.