-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scheduler: introduce WASM runtime to scheduler and load scheduler plugins via WASM runtime #112851
Comments
@sanposhiho: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I'm an author of go-plugin. Please let me know if you need any help. |
This sounds great, and is one of the exact use cases I have expected to happen with wazero. JFYI
|
I appreciate that wazero has no external dependencies, but it is still on the order of 50k LOC. Since we may run untrusted code, I would expect us to carefully audit any runtime we use. This is non-trivial, to say the least. Also, Go's lack of WASI support makes it impossible to use most of the k8s library code because tinygo cannot run it (and using an alternative compiler is not a valid approach in many envs). https://github.com/prep/wasmexec is an interesting workaround for that, but such an approach seems like a poor fit for k8s. I did a recent experiment with k8s+wasm for admission. Until Go itself has complete WASI support, I would not recommend this approach. |
interesting feedback. ps wazero has GOOS=js (wasmexec) support but I also wouldn't recommend it, as the performance is horrendous and also the wasm code in go is borderline hackweek. Seems safe to park this until there's a go source compiler that can work with k8s library code + wasi, regardless of which compiler that is! |
Definitely agree there is a lot of work to do to both prove the safety of a runtime as well as the (go) tooling for actually building wasm modules. That said I do not think wasm code should be considered "untrusted". Not to say that all wasm code should be inherently trusted, but rather that the trust should come from the admin wiring up the wasm bits to the scheduler. |
since I may have caused confusion, I'll restate a couple things about wasm (and wazero) especially as pertains to syscalls. At the moment, there are only two compilers that can produce wasm golang/go and tinygo-org/tinygo. Both of these can target the "js" operating system, and both when doing so will be inefficient due to that. tinygo can alternatively target wasi, without forcing data through a js model. This allows it to perform better. However, tinygo currently doesn't fully support reflection, so some porting and expertise is required to dodge libraries that fail. There will be porting notes in general as even with reflection supported, not all functionality works in wasm yet (ex parallelism). There are notes here about the above two approaches https://wazero.io/languages/ wasm has no instructions that can result in system calls. in freestanding/standalone form (something that will be added to tinygo), there's no way for wasm to result in a system call. That's because the only way out of the VM is host functions (like wasi). If you don't import any host functions, the worst you can do (in simple terms) is consume up to 4GB memory and spin CPU. wasi host functions are a subset of system calls, and at least in wazero, all of which are stubbed by default. Ex you can't even read the host clock or RNG, much less files or sockets, unless you (the embedder) choose to. There are code who can work with a small subset of functionality, typically stdout/stderr if anything for debug logging. For example, Envoy loads wasm, but currently forbids host imports that can do file I/O. That's fine because many don't need to. Also, faking files isn't difficult (ex in wazero it is fs.FS) I'm not sure of the general way k8s audits library dependencies, but there are a few things to consider with wazero. Firstly, if it is auditing it is a working wasm, we run both v1 and v2 spectests on each commit. you can audit that actually happens. Knowing there are no instructions that can cause syscalls (except indirectly by calling a function that does), you can also audit our wasi imports which are a small part of the codebase, and many are stubbed out. I think how to proceed depends really on the security posture of the project and how much rigor is needed.. I'm not sure as while I've noticed k8s has a lot of deps I have no idea what they are subject to. If on the other hand, this is about the idea of auditing wasm, I agree that's tough because it is a blob and there is no standard library. You can detect if host imports (like wasi) exist and you can also decide what to do if they are called (ex override wazero's noop stuff). You can also consult with security folks on the idea of auditing wasm. maybe trivy folks like @knqyf263 have some ideas. I suspect it might be easier to audit the repo that creates it, though as then what's inside is more transparent. Anyway, I hope these pointers help, as while certainly people need to think about security aspects, I would also agree that gatekeeping experimentation will hurt also. For example, there are many things about webassembly that require experience. Similar to dapr, it could be a good idea to put wasm under a flag, so that users and maintainers can feel it out, as that experience takes time to build my 2p |
here's another resource that might help with the decision outcome here. @mathetake presents how FFI works in webassembly and specifically wazero also https://speakerdeck.com/mathetake/cgo-less-foreign-function-interface-with-webassembly |
Thank everyone for much worthwhile information. I checked something in https://github.com/sanposhiho/kubernetes/tree/sanposhiho/poc-wasm-scheduler-plugin.
But, for the first problem, it doesn't matter when users want to write the wasm plugin in a language other than Go. My opinion/conclusion here:
|
I have some feedback since pinged. Waiting for go to support WASI will likely be pausing this feature for years. I would recommend waiting only unless you actively want to defer wasm support. In other words "let's wait for WASI" is a keyword of putting things on ice as it is unlikely it will happen. I wouldn't make it a primary requirement to implement this with go-plugin. While the kit looks convenient, it wasn't designed for performance and also has some gotchas. It is more typical to define an ABI (kind of like a API, but for wasm). This also allows non-go implementations. A better example would be dapr. We recently designed an ABI for that project called http-wasm and it is put together cleanly and benchmarked. For example, it instantiates wasm in a pool and certain use cases can be only dozens of microseconds overhead. Using an ABI implies SDK and docs, but it is a proven model which also allows maintainability and performance. See also Envoy or fastly SDKs etc. For reference, here's https://http-wasm.io/ and the benchmarks you can run for yourself and notice it isn't required for wasm to be slow https://github.com/dapr/components-contrib/blob/master/middleware/http/wasm/benchmark_test.go In any case, nothing has ever started out ideal from POC, there's always work to do. If you choose to leverage go-plugin despite above, expect to add effort to make it performant in use cases you want. That's much easier than trying to get google to introduce WASI in to the language, something that's been tried before. Even if WASI was added to the language, you'd have to proceed with all the normal evolution above, and deferring that would stunt the experience people need to acrue if wasm is important. my 2p and thanks for the ping! |
yep agreed. this work should come with the thorough ABI design work as I've done in Proxy-Wasm in Envoy community and in http-wasm as Adrian mentioned. Then the performance stuff shouldn't be a matter as I don't expect the number of nodes of k8s cluster to exceed the number of requests we handle in Dapr. The point here to make is that regardless of the Wasm runtime you choose, we have to copy the data between "Wasm world" and "Go world" and that's inevitable by the design of Wasm (sandboxing the memory) and hence it can be an easy bottleneck. In other words, the specific ABI design comes with how to reduce the memory pressure between the boundary and therefore the better perf vs copying all the data and result back and forth with go-plugin. In any way, I'm thrilled @sanposhiho did this PoC, and we (me and @codefromthecrypt) could help design ABI and implementation with the perf primarily in mind. |
Thanks. This was actually the biggest misunderstanding for me. If we can improve performance on our side (I mean outside wazero), then there is no blocker on proceeding straight ahead; we can just pursue and pursue and see if the performance after that is acceptable or not. We cannot get started on the project in k/k from the beginning either way; we need to have another place for it to gain some experience. either of sigs/scheduling-plugins, creating new sub-project, etc etc. But, I think it should be maintained somewhere in the community so that the experience won't be just mine, but that of the community. |
@sanposhiho I'm glad you are interested in getting started and gaining experience. In dapr, this is our third iteration and we are readying to do more now. If we never had iteration one and two, we'd not be able to progress mutually between core team and wasm contributors. |
FYI golang/go#58141 @Pryz, @johanbrandhorst, and other folks are working on pushing the "real" WASI target forward as a revived proposal. |
/cc |
The discussion is on-going ↓. mostly about the place where we should put the experimental implementation. |
/cc |
The repository is created. 🎉 |
@sanposhiho I think it has been a while since your POC, so a lot of things have changed, do you think it is worthwhile re-basing that against latest wazero etc so we can start the conversation from there? We now have a lot of tools available, as we get to defining an ABI and measuring it. For example, @achille-roussel and gang have pprof support for wasm. I'm very confident we can proceed with high signal as we iterate. Thanks tons for getting us together! |
apologies I think maybe this thread should be re-created on the new repo, my mistake as I thought I was commenting on it 😊 |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
What would you like to be added?
(Disclaimer: It's just an idea now, will confirm the technical feasibility for this after getting some feedback on this idea)
summery: introduce WASM runtime in kubernetes scheduler.
We can allow users to write WASM plugins and load them in the scheduler much more easily without building their custom schedulers. (Probably we just need to pass a path to the wasm files in CC)
Furthermore, we probably will be able to allow users to write their own plugins in other languages than Golang.
As you may know, istio/envoy allows developers to write Filter plugins in WASM.
https://istio.io/latest/docs/reference/config/proxy_extensions/wasm-plugin/
I've always felt it was an interesting idea and wondering if we can implement a similar one on the scheduler.
And, as far as I knew, there was no stable option to introduce WASM runtime in Golang without using CGO (= can be cross-compiled easily.).
But, I just found the one finally; that is tetratelabs/wazero. tetratelabs/wazero is the zero dependencies WebAssembly runtime for Golang and does not require CGO.
https://github.com/tetratelabs/wazero
By using it, I believe we can achieve something like we wanted to see in #100723 and #106705.
We can define a new interface to communicate with WASM plugins or may be able to use the scheduling framework's interface (e.g.,
Filter()
Score()
etc) as it is.Seems trivy is using it to provide a plugin feature which is something like what I wanna propose in this issue.
Also, there has already been a tool to implement a general plugin by tetratelabs/wazero. (which also seems good)
https://github.com/knqyf263/go-plugin
Some concerns:
Why is this needed?
#100723 describes a lot about the current extensions of the scheduler and its difficulties for users.
This feature will hugely contribute to the scheduler's extendability.
With this feature, users can use their own plugins without building their custom schedulers. That would be amazing.
And furthermore, users may be able to write their own scheduler plugins in languages other than Golang. That would be too amazing!
We tried to support a similar feature by Golang standard
plugin
package in #100723 and #106705But, as described in the discussion in #100723, we've found Golang's plugin feature has so many limitations and probably we need to give up to use it.
/assign
/sig scheduling
/cc @kubernetes/sig-scheduling-leads
The text was updated successfully, but these errors were encountered: