-
Notifications
You must be signed in to change notification settings - Fork 331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Bug Report — Runtime APIs: large wasm memory allocations bring a DurableObject into broken state #3264
Comments
Hey @yarolegovich. We believe that the root cause here is that the compile target @LuisDuarte1 has developed a pretty complicated work around to handle panics at the application layer. LMK if you are interested. |
Hey @joshthoward, I'm definitely interested 👀 , we have a few different use cases for larger wasm binaries in Durable Objects, including CRDTs. Some of them are our rust code, others are libraries that use wasm under the hood. |
Hi @joshthoward
Sure, knowing what exactly messes up the state will definitely be very helpful. |
The 128MiB limit is documented here (not sure if you can ask to increase via the form): https://developers.cloudflare.com/workers/platform/limits/#worker-limits
Not sure about implementation details in If resetting the WASM instance is fine in your use-case (this is not a general fix) - I would personally recommend re-creating the WASM instance (probably will need a patch in |
Right, so worker limits apply to durable objects (was looking on this page). Just to confirm my understanding. Requests are load balanced across servers where worker runtime processes run, hosting isolates of different tenants. Inside a runtime process, each tenant will only have a single isolate with tasks in the event loop potentially corresponding to different requests and durable object invocations, and the 128 MB limit applies to an isolate, also including wasm?
Yeah, so far that was the only possible "workaround" we could think of. |
@joshthoward Hi, thanks for taking a look at this. Would panic inside WASM put the WebAssembly.Memory instance into an invalid state? Note, that the error doesn't originate inside the WASM bytecode, but when we try to create a Uint8Array view of the WASM memory in JS. |
Panics in the |
In the above error we never get to the WASM instance. The error gets thrown in JS on this line right here: cachedUint8ArrayMemory0 = new Uint8Array(wasm.memory.buffer); // Range Error: Invalid typed array length: undefined |
I think you got the right idea in the sense that: why AFAICT So my point remains, further calls to an aborted WASM instance (and by extension its own generated JS glue code) are considered UB. (not to say that the bad state of |
So, how do we handle these? 🙂 |
Details
Reproduction code with steps in README.
We're using automerge library. When we make a large number of allocations a:
Error is thrown in this code produced by wasm-bindgen:
wasm.memory.buffer
is anArrayBuffer
with non-zero byteLength.The worts part is that after this happens for the first time the object starts failing on any interaction with the library (cause of memory access attempts) and it is hard to get the object out of this state. DurableObject reset by an unhandled exception or
this.ctx.abort()
doesn't help. Other DurableObjects of the same type are working fine (until overloaded).The text was updated successfully, but these errors were encountered: