-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
encode
is a crazy memory hog
#27
Comments
Thanks for the detailed report 👍 What are the 100M values? Plain strings? Could you please share some code reproducing the issue? Did you try with another messagepack implementation like |
But I can offer a few more insights regarding
It does not seem impossible that the Do you want to try to create your own sample with some dummy arrays containing a ton of strings? |
Isn't it better to use compression algorithms (like gzip) to encode data of huge sizes (e.g. 10MB and above)? On client-side (browser) maybe pako can decode it on the main thread or on the worker thread. uWebSockets.js is a great alternative to socket.io and other web frameworks too (e.g. express, koa, hapi, ws) |
@joshxyzhimself The issue here is with @darrachequesne To answer your question: |
Due to unoptimized algorithm (as also discussed in #12),
encode
is a memory hog (I have not looked atdecode
yet).I decided to post this as a separate issue, since the other issue's title does not capture the problem, and the discussion mostly focusses on execution speed, not on memory issues.
In my case, I am sending data with
socket.io
, and this is my journey:encode
it ran out of memory. I had to increase node's RAM limit to--max-old-space-size=8192
.298,406,623
_encode
call itself required 4GB of additional memory (even though, as mentioned above, buffersize
is less than 300MB total).encode
algorithm itself.process.memoryUsage()
. All three (rss
,heapTotal
, andheapUsed
) show the same trend.Possible Solution
I strongly suggest to heed manast's suggestion to use a direct buffer allocation approach. In case that buffer size is unknown, just run the algorithm once to compute buffer size and index positions, then re-run to actually populate, rather than using the current approach of creating temporary utility objects. This should come at a much lower memory (and probably CPU) cost, than the current version.
I know the owner currently does not have time to work on this, but one can dream :)
The text was updated successfully, but these errors were encountered: