Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[🐛 Bug]: java.lang.OutOfMemoryError #2528

Open
Doofus100500 opened this issue Dec 24, 2024 · 21 comments
Open

[🐛 Bug]: java.lang.OutOfMemoryError #2528

Doofus100500 opened this issue Dec 24, 2024 · 21 comments

Comments

@Doofus100500
Copy link
Contributor

Doofus100500 commented Dec 24, 2024

What happened?

Getting oom in eventbus container
image

Command used to start Selenium Grid with Docker (or Kubernetes)

helm

Relevant log output

{"class": "EventBusCommand","log-level": "INFO","log-message": "Started Selenium EventBus 4.26.0 (revision 69f9e5e): https:\u002f\u002f10.232.86.222:5557","log-name": "org.openqa.selenium.grid.commands.EventBusCommand","log-time-local": "2024-12-14T07:31:37.796Z","log-time-utc": "2024-12-14T07:31:37.796Z","method": "execute"}
Exception in thread "iothread-2" java.lang.OutOfMemoryError: Cannot reserve 8192 bytes of direct buffer memory (allocated: 501211210, limit: 501219328)
    at java.base/java.nio.Bits.reserveMemory(Bits.java:178)
    at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:121)
    at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:332)
    at zmq.io.coder.DecoderBase.<init>(DecoderBase.java:46)
    at zmq.io.coder.Decoder.<init>(Decoder.java:71)
    at zmq.io.coder.v2.V2Decoder.<init>(V2Decoder.java:18)
    at zmq.io.StreamEngine.handshake(StreamEngine.java:805)
    at zmq.io.StreamEngine.inEvent(StreamEngine.java:386)
    at zmq.io.IOObject.inEvent(IOObject.java:85)
    at zmq.poll.Poller.run(Poller.java:275)
    at java.base/java.lang.Thread.run(Thread.java:840)

Operating System

k8s

Docker Selenium version (image tag)

4.26.0-20241101

Selenium Grid chart version (chart version)

0.37.1

Copy link

@Doofus100500, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

@VietND96
Copy link
Member

It looks like the actual usage memory not reach the range of request and limit resources config.
In the latest change, I add default SE_JAVA_OPTS for all component (in the server configmap, which is referred by all components) the -Xmx and -Xms for JVM selenium server.

SE_JAVA_OPTS: "-XX:+UseG1GC -Xmx1024m -Xms256m -XX:MaxGCPauseMillis=1000 -Djdk.httpclient.keepalive.timeout=300 -Djdk.httpclient.maxstreams=10000"

Can you check it helps?

@VietND96
Copy link
Member

@joerg1985, do you have any comment on this?

@Doofus100500
Copy link
Contributor Author

-Xmx1024m -Xms256m

For all components, this is extremely low. In my opinion, it is necessary to make it possible to configure these parameters for each component individually. Under load, consumption increases significantly.

@VietND96
Copy link
Member

Via extraEnvironmentVariables in each component, I think you can override the global one

@Doofus100500
Copy link
Contributor Author

But this is not reflected in the chart for the eventBus and other distributed components

@VietND96
Copy link
Member

Oh really? Can you give example yaml values that you are settings?

@Doofus100500
Copy link
Contributor Author

For example, to address the issue with the event-bus mentioned in this issue, I added the following through k9s:

- name: SE_JAVA_OPTS  
  value: -Xmx2g

@VietND96
Copy link
Member

I just checked, in chart config, all distributed components are refer to this config for extra env vars components.extraEnvironmentVariables

@Doofus100500
Copy link
Contributor Author

That’s exactly what I’m saying. I want to set appropriate parameters for each component individually, rather than, for example, setting -Xmx16g for all of them.

@VietND96
Copy link
Member

Yes, I can understand the problem now, will add that config for each component, instead of common

@VietND96
Copy link
Member

Do you observe anything else that you think to fix in chart 0.38.3 also?

@Doofus100500
Copy link
Contributor Author

Unfortunately, I haven’t even looked into it yet. If I find anything, I’ll definitely come back in the future.

VietND96 added a commit that referenced this issue Dec 26, 2024
…ted component

Support #2528

Signed-off-by: Viet Nguyen Duc <nguyenducviet4496@gmail.com>
@joerg1985
Copy link
Member

@VietND96 i had a short look at the code of EventBusCommand and when looking at this (without debugging) i would expect a leak in the /status call. It adds a listener, but never removes it. Will put this on my todo list.

@joerg1985
Copy link
Member

The leaking listeners have been fixed in SeleniumHQ/selenium@269a7f6 but i am not sure this is the root cause here, as there are only a few bytes leaked for each call to /status so the grid must be up for several days to see this.

@Doofus100500
Copy link
Contributor Author

Doofus100500 commented Dec 28, 2024

Actually, in our case, we expect the grid (except for the pods with browsers) to always be operational. Could you please check for leaks and other components?
image
image
image
image

@joerg1985
Copy link
Member

@Doofus100500 i think the best would be to create a heap histogram with jmap and share them here.

@Doofus100500
Copy link
Contributor Author

Unfortunately, I will only be able to take care of this after the 9th.

@VietND96
Copy link
Member

VietND96 commented Jan 2, 2025

Via #2546, I added the way to get HeapDumpOnOutOfMemoryError, or get heap dump on demand when terminate/stop the container to directory /opt/selenium/logs. Need to use volume to mount that dir in container to persist the output files.

@joerg1985
Copy link
Member

@Doofus100500 please wait for the next release before testing, this might be the fix for your issue: SeleniumHQ/selenium#15011

@Doofus100500
Copy link
Contributor Author

Hi @VietND96 , have you considered using XX:MaxRAMPercentage and XX:MinRAMPercentage instead of Xmx and Xms? It seems like a good solution for general configuration in:

SE_JAVA_OPTS: "-XX:+UseG1GC -Xmx1024m -Xms256m -XX:MaxGCPauseMillis=1000 -Djdk.httpclient.keepalive.timeout=300 -Djdk.httpclient.maxstreams=10000"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants