-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[meta] kv and grok filters timeout handling #10976
Comments
/cc @headius @kares do you think this is a correct assessment of a strategy of directly using Java I see that this strategy was actually suggested back then (1.7.3) by @headius in jruby/jruby#792 because |
will need to look at details, but there's a somehow recent interrupt related change, altough pbly not regexp related (certainly smt where logstash was considered). anyways, there's some support in joni itself for regexp timeouts that isn't used much but could be fine tuned if it would help logstash's case ... |
@kares thanks - you probably refer to jruby/jruby#5683 ? While ultimately the end goal is about interrupting a long regex call in joni (and obviously all suggestions are welcome) my question is more about the potential dangers of using Java |
From what I tested I see no current way of interrupting joni from jruby, be it using interrupt or ruby's Timeout. |
FYI I found some history on joni's interruptibility; jruby/joni#6 and digging further I realize that an option was added to control joni's interruptability from JRuby https://github.com/jruby/jruby/blob/493822c134fe74e9630d2dc49325e291008b3101/core/src/main/java/org/jruby/util/cli/Options.java#L147 and it is off by default. @jsvd running your above interrupt example using
actually works - so that solves the mystery of why joni's interruption stopped working at some point. |
yep, this makes both java interrupt and ruby's Timeout work again. |
pushed kv logstash-plugins/logstash-filter-kv#84 and grok logstash-plugins/logstash-filter-grok#147 modifications |
👏 Well done, gentlemen. I know a few sites I've visited had kv timeout issues, and this is a tremendous bit of investigation and correction. Bravo! |
Also related and somewhat full-circle logstash-plugins/logstash-filter-grok#130 logstash-plugins/logstash-filter-grok#131 |
right, my shady memory forgot that the work-around was specific to mutex-es. shouldn't matter here.
believe you are correct esp. as you dissected JRuby's that being said the interruptible regexp is probably a better way, the reason why its not on by default is likely just performance (slows down matching - not sure how much pbly worth an isolated performance test). |
grok v4.1.0 and kv v4.4.0 have been published both using |
For reference, we traced back to v6.7.0 (where JRuby was updated from 9.1.13.0 to 9.2.6.0) when joni interruptability was disabled by default and resulted in both the kv and grok filters timeout options just not working. This is tracked in #10978 |
I am closing this issue since a fix was pushed. Note that we discovered a substantial performance regression with the current fix as described in #11302 |
Description
Both the kv and grok filters share a similar strategy for supporting the timeout option to abort long running regex operations.
The strategy, initially put in place for grok, is to rely on the Java interrupt status flag. The basic idea is to create a watchdog thread which monitor the time spent from the worker thread inside the regex parsing calls (that are ultimately handled by the Java joni regex library). Upon figuring that a worker thread has spent too much time in a regex call, to abort it, it sets that worker thread underlying Java thread interrupt status by using
Thread.interrupt()
. See kv implementation for example.Problem
We have realized that logstash was sometimes randomly crashing with
InterruptedException
and figured these were tied to the grok and kv filters timeout handling and also figured that the timeout was not working at all in other cases.The problem seems to be related to the direct manipulation of the Java interrupt status flag on the underlying Java thread from a JRuby execution context. JRuby threads + runtime has its own layer to handle the Java interrupt status flag, see the RubyThread implementation. We can see that the
RubyThread.pollThreadEvents()
method callsRubyThreads.anyInterrupted()
which is turn callThread.interrupted()
which reads and resets the interrupt status flag :The problem is that that the JRuby
RubyThread.pollThreadEvents()
methods is invoked in many places in the JRuby runtime and it is very hard to predict the behaviour of explicitly setting a Java thread interrupt flag from a JRuby context because once set in Java from JRuby, the JRuby runtime will continue to be invoked up until the final joni Java code is executed.Also looking at this JRuby issue we can see that figuring the right interruption semantics between Java and Ruby in not trivial and that these implementation details can change. Relying on such implementation details can become future problems if these implementation details change.
Solution
I am not sure. There are possibly two paths;
Timeout
class?The text was updated successfully, but these errors were encountered: