Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added: support for timeout_scope #153

Closed
wants to merge 8 commits into from

Conversation

kares
Copy link
Contributor

@kares kares commented Nov 14, 2019

this is a new feature (off by default) meant to reduce the high cost of timeouts (follow-up on #147)

an attempt to address: #152

some things left to wrap this one up:

  • naming help - does timeout_grouped sound okay?
  • doc review (on the new config)
  • numbers! (derived from a real-world multi match case)
  • changelog entry

eventually a performance test guard (could be added later).

@kares
Copy link
Contributor Author

kares commented Nov 14, 2019

"smoke" performance test - using 10 (simple) always failing patterns :

input {
    generator {
      lines => ["aaaaaaaaaa", "bbbbbbbbbb", "cccccccccc", "ddddddddddd", "eeeeeeeeee"]
      count => 1000000
    }
}

filter{
  grok {
    timeout_millis => 30000
    match => {
      "message" => [
        "foo1: %{NUMBER:bar}", "foo2: %{NUMBER:bar}", "foo3: %{NUMBER:bar}", "foo4: %{NUMBER:bar}", "foo5: %{NUMBER:bar}",
        "foo6: %{NUMBER:bar}", "foo7: %{NUMBER:bar}", "foo8: %{NUMBER:bar}", "foo9: %{NUMBER:bar}", "foo10: %{NUMBER:bar}"
        ]
    }
  }
}

output{ stdout { codec => dots {} } }

baseline

timeout_millis => 30000

[64,9KiB/s] [65,8KiB/s] [62,1KiB/s]

timeout_millis => 0

[78,2KiB/s] [75,6KiB/s] [73,3KiB/s]

updated plugin (from PR)

timeout_millis => 30000

[56,2KiB/s] [54,3KiB/s] [58,7KiB/s]

timeout_millis => 0

[74,6KiB/s] [75,0KiB/s] [70,2KiB/s]

timeout_millis => 30000 timeout_grouped => true

[69,8KiB/s] [66,6KiB/s] [69,1KiB/s]

@colinsurprenant
Copy link
Contributor

Great stuff @kares - left a naming suggestion comment. I really like the TimeoutSupport abstraction.
LGTM code-wise so far.

@kares
Copy link
Contributor Author

kares commented Nov 15, 2019

based on (above smoke test) numbers - surprisingly the current code does get (~5%) slower for the
timeout_millis => 30000 (guessing its either the Struct or the fast that blocks are not inlining)

@jsvd
Copy link
Member

jsvd commented Nov 15, 2019

btw I used https://gist.github.com/jsvd/23dbb156904e9ba770d48bb971b6735e#file-gistfile1-txt
to stress test your change and the difference is dramatic: about 20k eps with current 7.4.2, and 90k eps with this patch and timeout_grouped enabled

@kares
Copy link
Contributor Author

kares commented Nov 15, 2019

yy - more patterns more it should improve ... that part I am happy with 🥇
just do not like that we're a bit slower for the default case - maybe its not that relevant.

@jsvd
Copy link
Member

jsvd commented Nov 15, 2019

I have tested with a single pattern using https://gist.github.com/jsvd/23dbb156904e9ba770d48bb971b6735e#file-stress_single_pattern
and could not see a significant difference at all

@kares
Copy link
Contributor Author

kares commented Nov 15, 2019

~5% degradation mostly impacts timeout_grouped: false but we can advice to flip the switch! 🤜
looked into it and it's due the additional block being passed and yielded (they do not yet inline in JRuby).

this is motivated by reduced performance for the default case
since JRuby does not yet inline blocks the additional 'dummy'
block passing (due timeout) had a small impact on performance.
@kares
Copy link
Contributor Author

kares commented Nov 15, 2019

with some "oop" (to avoid dummy block passes) - smoke performance now shows close to base line.

Copy link
Contributor

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work on getting eliminating overhead from the baseline case, and this should certainly improve the performance issues we saw with many-pattrrn fall-through 🎉

I've left a few comments, mostly just things that can be cleaned up after your latest refactor.

lib/logstash/filters/grok.rb Outdated Show resolved Hide resolved
lib/logstash/filters/grok/timeout_support.rb Outdated Show resolved Hide resolved
lib/logstash/filters/grok/timeout_support.rb Outdated Show resolved Hide resolved
lib/logstash/filters/grok/timeout_support.rb Outdated Show resolved Hide resolved
@kares kares changed the title Added: support for grouped timeout (per group of patterns) Added: support for timeout_scope Nov 15, 2019
@karenzone karenzone self-requested a review November 15, 2019 16:33
Copy link
Contributor

@colinsurprenant colinsurprenant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending doc review.

Copy link
Contributor

@karenzone karenzone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a couple of comments as suggestions in line.
Please add an entry to the configuration options table around line 199.
Thanks for this.

docs/index.asciidoc Outdated Show resolved Hide resolved
docs/index.asciidoc Outdated Show resolved Hide resolved
logstash-filter-grok.gemspec Outdated Show resolved Hide resolved
lib/logstash/filters/grok.rb Show resolved Hide resolved
docs/index.asciidoc Outdated Show resolved Hide resolved
Copy link
Contributor

@karenzone karenzone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs build cleanly and LGTM. Thanks!

Copy link
Contributor

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last thing, I promise.

Other than the changelog version mismatch, this LGTM. Feel free to resolve that and merge/publish.

CHANGELOG.md Outdated Show resolved Hide resolved
Co-Authored-By: Ry Biesemeyer <yaauie@users.noreply.github.com>
Copy link
Contributor

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🙌🏼

@elasticsearch-bot
Copy link

Karol Bucek merged this into the following branches!

Branch Commits
master 3527b14, d4aac7c, 3c5e4c5, 8fcb5f8, d118d93, cd7d92e, 7a2c212, a317df8

elasticsearch-bot pushed a commit that referenced this pull request Nov 18, 2019
this is motivated by reduced performance for the default case
since JRuby does not yet inline blocks the additional 'dummy'
block passing (due timeout) had a small impact on performance.

Fixes #153
elasticsearch-bot pushed a commit that referenced this pull request Nov 18, 2019
elasticsearch-bot pushed a commit that referenced this pull request Nov 18, 2019
elasticsearch-bot pushed a commit that referenced this pull request Nov 18, 2019
elasticsearch-bot pushed a commit that referenced this pull request Nov 18, 2019
elasticsearch-bot pushed a commit that referenced this pull request Nov 18, 2019
Co-Authored-By: Ry Biesemeyer <yaauie@users.noreply.github.com>

Fixes #153
kares added a commit to kares/logstash-filter-grok that referenced this pull request Nov 22, 2019
kares added a commit to kares/logstash-filter-grok that referenced this pull request Nov 26, 2019
kares added a commit to kares/logstash-filter-grok that referenced this pull request Dec 2, 2019
kares added a commit that referenced this pull request Dec 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants