-
Notifications
You must be signed in to change notification settings - Fork 900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use OpenStruct over MiqHashStruct #17359
Conversation
ef7672b
to
6d0f956
Compare
@miq-bot assign @Fryguy @Fryguy I was informed that you might have a opinion against this change, so looping you in. Feel free to pass this off to @gmcculloug if you don't really care (though, wouldn't mind both of your opinions on this if possible). |
6d0f956
to
ccb17fd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NickLaMuro - This looks good to me, though with your comment in the description I'll defer to @Fryguy in regards to his concerns.
This pull request is not mergeable. Please rebase and repush. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am good with this one. Will leave it to @Fryguy who has more knowledge on the internals of MiqHashStruct.
ccb17fd
to
361e2b5
Compare
This is almost a straight 1 for 1 conversion, and performs much better than our implementation: MiqHashStruct. * * * There is a small portion of this that I had to make a change on, and I _think_ it is correct, but here is the explaination for it: There was a chunk of code in this model that silently did nothing when it was an MiqHashStruct because of how often it relied on method_missing to function: field_values[:values].each do |tz| if tz[1].to_i_with_method == val.to_i_with_method # Save [value, description] for timezones array init_values[field_name] = [val, tz[0]] end end If `tz` in the above was a MiqHashStruct, and when `tz[1]` is called, it hit's `method_missing` with `:[]` as the method argument. This of course will almost never have a match in the underlying `@hash` in the MiqHashStruct, return nil and not blow up at the very least. When it is an OpenStruct, this fails with an error since `1` can't be converted to a symbol properly. Since the `tz` object can also potentially be an `Array`, I added a simple check to the code: field_values[:values].each do |tz| next unless tz.kind_of?(Array) # changed this line if tz[1].to_i_with_method == val.to_i_with_method # Save [value, description] for timezones array init_values[field_name] = [val, tz[0]] end end That allowed this to work when it should, and skip over when it wouldn't.
361e2b5
to
58a6a6b
Compare
Checked commit NickLaMuro@58a6a6b with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0 app/models/miq_provision_virt_workflow.rb
|
@@ -35,7 +35,7 @@ | |||
|
|||
result = workflow.allowed_customization_templates(options) | |||
customization_template = workflow.instance_variable_get(:@values)[:customization_template_script] | |||
template_hash = result.first.instance_variable_get(:@hash) | |||
template_hash = result.first.instance_variable_get(:@table) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This change and the one below it are here because the previous version of the spec is grabbing the internal variable inside the MiqHashStruct
, and using that to test the resulting values. @hash
is the internal hash variable in MiqHashStruct
, and @table
is the equivalent in OpenStruct
.
I personally think it make more sense to avoid the instance_variable_get
all together, and just do a send(:attr)
instead, but I didn't want to change the tests that I hardly understand in the first place. They were added recent, however, in these two PRs:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just call result.first.to_h
if all you want is the hash. That works equivalently in MiqHashStruct and OpenStruct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. To be fair, I didn't write this test in the first place, so I was just trying to keep the spec as close to as it was originally. I can change it to the .to_h
suggestion if you prefer.
The purpose of MiqHashStruct is specifically because of memory/performance issues with OpenStruct. It was created as a drop-in replacement for OpenStruct, with mostly a similar interface (at least the parts we used of it). In this particular PR's case it might be ok to use it. If you're doing performance analysis you should take into account what MiqHashStruct is trying to solve, which I'm not sure your numbers are doing or not. The reason for it (at least in the past...I haven't checked recently) is that when you initialize an OpenStruct it creates ( In other words, MiqHashStruct is more efficient in high-instantiation / low-read/write cases (i.e. create 1000 OpenStructs, and only read 1 property from each), whereas OpenStruct is more efficient is low-instantiation / high-read cases (i.e. a cached global object created on boot where most attributes are read 1000s of times). Here is a gist of the performance concerns: https://gist.github.com/Fryguy/ac9adffcc48d343fb30d1c0373819357
|
@Fryguy Well, unfortunately that isn't the case. This whole process is basically taking So this is incredibly read heavy on the attribute end, and exactly the rational that https://gist.github.com/NickLaMuro/36f9f394a5b3cecc4aaf33d3e84dfb27#cpu-usage you will see that the biggest 'time suck' following So not sure the point you are making here. This clearly, based on the data for this particular use case that I have provided, proves that Frankly, I don't think either struct option makes sense if at the end of the day they are just getting feed into a XML blob, and we can just remove the intermediate object all together. But that is a much bigger and risker change than I was willing to do that the time for something here that seems like a pretty quick win. |
@NickLaMuro Can you explain what your benchmarks show? You highlight pending and manageiq/app, but I don't know what those mean. I also don't understand how MiqHashStruct -> OpenStruct can have such a dramatic change, considering they don't do very much. In other words, it's possible you could get the same benefit with some minor tweak to MiqHashStruct as well. |
I'll be honest, there is just way too much information, organized in a way that doesn't work for me (I find the Links section too complicated for how I process information, as it basically dumps a lot of P.S. after my mind has already context switched away from whatever paragraph each individual thing should be related to), so I didn't see that you had linked a gist at all. Separately, in the gist I see the time went from 15s to 1s, but here the time goes from 15s to 14s. I don't see how they can be so drastically different. The gist talks about SQL statements, so is this just one part of a set of changes just to keep it isolated?
Just educating on what MiqHashStruct's purpose for existence is, since there seems to be a misunderstanding, and it's what was asked of me. I really don't have time to read through everything, but I felt that giving the information and saying "pick the right tool for the job" was sufficient. As I said, "in this particular PR's case it might be ok to use it", but I couldn't tell if this fell into the high-read or low-read case.
Yup, that's what I was looking for, and your analysis seems to prove that out, so I am 👍
💯 Agree. A lot of the older code uses OpenStructs and such just to get dot-access, when it's really not necessary. Much better to just access the hashes directly, or if there is a reason for making a struct, then we should make a real-proper struct. |
Rereading this PR, I see discussion of CPU and memory differences in the I'm afraid we're trading memory for cpu or vice versa for different scenarios (high read/ low read / large struct/small struct) and it's hard to measure which is "better" for the most cases for this code area if we don't test the expected scenarios and show the CPU/memory with each struct. Note, I don't generally think allocations is a good measure for comparing things. Long lived objects vs. short lived objects have completely different performance characteristics so we can't really compare one allocation to another. If allocations are important, I'd expect us to pay the penalty in memory/CPU time (doing GC)... so maybe we can just watch memory/CPU time since that's really all anyone cares about. |
@Fryguy Sure. This is an excerpt from the Specifically, What you are seeing highlighted is there is a significant reduction in objects being allocated specifically in the
I actually tried this first, believe it or not. Since this is boarding to months old now, though, I forget where I put the diff of the changes I made around
(update: found the diff ... and guess were it was... the links section...) The overarching theme is that since At the end of the day, I found that I was either going to "rewrite
Sorry, my reaction may have been a bit harsh. I read your first comment as "we should keep using
Yes it is long, but this is trying to help rationalize why these 15+ PRs have a benefit together, but also the the individual impact when compared to all of the changes combined. I tried to omit as much information as possible in this PR (and the rest) specifically because of your And, as mentioned in the description at the end of the benchmarks section:
|
@jrafanie I tried to keep the data across all of the related PRs in a format that you could infer CPU, Memory, and SQL performance improvements/stagnation with the minimal amount of data (again, trying to solve for the With this PR specifically, I assumed that a "savings in CPU" could be inferred by a drop in 1s taken to run this scenario with only the change of Again, more detailed info for the "verbose crowd" can be found in the gist
I do all of this. The gist I linked to has a script that I have been running to validate these changes before and after. It is runnable by doing a: $ bin/rails r sample_profile_script.rb I have omitted the profiling code in to make it easier to grok, but you can profile it with whatever you want.
I assume you are referring to other controller actions or portions of the codebase that call to this model specifically not being tested. To that, I have no clue what that might be, or where to start.
Disagree. While this doesn't give you a good ballpark figure of how much memory is actually used, and the 24mil objects allocated could just be ` byte strings, it is superior in being incredibly consistent and reliable between runs. Having a consistently reproducible metrics/telemetry allows the changes I make be much more scientific and the resulting reports much more concrete. Also, when paired with the CPU metrics that I provide in the gist, it is much more obvious how much of an impact all of these extra objects are having on CPU performance as well. |
This pull request is not mergeable. Please rebase and repush. |
This pull request has been automatically closed because it has not been updated for at least 6 months. Feel free to reopen this pull request if these changes are still valid. Thank you for all your contributions! |
@NickLaMuro Should this be reopened? I think it just fell by the wayside, and I had already 👍ed it here: #17359 (comment) @jrafanie Since Nick's numbers showed a benefit, and I didn't really see a direct 👎 from you, are you ok with it? |
@Fryguy So I am actually working on something in the same area, and found that messing with The PRs in question:
Specifically, the last two PRs (required together), show that we can basically get down to about 4 seconds to run Something to consider, but I didn't originally plan on bringing this back from the dead since it seems like we can get some wins in other ways. |
Yeah, I was ok with this change but in light of what @NickLaMuro said, it makes sense to go after the more valuable changes than this one. We can always resurrect it if we can pinpoint it as a performance problem later. |
@NickLaMuro I vote to re-open as part of ManageIQ/manageiq-gems-pending#231 |
Might be Will look into doing this in a bit. |
This is almost a straight 1 for 1 conversion, and performs much better than our implementation:
MiqHashStruct
.Regarding the [WIP] portion:Update: This section is no longer relevant as I have found the solution to the issue (tests caught what was supposed to be happening). Keeping the below for posterity, but see the last commit for updates to the rational here.
The only part that doesn't work (but also doesn't make any sense that it ever did), was the portion of this code that was commented out:
tz
in the above always seems to be aMiqHashStruct
(in the previous commit), and whentz[1]
is called, it hit'smethod_missing
with:[]
as the method argument. This of course will almost never have a match in the underlying@hash
in theMiqHashStruct
, but will not blow up at least.When it is an
OpenStruct
, this fails with an error since1
can't be converted to a symbol properly. I am uncertain if this code is ever activated, so for now, I have just commented it out to observe the advantages to usingOpenStruct
overMiqHashStruct
.Benchmarks
The benchmark script used here basically runs the following:
And is targeting a fairly large EMS, with about 600 hosts.
Another analysis that included other fixes in addition to this, but gives an alternative solution as well (though it becomes quite obvious why this was chosen), can be found here.
Note: The benchmarks for this change do NOT include the changes from other PRs (#17354 for example). Benchmarks of all changes can be found here.
Links