Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: faster parse options #535

Merged
merged 4 commits into from
Apr 10, 2023
Merged

Conversation

H4ad
Copy link
Contributor

@H4ad H4ad commented Apr 6, 2023

The performance of parse-options before:

includePrerelease x 12,935,785 ops/sec ±3.66% (80 runs sampled)
includePrerelease + loose x 6,915,009 ops/sec ±1.54% (85 runs sampled)
includePrerelease + loose + rtl x 7,350,781 ops/sec ±1.54% (88 runs sampled)

After:

includePrerelease x 1,122,022,296 ops/sec ±0.81% (96 runs sampled)
includePrerelease + loose x 1,133,747,998 ops/sec ±0.11% (89 runs sampled)
includePrerelease + loose + rtl x 1,135,218,097 ops/sec ±0.35% (95 runs sampled)

The performance improvement also extend for any method that use parseOptions, like satisfies:

Before:

satisfies(1.0.6, 1.0.3||^2.0.0) x 281,862 ops/sec ±2.84% (89 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0) x 312,593 ops/sec ±0.90% (95 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0) x 323,066 ops/sec ±1.15% (89 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true}) x 275,542 ops/sec ±1.24% (91 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true}) x 295,613 ops/sec ±1.07% (94 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true}) x 314,728 ops/sec ±1.22% (90 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true,"loose":true}) x 253,795 ops/sec ±0.93% (94 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true,"loose":true}) x 268,212 ops/sec ±1.18% (95 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true,"loose":true}) x 280,362 ops/sec ±1.18% (94 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 218,464 ops/sec ±2.21% (90 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 232,007 ops/sec ±1.05% (91 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 244,718 ops/sec ±0.90% (94 runs sampled)

Then:

satisfies(1.0.6, 1.0.3||^2.0.0) x 323,247 ops/sec ±1.44% (90 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0) x 335,885 ops/sec ±1.22% (92 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0) x 358,881 ops/sec ±0.89% (96 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true}) x 321,664 ops/sec ±0.85% (94 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true}) x 341,290 ops/sec ±1.00% (96 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true}) x 367,722 ops/sec ±0.95% (96 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true,"loose":true}) x 294,952 ops/sec ±0.56% (93 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true,"loose":true}) x 311,517 ops/sec ±0.36% (94 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true,"loose":true}) x 320,496 ops/sec ±1.18% (91 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 274,429 ops/sec ±0.58% (94 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 282,862 ops/sec ±0.72% (91 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 304,038 ops/sec ±0.44% (94 runs sampled)

I also added parseOptions before any method that passes options down to range or semver to reduce the number of comparisons by looking into isParsedConfigSymbol.

With this new way of parsing the options, the memory allocation is 0 for any parsed options.

The only downside is the maintenance, the number of variables to represent the options can increase depending on the number of options if we keep just 3 options, so is not a problem, if we add more, is more comparisons.

Alternative Version

I also made a version using bit masks, which you can see in this commit: h4ad-forks@6c7a68a

In that version, the performance benefits are the same, the maintenance is lower because is very easy to handle bit flags but is takes more memory and breaks one test of parse-options (about the number).

Also, the options now are just a getter and I introduce a new variable called flagOptions, which holds all the flags.

Conclusion

In terms of lower changes and lower memory usage, the best option is Object.freeze.

If you want to take more aggressive changes with lower maintenance in the future and allocating some memory in the process, the option with bit masks is the best.

References

Related to #528

benchmark.js
const Benchmark = require('benchmark');
const satisfies = require('./functions/satisfies');
const suite = new Benchmark.Suite();

const versions = ['1.0.3||^2.0.0', '2.2.2||~3.0.0', '2.3.0||<4.0.0'];
const versionToCompare = '1.0.6';
const option1 = { includePrelease: true };
const option2 = { includePrelease: true, loose: true };
const option3 = { includePrelease: true, loose: true, rtl: true };

for (const version of versions) {
  suite.add(`satisfies(${versionToCompare}, ${version})`, function () {
    satisfies(versionToCompare, version);
  });
}

for (const version of versions) {
  suite.add(`satisfies(${versionToCompare}, ${version}, ${JSON.stringify(option1)})`, function () {
    satisfies(versionToCompare, version, option1);
  });
}

for (const version of versions) {
  suite.add(`satisfies(${versionToCompare}, ${version}, ${JSON.stringify(option2)})`, function () {
    satisfies(versionToCompare, version, option2);
  });
}

for (const version of versions) {
  suite.add(`satisfies(${versionToCompare}, ${version}, ${JSON.stringify(option3)})`, function () {
    satisfies(versionToCompare, version, option3);
  });
}

suite
  .on('cycle', function (event) {
    console.log(String(event.target));
  })
  .run({ async: false });

@H4ad H4ad requested a review from a team as a code owner April 6, 2023 02:54
@H4ad H4ad requested review from lukekarrys and removed request for a team April 6, 2023 02:54
bin/semver.js Show resolved Hide resolved
internal/parse-options.js Outdated Show resolved Hide resolved
ranges/max-satisfying.js Outdated Show resolved Hide resolved
@H4ad
Copy link
Contributor Author

H4ad commented Apr 6, 2023

Just to bring more attention, @ljharb points out a very interesting thing that could affect this PR at #536 (review).

If the assumption is true, we can clean all the varN variables and just return the original object options.
This will only affect the tests, so we will need to fix them.

@H4ad H4ad force-pushed the fix/faster-parse-options branch from eccb1f2 to cc7aadd Compare April 6, 2023 15:13
@jakebailey
Copy link

Here's my DT stress test using pnpm. Before:

Done in 2m 14.5s
total time:  134.82s
user time:   162.81s
system time: 31.61s
CPU percent: 144%
max memory:  1752 MB

This PR ("semver": "github:h4ad-forks/node-semver#fix/faster-parse-options"):

Done in 1m 42.2s
total time:  102.43s
user time:   115.96s
system time: 30.80s
CPU percent: 143%
max memory:  1774 MB

Which according to https://how-much-faster.glitch.me/ is:

  • 31.62% faster
  • 1.32x faster
  • 24.02% less time

@jakebailey
Copy link

I forgot to say the obvious: well done!

Feel free to ping me if you want this rerun and I haven't already done so.

ranges/intersects.js Outdated Show resolved Hide resolved
ranges/subset.js Outdated Show resolved Hide resolved
@wraithgar
Copy link
Member

wraithgar commented Apr 6, 2023

Stepping back and broadly asking the question "what is parseOptions for"? I think we have the following:

  1. standardize object for memoization
  2. coercion of true|false into { loose: true:false }
    • required, semver major change to remove this
  3. coercion of empty options into {}
    • not absolutely required, but definitely a best practice. Relying on the fact that true doesn't have a loose attribute is quite the antipattern
  4. filtering out of unknown options
    • we already don't do anything if extra options are passed.
    • potentially a memory savings though? Is the tradeoff worth it for the extra cpu time?

If all we did were 2 and 3 I think we would be ok?

ETA: if the function returns an object as-is then the memory concerns of 4 are nonexistent, it would be passing the same object around everywhere.

@jakebailey
Copy link

jakebailey commented Apr 6, 2023

Personally, if I were to rewrite this library from scratch (something I'm trying to do to explore perf improvements, though I don't know if I will actually publish it), I wouldn't be storing these options in SemVer at all.

Looking at SemVer, only two options are used, loose and includePrerelease.

loose means that the parser can accept things like leading zeros. But once it's parsed, the loose-ness isn't important anymore. If it's being compared with another version, that's a different version, and the loose-ness is not important; everyone's already parsed.

includePrerelease, on the other hand, isn't used at all. That option really only makes sense for Range. And even then, I still think that it's not a property of a Range; it's a property of satisfies and other helpers and how they decide to interpret a Range.

So to me, it feels like an equivalently good change to this PR would be to just check loose once, store it as a prop for people to know if a SemVer came from a loose parse, but otherwise not try and do anything.

But, options is a part of the API, so, the currrent API is sorta locked in.

@H4ad H4ad force-pushed the fix/faster-parse-options branch from cc7aadd to 7e4e016 Compare April 6, 2023 16:47
@wraithgar
Copy link
Member

rtl is used in coerce and only there, and that coerce is not used by anything else in semver.

@H4ad
Copy link
Contributor Author

H4ad commented Apr 6, 2023

Well, I think we agree to keep the object as-is and don't do any modification, and just validate the cases of undefined and boolean.

In this case, I will create a new commit instead of modifying the first one with the test modified and the new behavior.

Also, this can be considered as breaking change?
If so, I will prefer to keep this PR with Object.freeze, and then create another PR so simplify parseOptions.

@wraithgar
Copy link
Member

Also, this can be considered as breaking change?

It depends on what the changes needed are. Just cause our tests have to change doesn't necessarily mean it's a breaking change. If folks are going to get different results after this change by passing the exact same things as before, then it is a breaking change.

@H4ad
Copy link
Contributor Author

H4ad commented Apr 6, 2023

@wraithgar I thought it might be a breaking change if someone got the parsed object and access something in the options, but I don't think that will be the case, so maybe it's not a breaking change.

I push the version returning just the options, we can now see if worth to keep like that.

@wraithgar
Copy link
Member

I think once the linting errors are fixed we are in a good place here.

@H4ad how do you feel about this PR in contrast to your other idea of using bitmasks? Do you still think that is worth exploring?

@wraithgar
Copy link
Member

main

satisfies(1.0.6, 1.0.3||^2.0.0) x 365,321 ops/sec ±0.83% (90 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0) x 380,982 ops/sec ±0.56% (93 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0) x 394,928 ops/sec ±0.92% (95 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true}) x 344,723 ops/sec ±0.85% (93 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true}) x 349,690 ops/sec ±1.50% (86 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true}) x 358,409 ops/sec ±2.81% (89 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true,"loose":true}) x 289,201 ops/sec ±1.26% (88 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true,"loose":true}) x 299,104 ops/sec ±1.22% (89 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true,"loose":true}) x 307,864 ops/sec ±1.27% (87 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 241,164 ops/sec ±1.90% (87 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 249,600 ops/sec ±1.52% (87 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 259,557 ops/sec ±1.50% (88 runs sampled)

this PR as of commit 585125

satisfies(1.0.6, 1.0.3||^2.0.0) x 397,658 ops/sec ±0.71% (85 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0) x 429,146 ops/sec ±0.69% (95 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0) x 446,333 ops/sec ±0.79% (93 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true}) x 344,022 ops/sec ±1.09% (92 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true}) x 354,640 ops/sec ±1.01% (89 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true}) x 364,129 ops/sec ±1.06% (92 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true,"loose":true}) x 314,245 ops/sec ±1.30% (86 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true,"loose":true}) x 321,380 ops/sec ±1.46% (90 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true,"loose":true}) x 338,411 ops/sec ±1.45% (89 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 284,783 ops/sec ±1.62% (85 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 274,453 ops/sec ±2.18% (83 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 306,652 ops/sec ±1.68% (84 runs sampled)

@H4ad H4ad force-pushed the fix/faster-parse-options branch from 585125b to d0c20e3 Compare April 6, 2023 18:21
@H4ad
Copy link
Contributor Author

H4ad commented Apr 6, 2023

@wraithgar About bitmasks, I think it could be good for big refactorings of the library, for this PR, I don't think it's worth having that level of refactoring for the entire lib just to use bitmasks, the perf is almost the same as this current PR.

@wraithgar
Copy link
Member

Like #536 I'm giving this a 👍 and letting it sit through the weekend to let folks catch up.

This PR should definitely not land before #536, it is predicated off the fact that the parsed options are no longer being used to memoize a cache.

@wraithgar wraithgar removed the request for review from lukekarrys April 6, 2023 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants