-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast and correct #22
base: master
Are you sure you want to change the base?
Fast and correct #22
Conversation
Hi, thank you for this. I'm back from my trip. I will be addressing this after finishing up some Aura work and releasing Beta 3 of that. Thanks again for your patience. |
Thanks for the transparency! |
Hi, I've completed the major project I had been working on, and it's time to return to this. In the interim you would have noticed that I've patched the library a few times for work purposes, and there are some conflicts now. We'll need to resolve those. Conflicts aside, I will begin to review this in earnest. To confirm: this is the PR I should be looking at? What's the status of the other two PRs? |
Hi @fosskers , yes this is the PR to review first. There is already a follow-up PR that we use internally but I think it makes sense to address that one separately once we reached consensus on this one. I'll have a look at the changes that happened on |
I just updated this PR to be in sync with master again. I removed the I'm back from parental leave now, and I hope we can use the coming weeks to push this over the finish line to avoid doing double work again. I'm happy to answer any questions you may have. |
Thanks a lot, I will be going over this fully soon. |
Hi @fosskers , are you planning to follow-up on this? It's totally fine if you don't have the interest or time, I just would like to know because in that case I will probably publish this (and several follow-ups) in a separate repository. |
I do, I'm sorry to not have followed up yet. |
First of all, thank you for reading this. I know this PR is very big and contains a lot of breaking changes. I will try my best to explain the major changes in this description. The changes in this PR are the result of an iterative development process over the past 2 months. At my company we have been linking to this PR (and it's successor) in an actively developed codebase for the past month and that usage has informed a lot of the changes. If something is still unclear, please reach out.
Fast and correct
As we discussed before, the current API has issues with speed and soundness. The design put forward in this PR fixes both.
In the new design, the non-empty collection essentially encapsulates the corresponding possibly empty collection, for example:
The implementation of
NEVec
forwards all calls to the underlyingVec
while guaranteeing non-emptiness. This simplified inner representation also greatly simplifies a lot of complexity that previously existed. Take for exampleVec::insert()
:This is now:
Advantages of this approach are:
Non-emptiness is enforced by ensuring that an
NEVec
can never be constructed without any elements (quite similar to before), and by ensuring that the inner data structure can not be accessed (i.e. it is properly encapsulated).For
NEVec
I added a benchmark inbenches/vec.rs
that tests that performance ofNEVec
is equal toVec
. Because of this new simplified design I belief that the other datastructures are also on par with their possibly empty counterpart. Additionally, the codebase at work gets regularly benchmarked and there we found no performance impact with this new version (we mostly useNEVec
andNEIndexMap
), this is contrary to the current version which slowed our code down noticeably.A sound
NonEmptyIterator
As discussed in #16 there are some problems with the current implementation, most notably:
next()
andfirst()
in the incorrect order could lead to data lossNonEmptyIterator
s method after callingnext()
at least once, leads to unexpected results because the first value is always included even if the iterator has already passed it.Additionally I noticed that to implement the
NonEmptyIterator
a lot of repetitive boiler plate code is required.In this PR, the
NonEmptyIterator
looks as follows:Note that
NonEmptyIterator
is bound byIntoIterator
and that all its methods are already implemented, this means that it is a marker trait and therefore trivial to implement. This allows for very simple implementations of non-empty iterators, take for exampleIter
ofVec
:This is the entire implementation.
All
NonEmptyIterator
s are safe to use when they are created by a non-empty collection because in that case it is guaranteed that the underlying iterator (std::slice::Iter<'a, T>
in the above example) is non-empty. In case an implementor would make a mistake and create aNonEmptyIterator
that is in fact empty, a panic would occur when callingnext()
. This means that such mistakes are easy to detect and test for.Other changes
New features
In our codebase we converted some datastructures to their non-empty equivalent. During this refactoring we found missing functionality in this library. To fill the gaps, I created two new features:
itertools
: adds theNonEmptyItertools
trait similar to theItertools
trait forIterator
. So far it only contains a few methods that we needed. It is easy to extend this implementation to cover more ofitertools
in the future.either
: this feature adds theNEEither
enum, which is a non-empty variant of theEither
enum.Linting, formatting, and
justfile
Cargo.toml
now contains configuration for both Rust lints and Clippy lints. I've updated the code where needed to satisfy the lints. In order to ensure that lints don't get violated accidentally, they are automatically checked as part of CI.I've pinned the Rust toolchain to the most recent version (at the time of writing). This is to ensure that the build is stable. If we don't do this, it could be that a new version of Rust is released with new lints and that will break the build which will prohibit any updates/releases until it is fixed. Pinning the toolchain allows for choosing the time when to update to the new Rust version, which avoids breaking the build.
I added a
justfile
which can be run by the just runner. I've defined several useful commands for the project such astest
,fmt
, andlint
. I've also updated CI to use these definitions.The
fmt
command formats all Rust code. There is arustfmt.toml
file that configures the formatter. To run the code formatter a version of Rust nightly is required, but this introduces no dependency on nightly for users of the library. I've addedjust install-nightly
for easy installing the correct version. Part of the lint command is also a formatting check, this ensures that all code in the repository follows consistent formatting. You will see some minor changes in the diff of this PR that fixes some formatting inconsistencies.Breaking changes
I updated the changelog with every noteworthy change.
Follow-up
I've continued my work in a follow-up PR.