-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
P2728R7 #1
Conversation
70f178b
to
b279f17
Compare
#define UTFVIEW_CODE_UNIT_VIEW_HPP | ||
|
||
#include <UtfView/detail/concepts.hpp> | ||
#include <boost/stl_interfaces/iterator_interface.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No change needed here, but just a heads-up. I've pulled iterator_interface from consideration for C++26, because I was to see what it looks like to do it using the new reflection language feature we'll get in 26. Sine this paper is also probably targeting 29, this is maybe a non-issue; I fully intend to resurrect the iterator_interface paper for 29.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll see how the scheduling goes, but I ideally was hoping to target 26 because P2871 and P2873 are targeting 26 and I wanted this to replace the removed std::codecvt
facets. That's not a showstopper, though; if it turns out I have to wait for 29, I'll leave iterator_interface
in, and if I do manage to target 26, I can take out iterator_interface
at the expense of verbosity.
class as_char8_t_view : public std::ranges::view_interface<as_char8_t_view<V>> { | ||
V base_ = V(); // @*exposition only*@ | ||
|
||
template <bool Const> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these exposition-only? Most (all?) C++ views have nested iterator
and sentinel
templates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They have nested exposition-only iterator and sentinel templates, as far as I can tell: https://en.cppreference.com/w/cpp/ranges/elements_view/iterator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha! So they are. I think I just misremembered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, the correct way to get the types is iterator_t
and sentinel_t
which are just decltype(ranges::begin(r))
and similarly for end. So the types don't need to have publicly visible names as nested types of the view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good, except that I don't have that much confidence in the tests. It looks like you don't cover:
- back-and-forth iteration using the transcoding iterator in bidirectional mode;
- the exact place that replacement characters are produced by the transcoding iterators; and
- the combinatoric coverage of value category+as_utfN.
For the last one that would include foo | as_utfN, where foo is a borrowed/unborrowed range, and its value category is rvalue/lvalue+const/non-const.
I looked pretty closely at the implementation, and a lot less closely at the tests. I may have missed these things.
On a design level, I think this version of things is simpler than the previous one (since it drops the unpacking logic). That's a marked improvement, IMO. We do lose compatibility in some rare cases by not having the CPO, but . I don't think we'll miss it.
The only other things to note are the lack of direct support for working with NTBSs, and direct support for working with std::string{,_view} without going to char8_t first. The former is fine, given how easy null_term is to use and recommend. The latter... could I see an example of exactly how bad it will be? Did I maybe miss that somewhere?
friend constexpr bool operator==(I const& it, null_sentinel_t) { | ||
return *it == std::iter_value_t<I>{}; | ||
} | ||
template <std::forward_iterator I> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need both? If you get rid of the forward overload, and get rid of the not-forward constraint on the input overload, the only thing you lose is passing an iterator by reference instead of by value. Are the two here just for that small optimization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a big thing; I'm essentially just wondering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's the only reason why. The paper has the following language about it:
The
null_sentinel_t
'soperator==
has a separate overload for input iterators that takes the iterator by reference instead of by value. We want to take input iterators by reference because they are not required to be copyable. However, for forward iterators, we want to take by value because otherwise we incur a double indirection (e.g.int* const& it
) that compilers may not optimize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. Thanks.
class as_char8_t_view : public std::ranges::view_interface<as_char8_t_view<V>> { | ||
V base_ = V(); // @*exposition only*@ | ||
|
||
template <bool Const> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha! So they are. I think I just misremembered.
|
||
} // namespace detail | ||
|
||
inline constexpr detail::null_term_impl null_term; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotta say, I like this a lot more than I thought I would. :)
Those two are both tested by the function The repo implements automated test coverage, which is at 100%-- this is misleading because if a template isn't instantiated it doesn't count towards the total, but
I've added this to my TODO list.
There is direct support for working with NTBSs if they're in the form of non-decayed string literals; that's implemented by the |
Right. You still may be missing some coverage though if you only go all the way through the seqeunce each time (forward or backward). I had tests that, for N code points would do something like:
The result was checked at each step. In the initial implemenation, I did find a problem or two doing this.
It's fine to me to remove direct support for NTSBs, but references to arrays are not NTSBs. |
8b8b837
to
dae4300
Compare
9c0c210
to
d68becd
Compare
No description provided.