-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port Qt API to use QAnyStringView #479
base: main
Are you sure you want to change the base?
Port Qt API to use QAnyStringView #479
Conversation
QAnyStringView lets us save UTF-8 -> UTF-16 -> UTF-8 conversions. All QAnyStringView s passed to the library must be encoded in UTF-8, without null terminator. Tests don't pass
qPrintable returns a valid pointer that points to a null character. Change stringViewToChar to match
May lead to lower performance since it might require resizing the array.
Hmm, I'd need to learn more about this first to properly judge it, but first of all thank you for the patch! Using Not sure if someone would be constructing a lot of components from the Qt interface, but there is a chance I guess... And given how string-heavy AppStream is, saving a copy operation can actually be quite huge (internally, AppStream does a lot to avoid too much reallocation and string duplication). |
We have multiple options, it seems:
Now that I look at these options, I prefer 2i with automatically converting string views that do not end in a null terminator or 3. |
Maybe it's selfish of me but it's a matter of whether Discover can leverage this safely in its queries. If we need to keep converting to qstring (utf16) then it's probably not worth it. If we can do it in ways where we can save a good number of copies and share memory, I'd do it by all means. Since we are still a few days away from being able to have Discover master using AppStream 1.0, I suggest postponing this decision until we can just test and see what the final code would look like. I'm also worried the resulting code is considerably worse than the current one, we'll see. |
Using I really need to read up more on this, but from a first glance I am puzzled by the design decisions here. First of all, getting QString off of UTF-16 would have been nice a decade ago, but even when wanting to avoid that breakage, why no spent just one more byte to add a null-terminator for interfacing with C code - that would make this whole thing infinitely more usable... |
It is likely implementing an appropriate Qt hash function for QUtf8StringView would be necessary, since I see QString s are used inside QHash s. |
I just remembered: QUtf8StringView does not own the data it points to, and appstream expects whoever called the library to own the data. |
Do you think we can get this resolved for the AppStream 1.0 release? (Intended to be released in ~2 months) |
Also, how would this be implemented if we had a Qt5 version compiled as well? (Most likely that will happen, to make the transition to AppStream 1.0 easier). |
This was discussed a bit at #445 , so I thought to actually implement it to see how it actually would work.
I use QAnyStringView instead of the more specific QUtf8StringView because the former has the size_bytes() method that is the siplest way to copy it to a UTF-8 char*.
However, a QString can be implicitly converted to a QAnyStringView, so any software that used to work with the old version of the library will compile, but since the library expects all QAnyStringView s to be in UTF-8, it won't work.
We could try to check if the string is UTF-16 or UTF-8 and then do the conversion if needed.
We could also use QUtf8StringView, if we manipulate how the view is created so it includes the null character:
As an upside, it would save a copy. But it would require anyone passing QUtf8StringView s to the library to do this as well.
We could also use QUtf8StringView's iterator to copy and reassemble the Unicode codepoints in UTF-8