Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add StringView #1436

Merged
merged 1 commit into from
Jan 10, 2025
Merged

add StringView #1436

merged 1 commit into from
Jan 10, 2025

Conversation

Yu-zh
Copy link
Collaborator

@Yu-zh Yu-zh commented Jan 8, 2025

Basic usage:

let str = "Hello🤣🤣🤣"
guard let Some(start) = CharOffset::offset_by(str, 1)
guard let Some(end) = CharOffset::offset_by(str, 6)
let view = str[start:end]
inspect!(view, content="ello🤣")

To create a CharsView from a String, we need to use the CharOffset as indices. CharOffset is an opaque type that represents a position in the utf-16 String. It can be constructed via CharOffset::offset_by(str: String, offset: int) where the offset is the number of characters skipped between the start of the string and the returning position. It can also be constructed via CharOffset::unsafe_from_int(index: Int).

This is to avoid confusion with indexing String using indices, i.e. indices indicate utf-16 code unit offset when accessing String whereas when viewing String the indices represents offsets of characters.

Once we hold a CharsView, we can use integers directly to access chars or create subviews, and here the index always represents the codepoint offset.


Update:
renamed the types and methods:

CharsView -> StringView
CharOffset -> StringIndex
offset_by -> index_at

Copy link

peter-jerry-ye-code-review bot commented Jan 8, 2025

‼️ This code review is generated by a bot. Please verify the content before trusting it.

Here are three potential issues or observations from the provided git diff output:

  1. Inconsistent Documentation for index_at Function:

    • The index_at function in view.mbt has a parameter start~ : StringIndex = 0, but the documentation example uses start=offset instead of start~=offset. This inconsistency in naming conventions between the function signature and the example could lead to confusion for users. The documentation should align with the actual parameter names used in the function.
  2. Potential Performance Issue in op_as_view:

    • In the op_as_view function for StringView, there is a comment mentioning a TODO: "provide index_at_rev or index_at2 to avoid repeatedly iterate the string." This suggests that the current implementation might have performance issues due to repeated string iteration. This should be addressed to improve efficiency, especially for large strings or frequent operations.
  3. Error Handling in op_get Function:

    • The op_get function in view.mbt checks for invalid surrogate pairs and aborts with an error message. However, the error message "invalid surrogate pair" is not very descriptive. It would be more helpful to include additional context, such as the position or the specific characters involved, to aid in debugging.

These issues should be reviewed and addressed to improve code clarity, performance, and maintainability.

@coveralls
Copy link
Collaborator

coveralls commented Jan 8, 2025

Pull Request Test Coverage Report for Build 4682

Details

  • 38 of 44 (86.36%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.02%) to 83.217%

Changes Missing Coverage Covered Lines Changed/Added Lines %
string/view.mbt 38 44 86.36%
Totals Coverage Status
Change from base Build 4681: 0.02%
Covered Lines: 4765
Relevant Lines: 5726

💛 - Coveralls

@Yu-zh Yu-zh requested a review from bobzhang January 8, 2025 09:08
@Yu-zh Yu-zh changed the title add CharsView add StringView Jan 9, 2025
builtin/charsview.mbt Outdated Show resolved Hide resolved
Copy link
Contributor

@bobzhang bobzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try to move StringIndex/StringView into string package

builtin/charsview.mbt Outdated Show resolved Hide resolved
builtin/builtin.mbti Outdated Show resolved Hide resolved
builtin/builtin.mbti Outdated Show resolved Hide resolved
builtin/charsview_test.mbt Outdated Show resolved Hide resolved
@Yu-zh Yu-zh force-pushed the charsview branch 2 times, most recently from 9dc7f2a to 01b0efd Compare January 9, 2025 10:46
@Yu-zh Yu-zh requested a review from bobzhang January 9, 2025 14:07
@bobzhang bobzhang merged commit f40719f into moonbitlang:main Jan 10, 2025
14 checks passed
@Yu-zh Yu-zh deleted the charsview branch January 10, 2025 04:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants