Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib] Fix make String.join() work on StringSlice #3677

Open
wants to merge 21 commits into
base: nightly
Choose a base branch
from

Conversation

martinvuyk
Copy link
Contributor

@martinvuyk martinvuyk commented Oct 16, 2024

Fix make String.join() work on StringSlice by using the fast_join() method now renamed _join_bytes() without needing to allocate for each slice. Changes to AsBytes were needed to allow for non-owning types to work with it (StringSlice).

Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
@martinvuyk martinvuyk marked this pull request as ready for review October 28, 2024 15:18
@martinvuyk martinvuyk requested a review from a team as a code owner October 28, 2024 15:18
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
@martinvuyk martinvuyk marked this pull request as draft October 31, 2024 01:31
@JoeLoser
Copy link
Collaborator

JoeLoser commented Nov 1, 2024

Is this still something you want to keep in Draft or push forward with while other things are stalled on compiler issues?

@martinvuyk
Copy link
Contributor Author

This depends on #3528, I reimplemented the code for as_bytes_read() here but I figured I better wait for the generic as_bytes() and this will be cleaner and change a lot less things

Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
@martinvuyk martinvuyk marked this pull request as ready for review November 4, 2024 14:22
@martinvuyk
Copy link
Contributor Author

@JoeLoser I reimplemented the as_bytes() logic here. I think this will be a smaller PR to merge that change. Also, I don't know what you mean by compiler mutability issues. I had to fix calls to as_bytes() that were assuming it returns a mutable span:

fn test_string_byte_span() raises:
    var string = String("Hello")
    var str_slice = string.as_bytes() # removed this
    var str_slice = string.as_bytes[is_mutable=True]() # added this
    ...
    var sub1 = str_slice[:5]
    ...
    var sub2 = str_slice[2:5]
    ...
    #
    # Test mutation through slice
    #

    sub1[0] = ord("J")
    assert_equal(string, "Jello")

    sub2[2] = ord("y")
    assert_equal(string, "Jelly")

If that is what you mean then it's a matter of just changing all calls that do that

Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
@JoeLoser
Copy link
Collaborator

JoeLoser commented Nov 4, 2024

!sync

@modularbot modularbot added the imported-internally Signals that a given pull request has been imported internally. label Nov 4, 2024
@JoeLoser
Copy link
Collaborator

JoeLoser commented Nov 4, 2024

This is still crashing internally in the parser, e.g.

mojo: KGEN/lib/LITDialect/LITTypes.cpp:680: static OriginType M::KGEN::LIT::OriginType::get(TypedAttr): Assertion `isMutable.getType().isSignlessInteger(1) && "isMutable bit should be i1"' failed.

which is an internal assertion in the parser. The parser can't even parse object.mojo, for example - repros with

mojo stdlib/builtin/object.mojo

Can you rebase this PR and see if it also repros in the OSS repo with latest mojo compiler?

@martinvuyk
Copy link
Contributor Author

martinvuyk commented Nov 4, 2024

which is an internal assertion in the parser. The parser can't even parse object.mojo, for example - repros with

mojo stdlib/builtin/object.mojo

This branch is up to date with latest nightly and I'm using mojo 2024.11.405, and I can't repro it. Neither does the CI.

Assertion 'isMutable.getType().isSignlessInteger(1) && "isMutable bit should be i1"' failed

This makes me think that there is some transformation missing on the way, AFAIK the Mojo compiler progressively lowers stuff, so I'm thinking it keeps it Bool on some place where KGEN expects it to already be __mlir_type.i1.

The trail might start at the generic function (the more generic the later mojo lowers stuff AFAIK) which expects a Stringlike type:

    fn _join_bytes[
        T: AsBytesCollectionElement, //,
    ](self, elems: List[T, *_]) -> String:
        ...
        for e in elems:
            len_elems += len(e[].as_bytes[False, __origin_of(e)]())

An easy way to find out is to comment it and paste the current nightly join implementation:

var result: String = ""
var is_first = True

for e in elems:
    if is_first:
        is_first = False
    else:
        result += self
    result += str(e[])

return result

One idea I'm having is that for these kind of generic functions it might be better to use _lit_mut_cast[__origin_of(e), False].result before passing the origin to a generic like as_bytes() because it might have problems if it's passed a mutable origin. I'll change it and push the change if you want to test it out.

Beyond that I have no idea if it's solvable on the user side, and I can't repro it so I can't help :(

Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
@martinvuyk
Copy link
Contributor Author

@JoeLoser I think I fixed the issue (I believe it was just the origin casting), if you want to test it out whenever you have time. I have no problem waiting since I decoupled this feature from the other PRs where it was needed (just used byte_length() and unsafe_ptr()). And this is a nice to have (for safer generics), not a necessity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
imported-internally Signals that a given pull request has been imported internally.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants