Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add alteration methods to IArray #34

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ColonelThirtyTwo
Copy link
Contributor

Includes IArray::make_mut, which works similar to Rc::make_mut, as well as insertion and removal methods, all of which copy-on-write the array.

@kirillsemyonkin
Copy link
Collaborator

kirillsemyonkin commented Oct 22, 2023

Have you participated in #17 discussion?

I believe your solution is the same as using .to_vec(), except some cases where Rc does not have any weak refs and have only one current strong ref. If #17 is considered when making this, then I believe it should close #17 too.

@kirillsemyonkin
Copy link
Collaborator

I imagine this is being blocked by #35.

@ColonelThirtyTwo
Copy link
Contributor Author

Have you participated in #17 discussion?

I believe your solution is the same as using .to_vec(), except some cases where Rc does not have any weak refs and have only one current strong ref. If #17 is considered when making this, then I believe it should close #17 too.

I did not. The im/im-rc crate already implements a tree-based CoW Vec that does structural sharing like #17 seems to talk about, so IMO the authors should use that if that's the semantics they want. I'm not a huge fan of it because of worse random access times and the fact that the array is no longer a slice.

@kirillsemyonkin
Copy link
Collaborator

I was not talking about HAMT, I was talking about the rest...

@cecton
Copy link
Member

cecton commented Oct 23, 2023

I was not talking about HAMT, I was talking about the rest...

Yes actually I'm not sure this API proposed in this PR here is a good idea. Maybe the "to_vec()" is more than enough and more rust-y. It discussed more about it on #17 with my reasoning.

I'm not strongly against either so I will go with the flow.

I did not. The im/im-rc crate already implements a tree-based CoW Vec that does structural sharing like #17 seems to talk about, so IMO the authors should use that if that's the semantics they want. I'm not a huge fan of it because of worse random access times and the fact that the array is no longer a slice.

It might be worth adding an optional dependency on im/im-rc that will allow the user of implicit-clone to get the trait ImplicitClone on im/im-rc's types? This idea is really out of scope with this PR though.

@kirillsemyonkin
Copy link
Collaborator

I didn't even notice all of the methods added by this PR, only paying attention to those in the PR's description, oops...

I personally am not against make_mut and get_mut (where is it?), but I'm still not sold on everything else, as it encourages cloning the whole array on every action (@cecton I'm still shocked IArray's iter() clones every item, even though reads intuitively should happen way more often than writes, and it would be nice to not need them cloned in all of their entirety all of the time. I guess the rationale is that all items are ImplicitClone, and we actually do not allocate the array memory space for containing these clones).

To me, the non-resizable [T] which is given by both of them sounds like a good idea. These two functions would be good for the user to be able to quickly mutate the elements themselves after making an IArray before sending it out to the world.

@cecton
Copy link
Member

cecton commented Oct 24, 2023

@cecton I'm still shocked IArray's iter() clones every item, even though reads intuitively should happen way more often than writes, and it would be nice to not need them cloned in all of their entirety all of the time. I guess the rationale is that all items are ImplicitClone, and we actually do not allocate the array memory space for containing these clones).

haahh you're shocked for nothing :P really, what do you think happens in other programing languages anyway? And yes the rational was that each element of the IArray must implement ImplicitClone so they are supposed to be cheap to clone. Now it's not like I particularly mind about this API at all. It's not very rust-y but it allows writing fp-style code easily

wanna join 2 collections? array1.iter().chain(array2.iter()).collect(), no need to bother with the references. You want to iterate on a list of items like you would in JS, you can do it easily because you don't have the additional cognitive load of the borrow checker of Rust.

To me, the non-resizable [T] which is given by both of them sounds like a good idea. These two functions would be good for the user to be able to quickly mutate the elements themselves after making an IArray before sending it out to the world.

That's what the implicit cloning of IArray is actually trying to bring. Since the elements are cheap-to-clone, we are allowed to write things almost like a dynamic language. But again I don't have a strong opinion on this, it was a 0.x version.

(IMap also has this kind of API in mind btw brining dynamic programming to rust, truly going against the norm and see what does it do. Maybe it is OP, maybe it's terrible lol, maybe both)

Includes IArray::make_mut, which works similar to Rc::make_mut, as well
as insertion and removal methods, all of which copy-on-write the array.
@ColonelThirtyTwo
Copy link
Contributor Author

ColonelThirtyTwo commented Oct 24, 2023

You want to iterate on a list of items like you would in JS, you can do it easily because you don't have the additional cognitive load of the borrow checker of Rust.

I don't agree. The Rust-ism is that .iter() on collections returns references. Having to work with IArray that breaks this convention would increase my cognitive load, because now I have to remember the special case of IArray returning clones and not references.

EDIT: In addition, you do not want to be frivolously cloning Arcs. The atomic operations they do have more overhead than the non-atomic Rc's and can't be optimized out as easily.

wanna join 2 collections? array1.iter().chain(array2.iter()).collect(), no need to bother with the references.

array1.iter().chain(array2.iter()).cloned().collect() isn't that much bigger.

EDIT2: You also can't borrow from the iterator if it clones:

let arr: Vec<Rc<ReallyBigType>> = ...
// can't do this with IArray without cloning every node.string
let field = arr.iter().find(|node| &node.string).collect::<String>();

pub fn insert_many<I: IntoIterator<Item = T>>(&mut self, index: usize, values: I) {
let head = self.as_slice()[..index].iter().cloned();
let tail = self.as_slice()[index..].iter().cloned();
let rc = head.chain(values).chain(tail).collect();
Copy link
Collaborator

@kirillsemyonkin kirillsemyonkin Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess our main complaint in accordance with #17 is this .collect() here allocates a whole new array.

When iterating over items, sure, they can be cloned, since each item iteration makes a clone (ImplicitClone that is, meaning not even allocating anything besides on the stack, just doing things like incrementing an Rc count, for example) and then throws that clone out once its done.

Allocating a new array (also, WASM (since its a yew-stack crate) performance for such allocations I'm not sure about) means looking for a lot of space (both of the arrays combined could be massive - literally, in my native language word "array" translates to "massive"), and then moving all of the made clones into it. This means adding a new single item would require making a whole array clone of a worst-case size. Imagine adding two in a row?

Sure, the user could make some kind of temporary storage to insert all of new items at once, but the API cannot prove that user would not call push for adding single items in a big loop instead. Colloquially in IT and other areas of engineering this is called making the design "idiot-proof". Vec does actually amortize this kind of cost with separating size and capacity. Most of optimizations we could work on (talked over in #17) would probably lead us to just an inferior Vec/Rc<Vec>.

I believe its better to make the user do an intentional to_vec (I wonder if its possible to optimize into a into_vec(self) like make_mut so it just copies bytes (moves?) into newly allocated Vec in case there is only one Rc?) or newly added make_mut for just mutating some items, or get_mut (I already commented that, I'll be waiting for that to be added / explained why it could not be added).

Copy link
Contributor Author

@ColonelThirtyTwo ColonelThirtyTwo Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way Rc<[T]> -> Vec<T> could be more efficient than moving out of the Rc<[T]> buffer would be if Vec<T> reused the storage of Rc<[T]>, but it can't - the Rc buffer would be prefixed by the refcount, which Vec won't know about and won't deallocate correctly. You can't even move values out of an Rc<[T]> because try_unwrap and into_inner need [T]: Sized. So copying is really the only way.

to_vec -> edit -> collect::<Rc<[T]>> would involve two copies, which isn't great. Using iterators over the original slice would be better but gets complex.

IMO the simple insert/remove methods are convenient for the usual UI things of adding/removing a single element, though I agree they are kinda footguns for more involved usage.

Copy link
Collaborator

@kirillsemyonkin kirillsemyonkin Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way Rc<[T]> -> Vec<T> could be more efficient than moving out of the Rc<[T]> buffer would be if Vec<T> reused the storage of Rc<[T]>

The thought is that we don't necessarily clone each item into the new Vec, but move them instead. In most cases, a move is just like a copy, if not precisely that (but doing it manually would require unsafe and working with Pin I guess (also, do we need to handle Pin<IArray/IString/etc> somehow?)).

You can't even move values out of an Rc<[T]> because try_unwrap and into_inner need [T]: Sized. So copying is really the only way.

I guess this is our limiting factor. Probably making a tracking issue on this repo that would ask to add to Rust into_vec or something of the sort to the Rc<[...]>, unless using unsafe in this repo is a non-issue.

Copy link
Contributor Author

@ColonelThirtyTwo ColonelThirtyTwo Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with unsafe, you'd have to somehow be able to free the underlying Rc allocation without dropping the items, and unlike with Vec, there's no guarantees on the allocation that Rc makes, since there's a refcount on it as well. Rust would need to add something like Rc::drain.

EDIT: now that I think about it, you may be able to do it by transmuting to Rc<[MaybeUninit<T>]>, moving the values out, then dropping the transmuted Rc, which will free the memory without running destructors. Assuming that's the only reference, of course.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't mind that this go in a separate PR so we can merge this one already.

Copy link
Member

@cecton cecton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're going to need this in Yew for the ChildrenRenderer

@kirillsemyonkin
Copy link
Collaborator

I think we're going to need this in Yew for the ChildrenRenderer

Post code comparisons here? Hopefully you are not trying to add individual items in a loop?

@cecton
Copy link
Member

cecton commented Nov 1, 2023

I was going to make the PR now ^_^

@cecton
Copy link
Member

cecton commented Nov 1, 2023

Wait no, not for this. The mutable API would be useful to use IArray in VList instead of Vec. Though I'm not entirely sure how efficient this is... maybe it's terrible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants