Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

causal sequence type #27

Open
pgte opened this issue Dec 2, 2018 · 15 comments
Open

causal sequence type #27

pgte opened this issue Dec 2, 2018 · 15 comments
Labels
help wanted Extra attention is needed question Further information is requested

Comments

@pgte
Copy link
Collaborator

pgte commented Dec 2, 2018

The only sequence type implemented so far in JS Delta CRDTs is the RGA.
RGA has to keep the tombstones of the deleted elements around to be safe.
I think it would be very beneficial to have an array-like type that is causal and somehow allows to compress history (much like the ORMap when using the causal context and the dot store).
Is there any research done on such a (delta) state-based type?

/cc @gritzko @vitorenesduarte

@pgte pgte added the question Further information is requested label Dec 2, 2018
@pgte pgte added the help wanted Extra attention is needed label Dec 2, 2018
@vitorenesduarte
Copy link

Hi @pgte.
I don't know RGA, but the reason it needs to keep tombstones is because it's always possible to have a concurrent operation referencing the element that was deleted?
If yes, maybe causal stability information could be used?

@pgte
Copy link
Collaborator Author

pgte commented Dec 4, 2018

@vitorenesduarte yes, you're correct, that's a possibility.
OTOH, a nice side-effect of having a type that uses a CC and a DotStore is that you could easily compose (like in the ORMap type)..

@vitorenesduarte
Copy link

You mean composing sequences, or using the current implementations of dot stores to implement a sequence type?

@pgte
Copy link
Collaborator Author

pgte commented Dec 4, 2018

The second, with the goals 1) a sequence to be embedded in any causal type and 2) to embed any causal type inside the sequence.

@parkan
Copy link
Collaborator

parkan commented Dec 5, 2018

+1 for this as a super useful application primitive (I'm assuming this would be append-only, as well)

from a user perspective, what I often want is relatively weak/local ordering, i.e. updates that do not directly reference each can have an undefined order

@vitorenesduarte
Copy link

I believe that's a different data type (let's call it Append-only Log), and if I understand correctly, there's no need to have conflict resolution in this case.

Append-only Log

Consider a grow-only map mapping naturals to a set of operations (or whatever is to be kept in the log).
Initially we have an empty log {}.
When adding an op to the log, we take the highest natural in the log (let's say m), and create an entry in the log mapping m + 1 to a set with op.

Example

If we have two operations, A1 and A2, by node A, we'll have this sequence of states:

{}
{1: {A1}}
{1: {A1}, 2: {A2}}

Let's say that node B has seen the last state, and adds a new operation B1:

{1: {A1}, 2: {A2}, 3: {B1}}

If concurrently, A also adds operations A3 and A4 (without seeing the update from B), we will have:

{1: {A1}, 2: {A2}, 3: {A3}, 4: {A4}}

When these two states are merged, we get:

{1: {A1}, 2: {A2}, 3: {A3, B1}, 4: {A4}}

This means that A saw the order A1, A2, A3, A4, and when the update from B arrived, the order changed to A1, A2, A3, B1, A4 (if we order operations inside a set by the node identifier, and A < B).
Local order is respected, but at any point, new operations may be inserted before your last operation.
Would something like this work in the use case you have in mind @parkan?

@gritzko
Copy link

gritzko commented Dec 7, 2018

An append-only or a sorted container needs no tombstones, indeed. Arbitrary insertions need reliable location ids.

It would be helpful to apply the 5 whys technique here https://en.m.wikipedia.org/wiki/5_Whys
What problem are we trying to solve?

@pgte
Copy link
Collaborator Author

pgte commented Dec 7, 2018

I think @parkan is trying to solve a different problem than I am.
I was looking for a entirely-mutable array-like type that uses a DotStore so that I can embed it in other causal types..

@gritzko
Copy link

gritzko commented Dec 11, 2018

I am not fluent in the dot jargon, but this is how I understand it: you need a scheme where data absence is meaningful. Like, a store has its version clock, data is fed in causal order... if something is covered by the clock, but missing from the storage, we conclude it was deleted. That is a neat tidy scheme and it applies to delta-enabled, op-based and ORDTs, all alike (AFAICS).

I mostly focus on other tricks, because data drop has its issues:

  • we can't check Merkle structure if some data is omitted (may be somewhat fixable)
  • can't distinguish meaningful absence from a fault, because deletes are implicit,
  • apart from faults, there is malice (like, kicking the data out of the database without issuing any deletes),
  • can't recover historical versions (relevant to wikis, distributed revision control, possibly chats too)
  • interferes with partial replication in non-trivial ways (this part depends on many things),

From my younger days, I remember one rule of thumb. The Web grows exponentially, so the Web+history is only heavier than the last snapshot by some factor. Hence, pruning the history does not buy you much.

On the other hand, some use cases may be sensitive to that. So, having a forgetful RGA variant is a plus. AFAIK, the only way to do so is to put a limit on the offline time. Like, if you did not sync for a week, your replica is rotten. Then, older tombstones are prunable, if the tree structure stays intact for the remaining nodes.

@gritzko
Copy link

gritzko commented Dec 11, 2018

P.S.Correction: the original tree structure is only needed for "recent" nodes (those younger than the offline time, e.g. <1 week old). The rest is stable, hence the exact tree structure is less relevant. We may see it as the "ground" new trees grow from :)

@CBaquero
Copy link

Replying to @pgte "I was looking for a entirely-mutable array-like type that uses a DotStore so that I can embed it in other causal types.."

Although my design of ORSeq is flawed I was trying to do such a thing and I think its doable. Here is what I recall of some of the ingredients:

A sequence can refer to a causal context to avoid the use of tombstones. Elements are added into a given position and they get 3 things: a insert position identifier (using some good algorithm for that), a new dot (that is also added to the causal context), and (for now) a immutable content (say, a letter).

If that element is removed we can remove everything and only the dot survives in the causal context. The usual rules that deal with causal contexts can be used to propagate that removal, while avoiding tombstones. This sequence can be embedded in map and share a common causal context. But this sequence is a leaf in terms of embedding, since the inserted elements are immutable.

It should be possible to allow for element mutability. Say, an operation could be applied at a given element position in the sequence. Mutable elements should be causal types that also support causal contexts, and will share them. One could build a sequence of add-win sets, for instance.

This also works with deltas.

I had the plan to try to implement this some years ago, but never did it.

Hope this helps.

@parkan
Copy link
Collaborator

parkan commented Dec 11, 2018

ahh I see @pgte, taking my requirements elsewhere then :)

@pgte
Copy link
Collaborator Author

pgte commented Dec 11, 2018

@parkan it's fine, I think you're talking more about something like ipfs-log or versidag: ipfs-inactive/dynamic-data-and-capabilities#50

@pgte
Copy link
Collaborator Author

pgte commented Dec 11, 2018

@CBaquero Thank you , that looks like it would work for these purposes! I'll try to prototype an immutable version to see where that leads. :)

@RoeeShlomo
Copy link

I'm also interested. @pgte have there been any progress on that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants