-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store trait #21
Store trait #21
Conversation
Design Notes on a Willow Store Trait: MutationFor our rust implementation of Willow, we are designing a trait to abstract over data stores. Among the features that a store must provide are ingesting new entries, querying areas, resolving payloads, and subscribing to change notifications. Turns out this trait becomes rather involved. In this writeup, I want to focus on a small subset of the trait: everything that allows user code to mutate a data store. On the surface, Willow stores support only a single mutating operation: ingesting new entries. Does the following trait (heavily simplified on the type-level) do the trick? trait Store {
async fn ingest(&mut self, entry: Entry) -> Result<(), StoreError>;
} Nope, there is actually a whole lot more to consider. We'll start with something simple: while the data model only considers adding new entries to a store, there is a second operation that all implementations should support: locally deleting entries. We want to support this both for explicit entries, and for whole areas: trait Store {
async fn ingest(&mut self, entry: Entry) -> Result<(), StoreError>;
async fn forget(&mut self, entry: Entry) -> Result<(), StoreError>;
async fn forget_area(&mut self, area: Area3d) -> Result<(), StoreError>;
// We also need ingestion and forgetting of payloads,
// but we'll leave that for later.
} Now we have a sketch of a somewhat useable API, but it does not admit particularly efficient implementations. We want stores to (potentially) be backed by persistent storage. But persisting data to disk is expensive. Typically, writes are performed to an in-memory buffer, and then periodically (or explicitly on demand) flushed to disk (compare the fsync syscall in posix). To support this pattern, we should change the semantics of our methods to "kindly asking the store to eventually do something", and add a trait Store {
async fn ingest(&mut self, entry: Entry) -> Result<(), StoreError>;
async fn forget(&mut self, entry: Entry) -> Result<(), StoreError>;
async fn forget_area(&mut self, area: Area3d) -> Result<(), StoreError>;
async flush(&mut self) -> Result<(), StoreError>;
} Another typical optimisation is that of efficiently operating on slices of bulk data instead of individual pieces of data. Forgetting should hopefully be rare enough, but we should definitely have a method for bulk ingestion. trait Store {
async fn ingest(&mut self, entry: Entry) -> Result<(), StoreError>;
async fn ingest_bulk(&mut self, entries: &Entry[]) -> Result<(), StoreError>;
async fn forget(&mut self, entry: Entry) -> Result<(), StoreError>;
async fn forget_area(&mut self, area: Area3d) -> Result<(), StoreError>;
async flush(&mut self) -> Result<(), StoreError>;
} The next issue is one of interior versus exterior mutability. Those async methods desugar to returning Futures. And since the methods take a mutable reference to the store as an argument, no other store methods can be called while such a Future is in scope. Hence, this trait forces sequential issuing of commands, with no concurrency. While it might be borderline acceptable to force linearization of all mutating store accesses (especially since most method calls would simply place an instruction in a buffer rather than actually performing and committing side-effects), the inability to mutate the store while also querying it (say, by subscribing to changes) is a dealbreaker. Thus, we should change those methods to take an immutable trait Store {
async fn ingest(& self, entry: Entry) -> Result<(), StoreError>;
async fn ingest_bulk(&self, entries: &Entry[]) -> Result<(), StoreError>;
async fn forget(&self, entry: Entry) -> Result<(), StoreError>;
async fn forget_area(&self, area: Area3d) -> Result<(), StoreError>;
async flush(&self) -> Result<(), StoreError>;
} Finally, there is a line of thought that I'm less confident about yet. Starting with simple Aside from the drawback of forcing the explicit reification of operations, another drawback of the consumer approach would be the question of how to obtain the consumer. If the store itself implemented This is the main issue I wanted to convey in this writeup. I have glossed over various details (generics and assocated types, precise errors, information to return on successful operations, an ingest-unless-it-would-overwrite method, payload handling, etc.), because those are comparatively superficial. But I'd love to hear some feedback on the issues of interior vs exterior mutability and explicit vs implicit (buffered, bulk) consumer semantics. |
Storage requirements
|
I'm starting to work in earnest on this again. One thing I have thought about is how what a store is in our implementations is not what a store is in the spec:
Whereas in our implementations, a store is a set of authorised entries AND a partial set of corresponding payloads. In willow-js, this led to this situation where you need to specify how a store's payloads should be stored and retrieved via PayloadDrivers, ramping up the complexity of the store. I suggested the opportunity to separate the two concepts to @AljoschaMeyer, so that we'd have But Aljoscha raised a good point:
Which is persuasive, but maybe there's some way around it? Which is why I'm sharing this here. |
@AljoschaMeyer I have now added all mutation methods (unless we want a bulk payload ingestion method?) |
/// Attempt to ingest many [`AuthorisedEntry`] in the [`Store`]. | ||
/// | ||
/// The result being `Ok` does **not** indicate that all entry ingestions were successful, only that each entry had an ingestion attempt, some of which *may* have returned [`EntryIngestionError`]. The `Err` type of this result is only returned if there was some internal error. | ||
fn bulk_ingest_entry( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning a Vec of errors causes an allocation that might be avoidable (or at least configurable) if the method took a consumer of BulkIngestionResults as an argument instead? Not entirely sure whether that would really be an improvement. CC @Frando
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, I think you would always want to know about errors, right? The consumer pattern is cool, but is it maybe a little fancy?
Co-authored-by: Aljoscha Meyer <AljoschaMeyer@users.noreply.github.com>
Co-authored-by: Aljoscha Meyer <AljoschaMeyer@users.noreply.github.com>
Co-authored-by: Aljoscha Meyer <AljoschaMeyer@users.noreply.github.com>
Add query methods to Store trait
Note: Many of the structs still need the standard derives, Debug, Clone, and Eq/PartialEq where applicable. |
Work in progress, currently being discussed and iterated upon.