-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Igor Shishkin <me@teran.dev>
- Loading branch information
Showing
1 changed file
with
37 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,39 @@ | ||
# archived | ||
|
||
Cloud native service so store versioned data in space-efficient manner | ||
Cloud native service to store versioned data in space-efficient manner | ||
|
||
## How it works | ||
|
||
archived is inspired by `rsync --link-dest` which allowed to store package | ||
mirrors without duplicating data for decades. And now archived makes this | ||
approach unbound from local file systems by using modern era storage services | ||
under the hood like S3. | ||
|
||
To do so archived relies on two storages: metadata and CAS. | ||
|
||
Metadata is a some kind of database to store all of the things: | ||
|
||
* containers - some kind of directories | ||
* versions - immutable version of the data in container | ||
* objects - named data BLOBs with some additional metadata | ||
|
||
Good example of metadata storage is a PostgreSQL database. | ||
|
||
CAS storage is a BLOB storage which stores the data behind objects. | ||
CAS is actually an acronym means Content Addressed Storage which describes | ||
how exactly it operates: stores BLOBs under content aware unique key (SHA256 | ||
is used by default). | ||
|
||
Good example of CAS storage is S3. | ||
|
||
This approach allows to reduce raw data usage by linking duplicates instead | ||
if storing copies. | ||
|
||
## archived components | ||
|
||
archived is built with microservice architecture containing the following | ||
components: | ||
|
||
* access - HTTP server to allow data listing and fetching | ||
* manage - gRPC API to manage containers, versions and objects | ||
* CLI - CLI application to interact with manage component |