Skip to content

Commit

Permalink
Update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
sergiimk committed Aug 5, 2021
1 parent ccfc6bf commit bd176ee
Show file tree
Hide file tree
Showing 2 changed files with 28 additions and 8 deletions.
6 changes: 5 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,13 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.XX.0] - 2021-08-XX
## [0.49.0] - 2021-08-05
### Added
- Datasets can now be associated with remote repositories for ease of pulling and pushing
- `push` and `pull` commands now allow renaming the remote or local datasets
### Fixed
- Improved error reporting of invalid remote credentials
- Ingest will not create a new block if neither data now watermark had changed

## [0.48.2] - 2021-07-31
### Fixed
Expand Down
30 changes: 23 additions & 7 deletions docs/sharing_data.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
- [Pulling Data](#pulling-data)
- [Pushing Data](#pushing-data)

While `kamu` is a very powerful tool for managing and processing data on your own computer the real power of it becomes apparent when you start exchanging data with other people. Thanks to its core properties it makes sharing data reliable and safe both within your organization and between multiple completely independent parties.
While `kamu` is a very powerful tool for managing and processing data on your own computer, the real power of it becomes apparent only when you start exchanging data with other people. Thanks to its core properties it makes sharing data reliable and safe both within your organization and between multiple completely independent parties.


## Remote Types
Expand Down Expand Up @@ -34,19 +34,35 @@ This remote will now be visible in `kamu remote list`.
If the remote you added already contains a dataset you're interested in you can download it using the `pull` command:

```bash
kamu pull com.acme.shipments --remote acme
# Pulls `acme/com.acme.shipments` into local dataset `com.acme.shipments`
kamu pull acme/com.acme.shipments

# Or pull `acme/com.acme.shipments` into local dataset named `shipments`
kamu pull acme/com.acme.shipments --as shipments
```

This command will download all contents of the dataset to your computer and validate the integrity of metadata.
These commands will associate the local dataset with remote, so next time you pull you can simply do:

> Note: Currently the `pull` command with `--remote` flag does not create association between the downloaded dataset and the remote it came from, so executing `kamu pull com.acme.shipments` will make `kamu` attempt to perform ingest or derivative transformation (depending on the type of the dataset) instead of refreshing data from the remote. Bear with us while we improve the remotes API and continue to specify the `--remote` option for now.
```bash
# Will pull from associated `acme/com.acme.shipments`
kamu pull shipments
```

These associations are called "remote aliases" and can be viewed using:

```bash
kamu remote alias list
```

## Pushing Data
If you have created a brand new dataset you would like to share or made some changes to a dataset you are sharing with your friends - you can upload the new data using the `push` command:
If you have created a brand new dataset you would like to share, or made some changes to a dataset you are sharing with your friends - you can upload the new data using the `push` command:

```bash
kamu push com.acme.orders --remote acme
# Push local dataset `orders` to remote `acme/com.acme.orders`
kamu push orders --as acme/com.acme.orders

# This creates push alias, so next time you can push as simply as
kamu push orders
```

This command will analyze the state of the dataset at the remote and will only upload data and metadata that wasn't previously seen.
This command will analyze the state of the dataset at the remote and will only upload data and metadata that wasn't previously seen. It also detects any type of history collisions, so you will never overwrite someone else's changes.

0 comments on commit bd176ee

Please sign in to comment.