Skip to content

Commit

Permalink
chore: sync up with Kellys changes from PR #9
Browse files Browse the repository at this point in the history
Merge branch 'master' of https://github.com/CCBR/parkit
  • Loading branch information
kopardev committed Apr 16, 2024
2 parents 03e2829 + 0b63a3d commit 361fee1
Show file tree
Hide file tree
Showing 26 changed files with 260 additions and 161 deletions.
14 changes: 14 additions & 0 deletions .github/workflows/add_reponame_labels.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
name: Auto add reponame as label to all issues and PRs

on:
issues:
types:
- opened
pull_request:
types:
- opened

jobs:
add_label:
uses: CCBR/.github/.github/workflows/add_reponame_issue_label.yml@v0.2.0
secrets: inherit
16 changes: 16 additions & 0 deletions .github/workflows/auto-add-user-project.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
name: Add to personal projects

on:
issues:
types:
- assigned
pull_request:
types:
- assigned

jobs:
add-to-project:
uses: CCBR/.github/.github/workflows/auto-add-user-project.yml@6af5593b1ad6d7ee2b7f4c23b351902d4baaacd6
with:
username: ${{ github.event.assignee.login }}
secrets: inherit
20 changes: 20 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
name: docs
on:
workflow_dispatch:
push:
branches:
- main
paths:
- "docs/**"

jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: 3.11
- run: pip install --upgrade pip
- run: pip install -r docs/requirements.txt
- run: mkdocs gh-deploy --force
33 changes: 33 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name: test

on:
push:
branches:
- main
- develop
pull_request:
branches:
- main
- develop

jobs:
test:
runs-on: ubuntu-latest
timeout-minutes: 2
strategy:
matrix:
python-version: ["3.11"]
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
cache: "pip"
- name: Install dependencies
run: |
python -m pip install --upgrade pip setuptools
pip install .[dev,test]
- name: Test
run: |
python -m pytest
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
build/
**/site/*
**/dist/*
**/*.egg_info/*
**/parkit_pkg.egg-info/*
**/parkit.egg-info/*
**/.koparde*
**/*.pyc
**/.prettierrc
Expand Down
1 change: 0 additions & 1 deletion README.md

This file was deleted.

133 changes: 133 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
## parkit :parking: :blue_car:

**Park** an **arc**hived project tool**kit**!

> DISCLAIMERS:
>
> - works only on [BIOWULF](https://hpc.nih.gov/) or HELIX
> - moves files to [HPC-DME](https://hpcdmeweb.nci.nih.gov/login)
### Prerequisites:

- On helix or biowulf you can get access to `parkit` by loading the appropriate conda env

```bash
%> . "/data/CCBR_Pipeliner/db/PipeDB/Conda/etc/profile.d/conda.sh"
%> conda activate parkit
```

If not on helix or biowulf then you will have to **clone** the repo and **pip install** it.

- [HPC_DME_APIs](https://github.com/CBIIT/HPC_DME_APIs) package needs to be cloned and set up correctly. Run `dm_generate_token` to successfully generate a token prior to running `parkit`.

- **HPC_DM_UTILS** environmental variable should be preset before calling `parkit`. It also needs to be passed as an argument to `parkit_folder2hpcdme` and `parkit_tarball2hpcdme` end-to-end workflows.

### Usage:

```bash
%> parkit --help
usage: parkit [-h] {createtar,createmetadata,createemptycollection,deposittar} ...

parkit subcommands to park data in HPCDME

positional arguments:
{createtar,createmetadata,createemptycollection,deposittar}
Subcommand to run
createtar create tarball(and its filelist) from a project folder.
createmetadata create the metadata.json file required for a tarball (and its filelist)
createemptycollection
creates empty project and analysis collections
deposittar deposit tarball(and filelist) into vault

options:
-h, --help show this help message and exit
```

### Example:

- Say you want to archive `/data/CCBR/projects/CCBR-12345` folder to `/CCBR_Archive/GRIDFTP/CCBR-12345` collection on HPC-DME
- you can run the following commands sequentially to do this:

```bash
# create the tarball
%> parkit createtar --folder /data/CCBR/projects/ccbr_12345

# create an empty collection on HPC-DME
%> parkit createemptycollection --dest /CCBR_Archive/GRIDFTP/CCBR-12345 --projectdesc "testing" --projecttitle "test project 1"

# create required metadata
%> parkit createmetadata --tarball /data/CCBR/projects/ccbr_12345.tar --dest /CCBR_Archive/GRIDFTP/CCBR-12345

# deposit the tar into HPC-DME
%> parkit deposittar --tarball /data/CCBR/projects/ccbr_12345.tar --dest /CCBR_Archive/GRIDFTP/CCBR-12345

# bunch of extra files are created in the process
%> ls /data/CCBR/projects/ccbr_12345.tar*
/data/CCBR/projects/ccbr_12345.tar /data/CCBR/projects/ccbr_12345.tar.filelist.md5 /data/CCBR/projects/ccbr_12345.tar.md5
/data/CCBR/projects/ccbr_12345.tar.filelist /data/CCBR/projects/ccbr_12345.tar.filelist.metadata.json /data/CCBR/projects/ccbr_12345.tar.metadata.json

# these extra files can now be deleted
%> rm -f /data/CCBR/projects/ccbr_12345.tar*

# you can also deleted the recently parked project folder
%> rm -rf /data/CCBR/projects/ccbr_12345

# test results with
%> dm_get_collection /CCBR_Archive/GRIDFTP/CCBR-12345
# Done!
```

We also have end-to-end slurm-supported folder-to-hpcdme and tarball-to-hpcdme workflows:

- `parkit_folder2hpcdme`
- `parkit_tarball2hpcdme`

```bash
%> parkit_folder2hpcdme --help
usage: parkit_folder2hpcdme [-h] [--restartfrom RESTARTFROM] [--executor EXECUTOR] [--folder FOLDER] [--dest DEST]
[--projectdesc PROJECTDESC] [--projecttitle PROJECTTITLE] [--cleanup] --hpcdmutilspath HPCDMUTILSPATH
[--version]

End-to-end parkit: Folder 2 HPCDME

options:
-h, --help show this help message and exit
--restartfrom RESTARTFROM
if restarting then restart from this step. Options are: createemptycollection, createmetadata, deposittar
--executor EXECUTOR slurm or local
--folder FOLDER project folder to archive
--dest DEST vault collection path (Analysis goes under here!)
--projectdesc PROJECTDESC
project description
--projecttitle PROJECTTITLE
project title
--cleanup post transfer step to delete local files
--hpcdmutilspath HPCDMUTILSPATH
what should be the value of env var HPC_DM_UTILS
--version print version
```
```bash
parkit_tarball2hpcdme --help
usage: parkit_tarball2hpcdme [-h] [--restartfrom RESTARTFROM] [--executor EXECUTOR] [--tarball TARBALL] [--dest DEST]
[--projectdesc PROJECTDESC] [--projecttitle PROJECTTITLE] [--cleanup] --hpcdmutilspath HPCDMUTILSPATH
[--version]

End-to-end parkit: Tarball 2 HPCDME

options:
-h, --help show this help message and exit
--restartfrom RESTARTFROM
if restarting then restart from this step. Options are: createemptycollection, createmetadata, deposittar
--executor EXECUTOR slurm or local
--tarball TARBALL project tarball to archive
--dest DEST vault collection path (Analysis goes under here!)
--projectdesc PROJECTDESC
project description
--projecttitle PROJECTTITLE
project title
--cleanup post transfer step to delete local files
--hpcdmutilspath HPCDMUTILSPATH
what should be the value of env var HPC_DM_UTILS
--version print version
```
133 changes: 0 additions & 133 deletions parkit_pkg/README.md

This file was deleted.

Loading

0 comments on commit 361fee1

Please sign in to comment.