Skip to content

Commit

Permalink
Fixed repo config for Maven and add docker build (#62)
Browse files Browse the repository at this point in the history
* Configure cache directly in actions/setup-java
* Remove build config for Azure Pipelines and Cloud Build
* Fixed documentation since targetUri option and export endpoint is removed
* Set up docker build
  • Loading branch information
bjornandre authored Nov 14, 2023
1 parent f16b050 commit a52f241
Show file tree
Hide file tree
Showing 7 changed files with 209 additions and 252 deletions.
66 changes: 59 additions & 7 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,22 @@ on:
push:
branches:
- master
paths:
- src/**
- conf/**
- Dockerfile
pull_request:
branches:
- master
paths:
- src/**
- conf/**
- Dockerfile

env:
REGISTRY: europe-north1-docker.pkg.dev/artifact-registry-5n/dapla-pseudo-docker/ssb/dapla
IMAGE: dapla-dlp-pseudo-service
TAG: ${{ github.ref_name }}-${{ github.sha }}

jobs:
build:
Expand All @@ -23,6 +36,7 @@ jobs:
with:
java-version: 21
distribution: zulu
cache: maven

- name: Authenticate to Google Cloud
id: auth
Expand All @@ -32,13 +46,51 @@ jobs:
service_account: "gh-actions-dapla-pseudo@artifact-registry-5n.iam.gserviceaccount.com"
token_format: access_token

- name: Cache Maven packages
uses: actions/cache@v3
with:
path: ~/.m2
key: ${{ runner.os }}-m2-${{ hashFiles('**/pom.xml') }}
restore-keys: ${{ runner.os }}-m2

- name: Build with Maven and deploy to Artifact Registry
run: mvn --batch-mode -P ssb-bip deploy

- name: Clean up artifacts that are no longer needed
run: |
rm -f target/dapla-dlp-pseudo-service-*-sources.jar
rm -f target/dapla-dlp-pseudo-service-*-javadoc.jar
ls -al target/dapla-dlp-pseudo-service-*.jar
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v2

- name: Login to Artifact Registry
uses: docker/login-action@v2
with:
registry: ${{ env.REGISTRY }}
username: "oauth2accesstoken"
password: "${{ steps.auth.outputs.access_token }}"

- name: Docker meta
id: metadata
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE }}
# Docker tags based on the following events/attributes
tags: |
type=ref,event=branch
type=ref,event=pr
type=raw,value=latest,enable={{is_default_branch}}
type=semver,pattern=v{{version}}
type=semver,pattern=v{{major}}.{{minor}}
type=semver,pattern=v{{major}}
type=raw,value=${{ env.TAG }}, enable=true
- name: Build and push
id: docker_build
uses: docker/build-push-action@v4
with:
file: Dockerfile
push: true
context: .
tags: |
${{ steps.metadata.outputs.tags }}
labels: ${{ steps.metadata.outputs.labels }}

- name: Image digest
run: echo ${{ steps.docker_build.outputs.digest }}
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM azul/zulu-openjdk:17
FROM azul/zulu-openjdk:21
RUN apt-get -qq update && apt-get -y dist-upgrade && apt-get -y --no-install-recommends install curl
COPY target/dapla-dlp-pseudo-service-*.jar dapla-dlp-pseudo-service.jar
COPY target/classes/logback*.xml /conf/
Expand Down
46 changes: 1 addition & 45 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,49 +11,6 @@ Browse the API docs as:

## Examples

### Export a dataset
```sh
curl "${root_url}/export" -i -H "Authorization: Bearer ${dapla_auth_token}" --data @export-request.json
```
Where `root_url` points to an instance of the pseudo-service, `dapla_auth_token` is a JWT token and
`export-request.json` is a file containing the request. E.g:

```json
{
"sourceDataset": {
"root": "gs://ssb-dev-demo-enhjoern-a-data-produkt",
"path": "/path/to/data",
"version": "123"
},
"targetContentName": "test",
"targetContentType": "application/json",
"targetPassword": "kensentme",
"depseudonymize": true,
"pseudoRules": [
{
"name": "kontonummer",
"pattern": "**/kontonummer",
"func": "fpe-anychar(secret1)"
}
]
}
```

This example exports all columns matching either `**/foedsel` or `**/kontonummer` from a dataset located in a GCS
bucket at `gs://ssb-dev-demo-enhjoern-a-data-produkt/path/to/data/123`.
Columns matching `**/kontonummer` will be depseudonymized using the function `fpe-anychar(secret1)` and then compressed,
encrypted and uploaded (as json) to the preconfigured data export bucket (see config).

Note that the above will export all data. If you only need a subset of fields, you can specify this with column selector
glob expressions, like so:
```
"columnSelectors": [
"**/foedsel*",
"**/kontonummer"
]
```


### Pseudonymize JSON file and stream back the result

```sh
Expand Down Expand Up @@ -94,13 +51,12 @@ curl "${root_url}/depseudonymize/file" \
}'
```

### Depseudonymize JSON file and upload to google cloud storage as zipped CSV-file
### Depseudonymize JSON file and download a zipped CSV-file
```sh
curl "${root_url}/depseudonymize/file" \
--header "Authorization: Bearer ${dapla_auth_token}" \
--form 'data=@src/test/resources/data/15k-pseudonymized.json' \
--form 'request={
"targetUri": "gs://ssb-dev-demo-enhjoern-a-data-export/path/to/depseudonymized-csv.zip",
"targetContentType": "text/csv",
"pseudoConfig": {
"rules": [
Expand Down
47 changes: 0 additions & 47 deletions azure-pipelines.yml

This file was deleted.

21 changes: 0 additions & 21 deletions cloudbuild.yml

This file was deleted.

Loading

0 comments on commit a52f241

Please sign in to comment.