Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor for lambda #1

Open
wants to merge 40 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
18a5e87
Added gitignore for python artifacts
datadavev Apr 1, 2020
823fa41
Added some initial test cases; list of requirements
datadavev Apr 2, 2020
1c6ed76
Refactored verification view, added openapi spec.
datadavev Apr 2, 2020
4ae2bc9
Adjusted openapi spec, adjust docker build
datadavev Apr 2, 2020
8066c45
Added example default values for get request
datadavev Apr 2, 2020
49c3aa9
Updating README
datadavev Apr 5, 2020
0b1871d
Updating README
datadavev Apr 5, 2020
060ef31
added graphic
datadavev Apr 5, 2020
30c9c2d
Updating README
datadavev Apr 5, 2020
b47fcb2
Updating README
datadavev Apr 5, 2020
5edd1fb
Updating README
datadavev Apr 5, 2020
37cd4db
cropped
datadavev Apr 5, 2020
ed171bd
Edit image
datadavev Apr 5, 2020
5ed6284
requirements for tangram_web
datadavev Apr 5, 2020
89d191a
Simplified and removed spurious code
datadavev Apr 5, 2020
27bb3e5
Simplified and removed spurious code
datadavev Apr 5, 2020
4550746
Files for AWS lambda deployment
datadavev Apr 5, 2020
3d09daf
files for AWS Lambda deployment
datadavev Apr 5, 2020
a20061a
Inc version
datadavev Apr 5, 2020
eda5305
adjusted git ignore
datadavev Apr 5, 2020
bcdf8fc
removed requirements.txt
datadavev Apr 5, 2020
57fd0d4
moved image
datadavev Apr 5, 2020
eb2d856
moved tg square
datadavev Apr 5, 2020
e5b6890
Fixes for environment detection
datadavev Apr 5, 2020
3159fa2
Adjusted for environment var
datadavev Apr 5, 2020
3a140a0
Corrected some references
datadavev Apr 5, 2020
6c4b598
Moved test to src folder
datadavev Apr 5, 2020
895ac6a
tocuh
datadavev Apr 5, 2020
9075285
moved source files under src
datadavev Apr 5, 2020
b7a2f22
Adjusted urls
datadavev Apr 5, 2020
ab48dbf
Remove old info from README
datadavev Apr 5, 2020
acf1cdd
bump minor version
datadavev Apr 5, 2020
58334bc
Adjusted for additional requirements for lambda
datadavev Apr 6, 2020
606f952
make requirements dependent cascade
datadavev Apr 7, 2020
79f767f
hacking a headless setup. currently borked...
datadavev Apr 7, 2020
676d130
Headless work moved to develop-headless
datadavev Apr 7, 2020
7b9fb16
prepping for v3
datadavev Apr 7, 2020
ccf5c44
Added options for running validation
datadavev Apr 7, 2020
7d2c071
closes #2, closes #3
datadavev Apr 7, 2020
ce0ea96
updating instructions
datadavev Apr 7, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 148 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
.idea
lambda_env/
src/zappa_settings.json
src/lambda_role_policy.json

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# static files generated from Django application using `collectstatic`
media
static
135 changes: 126 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,24 @@
# Tangram
# <img src="./docs/tangram_square.svg" width="40px" /> Tangram

## About

A test program using Google Gloud Run for doing shacl conversion via pyshacl.
Tangram applies [SHACL](https://www.w3.org/TR/shacl/) graphs to [schema.org
](https://schema.org) graphs
to evaluate conformance with [ESIP Science-on-Schema.org](https://github.com/ESIPFed/science-on-schema.org) guidelines.

Tangram emerged from a combination of the original [`Tangram`](https://github.com/earthcubearchitecture-project418/tangram)
by Doug Fils and [`so-tools`](https://github.com/datadavev/sotools) by Dave Vieglais with support of ESIP
through a summer project activity. Tangram relies heavily on [RDFLib](https://github.com/RDFLib/rdflib) and
[PySHACL](https://github.com/RDFLib/pySHACL) for logic, and the ESIP Science-on-Schema.org guidelines for
validation rules which are implemented primarily as SHACL shape graphs.

Tangram is provided as a commandline application and a Flask web service that can be run locally, behind
a web server such as Apache, as a Docker application, to Google Cloud Run, or as an AWS Lambda service.

The core operational workflow of Tangram takes as input a data graph and a shape graph and outputs a
[SHACL validation report](https://www.w3.org/TR/shacl/#validation-report) in plain text or RDF.



## Tangram: Simple service example

Expand All @@ -15,21 +31,122 @@ Invoke the tool with something like:
With httpie client:

```bash
httpclient -f POST https://tangram.gleaner.io/uploader datagraph@./datagraphs/dataset-minimal-BAD.json-ld shapegraph@./shapegraphs/googleRecommended.ttl format=human
httpclient -f POST https://localhost:8080/verify \
dg@./datagraphs/dataset-minimal-BAD.json-ld \
sg@./shapegraphs/googleRecommended.ttl \
fmt=human
```

localhost
httpclient -f POST http://localhost:8080/uploader datagraph@./datagraphs/dataset-minimal-BAD.json-ld shapegraph@./shapegraphs/googleRecommended.ttl format=human
Or with good old curl (with format set to huam):

```bash
curl -F 'dg=@./datagraphs/dataset-minimal-BAD.json-ld' \
-F 'sg=@./shapegraphs/googleRecommended.ttl' \
-F 'fmt=human' \
https://localhos:8080/verify
```

Or with good old curl (with format set to huam):
## Install

### Local development install and run

This process is for local development, testing, and use. It is not suitable for a production deployment.
```bash
curl -F 'datagraph=@./datagraphs/dataset-minimal-BAD.json-ld' -F 'shapegraph=@./shapegraphs/googleRecommended.ttl' -F 'format=human' https://tangram.gleaner.io/uploader
python -m venv env
. env/bin/activate
pip install -e "git+git://github.com/RDFLib/rdflib.git#egg=rdflib"
pip install -e "git+git://github.com/RDFLib/rdflib-jsonld.git#egg=rdflib_jsonld"
pip install -r requirements-web.txt
python tangram_web.py
#Open http://localhost:8080 in a browser
```

## Tangram testing a web page
### Docker build and run

Docker offers a simple path for deploying Tangram as a production service.

The following provides a local test deployment that is removed when the docker image is shutdown:

```bash
httpclient "https://tangram.gleaner.io/ucheck?url=http://opencoredata.org/doc/dataset/b8d7bd1b-ef3b-4b08-a327-e28e1420adf0&format=human&shape=required"
make docker
docker run --rm --publish 8080:8080 --name tangram "fils/p418tangram:$(cat VERSION)"
#Open http://localhost:8080 in a browser
```


### AWS Lambda

Deployment to AWS Lambda is managed by Zappa which works by packaging the virtual environment and deploying to
Lambda in an environment described in zappa_settings.json. The included copy of zappa_settings will need to be
adjusted to work with the AWS account being used to manage the deployment. In particular, adjust the `role_arn`
value to the role created for the deployment.

1. Setup virtual environment

A deployment to AWS Lambda needs to include all dependencies outsde the standard python libraries. Zappa creates
the deployment by examining the python environment and bundling up extra libraries. The simplest way to do this
is with a virtual environment.

```shell script
python -m venv lambda_env
. lambda_env/bin/activate
pip install -r requirements.txt
pip install -r requirements-web.txt
pip install -r aws_lambda/requirements.txt
```

2. Prepare AWS IAM role for running the task

```shell script
# cd to the folder that contains lambda_web.py
# Set this to the account ID
ACCOUNT_ID="77633809XXXX"

aws iam create-role \
--role-name ZappaLambdaExecution \
--assume-role-policy-document file://aws_lambda/policy.json

aws iam attach-role-policy \
--role-name ZappaLambdaExecution \
--policy-arn "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"

#Update the role policy with the account number
sed "s/__ACOUNT_ID__/${ACCOUNT_ID}/g" aws_lambda/role_policy_template.json > "lambda_role_policy.json"

aws iam put-role-policy \
--role-name ZappaLambdaExecution \
--policy-name AWSLambdaBasicExecutionRole \
--policy-document "file://lambda_role_policy.json"

#Create zappa_settings.json with the account id in the current folder
sed "s/__ACCOUNT_ID__/${ACCOUNT_ID}/g" aws_lambda/zappa_settings_template.json > zappa_settings.json
```

3. Deploy the Lambda

The following all assume `cwd` is the folder containing `tangram_web.py`.

```shell script
$ zappa deploy dev
...
Deploying API Gateway..
Deployment complete!: https://7qwcfov1fc.execute-api.us-east-1.amazonaws.com/dev
```

To update the deployment after changes:

```shell script
$ zappa update dev
```

To remove the Lambda operation

```shell script
$ zappa undeploy dev
```

Pushing resources to S3

```shell script
aws s3 cp --recursive resources s3://sosov-${ACCOUNT_ID}/resources
```
1 change: 0 additions & 1 deletion VERSION

This file was deleted.

Loading