Skip to content

Commit

Permalink
Update to version v1.3.0
Browse files Browse the repository at this point in the history
  • Loading branch information
aassadza committed Jun 24, 2021
1 parent d8153d4 commit 9b84ed8
Show file tree
Hide file tree
Showing 31 changed files with 1,125 additions and 517 deletions.
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,17 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.3.0] - 2021-06-24

### Added

- The option to use [Amazon SageMaker Model Registry](https://docs.aws.amazon.com/sagemaker/latest/dg/model-registry.html) to deploy versioned models. The model registry allows you to catalog models for production, manage model versions, associate metadata with models, manage the approval status of a model, deploy models to production, and automate model deployment with CI/CD.
- The option to use an [AWS Organizations delegated administrator account](https://docs.amazonaws.cn/en_us/AWSCloudFormation/latest/UserGuide/stacksets-orgs-delegated-admin.html) to orchestrate the deployment of Machine Learning (ML) workloads across the AWS Organizations accounts using AWS CloudFormation StackSets.

### Updated

- The build of the AWS Lambda layer for Amazon SageMaker SDK using the lambda:build-python3.8 Docker image.

## [1.2.0] - 2021-05-04

### Added
Expand Down
30 changes: 17 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ to repeat successful processes at scale.

This solution is built with two primary components: 1) the orchestrator component, created by deploying the solution’s AWS CloudFormation template, and 2) the AWS CodePipeline instance deployed from either calling the solution’s API Gateway, or by committing a configuration file into an AWS CodeCommit repository. The solution’s pipelines are implemented as AWS CloudFormation templates, which allows you to extend the solution and add custom pipelines.

To support multiple use cases and business needs, the solution provides two AWS CloudFormation templates: **option 1** for single account deployment, and **option 2** for multi-account deployment.
To support multiple use cases and business needs, the solution provides two AWS CloudFormation templates: **option 1** for single account deployment, and **option 2** for multi-account deployment. In both templates, the solution provides the option to use Amazon SageMaker Model Registry to deploy versioned models.

### Template option 1: Single account deployment

Expand All @@ -41,7 +41,7 @@ The solution’s single account architecture allows you to provision ML pipeline

### Template option 2: Multi-account deployment

The solution uses [AWS Organizations](https://aws.amazon.com/organizations/) and [AWS CloudFormation StackSets](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/what-is-cfnstacksets.html) to allow you to provision or update ML pipelines across AWS accounts. Using an administrator account (also referred to as the orchestrator account) allows you to deploy ML pipelines implemented as AWS CloudFormation templates into selected target accounts (for example, development, staging, and production accounts).
The solution uses [AWS Organizations](https://aws.amazon.com/organizations/) and [AWS CloudFormation StackSets](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/what-is-cfnstacksets.html) to allow you to provision or update ML pipelines across AWS accounts. Using an AWS Organizations administrator account (a delegated administrator account or the management account), also referred to as the orchestrator account, allows you to deploy ML pipelines implemented as AWS CloudFormation templates into selected target accounts (for example, development, staging, and production accounts).

![architecture-option-2](source/architecture-option-2.png)

Expand Down Expand Up @@ -79,6 +79,12 @@ Upon successfully cloning the repository into your local development environment

## Creating a custom build

### Prerequisites

- Python 3.8
- [AWS Command Line Interface](https://aws.amazon.com/cli/)
- Docker (required to build the AWS Lambda layer for Amazon SageMaker SDK)

### 1. Clone the repository

Clone this git repository.
Expand Down Expand Up @@ -113,7 +119,7 @@ chmod +x ./build-s3-dist.sh
./build-s3-dist.sh $DIST_OUTPUT_BUCKET $SOLUTION_NAME $VERSION
```

- Deploy the distributable to an Amazon S3 bucket in your account. Note: you must have the AWS Command Line Interface installed.
- Upload the distributable assets to your Amazon S3 bucket in your account. Note: Ensure that you own the Amazon S3 bucket before uploading the assets. To upload the assets to the S3 bucket, you can use the AWS Console or the AWS CLI as shown below.

```
aws s3 cp ./global-s3-assets/ s3://my-bucket-name-<aws_region>/aws-mlops-framework/<my-version>/ --recursive --acl bucket-owner-full-control --profile aws-cred-profile-name
Expand All @@ -130,6 +136,14 @@ $SOLUTION_NAME - The name of This solution (example: aws-mlops-framework)
$VERSION - The version number of the change
```

## Uninstall the solution

Please refer to the [Uninstall the solution section](https://docs.aws.amazon.com/solutions/latest/aws-mlops-framework/uninstall-the-solution.html) in the [solution's implementation guide](https://docs.aws.amazon.com/solutions/latest/aws-mlops-framework/welcome.html).

## Collection of operational metrics

This solution collects anonymous operational metrics to help AWS improve the quality and features of the solution. For more information, including how to disable this capability, please see the [implementation guide](https://docs.aws.amazon.com/solutions/latest/aws-mlops-framework/operational-metrics.html).

## Known Issues

### Image Builder Pipeline may fail due to Docker Hub rate limits
Expand All @@ -141,16 +155,6 @@ This is due to Docker Inc. [limiting the rate at which images are pulled under D

For more information regarding this issue and short-term and long-term fixes, refer to this AWS blog post: [Advice for customers dealing with Docker Hub rate limits, and a Coming Soon announcement](https://aws.amazon.com/blogs/containers/advice-for-customers-dealing-with-docker-hub-rate-limits-and-a-coming-soon-announcement/)

### Model Monitor Blueprint may fail in multi-account deployment option

When using the blueprint for Model Monitor pipeline in multi-account deployment option, the deployment of the stack in the staging ("DeployStaging") account may fail with an error message:

```
Resource handler returned message: "Error occurred during operation 'CREATE'." (RequestToken:<token-id>, HandlerErrorCode: GeneralServiceException)
```

Workaround: there is no known workaround for this issue for the multi-account Model Monitor blueprint.

---

Copyright 2020-2021 Amazon.com, Inc. or its affiliates. All Rights Reserved.
Expand Down
9 changes: 5 additions & 4 deletions deployment/build-s3-dist.sh
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,11 @@ echo "--------------------------------------------------------------------------

echo "cd $source_dir"
cd $source_dir

# setup lambda layers (building sagemaker layer using lambda build environment for python 3.8)
echo 'docker run -v "$source_dir"/lib/blueprints/byom/lambdas/sagemaker_layer:/var/task lambci/lambda:build-python3.8 /bin/bash -c "cat requirements.txt; pip3 install --upgrade -r requirements.txt -t ./python; exit"'
docker run -v "$source_dir"/lib/blueprints/byom/lambdas/sagemaker_layer:/var/task lambci/lambda:build-python3.8 /bin/bash -c "cat requirements.txt; pip3 install --upgrade -r requirements.txt -t ./python; exit"

echo "python3 -m venv .venv-prod"
python3 -m venv .venv-prod
echo "source .venv-prod/bin/activate"
Expand All @@ -82,10 +87,6 @@ pip install -r ./lambdas/custom_resource/requirements.txt -t ./lambdas/custom_re
echo "pip install -r ./lambdas/solution_helper/requirements.txt -t ./lambdas/solution_helper/"
pip install -r ./lambdas/solution_helper/requirements.txt -t ./lambdas/solution_helper/

# setup lambda layers
echo "pip install -r ./lib/blueprints/byom/lambdas/sagemaker_layer/requirements.txt -t ./lib/blueprints/byom/lambdas/sagemaker_layer/python/"
pip install -r ./lib/blueprints/byom/lambdas/sagemaker_layer/requirements.txt -t ./lib/blueprints/byom/lambdas/sagemaker_layer/python/

# setup crhelper for invoke lambda custom resource
echo "pip install -r ./lib/blueprints/byom/lambdas/invoke_lambda_custom_resource/requirements.txt -t ./lib/blueprints/byom/lambdas/invoke_lambda_custom_resource/"
pip install -r ./lib/blueprints/byom/lambdas/invoke_lambda_custom_resource/requirements.txt -t ./lib/blueprints/byom/lambdas/invoke_lambda_custom_resource/
Expand Down
32 changes: 18 additions & 14 deletions source/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,72 +20,76 @@
from lib.blueprints.byom.multi_account_codepipeline import MultiAccountCodePipelineStack
from lib.blueprints.byom.byom_custom_algorithm_image_builder import BYOMCustomAlgorithmImageBuilderStack
from lib.aws_sdk_config_aspect import AwsSDKConfigAspect
from lib.blueprints.byom.pipeline_definitions.cdk_context_value import get_cdk_context_value

solution_id = "SO0136"
app = core.App()
solution_id = get_cdk_context_value(app, "SolutionId")
version = get_cdk_context_value(app, "Version")

mlops_stack_single = MLOpsStack(
app, "aws-mlops-single-account-framework", description=f"({solution_id}) - AWS MLOps Framework. Version %%VERSION%%"
app,
"aws-mlops-single-account-framework",
description=f"({solution_id}-sa) - AWS MLOps Framework (Single Account Option). Version {version}",
)

# add AWS_SDK_USER_AGENT env variable to Lambda functions
core.Aspects.of(mlops_stack_single).add(AwsSDKConfigAspect(app, "SDKUserAgentSingle", solution_id))
core.Aspects.of(mlops_stack_single).add(AwsSDKConfigAspect(app, "SDKUserAgentSingle", solution_id, version))

mlops_stack_multi = MLOpsStack(
app,
"aws-mlops-multi-account-framework",
multi_account=True,
description=f"({solution_id}) - AWS MLOps Framework. Version %%VERSION%%",
description=f"({solution_id}-ma) - AWS MLOps Framework (Multi Account Option). Version {version}",
)

core.Aspects.of(mlops_stack_multi).add(AwsSDKConfigAspect(app, "SDKUserAgentMulti", solution_id))
core.Aspects.of(mlops_stack_multi).add(AwsSDKConfigAspect(app, "SDKUserAgentMulti", solution_id, version))

BYOMCustomAlgorithmImageBuilderStack(
app,
"BYOMCustomAlgorithmImageBuilderStack",
description=(
f"({solution_id}byom-caib) - Bring Your Own Model pipeline to build custom algorithm docker images"
f"in AWS MLOps Framework. Version %%VERSION%%"
f"in AWS MLOps Framework. Version {version}"
),
)

batch_stack = BYOMBatchStack(
app,
"BYOMBatchStack",
description=(
f"({solution_id}byom-bt) - BYOM Batch Transform pipeline" f"in AWS MLOps Framework. Version %%VERSION%%"
f"({solution_id}byom-bt) - BYOM Batch Transform pipeline" f"in AWS MLOps Framework. Version {version}"
),
)

core.Aspects.of(batch_stack).add(AwsSDKConfigAspect(app, "SDKUserAgentBatch", solution_id))
core.Aspects.of(batch_stack).add(AwsSDKConfigAspect(app, "SDKUserAgentBatch", solution_id, version))

model_monitor_stack = ModelMonitorStack(
app,
"ModelMonitorStack",
description=(f"({solution_id}byom-mm) - Model Monitor pipeline. Version %%VERSION%%"),
description=(f"({solution_id}byom-mm) - Model Monitor pipeline. Version {version}"),
)

core.Aspects.of(model_monitor_stack).add(AwsSDKConfigAspect(app, "SDKUserAgentMonitor", solution_id))
core.Aspects.of(model_monitor_stack).add(AwsSDKConfigAspect(app, "SDKUserAgentMonitor", solution_id, version))


realtime_stack = BYOMRealtimePipelineStack(
app,
"BYOMRealtimePipelineStack",
description=(f"({solution_id}byom-rip) - BYOM Realtime Inference Pipleline. Version %%VERSION%%"),
description=(f"({solution_id}byom-rip) - BYOM Realtime Inference Pipleline. Version {version}"),
)

core.Aspects.of(realtime_stack).add(AwsSDKConfigAspect(app, "SDKUserAgentRealtime", solution_id))
core.Aspects.of(realtime_stack).add(AwsSDKConfigAspect(app, "SDKUserAgentRealtime", solution_id, version))

SingleAccountCodePipelineStack(
app,
"SingleAccountCodePipelineStack",
description=(f"({solution_id}byom-sac) - Single-account codepipeline. Version %%VERSION%%"),
description=(f"({solution_id}byom-sac) - Single-account codepipeline. Version {version}"),
)

MultiAccountCodePipelineStack(
app,
"MultiAccountCodePipelineStack",
description=(f"({solution_id}byom-mac) - Multi-account codepipeline. Version %%VERSION%%"),
description=(f"({solution_id}byom-mac) - Multi-account codepipeline. Version {version}"),
)


Expand Down
Binary file modified source/architecture-option-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified source/architecture-option-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 6 additions & 1 deletion source/cdk.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@
"app": "python3 app.py",
"context": {
"@aws-cdk/core:enableStackNameDuplicates": "true",
"aws-cdk:enableDiffNoFail": "true"
"aws-cdk:enableDiffNoFail": "true",
"SolutionId": "SO0136",
"SolutionName":"%%SOLUTION_NAME%%",
"Version": "%%VERSION%%",
"SourceBucket": "%%BUCKET_NAME%%",
"BlueprintsFile": "blueprints.zip"
}
}
10 changes: 5 additions & 5 deletions source/lambdas/custom_resource/index.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@
import tempfile
import logging
import traceback
import urllib.request
import boto3
from crhelper import CfnResource

Expand All @@ -28,17 +27,18 @@


def copy_assets_to_s3(s3_client):
# get the source and destination locations
source_url = os.environ.get("source_bucket") + "/blueprints.zip"
bucket = os.environ.get("destination_bucket")
# get the source/destination bukcets and file key
s3_bucket_name = os.environ.get("SOURCE_BUCKET")
bucket = os.environ.get("DESTINATION_BUCKET")
file_key = os.environ.get("FILE_KEY")
base_dir = "blueprints"

# create a tmpdir for the zip file to downlaod
zip_tmpdir = tempfile.mkdtemp()
zip_file_path = os.path.join(zip_tmpdir, f"{base_dir}.zip")

# download blueprints.zip
urllib.request.urlretrieve(source_url, zip_file_path)
s3_client.download_file(s3_bucket_name, file_key, zip_file_path)

# unpack the zip file in another tmp directory
unpack_tmpdir = tempfile.mkdtemp()
Expand Down
19 changes: 8 additions & 11 deletions source/lambdas/custom_resource/tests/test_custom_resource.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,35 +22,32 @@

@pytest.fixture(autouse=True)
def mock_env_variables():
os.environ["source_bucket"] = "solutions-bucket"
os.environ["destination_bucket"] = "blueprints-bucket"
os.environ["TESTFILE"] = "blueprints.zip"
os.environ["SOURCE_BUCKET"] = "solutions-bucket"
os.environ["DESTINATION_BUCKET"] = "blueprints-bucket"
os.environ["FILE_KEY"] = "blueprints.zip"


@pytest.fixture
def event():
return {"bucket": os.environ["source_bucket"]}
return {"bucket": os.environ["SOURCE_BUCKET"]}


@pytest.fixture
def mocked_response():
return f"CopyAssets-{os.environ['destination_bucket']}"
return f"CopyAssets-{os.environ['DESTINATION_BUCKET']}"


@mock_s3
@patch("index.os.walk")
@patch("index.shutil.unpack_archive")
@patch("index.urllib.request.urlretrieve")
def test_copy_assets_to_s3(mocked_urllib, mocked_shutil, mocked_walk, mocked_response):
def test_copy_assets_to_s3(mocked_shutil, mocked_walk, mocked_response):
s3_client = boto3.client("s3", region_name="us-east-1")
testfile = tempfile.NamedTemporaryFile()
s3_client.create_bucket(Bucket="solutions-bucket")
s3_client.create_bucket(Bucket="blueprints-bucket")
s3_client.upload_file(testfile.name, os.environ["source_bucket"], os.environ["TESTFILE"])
s3_client.upload_file(testfile.name, os.environ["SOURCE_BUCKET"], os.environ["FILE_KEY"])
local_file = tempfile.NamedTemporaryFile()
mocked_urllib.side_effect = s3_client.download_file(
os.environ["source_bucket"], os.environ["TESTFILE"], local_file.name
)
s3_client.download_file(os.environ["SOURCE_BUCKET"], os.environ["FILE_KEY"], local_file.name)
tmp = tempfile.mkdtemp()
mocked_walk.return_value = [
(tmp, (local_file.name,), (local_file.name,)),
Expand Down
40 changes: 25 additions & 15 deletions source/lambdas/pipeline_orchestration/lambda_helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,7 @@ def get_codepipeline_params(is_multi_account, stack_name, template_zip_name, tem
("PRODACCOUNTID", os.environ["PROD_ACCOUNT_ID"]),
("PRODORGID", os.environ["PROD_ORG_ID"]),
("BLUEPRINTBUCKET", os.environ["BLUEPRINT_BUCKET"]),
("DELEGATEDADMINACCOUNT", os.environ["IS_DELEGATED_ADMIN"]),
]
)

Expand All @@ -175,12 +176,24 @@ def get_codepipeline_params(is_multi_account, stack_name, template_zip_name, tem

def get_common_realtime_batch_params(event, region, stage):
inference_instance = get_stage_param(event, "inference_instance", stage)
image_uri = (
get_image_uri(event.get("pipeline_type"), event, region) if os.environ["USE_MODEL_REGISTRY"] == "No" else ""
)
model_package_group_name = (
# model_package_name example: arn:aws:sagemaker:us-east-1:<ACCOUNT_ID>:model-package/xgboost/1
# the model_package_group_name in this case is "xgboost"
event.get("model_package_name").split("/")[1]
if os.environ["USE_MODEL_REGISTRY"] == "Yes"
else ""
)
return [
("MODELNAME", event.get("model_name")),
("MODELARTIFACTLOCATION", event.get("model_artifact_location")),
("MODELARTIFACTLOCATION", event.get("model_artifact_location", "")),
("INFERENCEINSTANCE", inference_instance),
("CUSTOMALGORITHMSECRREPOARN", os.environ["ECR_REPO_ARN"]),
("IMAGEURI", get_image_uri(event.get("pipeline_type"), event, region)),
("IMAGEURI", image_uri),
("MODELPACKAGEGROUPNAME", model_package_group_name),
("MODELPACKAGENAME", event.get("model_package_name", "")),
]


Expand Down Expand Up @@ -333,25 +346,22 @@ def get_image_uri(pipeline_type, event, region):
raise Exception("Unsupported pipeline by get_image_uri function")


def get_required_keys(pipeline_type):
def get_required_keys(pipeline_type, use_model_registry):
# Realtime/batch pipelines
if pipeline_type in [
"byom_realtime_builtin",
"byom_realtime_custom",
"byom_batch_builtin",
"byom_batch_custom",
]:
common_keys = [
"pipeline_type",
"model_name",
"model_artifact_location",
"inference_instance",
]
builtin_model_keys = [
"model_framework",
"model_framework_version",
]
custom_model_keys = ["custom_image_uri"]
common_keys = ["pipeline_type", "model_name", "inference_instance"]
model_location = ["model_artifact_location"]
builtin_model_keys = ["model_framework", "model_framework_version"] + model_location
custom_model_keys = ["custom_image_uri"] + model_location
# if model registry is used
if use_model_registry == "Yes":
builtin_model_keys = custom_model_keys = ["model_package_name"]

realtime_specific_keys = ["data_capture_location"]
batch_specific_keys = ["batch_inference_data", "batch_job_output_location"]

Expand Down Expand Up @@ -403,7 +413,7 @@ def validate(event):
:raises: BadRequest Exception
"""
# get the required keys to validate the event
required_keys = get_required_keys(event.get("pipeline_type", ""))
required_keys = get_required_keys(event.get("pipeline_type", ""), os.environ["USE_MODEL_REGISTRY"])
for key in required_keys:
if key not in event:
logger.error(f"Request event did not have parameter: {key}")
Expand Down
Loading

0 comments on commit 9b84ed8

Please sign in to comment.