Skip to content

Commit

Permalink
first try deploy aws
Browse files Browse the repository at this point in the history
  • Loading branch information
nato-re committed Feb 1, 2024
1 parent 4455234 commit 6eb6587
Show file tree
Hide file tree
Showing 19 changed files with 1,428 additions and 174 deletions.
7 changes: 7 additions & 0 deletions .github/scripts/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#!/bin/bash
set -xe

# Maven is used to build and create a war file.
mvn -Dmaven.test.skip=true clean install


69 changes: 69 additions & 0 deletions .github/workflows/deploy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
name: Deploy to Amazon ECS

on:
push:
branches:
- main


env:
applicationfolder: ./
AWS_REGION: sa-east-1
#S3BUCKET: spotify-liekd-songs-cluster-webappdeploymentbucket-uigxxoklltki
ECR_REPOSITORY: 186912366203.dkr.ecr.sa-east-1.amazonaws.com/spotify-liked-songs-clustering-cs50 # set this to your Amazon ECR repository name
ECS_SERVICE: spotify-liked-songs-clustering-cs50-ecs-service
# set this to your Amazon ECS service name
ECS_CLUSTER: spotify-liked-songs-clustering-cs50-ecs-service # set this to your Amazon ECS cluster name
ECS_TASK_DEFINITION: ./aws/task-definition.json # set this to the path to your Amazon ECS task definition e.g. .aws/task-definition.json
CONTAINER_NAME: spotify-liked-songs-clustering-docker-image # set this

jobs:
build:
name: Build and Package
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
steps:
- uses: actions/checkout@v4
name: Checkout Repository

- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION }}

- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login

- name: Build, tag, and push image to Amazon ECR
id: build-image
env:
ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
IMAGE_TAG: ${{ github.sha }}
run: |
# Build a docker container and
# push it to ECR so that it can
# be deployed to ECS.
docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG .
docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
echo "image=$ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG" >> $GITHUB_OUTPUT
- name: Fill in the new image ID in the Amazon ECS task definition
id: task-def
uses: aws-actions/amazon-ecs-render-task-definition
with:
task-definition: ${{ env.ECS_TASK_DEFINITION }}
container-name: ${{ env.CONTAINER_NAME }}
image: ${{ steps.build-image.outputs.image }}

- name: Deploy Amazon ECS task definition
uses: aws-actions/amazon-ecs-deploy-task-definition@df9643053eda01f169e64a0e60233aacca83799a
with:
task-definition: ${{ steps.task-def.outputs.task-definition }}
service: ${{ env.ECS_SERVICE }}
cluster: ${{ env.ECS_CLUSTER }}
wait-for-service-stability: true
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
.env
appspec.yml
cloudformation/
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,6 @@ RUN python -m pip install -r requirements.txt
WORKDIR /app
COPY . /app

EXPOSE 8080
EXPOSE 80

ENTRYPOINT [ "python app.py" ]
120 changes: 94 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,43 +1,111 @@
# Project Title: K-means Cluster Playlist Maker
## Integrating with GitHub Actions – CICD pipeline to Deploy a Web App to Amazon EC2

## Video URL
[Link to Video](https://youtu.be/jSWJxH-pe7s)
Many Organizations adopt [DevOps Practices](https://aws.amazon.com/devops/what-is-devops/) to innovate faster by automating and streamlining the software development and infrastructure management processes. Beyond cultural adoption, DevOps also suggests following certain best practices and Continuous Integration and Continuous Delivery (CI/CD) is among the important ones to start with. CI/CD practice reduces the time it takes to release new software updates by automating deployment activities. Many tools are available to implement this practice. Although AWS has a set of native tools to help achieve your CI/CD goals, it also offers flexibility and extensibility for integrating with numerous third party tools.

## Project Description
In this post, you will use [GitHub Actions](https://help.github.com/en/actions) to create a CI/CD workflow and [AWS CodeDeploy](https://aws.amazon.com/codedeploy/) to deploy a sample Java SpringBoot application to Amazon Elastic Compute Cloud ([Amazon EC2](https://docs.aws.amazon.com/ec2/index.html?nc2=h_ql_doc_ec2#amazon-ec2)) instances in an Autoscaling group.

The K-means Cluster Playlist Maker is a web application that utilizes K-means clustering to group your liked songs on Spotify into distinct clusters. Each cluster represents a playlist, and the songs within a cluster share similar features, allowing you to discover and organize your music in a more meaningful way.

### Features
GitHub Actions is a feature on GitHub’s popular development platform that helps you automate your software development workflows in the same place that you store code and collaborate on pull requests and issues. You can write individual tasks called actions, and then combine them to create a custom workflow. Workflows are custom automated processes that you can set up in your repository to build, test, package, release, or deploy any code project on GitHub.

1. **Authentication**: Users can log in with their Spotify account, granting the application access to their liked songs.
AWS CodeDeploy is a deployment service that automates application deployments to Amazon EC2 instances, on-premises instances, serverless AWS Lambda functions, or Amazon Elastic Container Service (Amazon ECS) services.

2. **Cluster Generation**: Users can specify the number of clusters they want, and the application uses K-means clustering to group their liked songs accordingly.

3. **Visualization**: The application provides a visual representation of the clusters using K-means graphs.
## Solution Overview

4. **Playlist Naming**: The application uses OpenAI's GPT-3.5 language model to generate unique playlist names based on the characteristics of each cluster.
The solution utilizes following services:

5. **Playlist Creation**: Users can choose to create unique playlists based on the generated clusters, helping them organize their music on Spotify.
1. [GitHub Actions](https://docs.github.com/en/actions) : Workflow Orchestration tool that will host the Pipeline.
2. [AWS CodeDeploy](https://aws.amazon.com/codedeploy/) : AWS service to manage deployment on Amazon EC2 Autoscaling Group.
3. [AWS Auto Scaling](https://aws.amazon.com/ec2/autoscaling/) : AWS Service to help maintain application availability and elasticity by automatically adding or removing EC2 instances.
4. [Amazon EC2](https://docs.aws.amazon.com/ec2/index.html?nc2=h_ql_doc_ec2#amazon-ec2) : Destination Compute server for the application deployment.
5. [AWS CloudFormation](https://aws.amazon.com/cloudformation/) : AWS infrastructure as code (IaC) service used to spin up the initial infrastructure on AWS side.
6. [IAM OIDC identity provider](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_create_oidc.html) : Federated authentication service to establish trust between GitHub and AWS to allow GitHub Actions to deploy on AWS without maintaining AWS Secrets and credentials.
7. [Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html) : Amazon S3 to store the deployment artifacts.

### Technologies Used
The following diagram illustrates the architecture for the solution:
![Alt Text](aws-coodedeplooy-github-action-deploymentV3.png?raw=true "Title")

- **Backend**: Python with Flask framework
- **Frontend**: HTML, CSS, JavaScript
- **Data Science**: K-means clustering algorithm, pandas, scikit-learn
- **Spotify API**: Integration for user authentication and access to liked songs
- **OpenAI GPT-3.5 Turbo**: Language model for playlist naming suggestions
## Prerequisites
Before you begin, you need to complete the following prerequisites:

* An AWS account with permissions to create the necessary resources.
* A [Git Client](https://git-scm.com/downloads) to clone the provided source code.
* A [GitHub account](https://github.com/) with permissions to configure GitHub repositories, create workflows, and configure GitHub secrets.

### How to Run the Application
## Walkthrough
The following steps provide a high-level overview of the walkthrough:

1. Clone the repository to your local machine.
2. Install the required Python packages using `pip install -r requirements.txt`.
3. Run the Flask application using `python app.py`.
4. Access the application through a web browser at `http://localhost:8080`.
1. Clone the project from the AWS code samples repository.
2. Deploy the AWS CloudFormation template to create the required services.
3. Update the source code.
4. Setup GitHub secrets.
5. Integrate CodeDeploy with GitHub
6. Trigger the GitHub Action to build and deploy the code.
7. Verify the deployment.

### Note
## Download the source code

Make sure to set up your Spotify API credentials and OpenAI GPT-3.5 Turbo API key before running the application.
Clone this repository aws-codedeploy-github-actions-deployment

Feel free to explore, discover, and organize your Spotify liked songs with the K-means Cluster Playlist Maker!
git clone https://github.com/aws-samples/aws-codedeploy-github-actions-deployment.git

Be free
Create an empty repository in your personal GitHub account.

git clone https://github.com/<username>/<repoName>.git

Copy the code. We need contents from the hidden .github folder for the GitHub actions to work.

cp -r aws-codedeploy-github-actions-deployment/. <new repository>

e.g. GitActionsDeploytoAWS

## Deploying the CloudFormation template
To deploy the CloudFormation template, complete the following steps:

1. Open AWS CloudFormation console. Enter your account ID, user name and Password.
2. Check your region, this solution uses us-east-1.
3. If this is new AWS CloudFormation account, click Create New Stack. Otherwise, select Create Stack.
4. Select Template is Ready
5. Click Upload a template file
6. Click Choose File. Navigate to template.yml file in your cloned repository at “aws-codedeploy-github-actions-deployment/cloudformation/template.yaml”
7. Select the template.yml file and select next.
8. In Specify Stack Details, add or modify values as needed.
- Stack name = CodeDeployStack.
- VPC and Subnets = (these are pre-populated for you) you can change these values if you prefer to use your own Subnets)
- GitHubThumbprintList = 6938fd4d98bab03faadb97b34396831e3780aea1
- GitHubRepoName – Name of your GitHub personal repository which you created.
9. On the Options page, click Next.
10. Select the acknowledgement box to allow the creation of IAM resources, and then select Create.
It will take CloudFormation about 5 minutes to create all the resources. This stack would create below resources.
- Two EC2 Linux instances with Tomcat server and CodeDeploy agent installed
- Autoscaling group with Internet Application load balancer
- CodeDeploy application name and deployment group
- S3 bucket to store build artifacts
- Identity and Access Management (IAM) OIDC identity provider
- Instance profile for Amazon EC2
- Service role for CodeDeploy
- Security groups for ALB and Amazon EC2

## GitHub configuration and Testing

Please follow the [blog post](https://aws.amazon.com/blogs/devops/integrating-with-github-actions-ci-cd-pipeline-to-deploy-a-web-app-to-amazon-ec2/) to setup GitHub actions and test the CICD flow.

## Clean up

To avoid incurring future changes, you should clean up the resources that you created.

1. Empty the Amazon S3 bucket:
2. Delete the CloudFormation stack (CodeDeployStack) from the AWS console.
3. Delete the GitHub Secret (‘IAMROLE_GITHUB’)
1. Go to the repository settings on GitHub Page.
2. Select Secrets under Actions.
3. Select IAMROLE_GITHUB, and delete it.


## Security

See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.

## License

This library is licensed under the MIT-0 License. See the LICENSE file.
31 changes: 19 additions & 12 deletions api_requests.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,14 +97,17 @@ def get_artists(ids: set):
artists[artist['id']] = artist
return artists


def add_playlists_names(data):
playlist_quantity = len(data)
prompt = f"Create {playlist_quantity} unique playlist names based in the following spotify metadata, in JSON format, on the next lines, give only the playlists names separated by line breaks in your response\n"
prompt = f"Create {playlist_quantity} unique playlist names based in the following spotify metadata, in JSON format, on the next lines, give only the playlists names separated by line breaks in your response, do not number the items\n"
for cluster in data:
prompt += json.dumps(cluster["center"])
messages = [{ "content": prompt , "role": "user"}]

response = client.chat.completions.create(messages=messages, model="gpt-3.5-turbo-1106") # get_data.get_fake_chat_completion(messages=messages, model="gpt-3.5-turbo-1106") #
messages = [{"content": prompt, "role": "user"}]

# get_data.get_fake_chat_completion(messages=messages, model="gpt-3.5-turbo-1106") #
response = client.chat.completions.create(
messages=messages, model="gpt-3.5-turbo-1106")
print(response.choices)
print(response.choices[0])
choices = response.choices[0].message.content.split("\n")
Expand All @@ -119,25 +122,29 @@ def add_playlists_names(data):

return playlists


def create_playlist(body):
song_ids = body.get('song_ids')
playlist_name = body.get('playlist_name')
user = session.get("https://api.spotify.com/v1/me", headers=req_headers)
user_id = user.json()['id']
response = session.post(
f'https://api.spotify.com/v1/users/{user_id}/playlists', headers=req_headers,
json={
"name": playlist_name,
"description": "My description",
"public": False
})
f'https://api.spotify.com/v1/users/{user_id}/playlists', headers=req_headers,
json={
"name": playlist_name,
"description": "My description",
"public": False
})
body = response.json()
print(body)
playlist_id = body['id']
#max 100
# max 100
song_ids_to_uri = [f"spotify:track:{id}" for id in song_ids]
while len(song_ids_to_uri) > 0:
uris = song_ids_to_uri[:100]
session.post(f'https://api.spotify.com/v1/playlists/{playlist_id}/tracks', data={"uris": uris})
test = session.post(
f'https://api.spotify.com/v1/playlists/{playlist_id}/tracks',
json={"uris": uris},
headers=req_headers)
song_ids_to_uri = song_ids_to_uri[100:]
return playlist_id
30 changes: 17 additions & 13 deletions app.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import logging as log
from logging import debug
from flask import Flask, render_template, request, jsonify
from flask import Flask, render_template, request, jsonify, send_from_directory
import data.get_data as get_data
import api_requests as api
import data.treat_data as treat_data
Expand All @@ -10,12 +10,12 @@
import traceback
import os

app = Flask(__name__) # Cria a instância da aplicação

app.static_folder = "views/static"
app.template_folder = "views/templates"
global figure
global record_data
app = Flask(
__name__,
static_folder="views/static",
static_url_path='',
template_folder="views/templates",
)


@app.route("/")
Expand All @@ -27,13 +27,15 @@ def main():
def get_token_from_client():
return render_template('main/index.html.j2')


@app.route("/playlist", methods=['POST'])
def generate_fake():
auth_token = request.headers.get('Authorization')
api.set_token(auth_token)
api.create_playlist(request.json)
return "", 201


@app.route("/generate")
def generate():
try:
Expand All @@ -45,18 +47,20 @@ def generate():
return render_template('main', error='DEU RUIM')
df = get_data.treated_data(
auth_token
)
)
kmeans, centers = model.k_means_clustering(df, int(number_of_clusters))
fig = graphs.create_figure(df, kmeans, graphs.draw_k_means, "K-Means Graph")
clusters_record = treat_data.group_songs_by_cluster(df, kmeans, centers)
fig = graphs.create_figure(
df, kmeans, graphs.draw_k_means, "K-Means Graph")
clusters_record = treat_data.group_songs_by_cluster(
df, kmeans, centers)
records_with_playlists_names = api.add_playlists_names(clusters_record)
return jsonify(data=records_with_playlists_names, figure=fig)
except exceptions.HTTPError as err:
print(err.response.json())
return jsonify(err.response.json()), err.response.status_code
except Exception as e:
print(traceback.format_exc())
#log_error(str(e))
# log_error(str(e))
return jsonify(error=str(e)), 500


Expand All @@ -68,5 +72,5 @@ def generate():
# debug = True, reinicia automaticamente a cada mudança de arquivo
# mude a porta, caso ela estiver em uso
app.run(
debug=True,
host='0.0.0.0', port=8080)
debug=True,
host='0.0.0.0', port=80)
26 changes: 26 additions & 0 deletions appspec.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
version: 0.0
os: linux
files:
- source: /aws
destination: /usr/local/codedeployresources
hooks:
ApplicationStop:
- location: aws/scripts/application-stop.sh
timeout: 300
runas: root
BeforeInstall:
- location: aws/scripts/before-install.sh
timeout: 300
runas: root
AfterInstall:
- location: aws/scripts/after-install.sh
timeout: 300
runas: root
ApplicationStart:
- location: aws/scripts/application-start.sh
timeout: 300
runas: root
ValidateService:
- location: aws/scripts/validate-service.sh
timeout: 300
runas: root
Loading

0 comments on commit 6eb6587

Please sign in to comment.