Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for p5.48xlarge and p4de.24xlarge instance types and Fix for FsX monitoring #31

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

gallanik
Copy link

New GPU instance types added to the AWS instance family

Issue #, if available:

Description of changes:
Modified prometheus/prometheus.yml to add p4de.24xlarge and p5.48xlarge. These are new GPU instance types released by AWS.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

New GPU instance types added to the AWS instance family
With the new releases, the chef-dna file has modified the key for FsX ID. Now it is set to fsx_fs_ids, instead of cfn_fsx_fs_id. Due to this FsX metrics were not captured on the dashboard. This update should fix it.
@gallanik gallanik changed the title Adding support for p5.48xlarge and p4de.24xlarge instance types Adding support for p5.48xlarge and p4de.24xlarge instance types and Fix for FsX monitoring Dec 14, 2023
Copy link
Contributor

@nicolaven nicolaven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need this?

Fixing repo name in the monitoring_url variable
@gallanik
Copy link
Author

why do you need this?

@nicolaven - AWS has released new instance types and when I installed this tool, it did not pick up the p5 instances. Hence the need to include them.

@@ -14,7 +14,7 @@ monitoring_dir_name=aws-parallelcluster-monitoring
monitoring_tarball="${monitoring_dir_name}.tar.gz"

#get GitHub repo to clone and the installation script
monitoring_url=https://github.com/aws-samples/aws-parallelcluster-monitoring/archive/refs/tags/${version}.tar.gz
monitoring_url=https://github.com/gallanik/aws-parallelcluster-monitoring/archive/refs/tags/${version}.tar.gz
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change this.

@sean-smith
Copy link
Contributor

We can't merge this with your branch. Please change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants