Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support AWS SSM tailscaled state #41

Merged
merged 13 commits into from
Nov 21, 2024
Merged

feat: support AWS SSM tailscaled state #41

merged 13 commits into from
Nov 21, 2024

Conversation

gberenice
Copy link
Member

@gberenice gberenice commented Nov 15, 2024

what

  • This adds support to store tailscaled state in AWS SSM. This helps reusing device state names rather than deleting old devices which is not supported by Tailscale via Terraform ATM.
  • This adds recent Masterpoint's GH + CRabbit configs.
  • This sets trunk update to run less often to reduce noise.
  • This adds ability to confugure ASG min/max size and desired capacity. This is important in case of using an external state to avoid the Duplicate node key issue:

This can occur if you use a backup of one device to create another, or clone a file system from one device to another. The Tailscale configuration files are duplicated. The Tailscale files will need to be removed from one of the two.
You can identify duplicated devices in the Machines page of the admin console by looking for a Duplicate node key badge underneath the device name.
On one of the systems, uninstall and completely delete the Tailscale app. It is especially important to remove the files listed for your platform, the goal is to make a new Tailscale IP address when it is installed again. Then, reinstall the app.

why

  • Ephemeral nodes behave glitchy during the rotation - we had to perform a manual instance restart to run Tailscale. It's hard to reproduce, so keeping a state in external storage in one more option to try to keep the device in order.

references

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Introduced a new configuration file for CodeRabbit integration, enhancing review and feedback processes.
    • Added new modules for managing SSM state parameters and IAM policies in the Terraform setup.
    • Expanded configuration options with new variables for Auto Scaling Group and Tailscale state management.
    • Updated the workflow for trunk upgrades to run monthly, improving efficiency.
  • Documentation

    • Enhanced README.md with new module and variable details for better user guidance.
  • Chores

    • Updated .gitignore to manage ignored files more effectively.

@gberenice gberenice requested a review from a team as a code owner November 15, 2024 16:01
Copy link

coderabbitai bot commented Nov 15, 2024

Walkthrough

The changes introduce a new configuration file .coderabbit.yaml for CodeRabbit integration, specifying settings for language, tone, knowledge base, and review processes. The trunk upgrade workflow has been modified to run monthly instead of weekly, with an updated action version. The .gitignore file has been adjusted to reintroduce exclusions for .idea and .iml files while also adding .DS_Store. Additionally, the README.md and Terraform files have been updated to include new modules and input variables for managing SSM state parameters.

Changes

File Change Summary
.coderabbit.yaml New configuration file created with settings for language, tone, knowledge base, and reviews.
.github/workflows/trunk-upgrade.yaml Workflow name changed to "Monthly Trunk Upgrade"; schedule updated to run monthly; action version updated.
.gitignore Reintroduced exclusions for .idea and .iml, added .DS_Store to ignored files.
README.md Added modules ssm_state and ssm_policy, and input variables for managing SSM state.
examples/complete/fixtures.us-east-2.tfvars New variable ssm_state_enabled set to true.
main.tf Added local variables and modules for SSM state management; updated flags handling.
userdata.sh.tmpl Modified flags export method to write to configuration file instead of environment variable.
variables.tf New variables declared for controlling Auto Scaling Group configurations and SSM state storage.

Possibly related PRs

Suggested reviewers

  • Gowiem
  • oycyc

🌟 In the land of code where changes flow,
A new config file begins to grow.
With modules added and flags set right,
The Tailscale journey is now in sight!
Monthly upgrades, a smoother ride,
In CodeRabbit's world, we take great pride! 🌟


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (9)
.gitignore (1)

18-21: Consider improving section organization and documentation.

The "Random" section name could be more descriptive. Consider organizing these patterns into separate sections with clear comments.

-# Random
+# IDE - JetBrains
 **/.idea
 **/*.iml
+
+# OS-specific files
 .DS_Store
examples/complete/fixtures.us-east-2.tfvars (2)

Line range hint 6-7: Region value doesn't match filename

The region is set to "us-east-1" but the filename contains "us-east-2". This inconsistency could be confusing for users following the example.

Apply this diff to maintain consistency:

-region                  = "us-east-1"
-availability_zones      = ["us-east-1a", "us-east-1b"]
+region                  = "us-east-2"
+availability_zones      = ["us-east-2a", "us-east-2b"]

Line range hint 15-17: Consider adding security guidance for credentials

While using placeholder values for OAuth credentials in examples is good practice, consider adding a comment to guide users on secure credential management, such as using AWS Secrets Manager or SSM Parameter Store for production environments.

Add a comment like this:

# Replace these values with your own
+# For production environments, consider using AWS Secrets Manager or SSM Parameter Store
tailnet             = "orgname.org.github"
oauth_client_id     = "OAUTH_CLIENT_ID"
oauth_client_secret = "OAUTH_CLIENT_SECRET"
userdata.sh.tmpl (2)

17-18: Enhance the robustness of the sed command

While writing flags to /etc/default/tailscaled is a good practice, the current sed implementation could be made more robust:

  1. It assumes the FLAGS line exists
  2. The command might fail if special characters are present in tailscaled_extra_flags

Consider this safer approach:

-sudo sed -i "s|^FLAGS=.*|FLAGS=\"${tailscaled_extra_flags}\"|" /etc/default/tailscaled
+# Ensure the file exists
+sudo touch /etc/default/tailscaled
+# Remove existing FLAGS line if present
+sudo sed -i '/^FLAGS=/d' /etc/default/tailscaled
+# Append the new FLAGS line
+echo "FLAGS=\"${tailscaled_extra_flags}\"" | sudo tee -a /etc/default/tailscaled > /dev/null

This approach:

  • Ensures the config file exists
  • Cleanly removes any existing FLAGS setting
  • Safely adds the new FLAGS line without sed pattern matching

Line range hint 1-1: Consider verifying AWS SSM access

Since this script is part of a change to support storing tailscaled state in AWS SSM, it would be beneficial to verify SSM access early in the script execution.

Consider adding these checks after the initial logging setup:

+# Verify AWS SSM access
+echo "Verifying AWS SSM access..."
+if ! aws ssm get-parameter --name dummy-test 2>/dev/null; then
+    echo "Warning: Unable to access AWS SSM. State persistence may not work correctly."
+    # Continue anyway as this might be intentional
+fi

This will help catch permission issues early and provide better debugging information.

.coderabbit.yaml (1)

44-90: Consider enabling complementary tools

While disabling tools to avoid overlap with Trunk makes sense, consider enabling tools that provide unique value:

  • gitleaks for sensitive data detection
  • checkov for cloud infrastructure security scanning
 gitleaks:
-      enabled: false
+      enabled: true
 checkov:
-      enabled: false
+      enabled: true
main.tf (2)

7-8: Consider using try function for safer module reference

The module reference module.ssm_state[0] could fail if var.ssm_state_enabled is false. Consider using the try function for safer handling:

- ssm_state_flag       = var.ssm_state_enabled ? "--state=${module.ssm_state[0].arn_map[local.ssm_state_param_name]}" : ""
+ ssm_state_flag       = var.ssm_state_enabled ? "--state=${try(module.ssm_state[0].arn_map[local.ssm_state_param_name], "")}" : ""

72-89: Document required IAM permissions for SSM state management

The SSM state implementation looks good, but it would be helpful to document the required IAM permissions for the EC2 instance role to manage these SSM parameters.

Consider adding a note in the README about the following required permissions:

  • ssm:GetParameter
  • ssm:PutParameter
variables.tf (1)

184-191: Consider enhancing the documentation and adding validation

The implementation could benefit from:

  1. More detailed description explaining the benefits of SSM state storage (e.g., device reuse, rotation stability)
  2. Example usage in the description
  3. Validation block to ensure AWS region supports SSM Parameter Store

Here's a suggested enhancement:

 variable "ssm_state_enabled" {
   default     = false
   type        = bool
   description = <<-EOT
-  Control is tailscaled state (including preferences and keys) is stored in AWS SSM.
-  See more in the [docs](https://tailscale.com/kb/1278/tailscaled#flags-to-tailscaled).
+  Controls whether tailscaled state (including preferences and keys) is stored in AWS SSM Parameter Store.
+  When enabled, this allows:
+  - Reuse of device state names (helpful when Terraform can't delete old devices)
+  - Improved stability during instance rotation
+  - Persistent state across instance replacements
+
+  Example:
+  ```hcl
+  module "tailscale" {
+    # ... other configuration ...
+    ssm_state_enabled = true
+  }
+  ```
+
+  For more details, see [Tailscale daemon documentation](https://tailscale.com/kb/1278/tailscaled#flags-to-tailscaled).
   EOT
+  validation {
+    condition     = !var.ssm_state_enabled || can(regex("^(us|eu|ap|sa|ca|me|af)-\\w+-\\d+$", data.aws_region.current.name))
+    error_message = "SSM state storage requires a valid AWS region."
+  }
 }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 634d31b and d179278.

📒 Files selected for processing (8)
  • .coderabbit.yaml (1 hunks)
  • .github/workflows/trunk-upgrade.yaml (2 hunks)
  • .gitignore (1 hunks)
  • README.md (2 hunks)
  • examples/complete/fixtures.us-east-2.tfvars (1 hunks)
  • main.tf (4 hunks)
  • userdata.sh.tmpl (1 hunks)
  • variables.tf (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
main.tf (1)

Pattern **/*.tf: You're a Terraform expert who has thoroughly studied all the documentation from Hashicorp https://developer.hashicorp.com/terraform/docs and OpenTofu https://opentofu.org/docs/.
You have a strong grasp of Terraform syntax and prioritize providing accurate and insightful code suggestions.
As a fan of the Cloud Posse / SweetOps ecosystem, you incorporate many of their best practices https://docs.cloudposse.com/best-practices/terraform/ while balancing them with general Terraform guidelines.

variables.tf (1)

Pattern **/*.tf: You're a Terraform expert who has thoroughly studied all the documentation from Hashicorp https://developer.hashicorp.com/terraform/docs and OpenTofu https://opentofu.org/docs/.
You have a strong grasp of Terraform syntax and prioritize providing accurate and insightful code suggestions.
As a fan of the Cloud Posse / SweetOps ecosystem, you incorporate many of their best practices https://docs.cloudposse.com/best-practices/terraform/ while balancing them with general Terraform guidelines.

🔇 Additional comments (13)
.gitignore (1)

18-21: LGTM! Standard ignore patterns for IDE and system files.

The added patterns will properly exclude common development environment files:

  • .idea and .iml: JetBrains IDE files
  • .DS_Store: macOS system files
examples/complete/fixtures.us-east-2.tfvars (1)

11-12: LGTM: SSM state configuration added correctly

The addition of ssm_state_enabled = true aligns with the PR objective to support AWS SSM for tailscaled state storage. This is appropriately set to true in the complete example to demonstrate the feature.

.github/workflows/trunk-upgrade.yaml (2)

1-1: LGTM: Schedule change reduces notification noise

The change from weekly to monthly trunk upgrades is a good optimization that aligns with the PR's objective to reduce unnecessary notifications. The cron schedule is well-documented with a clear comment.

Also applies to: 4-5


21-21: Verify the trunk-action version compatibility

The action has been updated to v1.1.18 (commit: 2eaee169). While pinning to a specific version is good practice, let's verify there are no breaking changes.

✅ Verification successful

Trunk-action v1.1.18 update looks safe to proceed

The changes in v1.1.18 are minor improvements and dependency updates:

  • Updated action dependencies
  • Better logging by not suppressing launcher output
  • Added new optional labels input feature

No breaking changes or compatibility issues are indicated in the release notes.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any breaking changes or known issues in trunk-action v1.1.18

# Fetch the changelog or release notes
gh api \
  -H "Accept: application/vnd.github+json" \
  /repos/trunk-io/trunk-action/releases | \
  jq -r '.[] | select(.tag_name=="v1.1.18") | .body'

Length of output: 436

.coderabbit.yaml (3)

13-24: LGTM! Knowledge base configuration is well-structured

The consistent use of global scope across all sections will help leverage organization-wide knowledge effectively.


1-11: Verify schema URL validity

The configuration starts with good documentation links and clear basic settings. Let's verify the schema URL is current.

✅ Verification successful

Schema URL is valid and configuration looks good

The schema URL is accessible and returns valid JSON. The basic configuration section is well-structured with clear documentation links, appropriate language settings, and professional tone instructions.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the schema URL is accessible and returns valid JSON
curl -s -f -o /dev/null https://coderabbit.ai/integrations/schema.v2.json
if [ $? -eq 0 ]; then
    echo "Schema URL is valid and accessible"
else
    echo "Schema URL might be outdated or inaccessible"
fi

Length of output: 172


26-43: Verify documentation URLs in path instructions

The review settings are well-configured with appropriate ignore keywords. Let's verify the documentation URLs referenced in the path instructions.

✅ Verification successful

All documentation URLs in path instructions are accessible and valid

The verification confirms that all three documentation URLs referenced in the path instructions are accessible:

  • HashiCorp Terraform docs
  • OpenTofu docs
  • Cloud Posse best practices
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify documentation URLs in path instructions
urls=(
    "https://developer.hashicorp.com/terraform/docs"
    "https://opentofu.org/docs/"
    "https://docs.cloudposse.com/best-practices/terraform/"
)

for url in "${urls[@]}"; do
    if curl -s -f -o /dev/null "$url"; then
        echo "$url is accessible"
    else
        echo "$url might be outdated or inaccessible"
    fi
done

Length of output: 805

main.tf (3)

12-13: LGTM! Robust flag handling implementation

The use of compact with join ensures clean flag string generation, and the length check is a reliable way to determine if flags are enabled.


26-26: LGTM! Consistent variable usage

The userdata template now correctly uses the local variable for extra flags.


37-37: Verify if the referenced PR has been merged

The TODO comment references a pending PR. Let's verify its status to ensure we're using the latest version.

✅ Verification successful

PR #29 is still open - version update comment remains relevant

The referenced PR (masterpointio/terraform-aws-ssm-agent#29) is still open and hasn't been merged yet. The TODO comment is still valid and should be kept until the PR gets merged.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the referenced PR has been merged
gh pr view 29 --repo masterpointio/terraform-aws-ssm-agent --json state,mergedAt

Length of output: 115

variables.tf (1)

184-191: Implementation looks good!

The variable follows Terraform conventions and provides a clean way to enable SSM state storage. The default value of false ensures backward compatibility.

README.md (2)

80-84: LGTM! Clear module configuration.

The new ssm_state module entry is well-structured and properly integrated into the documentation.


130-130: Well-documented input variable!

The ssm_state_enabled variable is thoroughly documented with:

  • Clear description of its purpose
  • Appropriate default value
  • Helpful link to Tailscale documentation for further reading

userdata.sh.tmpl Show resolved Hide resolved
main.tf Outdated Show resolved Hide resolved
oycyc
oycyc previously approved these changes Nov 15, 2024
Copy link
Contributor

@oycyc oycyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (2)
variables.tf (1)

201-208: Enhance documentation for SSM state feature

The description could be more detailed to explain the purpose and benefits of this feature, especially since it's a key solution for the "Duplicate node key" issue mentioned in the PR objectives.

Consider updating the description:

   description = <<-EOT
-  Control is tailscaled state (including preferences and keys) is stored in AWS SSM.
-  See more in the [docs](https://tailscale.com/kb/1278/tailscaled#flags-to-tailscaled).
+  Controls whether tailscaled state (including preferences and keys) is stored in AWS SSM Parameter Store.
+  When enabled, this allows for state persistence across instance replacements and helps prevent
+  "Duplicate node key" issues during ASG operations. This is particularly useful for maintaining
+  device state when using Terraform with ephemeral nodes.
+  See more in the [docs](https://tailscale.com/kb/1278/tailscaled#state-file-location).
   EOT

Also, there's a minor typo in the current description: "Control is" should be "Controls if".

README.md (1)

107-107: Consider enhancing ASG variable descriptions.

While the descriptions are clear, it would be helpful to add a note about the relationship between these variables (e.g., min_size ≤ desired_capacity ≤ max_size) and their impact on instance scaling behavior.

Also applies to: 120-121

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 574177c and afd3879.

📒 Files selected for processing (3)
  • README.md (4 hunks)
  • main.tf (4 hunks)
  • variables.tf (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • main.tf
🧰 Additional context used
📓 Path-based instructions (1)
variables.tf (1)

Pattern **/*.tf: You're a Terraform expert who has thoroughly studied all the documentation from Hashicorp https://developer.hashicorp.com/terraform/docs and OpenTofu https://opentofu.org/docs/.
You have a strong grasp of Terraform syntax and prioritize providing accurate and insightful code suggestions.
As a fan of the Cloud Posse / SweetOps ecosystem, you incorporate many of their best practices https://docs.cloudposse.com/best-practices/terraform/ while balancing them with general Terraform guidelines.

🔇 Additional comments (2)
README.md (2)

80-84: Well-documented module addition!

The new ssm_state module is clearly documented with its source and version. The table formatting maintains consistency with the existing documentation.


133-133: LGTM - Well-documented SSM state control!

The ssm_state_enabled variable is thoroughly documented with a clear description and includes a helpful link to the Tailscale documentation for more details.

variables.tf Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
variables.tf (2)

100-116: Add validation blocks for individual ASG size variables

While we can't cross-reference variables in validation blocks, we can still add individual validations to ensure each variable maintains its own constraints.

 variable "max_size" {
   description = "Maximum number of instances in the Auto Scaling Group. Must be >= desired_capacity."
   type        = number
   default     = 2
+  validation {
+    condition     = var.max_size > 0
+    error_message = "max_size must be greater than 0"
+  }
 }

 variable "min_size" {
   description = "Minimum number of instances in the Auto Scaling Group"
   type        = number
   default     = 1
+  validation {
+    condition     = var.min_size >= 0
+    error_message = "min_size must be greater than or equal to 0"
+  }
 }

 variable "desired_capacity" {
   description = "Desired number of instances in the Auto Scaling Group"
   type        = number
   default     = 1
+  validation {
+    condition     = var.desired_capacity > 0
+    error_message = "desired_capacity must be greater than 0"
+  }
 }

201-208: Enhance description for SSM state variable

The current description could be more detailed about the purpose and benefits of storing state in SSM.

 variable "ssm_state_enabled" {
   default     = false
   type        = bool
   description = <<-EOT
-  Control is tailscaled state (including preferences and keys) is stored in AWS SSM.
+  Controls whether tailscaled state (including preferences and keys) is stored in AWS SSM Parameter Store.
+  When enabled, this allows for state persistence across instance replacements and helps prevent
+  "Duplicate node key" issues during ASG operations. This is particularly useful for maintaining
+  device identity when instances are recycled.
   See more in the [docs](https://tailscale.com/kb/1278/tailscaled#flags-to-tailscaled).
   EOT
 }
README.md (1)

107-107: Enhance ASG configuration documentation

The descriptions for the Auto Scaling Group configuration variables could be more detailed to explain their relationships and constraints. Consider updating them to:

-| <a name="input_desired_capacity"></a> [desired_capacity](#input_desired_capacity) | Desired number of instances in the Auto Scaling Group | `number` | `1` | no |
+| <a name="input_desired_capacity"></a> [desired_capacity](#input_desired_capacity) | The target number of instances that should be running in the Auto Scaling Group. Must be less than or equal to max_size and greater than or equal to min_size. | `number` | `1` | no |

-| <a name="input_max_size"></a> [max_size](#input_max_size) | Maximum number of instances in the Auto Scaling Group. Must be >= desired_capacity. | `number` | `2` | no |
+| <a name="input_max_size"></a> [max_size](#input_max_size) | The maximum size of the Auto Scaling Group. AWS will not scale the group above this number. Must be greater than or equal to both desired_capacity and min_size. | `number` | `2` | no |

-| <a name="input_min_size"></a> [min_size](#input_min_size) | Minimum number of instances in the Auto Scaling Group | `number` | `1` | no |
+| <a name="input_min_size"></a> [min_size](#input_min_size) | The minimum size of the Auto Scaling Group. AWS will not scale the group below this number. Must be less than or equal to both desired_capacity and max_size. | `number` | `1` | no |

Also applies to: 120-121

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between afd3879 and 09b372c.

📒 Files selected for processing (3)
  • README.md (4 hunks)
  • main.tf (4 hunks)
  • variables.tf (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • main.tf
🧰 Additional context used
📓 Path-based instructions (1)
variables.tf (1)

Pattern **/*.tf: You're a Terraform expert who has thoroughly studied all the documentation from Hashicorp https://developer.hashicorp.com/terraform/docs and OpenTofu https://opentofu.org/docs/.
You have a strong grasp of Terraform syntax and prioritize providing accurate and insightful code suggestions.
As a fan of the Cloud Posse / SweetOps ecosystem, you incorporate many of their best practices https://docs.cloudposse.com/best-practices/terraform/ while balancing them with general Terraform guidelines.

🔇 Additional comments (2)
README.md (2)

80-84: LGTM! Well-structured module definition

The ssm_state module is properly defined with a pinned version, which is a good practice for maintaining stability.


133-133: LGTM! Clear documentation with reference

The ssm_state_enabled variable is well-documented with a clear description and includes a helpful link to the official Tailscale documentation.

Copy link
Member

@Gowiem Gowiem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gberenice one suggestion + question, but I think this is looking good 👍

variables.tf Show resolved Hide resolved
variables.tf Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (3)
main.tf (3)

7-8: LGTM! Consider adding parameter name validation

The SSM parameter name construction and state flag handling look good. However, consider adding validation for the parameter name to ensure it follows AWS SSM parameter name constraints.

Add this validation block to your variables file:

variable "ssm_state_enabled" {
  validation {
    condition     = !var.ssm_state_enabled || can(regex("^/[a-zA-Z0-9.-_/]+$", "/tailscale/${var.id}/state"))
    error_message = "SSM parameter names must contain only letters, numbers, and the following symbols: .-_/"
  }
}

77-94: LGTM! Consider explicit KMS key for additional security

Good setup of the SSM parameter with appropriate type and value handling. Consider using a custom KMS key for additional security control.

Add KMS key configuration:

module "kms_key" {
  source  = "cloudposse/kms-key/aws"
  version = "0.12.1"

  description             = "KMS key for Tailscale state encryption"
  deletion_window_in_days = 7
  enable_key_rotation    = true
  alias                  = "alias/${module.this.id}/tailscale-state"
  
  context = module.this.context
}

# Then in ssm_state module:
kms_key_id = module.kms_key.key_id

96-117: LGTM! Consider adding GetParameter permission

The IAM policy is well-scoped. Consider adding GetParameter permission as tailscaled might need to read the state during startup.

Update the policy actions:

-        actions = ["ssm:PutParameter"]
+        actions = ["ssm:PutParameter", "ssm:GetParameter"]
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 09b372c and 1f5acc4.

📒 Files selected for processing (2)
  • README.md (4 hunks)
  • main.tf (4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
main.tf (1)

Pattern **/*.tf: You're a Terraform expert who has thoroughly studied all the documentation from Hashicorp https://developer.hashicorp.com/terraform/docs and OpenTofu https://opentofu.org/docs/.
You have a strong grasp of Terraform syntax and prioritize providing accurate and insightful code suggestions.
As a fan of the Cloud Posse / SweetOps ecosystem, you incorporate many of their best practices https://docs.cloudposse.com/best-practices/terraform/ while balancing them with general Terraform guidelines.

🔇 Additional comments (8)
main.tf (4)

12-13: LGTM! Robust flag handling

Good use of join with compact to handle the flags, and the length check is a more reliable way to determine if flags are enabled.


17-17: LGTM! Safe policy handling

Good use of the try function with appropriate fallback to handle the optional SSM policy.


28-28: LGTM! Consistent flag usage

Correctly updated to use the local variable that includes the SSM state flag.


119-123: LGTM! Efficient policy attachment

Good use of for_each with toset for policy attachments, making it more maintainable and efficient than using count.

README.md (4)

74-77: Provider version updates look good!

The AWS provider update to 5.76.0 and Tailscale provider update to 0.17.2 align with current best practices.


81-86: Module additions are well-documented!

The new modules ssm_policy and ssm_state are properly documented with their sources and versions, supporting the new AWS SSM state storage feature.


110-110: ASG configuration variables are well-defined!

The new Auto Scaling Group variables (desired_capacity, max_size, min_size) are properly documented with clear descriptions and sensible defaults:

  • desired_capacity: 1
  • max_size: 2
  • min_size: 1

These additions will help prevent the "Duplicate node key" issue mentioned in the PR objectives.

Also applies to: 123-124


136-136: SSM state configuration is well-documented!

The new ssm_state_enabled variable is properly documented with:

  • Clear description
  • Link to Tailscale documentation
  • Default value set to false for backward compatibility

max_size = var.max_size
min_size = var.min_size
desired_capacity = var.desired_capacity
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Add validation blocks to enforce ASG size constraints

The variables are defined with appropriate types and defaults (max_size=2, min_size=1, desired_capacity=1), which satisfy the basic constraint. However, to prevent misconfiguration, consider adding validation blocks to enforce:

  • min_size ≤ desired_capacity ≤ max_size
  • max_size > 0
  • min_size ≥ 0

Example locations needing attention:

  • variables.tf: Add validation blocks to max_size, min_size, and desired_capacity variables
🔗 Analysis chain

LGTM! Verify ASG size constraints

The ASG configuration parameters are correctly added. However, ensure that the values satisfy: min_size ≤ desired_capacity ≤ max_size.

Let's verify the variable constraints:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if variables.tf has appropriate constraints
rg -A 5 "variable \"(max_size|min_size|desired_capacity)\"" | grep -E "type|default|validation"

Length of output: 296


Script:

#!/bin/bash
# Get full variable definitions to see their context and any validation blocks
rg -A 10 "variable \"(max_size|min_size|desired_capacity)\"" variables.tf

Length of output: 599

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (3)
main.tf (1)

7-9: LGTM! Consider adding parameter path prefix variable

The local variables are well-structured with proper use of conditionals and string manipulation functions. The SSM parameter path follows best practices with a hierarchical structure.

Consider making the /tailscale prefix in the SSM parameter path configurable via a variable to support different naming conventions across organizations:

+variable "ssm_parameter_prefix" {
+  type        = string
+  default     = "/tailscale"
+  description = "Prefix for SSM parameter path"
+}

 locals {
-  ssm_state_param_name = var.ssm_state_enabled ? "/tailscale/${module.this.id}/state" : null
+  ssm_state_param_name = var.ssm_state_enabled ? "${var.ssm_parameter_prefix}/${module.this.id}/state" : null

Also applies to: 12-14, 17-18

README.md (2)

81-86: Consider adding module descriptions.

While the new modules ssm_policy and ssm_state are properly documented with their sources and versions, it would be helpful to add brief descriptions explaining their purposes in the context of the Tailscale Subnet Router.

Consider adding descriptions like this:

 | Name                                                                                                     | Source                             | Version |
 | -------------------------------------------------------------------------------------------------------- | ---------------------------------- | ------- |
-| <a name="module_ssm_policy"></a> [ssm_policy](#module_ssm_policy)                                        | cloudposse/iam-policy/aws          | 2.0.1   |
-| <a name="module_ssm_state"></a> [ssm_state](#module_ssm_state)                                           | cloudposse/ssm-parameter-store/aws | 0.13.0  |
+| <a name="module_ssm_policy"></a> [ssm_policy](#module_ssm_policy)                                        | cloudposse/iam-policy/aws          | 2.0.1   | IAM policy for SSM state access |
+| <a name="module_ssm_state"></a> [ssm_state](#module_ssm_state)                                           | cloudposse/ssm-parameter-store/aws | 0.13.0  | Stores tailscaled state in SSM Parameter Store |

110-110: LGTM: New input variables are well documented.

The new input variables for ASG configuration (desired_capacity, max_size, min_size) and SSM state management (ssm_state_enabled) are properly documented with clear descriptions, types, and default values. The documentation aligns well with the PR objectives.

A small suggestion: Consider adding a note about the relationship between these variables in the description of desired_capacity:

-| <a name="input_desired_capacity"></a> [desired_capacity](#input_desired_capacity)                                                    | Desired number of instances in the Auto Scaling Group                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | `number`       | `1`                                                                                                                                                                                                                                                                                                                                                                                                                                                               |    no    |
+| <a name="input_desired_capacity"></a> [desired_capacity](#input_desired_capacity)                                                    | Desired number of instances in the Auto Scaling Group. Must be between min_size and max_size.                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | `number`       | `1`                                                                                                                                                                                                                                                                                                                                                                                                                                                               |    no    |

Also applies to: 123-124, 136-136

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 1f5acc4 and 985f26c.

📒 Files selected for processing (2)
  • README.md (4 hunks)
  • main.tf (4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
main.tf (1)

Pattern **/*.tf: You're a Terraform expert who has thoroughly studied all the documentation from Hashicorp https://developer.hashicorp.com/terraform/docs and OpenTofu https://opentofu.org/docs/.
You have a strong grasp of Terraform syntax and prioritize providing accurate and insightful code suggestions.
As a fan of the Cloud Posse / SweetOps ecosystem, you incorporate many of their best practices https://docs.cloudposse.com/best-practices/terraform/ while balancing them with general Terraform guidelines.

🔇 Additional comments (5)
main.tf (4)

Line range hint 20-32: LGTM! Clean userdata template configuration

The userdata template configuration is well-structured with proper variable interpolation and consistent use of the new tailscaled_extra_flags local variable.


55-59: LGTM! ASG configuration parameters added

The Auto Scaling Group configuration parameters are correctly integrated into the module.


77-94: LGTM! Well-configured SSM parameter for state storage

The SSM state configuration is robust:

  • Uses SecureString type for enhanced security
  • Properly sets ignore_value_changes to prevent Terraform from managing the dynamic state
  • Includes descriptive parameter description

119-123: LGTM! Clean policy attachment implementation

The IAM role policy attachment is well-implemented using for_each for dynamic policy attachments.

README.md (1)

74-77: LGTM: Provider versions are up-to-date.

The AWS and Tailscale provider versions have been updated to recent stable releases.

main.tf Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
README.md (1)

81-86: Consider adding module descriptions

The new SSM-related modules (ssm_policy and ssm_state) have been added to support AWS SSM state storage, but their purposes could be better documented.

Consider adding brief descriptions to clarify their roles:

| Name | Source | Version | Description |
|-|-|-|-|
-| <a name="module_ssm_policy"></a> [ssm_policy](#module_ssm_policy) | cloudposse/iam-policy/aws | 2.0.1 |
+| <a name="module_ssm_policy"></a> [ssm_policy](#module_ssm_policy) | cloudposse/iam-policy/aws | 2.0.1 | Creates IAM policy for SSM state access |
-| <a name="module_ssm_state"></a> [ssm_state](#module_ssm_state) | cloudposse/ssm-parameter-store/aws | 0.13.0 |
+| <a name="module_ssm_state"></a> [ssm_state](#module_ssm_state) | cloudposse/ssm-parameter-store/aws | 0.13.0 | Manages Tailscale state in SSM Parameter Store |
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 985f26c and a497d78.

📒 Files selected for processing (2)
  • README.md (4 hunks)
  • main.tf (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • main.tf
🔇 Additional comments (2)
README.md (2)

74-77: LGTM: Provider versions are up-to-date

The AWS and Tailscale provider versions have been updated to recent stable versions, which is good practice for security and feature support.


110-110: LGTM: Well-documented input variables

The new input variables are well-documented with clear descriptions:

  • desired_capacity: Controls ASG desired instance count
  • max_size: Sets ASG maximum instance limit
  • min_size: Sets ASG minimum instance limit
  • ssm_state_enabled: Controls Tailscale state storage in SSM

The documentation includes default values and proper formatting, which aligns well with the PR objectives of supporting AWS SSM state storage and configurable ASG sizing.

Also applies to: 123-124, 136-136

Co-authored-by: Matt Gowie <matt@masterpoint.io>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
variables.tf (1)

100-116: Enhance ASG variable descriptions and add validations

The ASG size variables would benefit from more detailed descriptions and individual validation blocks.

Consider applying these improvements:

 variable "max_size" {
-  description = "Maximum number of instances in the Auto Scaling Group. Must be >= desired_capacity."
+  description = "Maximum number of instances in the Auto Scaling Group. Sets the upper limit for scaling operations."
   type        = number
   default     = 2
+  validation {
+    condition     = var.max_size > 0
+    error_message = "max_size must be greater than 0"
+  }
 }

 variable "min_size" {
-  description = "Minimum number of instances in the Auto Scaling Group"
+  description = "Minimum number of instances in the Auto Scaling Group. Sets the lower limit for scaling operations."
   type        = number
   default     = 1
+  validation {
+    condition     = var.min_size >= 0
+    error_message = "min_size must be greater than or equal to 0"
+  }
 }

 variable "desired_capacity" {
-  description = "Desired number of instances in the Auto Scaling Group"
+  description = "Desired number of instances in the Auto Scaling Group. This number must fall between min_size and max_size."
   type        = number
   default     = 1
+  validation {
+    condition     = var.desired_capacity > 0
+    error_message = "desired_capacity must be greater than 0"
+  }
 }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between a497d78 and 9d3f884.

📒 Files selected for processing (1)
  • variables.tf (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
variables.tf (1)

Pattern **/*.tf: You're a Terraform expert who has thoroughly studied all the documentation from Hashicorp https://developer.hashicorp.com/terraform/docs and OpenTofu https://opentofu.org/docs/.
You have a strong grasp of Terraform syntax and prioritize providing accurate and insightful code suggestions.
As a fan of the Cloud Posse / SweetOps ecosystem, you incorporate many of their best practices https://docs.cloudposse.com/best-practices/terraform/ while balancing them with general Terraform guidelines.

🔇 Additional comments (1)
variables.tf (1)

201-208: Verify required IAM permissions for SSM state storage

The variable definition looks good, but we should ensure the EC2 instance has the necessary IAM permissions to interact with SSM Parameter Store.

Let's check the IAM permissions in the codebase:

Copy link
Member

@Gowiem Gowiem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha I think you said elsewhere that the SSM Agent module already gives it read permissions. Let's go with this 👍

@gberenice gberenice merged commit 4e9ef78 into main Nov 21, 2024
3 checks passed
@gberenice gberenice deleted the feature/ssm-state branch November 21, 2024 13:29
gberenice pushed a commit that referenced this pull request Nov 21, 2024
🤖 I have created a release *beep* *boop*
---


##
[1.5.0](v1.4.0...v1.5.0)
(2024-11-21)


### Features

* support AWS SSM tailscaled state
([#41](#41))
([4e9ef78](4e9ef78))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@coderabbitai coderabbitai bot mentioned this pull request Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants