Skip to content

Commit

Permalink
feat: support AWS SSM tailscaled state (#41)
Browse files Browse the repository at this point in the history
## what
* This adds support to store `tailscaled` state in AWS SSM. This helps
reusing device state names rather than deleting old devices which is not
supported by Tailscale via Terraform ATM.
* This adds recent Masterpoint's GH + CRabbit configs.
* This sets trunk update to run less often to reduce noise.
* This adds ability to confugure ASG min/max size and desired capacity.
This is important in case of using an external state to avoid the
[`Duplicate node key `
issue](https://tailscale.com/kb/1023/troubleshooting):
> This can occur if you use a backup of one device to create another, or
clone a file system from one device to another. The Tailscale
configuration files are duplicated. The Tailscale files will need to be
removed from one of the two.
You can identify duplicated devices in the
[Machines](https://login.tailscale.com/admin/machines) page of the admin
console by looking for a Duplicate node key badge underneath the device
name.
On one of the systems, [uninstall and completely
delete](https://tailscale.com/kb/1069/uninstall) the Tailscale app. It
is especially important to remove the files listed for your platform,
the goal is to make a new Tailscale IP address when it is installed
again. Then, [reinstall the app](https://tailscale.com/kb/install).


## why
* Ephemeral nodes behave glitchy during the rotation - we had to perform
a manual instance restart to run Tailscale. It's hard to reproduce, so
keeping a state in external storage in one more option to try to keep
the device in order.

## references
* https://tailscale.com/kb/1278/tailscaled#flags-to-tailscaled



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Summary by CodeRabbit

- **New Features**
- Introduced a new configuration file for CodeRabbit integration,
enhancing review and feedback processes.
- Added new modules for managing SSM state parameters and IAM policies
in the Terraform setup.
- Expanded configuration options with new variables for Auto Scaling
Group and Tailscale state management.
- Updated the workflow for trunk upgrades to run monthly, improving
efficiency.

- **Documentation**
- Enhanced `README.md` with new module and variable details for better
user guidance.
  
- **Chores**
	- Updated `.gitignore` to manage ignored files more effectively.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Matt Gowie <matt@masterpoint.io>
  • Loading branch information
gberenice and Gowiem authored Nov 21, 2024
1 parent 67f76f8 commit 4e9ef78
Show file tree
Hide file tree
Showing 8 changed files with 210 additions and 24 deletions.
90 changes: 90 additions & 0 deletions .coderabbit.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Docs: https://docs.coderabbit.ai/configure-coderabbit
# Schema: https://coderabbit.ai/integrations/schema.v2.json
# Support: https://discord.gg/GsXnASn26c

language: en

tone_instructions: |
Provide feedback in a professional, friendly, constructive, and concise tone.
Offer clear, specific suggestions and best practices to help enhance the code quality and promote learning.
early_access: true

knowledge_base:
# The scope of learnings to use for the knowledge base.
# `local` uses the repository's learnings,
# `global` uses the organization's learnings,
# `auto` uses repository's learnings for public repositories and organization's learnings for private repositories.
# Default value: `auto`
learnings:
scope: global
issues:
scope: global
pull_requests:
scope: global

reviews:
profile: chill
auto_review:
# Ignore reviewing if the title of the pull request contains any of these keywords (case-insensitive)
ignore_title_keywords:
- wip
- draft
- test
# Set the commit status to 'pending' when the review is in progress and 'success' when it is complete.
commit_status: false
# Post review details on each review. Additionally, post a review status when a review is skipped in certain cases.
review_status: false
path_instructions:
- path: "**/*.tf"
instructions: |
You're a Terraform expert who has thoroughly studied all the documentation from Hashicorp https://developer.hashicorp.com/terraform/docs and OpenTofu https://opentofu.org/docs/.
You have a strong grasp of Terraform syntax and prioritize providing accurate and insightful code suggestions.
As a fan of the Cloud Posse / SweetOps ecosystem, you incorporate many of their best practices https://docs.cloudposse.com/best-practices/terraform/ while balancing them with general Terraform guidelines.
tools:
# By default, all tools are enabled.
# Masterpoint uses Trunk (https://trunk.io) so we do not need a lot of this feedback due to overlap.
shellcheck:
enabled: false
ruff:
enabled: false
markdownlint:
enabled: false
github-checks:
enabled: false
languagetool:
enabled: false
biome:
enabled: false
hadolint:
enabled: false
swiftlint:
enabled: false
phpstan:
enabled: false
golangci-lint:
enabled: false
yamllint:
enabled: false
gitleaks:
enabled: false
checkov:
enabled: false
detekt:
enabled: false
eslint:
enabled: false
rubocop:
enabled: false
buf:
enabled: false
regal:
enabled: false
actionlint:
enabled: false
pmd:
enabled: false
cppcheck:
enabled: false
circleci:
enabled: false
8 changes: 4 additions & 4 deletions .github/workflows/trunk-upgrade.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
name: Weekly Trunk Upgrade
name: Monthly Trunk Upgrade
on:
schedule:
# Every Monday @ 5am
- cron: 0 5 * * 1
# On the first day of every month @ 8am
- cron: 0 8 1 * *
# Allows us to manually run the workflow from Actions UI
workflow_dispatch: {}
permissions: read-all
Expand All @@ -18,7 +18,7 @@ jobs:
uses: actions/checkout@v4

- name: Trunk Upgrade
uses: trunk-io/trunk-action/upgrade@d5b1b61d0beee562512f530a278b6a2931fba857
uses: trunk-io/trunk-action/upgrade@2eaee169140ec559bd556208f9f99cdfdf468da8 # v1.1.18
with:
base: main
reviewers: "@masterpointio/masterpoint-internal"
Expand Down
8 changes: 5 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,15 @@
.terraform
.terraform.tfstate.lock.info

**/.idea
**/*.iml

# Cloud Posse Build Harness https://github.com/cloudposse/build-harness
**/.build-harness
**/build-harness

# Crash log files
crash.log
test.log

# Random
**/.idea
**/*.iml
.DS_Store
28 changes: 18 additions & 10 deletions README.md

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions examples/complete/fixtures.us-east-2.tfvars
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ region = "us-east-1"
availability_zones = ["us-east-1a", "us-east-1b"]
ipv4_primary_cidr_block = "172.16.0.0/16"

ssm_state_enabled = true

# Replace these values with your own
tailnet = "orgname.org.github"
oauth_client_id = "OAUTH_CLIENT_ID"
Expand Down
66 changes: 62 additions & 4 deletions main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,14 @@ locals {
prefixed_primary_tag = "tag:${local.primary_tag}"
prefixed_additional_tags = [for tag in var.additional_tags : "tag:${tag}"]

ssm_state_param_name = var.ssm_state_enabled ? "/tailscale/${module.this.id}/state" : null
ssm_state_flag = var.ssm_state_enabled ? "--state=${module.ssm_state[0].arn_map[local.ssm_state_param_name]}" : ""

tailscale_tags = concat([local.prefixed_primary_tag], local.prefixed_additional_tags)

tailscaled_extra_flags_enabled = length(var.tailscaled_extra_flags) > 0
tailscaled_extra_flags = join(" ", compact(concat(var.tailscaled_extra_flags, [local.ssm_state_flag])))
tailscaled_extra_flags_enabled = length(local.tailscaled_extra_flags) > 0

tailscale_up_extra_flags_enabled = length(var.tailscale_up_extra_flags) > 0

userdata = templatefile("${path.module}/userdata.sh.tmpl", {
Expand All @@ -18,7 +23,7 @@ locals {
tags = join(",", local.tailscale_tags)

tailscaled_extra_flags_enabled = local.tailscaled_extra_flags_enabled
tailscaled_extra_flags = join(" ", var.tailscaled_extra_flags)
tailscaled_extra_flags = local.tailscaled_extra_flags
tailscale_up_extra_flags_enabled = local.tailscale_up_extra_flags_enabled
tailscale_up_extra_flags = join(" ", var.tailscale_up_extra_flags)
})
Expand All @@ -45,8 +50,11 @@ module "tailscale_subnet_router" {
session_logging_enabled = var.session_logging_enabled
session_logging_ssm_document_name = var.session_logging_ssm_document_name

ami = var.ami
instance_type = var.instance_type
ami = var.ami
instance_type = var.instance_type
max_size = var.max_size
min_size = var.min_size
desired_capacity = var.desired_capacity

monitoring_enabled = var.monitoring_enabled
associate_public_ip_address = var.associate_public_ip_address
Expand All @@ -63,3 +71,53 @@ resource "tailscale_tailnet_key" "default" {
# A device is automatically tagged when it is authenticated with this key.
tags = local.tailscale_tags
}

module "ssm_state" {
count = var.ssm_state_enabled ? 1 : 0
source = "cloudposse/ssm-parameter-store/aws"
version = "0.13.0"
ignore_value_changes = true

parameter_write = [
{
name = local.ssm_state_param_name
type = "SecureString"
overwrite = "true"
value = "{}"
description = "Tailscaled state of ${module.this.id} subnet router."
}
]
context = module.this.context
tags = module.this.tags
}

module "ssm_policy" {
count = var.ssm_state_enabled ? 1 : 0
source = "cloudposse/iam-policy/aws"
version = "2.0.1"

name = "ssm"
description = "Additional SSM access for SSM Agent"

iam_policy_enabled = true
iam_policy = [{
statements = [
{
sid = "SSMAgentPutParameter"
effect = "Allow"
actions = ["ssm:PutParameter"]
resources = [
module.ssm_state[0].arn_map[local.ssm_state_param_name],
]
},
]
}]
context = module.this.context
tags = module.this.tags
}

resource "aws_iam_role_policy_attachment" "default" {
count = var.ssm_state_enabled ? 1 : 0
role = module.tailscale_subnet_router.role_id
policy_arn = module.ssm_policy[0].policy_arn
}
4 changes: 2 additions & 2 deletions userdata.sh.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ sudo yum-config-manager --add-repo https://pkgs.tailscale.com/stable/amazon-linu
sudo yum install -y tailscale

%{ if tailscaled_extra_flags_enabled == true }
echo "Exporting FLAGS to environment variable..."
export FLAGS=${tailscaled_extra_flags}%
echo "Exporting FLAGS to /etc/default/tailscaled..."
sudo sed -i "s|^FLAGS=.*|FLAGS=\"${tailscaled_extra_flags}\"|" /etc/default/tailscaled
%{ endif }

# Setup tailscale
Expand Down
28 changes: 27 additions & 1 deletion variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ variable "session_logging_kms_key_alias" {
EOF
}


variable "session_logging_ssm_document_name" {
default = "SSM-SessionManagerRunShell-Tailscale"
type = string
Expand Down Expand Up @@ -98,6 +97,24 @@ variable "associate_public_ip_address" {
default = null
}

variable "max_size" {
description = "Maximum number of instances in the Auto Scaling Group. Must be >= desired_capacity."
type = number
default = 2
}

variable "min_size" {
description = "Minimum number of instances in the Auto Scaling Group"
type = number
default = 1
}

variable "desired_capacity" {
description = "Desired number of instances in the Auto Scaling Group"
type = number
default = 1
}

################
## Tailscale ##
##############
Expand Down Expand Up @@ -180,3 +197,12 @@ variable "tailscale_up_extra_flags" {
See more in the [docs](https://tailscale.com/kb/1241/tailscale-up).
EOT
}

variable "ssm_state_enabled" {
default = false
type = bool
description = <<-EOT
Control if tailscaled state is stored in AWS SSM (including preferences and keys). This tells the Tailscale daemon to write + read state from SSM, which unlocks important features like retaining the existing tailscale machine name.
See more in the [docs](https://tailscale.com/kb/1278/tailscaled#flags-to-tailscaled).
EOT
}

0 comments on commit 4e9ef78

Please sign in to comment.