thoughtbot · OlamideOl1 · Jun 4, 2026 · Jun 4, 2026 · Jun 4, 2026 · Jun 4, 2026
@@ -0,0 +1,23 @@
+name: elasticache-redis/replication-group-secondary
+on:
+  pull_request:
+    branches:
+      - main
+    paths:
+      - elasticache-redis/replication-group-secondary/**
+    types:
+      - closed
+      - opened
+      - reopened
+      - synchronize
+jobs:
+  terraform:
+    uses: ./.github/workflows/terraform.yml
+    concurrency: ${{ github.workflow }}
+    with:
+      module: elasticache-redis/replication-group-secondary
+    permissions:
+      id-token: write
+      contents: write
+      checks: write
+      pull-requests: write
@@ -0,0 +1,76 @@
+# ElastiCache Redis (Global Datastore Secondary)
+
+Provision a secondary (regional) member of an ElastiCache global datastore.
+
+Use this module instead of `replication-group` when joining an existing global
+datastore via `global_replication_group_id`. A secondary member inherits the
+engine, engine version, node type, encryption settings, parameter group,
+snapshots, and auth token from the global datastore's primary, so those
+arguments are intentionally omitted here -- the AWS provider rejects them when
+`global_replication_group_id` is set.
+
+<!-- BEGIN_TF_DOCS -->
+## Requirements
+
+| Name | Version |
+|------|---------|
+| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.6.2 |
+| <a name="requirement_aws"></a> [aws](#requirement\_aws) | ~> 6.0 |
+
+## Providers
+
+| Name | Version |
+|------|---------|
+| <a name="provider_aws"></a> [aws](#provider\_aws) | ~> 6.0 |
+
+## Modules
+
+| Name | Source | Version |
+|------|--------|---------|
+| <a name="module_client_security_group"></a> [client\_security\_group](#module\_client\_security\_group) | ../../security-group | n/a |
+| <a name="module_server_security_group"></a> [server\_security\_group](#module\_server\_security\_group) | ../../security-group | n/a |
+
+## Resources
+
+| Name | Type |
+|------|------|
+| [aws_cloudwatch_metric_alarm.check_cpu_balance](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_metric_alarm) | resource |
+| [aws_cloudwatch_metric_alarm.cpu](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_metric_alarm) | resource |
+| [aws_cloudwatch_metric_alarm.memory](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_metric_alarm) | resource |
+| [aws_elasticache_replication_group.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/elasticache_replication_group) | resource |
+| [aws_elasticache_subnet_group.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/elasticache_subnet_group) | resource |
+| [aws_ec2_instance_type.instance_attributes](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/ec2_instance_type) | data source |
+
+## Inputs
+
+| Name | Description | Type | Default | Required |
+|------|-------------|------|---------|:--------:|
+| <a name="input_alarm_actions"></a> [alarm\_actions](#input\_alarm\_actions) | SNS topics or other actions to invoke for alarms | `list(object({ arn = string }))` | `[]` | no |
+| <a name="input_allowed_cidr_blocks"></a> [allowed\_cidr\_blocks](#input\_allowed\_cidr\_blocks) | CIDR blocks allowed to access the database | `list(string)` | `[]` | no |
+| <a name="input_allowed_security_group_ids"></a> [allowed\_security\_group\_ids](#input\_allowed\_security\_group\_ids) | Security group allowed to access the database | `list(string)` | `[]` | no |
+| <a name="input_apply_immediately"></a> [apply\_immediately](#input\_apply\_immediately) | Set to true to apply changes immediately | `bool` | `false` | no |
+| <a name="input_client_security_group_name"></a> [client\_security\_group\_name](#input\_client\_security\_group\_name) | Override the name for the security group; defaults to identifer | `string` | `""` | no |
+| <a name="input_create_client_security_group"></a> [create\_client\_security\_group](#input\_create\_client\_security\_group) | Set to false to only use existing security groups | `bool` | `true` | no |
+| <a name="input_create_server_security_group"></a> [create\_server\_security\_group](#input\_create\_server\_security\_group) | Set to false to only use existing security groups | `bool` | `true` | no |
+| <a name="input_description"></a> [description](#input\_description) | Human-readable description for this replication group | `string` | n/a | yes |
+| <a name="input_global_replication_group_id"></a> [global\_replication\_group\_id](#input\_global\_replication\_group\_id) | The ID of the global replication group to which this replication group belongs. | `string` | n/a | yes |
+| <a name="input_name"></a> [name](#input\_name) | Name for this cluster | `string` | n/a | yes |
+| <a name="input_port"></a> [port](#input\_port) | Port on which to listen (used for the security group; the replication group itself inherits the port from the global datastore) | `number` | `6379` | no |
+| <a name="input_replica_count"></a> [replica\_count](#input\_replica\_count) | Number of read-only replicas to add to the cluster | `number` | `1` | no |
+| <a name="input_replication_group_id"></a> [replication\_group\_id](#input\_replication\_group\_id) | Override the ID for the replication group | `string` | `""` | no |
+| <a name="input_server_security_group_ids"></a> [server\_security\_group\_ids](#input\_server\_security\_group\_ids) | IDs of VPC security groups for this instance. One of vpc\_id or server\_security\_group\_ids is required | `list(string)` | `[]` | no |
+| <a name="input_server_security_group_name"></a> [server\_security\_group\_name](#input\_server\_security\_group\_name) | Override the name for the security group; defaults to identifer | `string` | `""` | no |
+| <a name="input_subnet_group_name"></a> [subnet\_group\_name](#input\_subnet\_group\_name) | Override the name for the subnet group | `string` | `""` | no |
+| <a name="input_subnet_ids"></a> [subnet\_ids](#input\_subnet\_ids) | Subnets connected to the database | `list(string)` | n/a | yes |
+| <a name="input_tags"></a> [tags](#input\_tags) | Tags to be applied to created resources | `map(string)` | `{}` | no |
+| <a name="input_vpc_id"></a> [vpc\_id](#input\_vpc\_id) | ID of VPC for this instance. One of vpc\_id or vpc\_security\_group\_ids is required | `string` | `null` | no |
+
+## Outputs
+
+| Name | Description |
+|------|-------------|
+| <a name="output_client_security_group_id"></a> [client\_security\_group\_id](#output\_client\_security\_group\_id) | Name of the security group created for clients |
+| <a name="output_id"></a> [id](#output\_id) | ID of the created replication group |
+| <a name="output_instance"></a> [instance](#output\_instance) | Elasticache Redis replication group |
+| <a name="output_server_security_group_id"></a> [server\_security\_group\_id](#output\_server\_security\_group\_id) | Name of the security group created for the server |
+<!-- END_TF_DOCS -->
@@ -0,0 +1,211 @@
+# Secondary (regional) member of an ElastiCache global datastore.
+#
+# When `global_replication_group_id` is set, AWS inherits engine,
+# engine_version, node_type, encryption, parameter group, snapshots and the
+# auth token from the global datastore's primary. The provider enforces this
+# with ConflictsWith rules that fire whenever those attributes are *configured*
+# at all -- even when set to null -- so this module simply omits them.
+resource "aws_elasticache_replication_group" "this" {
+  replication_group_id        = coalesce(var.replication_group_id, var.name)
+  global_replication_group_id = var.global_replication_group_id
+
+  apply_immediately          = var.apply_immediately
+  automatic_failover_enabled = local.replica_enabled
+  description                = var.description
+  multi_az_enabled           = local.replica_enabled
+  num_cache_clusters         = local.instance_count
+  security_group_ids         = local.server_security_group_ids
+  subnet_group_name          = aws_elasticache_subnet_group.this.name
+}
+
+resource "aws_elasticache_subnet_group" "this" {
+  name = coalesce(
+    var.subnet_group_name,
+    var.replication_group_id,
+    var.name
+  )
+
+  description = "Redis subnet group"
+  subnet_ids  = var.subnet_ids
+
+  lifecycle {
+    create_before_destroy = true
+  }
+}
+
+module "server_security_group" {
+  count  = var.create_server_security_group ? 1 : 0
+  source = "../../security-group"
+
+  allowed_cidr_blocks = var.allowed_cidr_blocks
+  description         = "ElastiCache Redis server: ${var.name}"
+  randomize_name      = var.server_security_group_name == ""
+  tags                = var.tags
+  vpc_id              = var.vpc_id
+
+  allowed_security_group_ids = concat(
+    var.allowed_security_group_ids,
+    module.client_security_group[*].id
+  )
+
+  name = coalesce(
+    var.server_security_group_name,
+    "${var.name}-server"
+  )
+
+  ports = {
+    redis = var.port
+  }
+}
+
+module "client_security_group" {
+  count  = var.create_client_security_group ? 1 : 0
+  source = "../../security-group"
+
+  allowed_cidr_blocks        = var.allowed_cidr_blocks
+  allowed_security_group_ids = var.allowed_security_group_ids
+  description                = "ElastiCache Redis client: ${var.name}"
+  randomize_name             = var.client_security_group_name == ""
+  tags                       = var.tags
+  vpc_id                     = var.vpc_id
+
+  name = coalesce(
+    var.client_security_group_name,
+    "${var.name}-client"
+  )
+}
+
+resource "aws_cloudwatch_metric_alarm" "cpu" {
+  count = local.instance_count
+
+  alarm_name          = "${var.name}-${count.index}-high-cpu"
+  alarm_description   = "${var.name}-${count.index} is using more than 90% of its CPU"
+  comparison_operator = "GreaterThanOrEqualToThreshold"
+  evaluation_periods  = "5"
+  metric_name         = "CPUUtilization"
+  namespace           = "AWS/ElastiCache"
+  period              = "60"
+  statistic           = "Average"
+  threshold           = 90 / data.aws_ec2_instance_type.instance_attributes.default_cores
+  treat_missing_data  = "notBreaching"
+
+  dimensions = {
+    CacheClusterId = local.instances[count.index]
+  }
+
+  alarm_actions = var.alarm_actions[*].arn
+  ok_actions    = var.alarm_actions[*].arn
+}
+
+resource "aws_cloudwatch_metric_alarm" "memory" {
+  count = local.instance_count
+
+  alarm_name          = "${var.name}-${count.index}-datababase-memory-remaining"
+  alarm_description   = "${var.name}-${count.index} has less than ${local.memory_threshold_mb}MiB of memory remaining"
+  comparison_operator = "LessThanOrEqualToThreshold"
+  evaluation_periods  = "2"
+  metric_name         = "FreeableMemory"
+  namespace           = "AWS/ElastiCache"
+  period              = "60"
+  statistic           = "Average"
+  threshold           = local.memory_threshold_mb * 1024 * 1024
+  treat_missing_data  = "notBreaching"
+
+  dimensions = {
+    CacheClusterId = local.instances[count.index]
+  }
+
+  alarm_actions = var.alarm_actions[*].arn
+  ok_actions    = var.alarm_actions[*].arn
+}
+
+resource "aws_cloudwatch_metric_alarm" "check_cpu_balance" {
+  count = data.aws_ec2_instance_type.instance_attributes.burstable_performance_supported == true ? local.instance_count : 0
+
+  alarm_name          = "${var.name}-${count.index}-elasticache-low-cpu-credit"
+  alarm_description   = "Insufficient CPU credits for ${var.name}-${count.index}"
+  comparison_operator = "LessThanOrEqualToThreshold"
+  evaluation_periods  = "2"
+  threshold           = "0"
+  treat_missing_data  = "notBreaching"
+
+  alarm_actions = var.alarm_actions[*].arn
+  ok_actions    = var.alarm_actions[*].arn
+
+  metric_query {
+    id          = "e1"
+    expression  = "m1 - m2 - (m3 * 12)"
+    label       = "Available CPU Credits"
+    return_data = "true"
+  }
+
+  metric_query {
+    id = "m1"
+
+    metric {
+      metric_name = "CPUCreditBalance"
+      namespace   = "AWS/ElastiCache"
+      period      = "120"
+      stat        = "Average"
+      unit        = "Count"
+
+      dimensions = {
+        CacheClusterId = local.instances[count.index]
+      }
+    }
+  }
+
+  metric_query {
+    id = "m2"
+
+    metric {
+      metric_name = "CPUSurplusCreditBalance"
+      namespace   = "AWS/ElastiCache"
+      period      = "120"
+      stat        = "Average"
+      unit        = "Count"
+
+      dimensions = {
+        CacheClusterId = local.instances[count.index]
+      }
+    }
+  }
+
+  metric_query {
+    id = "m3"
+
+    metric {
+      metric_name = "CPUCreditUsage"
+      namespace   = "AWS/ElastiCache"
+      period      = "120"
+      stat        = "Average"
+      unit        = "Count"
+
+      dimensions = {
+        CacheClusterId = local.instances[count.index]
+      }
+    }
+  }
+}
+
+# The node type is inherited from the global datastore primary, so read it back
+# off the created replication group rather than requiring it as an input.
+data "aws_ec2_instance_type" "instance_attributes" {
+  instance_type = local.instance_size
+}
+
+locals {
+  instance_count            = var.replica_count + 1
+  instance_size             = replace(aws_elasticache_replication_group.this.node_type, "cache.", "")
+  instances                 = sort(aws_elasticache_replication_group.this.member_clusters)
+  owned_security_group_ids  = module.server_security_group[*].id
+  replica_enabled           = var.replica_count > 0
+  shared_security_group_ids = var.server_security_group_ids
+
+  memory_threshold_mb = data.aws_ec2_instance_type.instance_attributes.memory_size
+
+  server_security_group_ids = concat(
+    local.owned_security_group_ids,
+    local.shared_security_group_ids
+  )
+}
@@ -0,0 +1,19 @@
+output "client_security_group_id" {
+  description = "Name of the security group created for clients"
+  value       = join("", module.client_security_group[*].id)
+}
+
+output "instance" {
+  description = "Elasticache Redis replication group"
+  value       = aws_elasticache_replication_group.this
+}
+
+output "id" {
+  description = "ID of the created replication group"
+  value       = aws_elasticache_replication_group.this.replication_group_id
+}
+
+output "server_security_group_id" {
+  description = "Name of the security group created for the server"
+  value       = join("", module.server_security_group[*].id)
+}