home / skills / giuseppe-trisciuoglio / developer-kit / aws-cloudformation-ecs
/plugins/developer-kit-aws/skills/aws-cloudformation/aws-cloudformation-ecs
This skill provides AWS CloudFormation patterns for ECS clusters, services, task definitions, and blue/green deployments to simplify infrastructure as code.
npx playbooks add skill giuseppe-trisciuoglio/developer-kit --skill aws-cloudformation-ecsReview the files below or copy the command above to add this skill to your agents.
---
name: aws-cloudformation-ecs
description: Provides AWS CloudFormation patterns for ECS clusters, services, and task definitions. Use when creating ECS infrastructure with CloudFormation, configuring container definitions, scaling policies, service discovery, load balancing integration, and implementing template structure with Parameters, Outputs, Mappings, Conditions, cross-stack references, and blue/green deployments with CodeDeploy.
category: aws
tags: [aws, cloudformation, ecs, containers, docker, orchestration, infrastructure, iaac]
version: 1.1.0
allowed-tools: Read, Write, Bash
---
# AWS CloudFormation ECS
## Overview
Create production-ready container infrastructure using AWS CloudFormation templates. This skill covers ECS clusters, services, task definitions, container configurations, scaling, service discovery, load balancing, and blue/green deployments with CodeDeploy.
## When to Use
Use this skill when:
- Creating new ECS clusters with CloudFormation
- Defining task definitions for container workloads
- Configuring ECS services with deployment strategies
- Integrating ECS with Application Load Balancer
- Implementing auto scaling for ECS services
- Configuring service discovery with Cloud Map
- Implementing blue/green deployments with CodeDeploy
- Organizing templates with Parameters, Outputs, Mappings, Conditions
- Implementing cross-stack references with export/import
- Using Transform for macro and reuse
## Instructions
Follow these steps to create ECS infrastructure with CloudFormation:
1. **Define Cluster**: Create ECS cluster with container insights
2. **Configure Task Definition**: Set up container image, resources, and environment
3. **Create Service**: Configure desired count and deployment strategy
4. **Set Up ALB Integration**: Configure target groups and listeners
5. **Implement Service Discovery**: Use Cloud Map for service naming
6. **Configure Auto Scaling**: Set up target tracking or step scaling
7. **Add Security**: Configure IAM roles and security groups
8. **Implement Blue/Green**: Use CodeDeploy for zero-downtime deployments
For complete examples, see the [EXAMPLES.md](references/examples.md) file.
## Examples
The following examples demonstrate common ECS patterns:
### Example 1: ECS Cluster
```yaml
ECSCluster:
Type: AWS::ECS::Cluster
Properties:
ClusterName: !Sub "${AWS::StackName}-cluster"
ClusterSettings:
- Name: containerInsights
Value: enabled
```
### Example 2: Task Definition
```yaml
TaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
Family: web-app
Cpu: "512"
Memory: "1024"
NetworkMode: awsvpc
RequiresCompatibilities:
- EC2
- FARGATE
ExecutionRoleArn: !Ref TaskExecutionRole
ContainerDefinitions:
- Name: web
Image: nginx:latest
PortMappings:
- ContainerPort: 80
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Ref LogGroup
awslogs-region: !Ref AWS::Region
```
### Example 3: ECS Service with ALB
```yaml
EcsService:
Type: AWS::ECS::Service
Properties:
ServiceName: !Sub "${AWS::StackName}-service"
Cluster: !Ref ECSCluster
TaskDefinition: !Ref TaskDefinition
DesiredCount: 2
LaunchType: FARGATE
DeploymentConfiguration:
MaximumPercent: 200
MinimumHealthyPercent: 50
NetworkConfiguration:
AwsvpcConfiguration:
AssignPublicIp: DISABLED
SecurityGroups:
- !Ref ServiceSecurityGroup
Subnets: !Ref PrivateSubnets
LoadBalancers:
- ContainerName: web
ContainerPort: 80
TargetGroupArn: !Ref TargetGroup
```
For complete production-ready examples, see [EXAMPLES.md](references/examples.md).
## Template Structure
### Base Template with Standard Format
```yaml
AWSTemplateFormatVersion: 2010-09-09
Description: ECS cluster with service and load balancer
Metadata:
AWS::CloudFormation::Interface:
ParameterGroups:
- Label:
default: Cluster Configuration
Parameters:
- ClusterName
- InstanceType
- DesiredCapacity
- Label:
default: Container Configuration
Parameters:
- ContainerName
- ContainerImage
- ContainerPort
Parameters:
ClusterName:
Type: String
Default: production-ecs-cluster
Description: Name of the ECS cluster
InstanceType:
Type: String
Default: t3.medium
AllowedValues:
- t3.small
- t3.medium
- t3.large
- m5.large
DesiredCapacity:
Type: Number
Default: 2
MinValue: 1
MaxValue: 20
Description: Initial number of EC2 instances
ContainerName:
Type: String
Default: web-app
Description: Name of the container
ContainerImage:
Type: String
Default: nginx:latest
Description: Docker image for the container
ContainerPort:
Type: Number
Default: 80
Description: Port the container listens on
Mappings:
EnvironmentConfig:
dev:
InstanceType: t3.small
DesiredCapacity: 1
ContainerMemoryHardLimit: 256
ContainerMemorySoftLimit: 128
staging:
InstanceType: t3.medium
DesiredCapacity: 2
ContainerMemoryHardLimit: 512
ContainerMemorySoftLimit: 256
production:
InstanceType: t3.large
DesiredCapacity: 3
ContainerMemoryHardLimit: 1024
ContainerMemorySoftLimit: 512
Conditions:
IsProduction: !Equals [!Ref Environment, production]
UseSpotInstances: !Equals [!Ref SpotInstances, true]
Transform:
- AWS::Serverless-2016-10-31
Resources:
# ECS Cluster
ECSCluster:
Type: AWS::ECS::Cluster
Properties:
ClusterName: !Ref ClusterName
ClusterSettings:
- Name: containerInsights
Value: enabled
# EC2 Instance Configuration
InstanceCapacity:
Type: AWS::AutoScaling::LaunchConfiguration
Properties:
ImageId: !Ref AmiId
InstanceType: !Ref InstanceType
IamInstanceProfile: !Ref EcsInstanceProfile
SecurityGroups:
- !Ref EcsSecurityGroup
UserData:
Fn::Base64: !Sub |
#!/bin/bash
echo "ECS_CLUSTER=${ECSCluster.Name}" >> /etc/ecs/ecs.config
echo "ECS_BACKEND_HOST=${ECSBackendHost}" >> /etc/ecs/ecs.config
BlockDeviceMappings:
- DeviceName: /dev/xvda
Ebs:
VolumeSize: 30
VolumeType: gp3
AutoScalingGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
AutoScalingGroupName: !Sub "${ClusterName}-asg"
LaunchConfigurationName: !Ref InstanceCapacity
MinSize: !Ref DesiredCapacity
MaxSize: !Ref MaxCapacity
DesiredCapacity: !Ref DesiredCapacity
VPCZoneIdentifier: !Ref SubnetIds
Tags:
- Key: Name
Value: !Sub "${ClusterName}-instance"
PropagateAtLaunch: true
- Key: Environment
Value: !Ref Environment
PropagateAtLaunch: true
Outputs:
ClusterName:
Description: Name of the ECS cluster
Value: !Ref ECSCluster
Export:
Name: !Sub "${AWS::StackName}-ClusterName"
ClusterArn:
Description: ARN of the ECS cluster
Value: !GetAtt ECSCluster.Arn
Export:
Name: !Sub "${AWS::StackName}-ClusterArn"
```
## Parameters Best Practices
### AWS-Specific Parameter Types
```yaml
Parameters:
# AWS-specific types for validation
VpcId:
Type: AWS::EC2::VPC::Id
Description: VPC where ECS cluster will be created
SubnetIds:
Type: List<AWS::EC2::Subnet::Id>
Description: Subnets for ECS instances
SecurityGroupIds:
Type: List<AWS::EC2::SecurityGroup::Id>
Description: Security groups for ECS service
ClusterArn:
Type: AWS::ECS::Cluster::Arn
Description: ARN of existing ECS cluster
TaskDefinitionArn:
Type: AWS::ECS::TaskDefinition::Arn
Description: ARN of ECS task definition
ServiceArn:
Type: AWS::ECS::Service::Arn
Description: ARN of ECS service
LoadBalancerArn:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer::Arn
Description: ARN of Application Load Balancer
TargetGroupArn:
Type: AWS::ElasticLoadBalancingV2::TargetGroup::Arn
Description: ARN of Target Group
```
### Parameter Constraints
```yaml
Parameters:
ContainerName:
Type: String
Default: web-app
Description: Container name
ConstraintDescription: Must be 1-256 characters, alphanumeric and hyphens
MinLength: 1
MaxLength: 256
AllowedPattern: "[a-zA-Z0-9-]+"
DesiredCount:
Type: Number
Default: 2
Description: Desired number of tasks
MinValue: 0
MaxValue: 1000
ConstraintDescription: Must be between 0 and 1000
Cpu:
Type: String
Default: 256
Description: CPU units for container
AllowedValues:
- 128
- 256
- 512
- 1024
- 2048
- 4096
ConstraintDescription: Must be a valid CPU value
Memory:
Type: Number
Default: 512
Description: Memory limit in MiB
MinValue: 4
MaxValue: 15000
ConstraintDescription: Must be between 4 and 15000 MiB
```
### SSM Parameter References
```yaml
Parameters:
ContainerImage:
Type: AWS::SSM::Parameter::Value<String>
Default: /ecs/app/container-image
Description: Container image from SSM Parameter Store
DatabaseConnectionString:
Type: AWS::SSM::Parameter::Value<SecureString>
Default: /ecs/app/database/connection
Description: Database connection from SSM
```
## Outputs and Cross-Stack References
### Export/Import Patterns
```yaml
# Stack A - Network Stack
AWSTemplateFormatVersion: 2010-09-09
Description: Network infrastructure stack for ECS
Resources:
VPC:
Type: AWS::EC2::VPC
Properties:
CidrBlock: 10.0.0.0/16
EnableDnsHostnames: true
EnableDnsSupport: true
Tags:
- Key: Name
Value: !Sub "${AWS::StackName}-vpc"
PublicSubnets:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [0, !GetAZs ""]
CidrBlock: 10.0.1.0/24
MapPublicIpOnLaunch: true
PrivateSubnets:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [0, !GetAZs ""]
CidrBlock: 10.0.2.0/24
Outputs:
VpcId:
Description: VPC ID
Value: !Ref VPC
Export:
Name: !Sub "${AWS::StackName}-VpcId"
PublicSubnetIds:
Description: Public subnet IDs
Value: !Join [",", [!Ref PublicSubnet1, !Ref PublicSubnet2]]
Export:
Name: !Sub "${AWS::StackName}-PublicSubnetIds"
PrivateSubnetIds:
Description: Private subnet IDs
Value: !Join [",", [!Ref PrivateSubnet1, !Ref PrivateSubnet2]]
Export:
Name: !Sub "${AWS::StackName}-PrivateSubnetIds"
EcsSecurityGroupId:
Description: Security group ID for ECS
Value: !Ref EcsSecurityGroup
Export:
Name: !Sub "${AWS::StackName}-EcsSecurityGroupId"
```
```yaml
# Stack B - ECS Stack (imports from Stack A)
AWSTemplateFormatVersion: 2010-09-09
Description: ECS application stack
Parameters:
NetworkStackName:
Type: String
Default: network-stack
Description: Name of the network stack
Resources:
ECSService:
Type: AWS::ECS::Service
Properties:
ServiceName: !Sub "${AWS::StackName}-service"
Cluster: !ImportValue
!Sub "${NetworkStackName}-ClusterArn"
TaskDefinition: !Ref TaskDefinition
DesiredCount: 2
LaunchType: EC2
NetworkConfiguration:
AwsvpcConfiguration:
AssignPublicIp: DISABLED
SecurityGroups:
- !ImportValue
!Sub "${NetworkStackName}-EcsSecurityGroupId"
Subnets:
- !Select [0, !Split [",", !ImportValue !Sub "${NetworkStackName}-PrivateSubnetIds"]]
- !Select [1, !Split [",", !ImportValue !Sub "${NetworkStackName}-PrivateSubnetIds"]]
```
### Nested Stacks for Modularity
```yaml
AWSTemplateFormatVersion: 2010-09-09
Description: Main stack with nested ECS stacks
Resources:
# Nested stack for cluster
ClusterStack:
Type: AWS::CloudFormation::Stack
Properties:
TemplateURL: https://s3.amazonaws.com/bucket/ecs-cluster.yaml
TimeoutInMinutes: 30
Parameters:
ClusterName: !Ref ClusterName
Environment: !Ref Environment
# Nested stack for services
ServicesStack:
Type: AWS::CloudFormation::Stack
Properties:
TemplateURL: https://s3.amazonaws.com/bucket/ecs-services.yaml
TimeoutInMinutes: 30
Parameters:
ClusterArn: !GetAtt ClusterStack.Outputs.ClusterArn
Environment: !Ref Environment
# Nested stack for load balancer
LoadBalancerStack:
Type: AWS::CloudFormation::Stack
Properties:
TemplateURL: https://s3.amazonaws.com/bucket/ecs-alb.yaml
TimeoutInMinutes: 30
Parameters:
VpcId: !Ref VpcId
Subnets: !Ref Subnets
```
## ECS Task Definitions
### Basic Task Definition
```yaml
AWSTemplateFormatVersion: 2010-09-09
Description: ECS task definition
Resources:
TaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
Family: web-app-task
Cpu: "512"
Memory: "1024"
NetworkMode: awsvpc
RequiresCompatibilities:
- EC2
- FARGATE
ExecutionRoleArn: !Ref TaskExecutionRole
TaskRoleArn: !Ref TaskRole
ContainerDefinitions:
- Name: web-app
Image: !Ref ContainerImage
Cpu: 256
Memory: 512
PortMappings:
- ContainerPort: !Ref ContainerPort
Protocol: tcp
Environment:
- Name: ENVIRONMENT
Value: !Ref Environment
- Name: LOG_LEVEL
Value: INFO
Secrets:
- Name: DATABASE_URL
ValueFrom: !Ref DatabaseSecretArn
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Ref LogGroup
awslogs-region: !Ref AWS::Region
awslogs-stream-prefix: ecs
HealthCheck:
Command:
- CMD-SHELL
- curl -f http://localhost:8080/health || exit 1
Interval: 30
Timeout: 5
Retries: 3
StartPeriod: 60
TaskExecutionRole:
Type: AWS::IAM::Role
Properties:
RoleName: !Sub "${AWS::StackName}-task-execution-role"
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Principal:
Service: ecs-tasks.amazonaws.com
Action: sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy
Policies:
- PolicyName: EcsSecretsPolicy
PolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Action:
- secretsmanager:GetSecretValue
Resource: !Ref DatabaseSecretArn
- PolicyName: EcsLogsPolicy
PolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Action:
- logs:CreateLogStream
- logs:PutLogEvents
Resource: !GetAtt LogGroup.Arn
LogGroup:
Type: AWS::Logs::LogGroup
Properties:
LogGroupName: !Sub "/ecs/${AWS::StackName}"
RetentionInDays: 30
```
### Multi-Container Task Definition
```yaml
Resources:
TaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
Family: multi-container-task
Cpu: "1024"
Memory: "2048"
NetworkMode: awsvpc
ContainerDefinitions:
- Name: web
Image: nginx:latest
Cpu: 256
Memory: 512
PortMappings:
- ContainerPort: 80
Protocol: tcp
DependsOn:
- ContainerName: app
Condition: HEALTHY
Environment:
- Name: BACKEND_URL
Value: !Sub "http://localhost:${AppPort}"
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Ref LogGroup
awslogs-region: !Ref AWS::Region
awslogs-stream-prefix: web
- Name: app
Image: !Ref AppImage
Cpu: 512
Memory: 1024
PortMappings:
- ContainerPort: !Ref AppPort
Protocol: tcp
Environment:
- Name: DATABASE_URL
ValueFrom: !Ref DatabaseSecretArn
- Name: REDIS_URL
ValueFrom: !Ref RedisSecretArn
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Ref LogGroup
awslogs-region: !Ref AWS::Region
awslogs-stream-prefix: app
- Name: redis
Image: redis:alpine
Cpu: 128
Memory: 256
PortMappings:
- ContainerPort: 6379
Protocol: tcp
HealthCheck:
Command:
- CMD-SHELL
- redis-cli ping | grep -q PONG
Interval: 10
Timeout: 5
Retries: 3
```
## ECS Services
### Service with Application Load Balancer
```yaml
AWSTemplateFormatVersion: 2010-09-09
Description: ECS service with ALB
Resources:
# Application Load Balancer
ApplicationLoadBalancer:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
Name: !Sub "${AWS::StackName}-alb"
Scheme: internet-facing
SecurityGroups:
- !Ref AlbSecurityGroup
Subnets: !Ref PublicSubnets
Type: application
AlbSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub "${AWS::StackName}-alb-sg"
GroupDescription: Security group for ALB
VpcId: !Ref VpcId
SecurityGroupIngress:
# HTTP from anywhere - public-facing ALB requires this
# For production, use AWS WAF or restrict to known CIDR ranges
- IpProtocol: tcp
FromPort: 80
ToPort: 80
CidrIp: 0.0.0.0/0
Description: HTTP for public web traffic
# HTTPS from anywhere - public-facing ALB requires this
# For production, use AWS WAF or restrict to known CIDR ranges
- IpProtocol: tcp
FromPort: 443
ToPort: 443
CidrIp: 0.0.0.0/0
Description: HTTPS for secure public web traffic
# Target Groups
BlueTargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
Name: !Sub "${AWS::StackName}-blue-tg"
Port: 80
Protocol: HTTP
VpcId: !Ref VpcId
Matcher:
HttpCode: 200-499
HealthCheckPath: /health
HealthCheckIntervalSeconds: 30
HealthCheckTimeoutSeconds: 5
HealthyThresholdCount: 2
UnhealthyThresholdCount: 3
TargetType: ip
IpAddressType: ipv4
GreenTargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
Name: !Sub "${AWS::StackName}-green-tg"
Port: 80
Protocol: HTTP
VpcId: !Ref VpcId
Matcher:
HttpCode: 200-499
HealthCheckPath: /health
HealthCheckIntervalSeconds: 30
HealthCheckTimeoutSeconds: 5
HealthyThresholdCount: 2
UnhealthyThresholdCount: 3
TargetType: ip
IpAddressType: ipv4
# Listener
AlbListener:
Type: AWS::ElasticLoadBalancingV2::Listener
Properties:
DefaultActions:
- Type: forward
ForwardConfig:
TargetGroupStickinessConfig:
Enabled: false
DurationSeconds: 3600
TargetGroups:
- TargetGroupArn: !Ref BlueTargetGroup
Weight: 100
LoadBalancerArn: !Ref ApplicationLoadBalancer
Port: 80
Protocol: HTTP
# ECS Service
EcsService:
Type: AWS::ECS::Service
DependsOn: AlbListener
Properties:
ServiceName: !Sub "${AWS::StackName}-service"
Cluster: !Ref ECSClusterArn
TaskDefinition: !Ref TaskDefinitionArn
DesiredCount: 2
LaunchType: FARGATE
DeploymentConfiguration:
MaximumPercent: 200
MinimumHealthyPercent: 50
DeploymentCircuitBreaker:
Enable: true
Rollback: true
NetworkConfiguration:
AwsvpcConfiguration:
AssignPublicIp: DISABLED
SecurityGroups:
- !Ref EcsSecurityGroup
Subnets: !Ref PrivateSubnets
LoadBalancers:
- ContainerName: web
ContainerPort: 80
TargetGroupArn: !Ref BlueTargetGroup
PropagateTags: SERVICE
ServiceRegistries:
- RegistryArn: !GetAtt ServiceDiscoveryService.Arn
EcsSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub "${AWS::StackName}-ecs-sg"
GroupDescription: Security group for ECS service
VpcId: !Ref VpcId
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 80
ToPort: 80
SourceSecurityGroupId: !Ref AlbSecurityGroup
# Service Discovery
ServiceDiscoveryService:
Type: AWS::ServiceDiscovery::Service
Properties:
Name: web-app
DnsConfig:
NamespaceId: !Ref ServiceDiscoveryNamespace
DnsRecords:
- Type: A
TTL: 60
HealthCheckConfig:
Type: HTTP
ResourcePath: /health
ServiceDiscoveryNamespace:
Type: AWS::ServiceDiscovery::PrivateDnsNamespace
Properties:
Name: !Sub "${Environment}.internal"
VpcId: !Ref VpcId
```
## Auto Scaling
### Service Auto Scaling
```yaml
AWSTemplateFormatVersion: 2010-09-09
Description: ECS service with auto scaling
Resources:
# Scalable Target
ScalableTarget:
Type: AWS::ApplicationAutoScaling::ScalableTarget
Properties:
MaxCapacity: 10
MinCapacity: 2
ResourceId: !Sub "service/${ECSClusterName}/${ECSServiceName}"
RoleARN: !Ref AutoScalingRoleArn
ScalableDimension: ecs:service:DesiredCount
ServiceNamespace: ecs
# Scaling Policy - CPU Utilization
CpuScalingPolicy:
Type: AWS::ApplicationAutoScaling::ScalingPolicy
Properties:
PolicyName: !Sub "${AWS::StackName}-cpu-scaling"
PolicyType: TargetTrackingScaling
ScalingTargetId: !Ref ScalableTarget
TargetTrackingScalingPolicyConfiguration:
PredefinedMetricSpecification:
PredefinedMetricType: ECSServiceAverageCPUUtilization
TargetValue: 70
ScaleInCooldown: 300
ScaleOutCooldown: 60
# Scaling Policy - Memory Utilization
MemoryScalingPolicy:
Type: AWS::ApplicationAutoScaling::ScalingPolicy
Properties:
PolicyName: !Sub "${AWS::StackName}-memory-scaling"
PolicyType: TargetTrackingScaling
ScalingTargetId: !Ref ScalableTarget
TargetTrackingScalingPolicyConfiguration:
PredefinedMetricSpecification:
PredefinedMetricType: ECSServiceAverageMemoryUtilization
TargetValue: 80
ScaleInCooldown: 300
ScaleOutCooldown: 60
# Step Scaling Policy
RequestCountScalingPolicy:
Type: AWS::ApplicationAutoScaling::ScalingPolicy
Properties:
PolicyName: !Sub "${AWS::StackName}-request-scaling"
PolicyType: StepScaling
ScalingTargetId: !Ref ScalableTarget
StepScalingPolicyConfiguration:
AdjustmentType: ChangeInCapacity
Cooldown: 60
MetricAggregationType: Average
StepAdjustments:
- MetricIntervalLowerBound: 0
ScalingAdjustment: 1
- MetricIntervalLowerBound: 1000
ScalingAdjustment: 2
- MetricIntervalLowerBound: 5000
ScalingAdjustment: 4
# CloudWatch Alarm
HighCpuAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: !Sub "${AWS::StackName}-high-cpu"
AlarmDescription: CPU utilization above threshold
MetricName: CPUUtilization
Namespace: AWS/ECS
Dimensions:
- Name: ClusterName
Value: !Ref ECSClusterName
- Name: ServiceName
Value: !Ref ECSServiceName
Statistic: Average
Period: 60
EvaluationPeriods: 2
Threshold: 80
ComparisonOperator: GreaterThanThreshold
AlarmActions:
- !Ref RequestCountScalingPolicy
AutoScalingRoleArn:
Type: AWS::IAM::Role
Properties:
RoleName: !Sub "${AWS::StackName}-autoscaling-role"
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Principal:
Service: application-autoscaling.amazonaws.com
Action: sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceAutoscaleRole
```
## Blue/Green Deployments with CodeDeploy
```yaml
AWSTemplateFormatVersion: 2010-09-09
Transform: AWS::CodeDeployBlueGreen
Description: ECS blue/green deployment with CodeDeploy
Parameters:
Environment:
Type: String
Default: production
AllowedValues:
- staging
- production
Resources:
# ECS Cluster
ECSCluster:
Type: AWS::ECS::Cluster
Properties:
ClusterName: !Sub "${AWS::StackName}-cluster"
# Task Definition (Blue)
TaskDefinitionBlue:
Type: AWS::ECS::TaskDefinition
Properties:
Family: blue-task
Cpu: "512"
Memory: "1024"
NetworkMode: awsvpc
RequiresCompatibilities:
- FARGATE
ContainerDefinitions:
- Name: web
Image: !Ref BlueImage
Cpu: 256
Memory: 512
PortMappings:
- ContainerPort: 80
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Ref LogGroup
awslogs-region: !Ref AWS::Region
awslogs-stream-prefix: blue
# Task Definition (Green)
TaskDefinitionGreen:
Type: AWS::ECS::TaskDefinition
Properties:
Family: green-task
Cpu: "512"
Memory: "1024"
NetworkMode: awsvpc
RequiresCompatibilities:
- FARGATE
ContainerDefinitions:
- Name: web
Image: !Ref GreenImage
Cpu: 256
Memory: 512
PortMappings:
- ContainerPort: 80
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Ref LogGroup
awslogs-region: !Ref AWS::Region
awslogs-stream-prefix: green
# Task Set (Blue)
TaskSetBlue:
Type: AWS::ECS::TaskSet
Properties:
Cluster: !Ref ECSCluster
Service: !Ref ECSService
TaskDefinition: !Ref TaskDefinitionBlue
Scale:
Unit: PERCENT
Value: 100
# Task Set (Green)
TaskSetGreen:
Type: AWS::ECS::TaskSet
Properties:
Cluster: !Ref ECSCluster
Service: !Ref ECSService
TaskDefinition: !Ref TaskDefinitionGreen
Scale:
Unit: PERCENT
Value: 0
# ECS Service
ECSService:
Type: AWS::ECS::Service
Properties:
ServiceName: !Sub "${AWS::StackName}-service"
Cluster: !Ref ECSCluster
TaskDefinition: !Ref TaskDefinitionBlue
DesiredCount: 2
LaunchType: FARGATE
LoadBalancers:
- ContainerName: web
ContainerPort: 80
TargetGroupArn: !Ref BlueTargetGroup
# Target Groups
BlueTargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
Name: !Sub "${AWS::StackName}-blue-tg"
Port: 80
Protocol: HTTP
VpcId: !Ref VpcId
HealthCheckPath: /health
TargetType: ip
GreenTargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
Name: !Sub "${AWS::StackName}-green-tg"
Port: 80
Protocol: HTTP
VpcId: !Ref VpcId
HealthCheckPath: /health
TargetType: ip
# Listener
ProductionListener:
Type: AWS::ElasticLoadBalancingV2::Listener
Properties:
DefaultActions:
- Type: forward
TargetGroupArn: !Ref BlueTargetGroup
LoadBalancerArn: !Ref ApplicationLoadBalancer
Port: 80
Protocol: HTTP
TestListener:
Type: AWS::ElasticLoadBalancingV2::Listener
Properties:
DefaultActions:
- Type: forward
TargetGroupArn: !Ref BlueTargetGroup
LoadBalancerArn: !Ref ApplicationLoadBalancer
Port: 8080
Protocol: HTTP
ApplicationLoadBalancer:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
Name: !Sub "${AWS::StackName}-alb"
Scheme: internet-facing
Subnets: !Ref PublicSubnets
SecurityGroups:
- !Ref AlbSecurityGroup
AlbSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub "${AWS::StackName}-alb-sg"
GroupDescription: ALB security group
VpcId: !Ref VpcId
SecurityGroupIngress:
# HTTP from anywhere - public-facing ALB requires this
- IpProtocol: tcp
FromPort: 80
ToPort: 80
CidrIp: 0.0.0.0/0
Description: HTTP for public web traffic
# Test traffic from internal network only
- IpProtocol: tcp
FromPort: 8080
ToPort: 8080
CidrIp: 10.0.0.0/16
Description: Test traffic from internal VPC
LogGroup:
Type: AWS::Logs::LogGroup
Properties:
LogGroupName: !Sub "/ecs/${AWS::StackName}"
RetentionInDays: 30
Hooks:
BlueGreenHook:
Type: AWS::CodeDeploy::BlueGreen
Properties:
TrafficRoutingConfig:
Type: TimeBasedCanary
TimeBasedCanary:
StepPercentage: 15
BakeTimeMins: 5
AdditionalOptions:
TerminationWaitTimeInMinutes: 5
ServiceRole: !Ref CodeDeployRoleArn
Applications:
- Target:
Type: AWS::ECS::Service
LogicalID: ECSService
ECSAttributes:
TaskDefinitions:
- TaskDefinitionBlue
- TaskDefinitionGreen
TaskSets:
- TaskSetBlue
- TaskSetGreen
TrafficRouting:
ProdTrafficRoute:
Type: AWS::ElasticLoadBalancingV2::Listener
LogicalID: ProductionListener
TestTrafficRoute:
Type: AWS::ElasticLoadBalancingV2::Listener
LogicalID: TestListener
TargetGroups:
- BlueTargetGroup
- GreenTargetGroup
CodeDeployRoleArn:
Value: !Sub "arn:aws:iam::${AWS::AccountId}:role/service-role/AmazonCodeDeployRoleForECS"
```
## Conditions and Transforms
### Conditions for Environment-Specific Resources
```yaml
AWSTemplateFormatVersion: 2010-09-09
Description: ECS with conditional resources
Parameters:
Environment:
Type: String
Default: dev
AllowedValues:
- dev
- staging
- production
EnableServiceDiscovery:
Type: String
Default: true
AllowedValues:
- true
- false
EnableSpotInstances:
Type: String
Default: false
AllowedValues:
- true
- false
Conditions:
IsProduction: !Equals [!Ref Environment, production]
IsStaging: !Equals [!Ref Environment, staging]
UseServiceDiscovery: !Equals [!Ref EnableServiceDiscovery, true]
UseSpotInstances: !And
- !Equals [!Ref EnableSpotInstances, true]
- !Not [!Equals [!Ref Environment, production]]
Resources:
# Always created
ECSCluster:
Type: AWS::ECS::Cluster
Properties:
ClusterName: !Sub "${AWS::StackName}-${Environment}"
# Conditionally created service discovery
ServiceDiscoveryNamespace:
Type: AWS::ServiceDiscovery::PrivateDnsNamespace
Condition: UseServiceDiscovery
Properties:
Name: !Sub "${Environment}.internal"
VpcId: !Ref VpcId
# Conditionally created DLQ
DeadLetterQueue:
Type: AWS::SQS::Queue
Condition: IsProduction
Properties:
QueueName: !Sub "${AWS::StackName}-dlq"
```
### Transforms for Code Reuse
```yaml
AWSTemplateFormatVersion: 2010-09-09
Transform: AWS::Serverless-2016-10-31
Description: Using SAM Transform for ECS
Globals:
Function:
Timeout: 30
Runtime: python3.11
Tracing: Active
Resources:
TaskFunction:
Type: AWS::Serverless::Function
Properties:
FunctionName: !Sub "${AWS::StackName}-task-processor"
Handler: task_handler.handler
Policies:
- ECSFullAccess
Events:
TaskQueue:
Type: SQS
Properties:
Queue: !GetAtt TaskQueue.Arn
BatchSize: 10
TaskQueue:
Type: AWS::SQS::Queue
Properties:
QueueName: !Sub "${AWS::StackName}-tasks"
VisibilityTimeout: 360
```
## CloudFormation Best Practices
### Stack Policies
Stack Policies prevent accidental updates to critical infrastructure resources. Use them to protect ECS clusters, task definitions, and production services from unintended modifications.
```yaml
AWSTemplateFormatVersion: 2010-09-09
Description: ECS stack with protective policy
Resources:
ECSCluster:
Type: AWS::ECS::Cluster
Properties:
ClusterName: !Sub "${AWS::StackName}-cluster"
EcsService:
Type: AWS::ECS::Service
Properties:
ServiceName: !Sub "${AWS::StackName}-service"
Cluster: !Ref ECSCluster
TaskDefinition: !Ref TaskDefinition
DesiredCount: 2
TaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
Family: web-app-task
Cpu: "512"
Memory: "1024"
NetworkMode: awsvpc
RequiresCompatibilities:
- FARGATE
ContainerDefinitions:
- Name: web
Image: !Ref ContainerImage
Cpu: 256
Memory: 512
PortMappings:
- ContainerPort: 80
# Stack Policy JSON (apply via AWS Console or CLI)
StackPolicy:
Statement:
# Allow all updates to task definitions (needed for deployments)
- Effect: Allow
Action: Update:Modify
Resource: "*"
# Prevent modifications to the ECS cluster
- Effect: Deny
Action:
- Update:Modify
- Update:Replace
- Delete
Resource: "*"
Condition:
StringEquals:
ResourceType:
- AWS::ECS::Cluster
# Prevent deletion of the ECS service in production
- Effect: Deny
Action: Delete
Resource: "*"
Condition:
StringEquals:
ResourceType:
- AWS::ECS::Service
StringEqualsIfExists:
Environment: production
```
Apply stack policy using CLI:
```bash
aws cloudformation set-stack-policy \
--stack-name my-ecs-stack \
--stack-policy-body file://stack-policy.json
```
### Termination Protection
Enable termination protection to prevent accidental deletion of production stacks. This is critical for ECS infrastructure that handles production workloads.
```bash
# Enable termination protection when creating a stack
aws cloudformation create-stack \
--stack-name production-ecs \
--template-body file://ecs-template.yaml \
--enable-termination-protection
# Enable termination protection on existing stack
aws cloudformation update-termination-protection \
--stack-name production-ecs \
--enable-termination-protection
# Disable termination protection (requires explicit confirmation)
aws cloudformation update-termination-protection \
--stack-name production-ecs \
--no-enable-termination-protection
```
In templates, add to resources that should be preserved:
```yaml
Resources:
ProductionECSService:
Type: AWS::ECS::Service
DeletionPolicy: Retain
UpdateReplacePolicy: Retain
Properties:
# ... service configuration
```
### Drift Detection
CloudFormation drift detection identifies when infrastructure has been modified outside of CloudFormation. Regular drift checks ensure your ECS infrastructure remains consistent with your templates.
```bash
# Detect drift on a stack
aws cloudformation detect-stack-drift \
--stack-name my-ecs-stack
# Get drift detection status
aws cloudformation describe-stack-drift-detection-status \
--stack-drift-detection-id <detection-id>
# Get stack resources with drift status
aws cloudformation describe-stack-resource-drifts \
--stack-name my-ecs-stack
# Check specific resource drift
aws cloudformation detect-stack-resource-drift \
--stack-name my-ecs-stack \
--logical-resource-id EcsService
```
Drift status values:
- **IN_SYNC**: Resource matches template
- **MODIFIED**: Resource has been changed outside CloudFormation
- **DELETED**: Resource exists in template but not in AWS
- **NOT_CHECKED**: Resource not included in drift detection
Automated drift detection schedule:
```yaml
# Use AWS Config rules for continuous drift monitoring
Resources:
CloudFormationStackDriftDetectionConfigRule:
Type: AWS::Config::ConfigRule
Properties:
ConfigRuleName: cf-drift-detection
Description: Detect CloudFormation stack drift
Source:
Owner: AWS
SourceIdentifier: CFN_STACK_DRIFT_DETECTION_CHECK
Scope:
ComplianceResourceTypes:
- AWS::CloudFormation::Stack
```
### Change Sets
Change Sets provide a preview of stack changes before execution. Always use change sets for production deployments to review and validate modifications.
```bash
# 1. Create a change set (dry-run)
aws cloudformation create-change-set \
--stack-name production-ecs \
--template-body file://ecs-template.yaml \
--change-set-name production-ecs-changeset \
--capabilities CAPABILITY_IAM \
--parameters ParameterKey=Environment,ParameterValue=production
# 2. Wait for change set creation
aws cloudformation wait change-set-create-complete \
--stack-name production-ecs \
--change-set-name production-ecs-changeset
# 3. View change set (review changes)
aws cloudformation describe-change-set \
--stack-name production-ecs \
--change-set-name production-ecs-changeset \
--output table
# 4. Execute change set (if changes are correct)
aws cloudformation execute-change-set \
--stack-name production-ecs \
--change-set-name production-ecs-changeset
# 5. Delete change set (if not executing)
aws cloudformation delete-change-set \
--stack-name production-ecs \
--change-set-name production-ecs-changeset
```
Change Set for nested stacks:
```bash
aws cloudformation create-change-set \
--stack-name parent-stack \
--template-body file://parent-template.yaml \
--change-set-name parent-changeset \
--nested-stack-resolution TEMPLATE
```
### Change Set Template Review
When reviewing change sets, examine:
1. **Resource modifications**: Changes to ECS services may cause service disruptions
2. **Deletions**: Ensure no critical resources are marked for deletion
3. **IAM changes**: New or modified roles require manual approval
4. **Parameter changes**: Verify parameter overrides are correct
5. **Replace operations**: Resources marked for replacement may cause downtime
## Security Best Practices
### Security
- Use IAM roles with minimum required permissions (Task Execution Role, Task Role)
- Encrypt container images with ECR
- Use Secrets Manager for sensitive data (passwords, API keys)
- Configure security groups with restrictive rules
- Use private subnets for ECS services
- Enable Container Insights for monitoring
- Implement task execution role with specific permissions
### Security Group Best Practices
For public-facing ALBs, HTTP/HTTPS from 0.0.0.0/0 is often necessary. However, enhance security with:
```yaml
AlbSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub "${AWS::StackName}-alb-sg"
GroupDescription: ALB security group - restrict in production
VpcId: !Ref VpcId
SecurityGroupIngress:
# HTTP from anywhere - required for public ALB
- IpProtocol: tcp
FromPort: 80
ToPort: 80
CidrIp: 0.0.0.0/0
Description: HTTP for public web traffic
# HTTPS from anywhere - required for public ALB
- IpProtocol: tcp
FromPort: 443
ToPort: 443
CidrIp: 0.0.0.0/0
Description: HTTPS for secure public web traffic
# Security enhancements for production:
# 1. Use AWS WAF to filter malicious traffic
# 2. Place ALB behind CloudFront for additional protection
# 3. Use AWS Shield for DDoS protection
# 4. Implement rate limiting
# 5. Use AWS Network Firewall for advanced filtering
```
### Performance
- Choose CPU and memory based on container profiling
- Use right-sizing for EC2 instances or Fargate
- Implement auto scaling based on metrics (CPU, Memory, Request count)
- Optimize container startup with appropriate health checks
- Use task placement strategies for optimization
- Consider Spot instances for non-critical workloads
### Monitoring
- Enable Container Insights for detailed metrics
- Configure CloudWatch alarms for CPU, memory, and error rates
- Implement structured logging with awslogs driver
- Use X-Ray for distributed tracing
- Configure task-level monitoring
### Deployment
- Use blue/green deployments with CodeDeploy for production
- Implement deployment circuit breaker
- Use change sets before deployment
- Organize stacks by lifecycle and ownership
- Test task definitions locally before deployment
## Best Practices
### Container Configuration
- **Use Specific Image Tags**: Avoid `latest` tag; pin specific versions for reproducibility
- **Limit Container Resources**: Set memory and CPU limits to prevent runaway containers
- **Health Check Configuration**: Configure appropriate health check intervals and timeouts
- **Log Configuration**: Use awslogs driver with appropriate retention and log groups
### Task Definition Design
- **One Process Per Container**: Follow single responsibility principle per container
- **Use Sidecar Patterns**: Deploy log forwarders and monitoring as sidecar containers
- **Essential Containers**: Mark only one container as essential for task health
- **Task Roles**: Use IAM roles for tasks instead of passing credentials
### Service Configuration
- **Minimum Healthy Percent**: Set appropriate values for zero-downtime deployments
- **Maximum Percent**: Control how many extra tasks can run during updates
- **Placement Strategies**: Use spread strategies across AZs for high availability
- **Task Networking**: Use awsvpc network mode for ENI-level networking control
### Deployment Strategies
- **Blue/Green Deployments**: Use CodeDeploy for production blue/green deployments
- **Rolling Updates**: Configure appropriate deployment parameters for rolling updates
- **Task Definition Versions**: Always create new versions instead of updating existing
- **Environment Variables**: Use SSM Parameter Store or Secrets Manager for sensitive data
### Performance Optimization
- **Right-Size Resources**: Profile containers to determine optimal CPU/memory allocation
- **Fargate vs EC2**: Choose Fargate for simplified operations; EC2 for cost optimization at scale
- **Spot Instances**: Use Spot capacity providers for fault-tolerant workloads
- **Task Placement**: Use custom task placement strategies for optimization
### Security Hardening
- **Least Privilege IAM**: Grant minimum required permissions in task IAM roles
- **Network Isolation**: Use security groups to restrict inter-task communication
- **Secrets Management**: Never store secrets in environment variables or task definitions
- **Image Scanning**: Enable ECR image scanning and deploy only scanned images
- **Runtime Security**: Enable ECS Exec only when needed; remove access after debugging
## Constraints and Warnings
### Resource Limits
- **Task Definition Size**: Maximum task definition size is 1 KB (CloudFormation) for certain configurations
- **Container Limits**: Maximum 10 containers per task definition
- **Memory Limits**: Memory limits must be set for all containers in a task
- **CPU Limits**: CPU units must be set for all containers in Fargate
### Operational Constraints
- **Service Update Limits**: Services can only update to new task definitions; cannot modify existing
- **ENI Limits**: awsvpc mode requires ENIs; each task consumes ENIs from subnet pool
- **Task Start Time**: Cold starts can take 30-60 seconds depending on image size
- **Scaling Delays**: Auto scaling takes time to provision new tasks; not instant
### Security Constraints
- **IAM Role Limits**: Task IAM roles must be assumable by ecs-tasks.amazonaws.com
- **Network Mode**: awsvpc mode required for Fargate; cannot use bridge or host mode
- **Security Group Rules**: Security groups attached to tasks control all network traffic
- **Secrets Rotation**: Rotating secrets requires task redeployment to pick up new values
### Cost Considerations
- **Fargate Pricing**: Fargate charges per vCPU and memory per hour; minimum charges apply
- **Data Transfer**: Data transfer between tasks in different AZs incurs cross-AZ costs
- **ECR Storage**: Container images stored in ECR incur monthly storage costs
- **Monitoring Costs**: Container Insights and detailed monitoring add costs
### Deployment Constraints
- **Blue/Green Requirements**: Blue/green deployments require production and listener configuration
- **CodeDeploy Integration**: CodeDeploy requires appropriate IAM permissions and configuration
- **Rollback Limitations**: Rollbacks require manual intervention in some failure scenarios
- **Task Drift**: Manual changes to running tasks are not reflected in CloudFormation
### Availability Constraints
- **Fargate Spot**: Fargate Spot can interrupt tasks with 2-minute notice
- **Multi-AZ**: Services should be spread across multiple AZs for high availability
- **Health Check Failures**: Failing health checks can cause rapid task replacement
- **Service Discovery**: Service discovery has additional costs and latency
## Related Resources
- [ECS Documentation](https://docs.aws.amazon.com/ecs/)
- [ECS Task Definitions](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definitions.html)
- [ECS Services](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs_services.html)
- [AWS CloudFormation User Guide](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/)
- [ECS Blue/Green Deployments](https://docs.aws.amazon.com/codedeploy/latest/userguide/deployment-steps-ecs.html)
- [CloudFormation Stack Policies](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/protect-stack-resources.html)
- [CloudFormation Drift Detection](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/detect-drift-stack.html)
- [CloudFormation Change Sets](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-changesets.html)
## Additional Files
For complete details on resources and their properties, see:
- [REFERENCE.md](references/reference.md) - Detailed reference guide for all CloudFormation resources
- [EXAMPLES.md](references/examples.md) - Complete production-ready examples
This skill provides production-ready AWS CloudFormation patterns for provisioning ECS clusters, services, and task definitions. It focuses on container configuration, ALB integration, scaling, service discovery, and blue/green deployments with CodeDeploy. Use it to structure templates with Parameters, Mappings, Conditions, Outputs, cross-stack references, and nested stacks.
The skill supplies reusable CloudFormation constructs and example templates that define ECS clusters, EC2/Fargate task definitions, and services wired to Application Load Balancers and Cloud Map. It includes patterns for IAM roles, log configuration, auto scaling policies, target groups, ALB listeners, and CodeDeploy hooks for blue/green deployments. Templates show parameter validation, SSM references, export/import for cross-stack references, and nested stacks for modular deployments.
Can I use these templates for both Fargate and EC2 launch types?
Yes. The patterns include examples for Fargate and EC2 by adjusting RequiresCompatibilities, NetworkMode, and launch configuration resources.
How do I manage secrets and container images securely?
Reference container images and secrets from SSM Parameter Store or AWS Secrets Manager using AWS::SSM::Parameter::Value and Secrets Manager ARNs in task Definitions to avoid embedding secrets in templates.