Контейнеры в облаке: ECS vs EKS
Зачем контейнеры в облаке
Локально вы запускаете docker compose up и всё работает. В production нужно решить:
- Как запускать контейнеры на нескольких серверах?
- Как масштабировать при росте нагрузки?
- Как обновлять без downtime?
- Как восстанавливаться при сбоях?
Оркестраторы контейнеров решают все эти задачи.
┌─────────────────────────────────────────────────────────┐
│ Container Orchestration │
│ │
│ ┌─────────┐ ┌─────────────┐ ┌──────────────────┐ │
│ │ ECS │ │ EKS │ │ App Runner │ │
│ │ Fargate │ │ (Kubernetes)│ │ (Zero config) │ │
│ │ │ │ │ │ │ │
│ │ AWS- │ │ Standard │ │ Push image → │ │
│ │ native │ │ K8s API │ │ get URL │ │
│ │ simple │ │ portable │ │ │ │
│ └─────────┘ └─────────────┘ └──────────────────┘ │
│ │
│ Complexity: Low ◄──────────────────────────► High │
│ Control: Low ◄──────────────────────────► Full │
└─────────────────────────────────────────────────────────┘
Сравнение платформ
| Критерий | ECS Fargate | EKS | App Runner |
|---|---|---|---|
| Сложность настройки | Средняя | Высокая | Минимальная |
| Управление серверами | Нет | Зависит (Fargate/EC2) | Нет |
| Kubernetes знания | Не нужны | Обязательны | Не нужны |
| Auto-scaling | Target tracking | HPA/VPA/KEDA | Встроенное |
| Networking | VPC, ALB, Service Discovery | K8s Services, Ingress | Автоматическое |
| CI/CD | CodePipeline, GitHub Actions | ArgoCD, Flux, Helm | Авто-деплой из ECR |
| Multi-cloud | Нет | Да (K8s portable) | Нет |
| Стоимость control plane | Бесплатно | $73/мес | Бесплатно |
| Стоимость compute | vCPU + Memory per task | vCPU + Memory per node | vCPU + Memory + requests |
| Лучше для | Большинство проектов | Сложные microservice mesh | Простые API, MVP |
ECS Fargate: подробно
Основные концепции
┌──────────────────────────────────────────┐
│ ECS Cluster │
│ │
│ ┌────────────────────────────────────┐ │
│ │ Service: API │ │
│ │ Desired count: 3 │ │
│ │ ┌────────┐ ┌────────┐ ┌────────┐ │ │
│ │ │ Task 1 │ │ Task 2 │ │ Task 3 │ │ │
│ │ │ nginx │ │ nginx │ │ nginx │ │ │
│ │ │ php │ │ php │ │ php │ │ │
│ │ └────────┘ └────────┘ └────────┘ │ │
│ └────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────┐ │
│ │ Service: Worker │ │
│ │ Desired count: 2 │ │
│ │ ┌────────┐ ┌────────┐ │ │
│ │ │ Task 1 │ │ Task 2 │ │ │
│ │ │ worker │ │ worker │ │ │
│ │ └────────┘ └────────┘ │ │
│ └────────────────────────────────────┘ │
└──────────────────────────────────────────┘
- Cluster -- логическая группа сервисов
- Task Definition -- описание контейнера (image, CPU, memory, environment, ports)
- Task -- запущенный экземпляр Task Definition (аналог
docker run) - Service -- управляет N tasks, обеспечивает desired count, rolling updates
Task Definition для PHP-приложения
# ECR Repository for Docker images
resource "aws_ecr_repository" "app" {
name = "${var.project}-api"
image_tag_mutability = "IMMUTABLE"
image_scanning_configuration {
scan_on_push = true
}
encryption_configuration {
encryption_type = "AES256"
}
}
# ECS Task Definition: PHP API (nginx + php-fpm sidecar)
resource "aws_ecs_task_definition" "api" {
family = "${var.project}-api"
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
cpu = var.environment == "prod" ? 1024 : 256
memory = var.environment == "prod" ? 2048 : 512
execution_role_arn = aws_iam_role.ecs_execution.arn
task_role_arn = aws_iam_role.ecs_task.arn
container_definitions = jsonencode([
{
name = "nginx"
image = "${aws_ecr_repository.app.repository_url}:nginx-${var.image_tag}"
essential = true
cpu = var.environment == "prod" ? 256 : 64
memory = var.environment == "prod" ? 512 : 128
portMappings = [{
containerPort = 80
protocol = "tcp"
}]
dependsOn = [{
containerName = "php"
condition = "HEALTHY"
}]
logConfiguration = {
logDriver = "awslogs"
options = {
"awslogs-group" = aws_cloudwatch_log_group.api.name
"awslogs-region" = var.region
"awslogs-stream-prefix" = "nginx"
}
}
},
{
name = "php"
image = "${aws_ecr_repository.app.repository_url}:php-${var.image_tag}"
essential = true
cpu = var.environment == "prod" ? 768 : 192
memory = var.environment == "prod" ? 1536 : 384
environment = [
{ name = "APP_ENV", value = var.environment },
{ name = "DATABASE_URL", value = "postgresql://${var.db_user}:${var.db_pass}@${aws_db_instance.main.address}:5432/${var.db_name}" },
{ name = "REDIS_URL", value = "rediss://${aws_elasticache_replication_group.main.primary_endpoint_address}:6379" },
{ name = "MESSENGER_TRANSPORT_DSN", value = "sqs://sqs.${var.region}.amazonaws.com/${data.aws_caller_identity.current.account_id}/${var.project}-messages" },
]
secrets = [
{ name = "APP_SECRET", valueFrom = "${aws_secretsmanager_secret.app.arn}:APP_SECRET::" },
{ name = "JWT_PASSPHRASE", valueFrom = "${aws_secretsmanager_secret.app.arn}:JWT_PASSPHRASE::" },
]
healthCheck = {
command = ["CMD-SHELL", "php-fpm-healthcheck || exit 1"]
interval = 15
timeout = 5
retries = 3
startPeriod = 30
}
logConfiguration = {
logDriver = "awslogs"
options = {
"awslogs-group" = aws_cloudwatch_log_group.api.name
"awslogs-region" = var.region
"awslogs-stream-prefix" = "php"
}
}
}
])
tags = {
Name = "${var.project}-api"
Environment = var.environment
}
}
ECS Service с ALB и Auto-Scaling
# ECS Service
resource "aws_ecs_service" "api" {
name = "${var.project}-api"
cluster = aws_ecs_cluster.main.id
task_definition = aws_ecs_task_definition.api.arn
desired_count = var.environment == "prod" ? 3 : 1
launch_type = "FARGATE"
deployment_configuration {
maximum_percent = 200
minimum_healthy_percent = 100
}
deployment_circuit_breaker {
enable = true
rollback = true
}
network_configuration {
subnets = aws_subnet.private[*].id
security_groups = [aws_security_group.app.id]
assign_public_ip = false
}
load_balancer {
target_group_arn = aws_lb_target_group.api.arn
container_name = "nginx"
container_port = 80
}
lifecycle {
ignore_changes = [desired_count] # Managed by auto-scaling
}
tags = {
Name = "${var.project}-api"
Environment = var.environment
}
}
# Auto-Scaling: target tracking on CPU
resource "aws_appautoscaling_target" "api" {
max_capacity = var.environment == "prod" ? 20 : 3
min_capacity = var.environment == "prod" ? 3 : 1
resource_id = "service/${aws_ecs_cluster.main.name}/${aws_ecs_service.api.name}"
scalable_dimension = "ecs:service:DesiredCount"
service_namespace = "ecs"
}
# Scale on CPU utilization
resource "aws_appautoscaling_policy" "api_cpu" {
name = "${var.project}-api-cpu-scaling"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.api.resource_id
scalable_dimension = aws_appautoscaling_target.api.scalable_dimension
service_namespace = aws_appautoscaling_target.api.service_namespace
target_tracking_scaling_policy_configuration {
predefined_metric_specification {
predefined_metric_type = "ECSServiceAverageCPUUtilization"
}
target_value = 70.0
scale_in_cooldown = 300
scale_out_cooldown = 60
}
}
# Scale on request count per target
resource "aws_appautoscaling_policy" "api_requests" {
name = "${var.project}-api-request-scaling"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.api.resource_id
scalable_dimension = aws_appautoscaling_target.api.scalable_dimension
service_namespace = aws_appautoscaling_target.api.service_namespace
target_tracking_scaling_policy_configuration {
predefined_metric_specification {
predefined_metric_type = "ALBRequestCountPerTarget"
resource_label = "${aws_lb.main.arn_suffix}/${aws_lb_target_group.api.arn_suffix}"
}
target_value = 1000.0
scale_in_cooldown = 300
scale_out_cooldown = 60
}
}
Worker Service (для очередей)
# Worker Task Definition (single container, no nginx)
resource "aws_ecs_task_definition" "worker" {
family = "${var.project}-worker"
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
cpu = 512
memory = 1024
execution_role_arn = aws_iam_role.ecs_execution.arn
task_role_arn = aws_iam_role.ecs_task.arn
container_definitions = jsonencode([{
name = "worker"
image = "${aws_ecr_repository.app.repository_url}:php-${var.image_tag}"
essential = true
command = ["php", "bin/console", "messenger:consume", "async", "--time-limit=3600", "--memory-limit=512M"]
environment = [
{ name = "APP_ENV", value = var.environment },
{ name = "DATABASE_URL", value = "postgresql://${var.db_user}:${var.db_pass}@${aws_db_instance.main.address}:5432/${var.db_name}" },
{ name = "MESSENGER_TRANSPORT_DSN", value = "sqs://sqs.${var.region}.amazonaws.com/${data.aws_caller_identity.current.account_id}/${var.project}-messages" },
]
logConfiguration = {
logDriver = "awslogs"
options = {
"awslogs-group" = aws_cloudwatch_log_group.worker.name
"awslogs-region" = var.region
"awslogs-stream-prefix" = "worker"
}
}
}])
}
# Worker service (scales based on SQS queue depth)
resource "aws_ecs_service" "worker" {
name = "${var.project}-worker"
cluster = aws_ecs_cluster.main.id
task_definition = aws_ecs_task_definition.worker.arn
desired_count = var.environment == "prod" ? 2 : 1
launch_type = "FARGATE"
network_configuration {
subnets = aws_subnet.private[*].id
security_groups = [aws_security_group.app.id]
assign_public_ip = false
}
lifecycle {
ignore_changes = [desired_count]
}
}
# Scale worker based on SQS queue depth
resource "aws_appautoscaling_policy" "worker_sqs" {
name = "${var.project}-worker-sqs-scaling"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.worker.resource_id
scalable_dimension = aws_appautoscaling_target.worker.scalable_dimension
service_namespace = aws_appautoscaling_target.worker.service_namespace
target_tracking_scaling_policy_configuration {
customized_metric_specification {
metric_name = "ApproximateNumberOfMessagesVisible"
namespace = "AWS/SQS"
statistic = "Average"
dimensions {
name = "QueueName"
value = "${var.project}-messages"
}
}
target_value = 10.0 # Scale when >10 messages per worker
scale_in_cooldown = 300
scale_out_cooldown = 60
}
}
CI/CD для ECS (GitHub Actions)
# .github/workflows/deploy.yml
name: Deploy to ECS
on:
push:
branches: [main]
env:
AWS_REGION: us-east-1
ECR_REPOSITORY: myapp-api
ECS_CLUSTER: myapp-prod
ECS_SERVICE: myapp-api
jobs:
deploy:
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
steps:
- uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/github-deploy
aws-region: ${{ env.AWS_REGION }}
- name: Login to ECR
id: ecr
uses: aws-actions/amazon-ecr-login@v2
- name: Build and push images
env:
REGISTRY: ${{ steps.ecr.outputs.registry }}
IMAGE_TAG: ${{ github.sha }}
run: |
# Build PHP image
docker build -f api/Dockerfile --target prod \
-t $REGISTRY/$ECR_REPOSITORY:php-$IMAGE_TAG api/
docker push $REGISTRY/$ECR_REPOSITORY:php-$IMAGE_TAG
# Build Nginx image
docker build -f api/docker/nginx/Dockerfile \
-t $REGISTRY/$ECR_REPOSITORY:nginx-$IMAGE_TAG api/
docker push $REGISTRY/$ECR_REPOSITORY:nginx-$IMAGE_TAG
- name: Update ECS task definition
id: task-def
uses: aws-actions/amazon-ecs-render-task-definition@v1
with:
task-definition: ecs/task-definition.json
container-name: php
image: ${{ steps.ecr.outputs.registry }}/${{ env.ECR_REPOSITORY }}:php-${{ github.sha }}
- name: Deploy to ECS
uses: aws-actions/amazon-ecs-deploy-task-definition@v2
with:
task-definition: ${{ steps.task-def.outputs.task-definition }}
service: ${{ env.ECS_SERVICE }}
cluster: ${{ env.ECS_CLUSTER }}
wait-for-service-stability: true
EKS: когда нужен Kubernetes
Когда выбирать EKS
- Команда уже знает Kubernetes
- Нужна multi-cloud портативность
- Сложные сетевые политики (NetworkPolicy)
- Custom operators (CRD)
- Service mesh (Istio, Linkerd)
- Advanced deployment (canary с Flagger, progressive delivery)
Пример: Deployment для PHP
# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
labels:
app: api
spec:
replicas: 3
selector:
matchLabels:
app: api
template:
metadata:
labels:
app: api
spec:
containers:
- name: nginx
image: 123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:nginx-latest
ports:
- containerPort: 80
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "250m"
memory: "256Mi"
- name: php
image: 123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:php-latest
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "1Gi"
env:
- name: APP_ENV
value: "prod"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: app-secrets
key: database-url
readinessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 10
periodSeconds: 5
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 30
periodSeconds: 10
---
# HPA for auto-scaling
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
App Runner: для простых случаев
# App Runner — simplest deployment
resource "aws_apprunner_service" "api" {
service_name = "${var.project}-api"
source_configuration {
image_repository {
image_identifier = "${aws_ecr_repository.app.repository_url}:latest"
image_repository_type = "ECR"
image_configuration {
port = "80"
runtime_environment_variables = {
APP_ENV = var.environment
DATABASE_URL = "postgresql://${var.db_user}:${var.db_pass}@${aws_db_instance.main.address}:5432/${var.db_name}"
}
}
}
auto_deployments_enabled = true
authentication_configuration {
access_role_arn = aws_iam_role.apprunner_ecr.arn
}
}
instance_configuration {
cpu = "1024"
memory = "2048"
}
auto_scaling_configuration_arn = aws_apprunner_auto_scaling_configuration_version.api.arn
network_configuration {
egress_configuration {
egress_type = "VPC"
vpc_connector_arn = aws_apprunner_vpc_connector.main.arn
}
}
tags = {
Name = "${var.project}-api"
Environment = var.environment
}
}
Push image → автоматический деплой → получаете URL. Без кластеров, task definitions, load balancers.
Decision Matrix
Начинаю новый проект?
└── Нужен быстрый старт, MVP?
│ └── Да → App Runner
│ └── Нет
│ └── Команда знает Kubernetes?
│ └── Да → EKS (portability + ecosystem)
│ └── Нет → ECS Fargate (default choice)
│
└── Мигрирую с Docker Compose?
└── <5 сервисов, простая сеть → ECS Fargate
└── >5 сервисов, service mesh → EKS
└── 1 сервис, нужен URL → App Runner
Рекомендация: начинайте с ECS Fargate. Это покрывает 80% кейсов. Мигрируйте на EKS только если нужна портативность или сложные K8s-фичи.