CostObserver

Your ECS to EKS Migration Saved $103K. Did You Audit the Attack Surface It Created?

A junior infrastructure engineer recently documented an ECS to EKS migration that saved $103,500 in annual AWS costs. A 40% reduction. Zero downtime. Bin-packing, Spot instances, HPA, Cluster Autoscaler. The guide is thorough, honest about what breaks, and genuinely useful.

It also ends with WAF attachment as the last step. Almost an afterthought.

That ordering tells you everything about how most teams think about container migrations. Cost first. Security somewhere near the end. The problem is that every decision that reduced the bill also changed the security surface. Those two things happened simultaneously. The audit of one without the other is incomplete.

This post is not a critique of that migration. It is the security and cost context the guide did not cover.

The New IAM Surface Nobody Scoped Down

An EKS cluster requires at minimum two IAM roles: a cluster role for the control plane and a node role for the EC2 instances. The node role needs three managed policies attached: AmazonEKSWorkerNodePolicy, AmazonEKS_CNI_Policy, and AmazonEC2ContainerRegistryReadOnly.

Those are broad policies. AmazonEKSWorkerNodePolicy gives nodes the ability to describe and list resources across the cluster. AmazonEC2ContainerRegistryReadOnly gives nodes read access to every ECR repository in the account, not just the ones the workload needs.

Most teams attach these policies and move on. The migration works. The pods run. The cost drops.

But the node role is now attached to every EC2 instance in every node group. If a pod on a Spot node is compromised, the perpetrator has the node’s instance profile. That instance profile can describe EC2 resources, list ECR repositories, and interact with the EKS control plane. The blast radius of a single compromised pod is larger than it was in ECS, where task roles were scoped per service.

The fix is not complicated. IRSA (IAM Roles for Service Accounts) lets you attach IAM roles directly to Kubernetes service accounts rather than to the node. Each workload gets only the permissions it needs. The node role stays minimal. A compromised pod cannot use the node’s identity to enumerate the rest of the account.

This is a one-time setup per workload. It does not affect cost. It significantly reduces the blast radius of a node-level compromise.

Spot Instances and the Anomaly Detection Problem

Spot instances are one of the biggest cost levers in an EKS migration. Around 70% cheaper than On-Demand for fault-tolerant workloads. The savings are real and the Kubernetes interruption handling is genuinely good.

But Spot instances create constant node churn. Nodes are terminated and replaced on AWS’s schedule, not yours. Every replacement launches a new EC2 instance with a new instance ID, a new private IP, and a new set of network connections.

This matters for security baselines. AWS GuardDuty builds behavioural baselines for your EC2 instances over time. Unusual API calls, unexpected network destinations, anomalous DNS queries. When a node is replaced every few hours, GuardDuty has no stable baseline to compare against. Findings that would be flagged on a long-running instance get lost in the noise of normal Spot replacement activity.

AWS Cost Anomaly Detection has the same problem. Your EC2 cost baseline now includes constant Spot replacement events. A compromised node launching additional instances looks similar to normal Cluster Autoscaler activity. The cost signal that would have been obvious on a static ECS cluster is now buried in legitimate autoscaling noise.

The mitigation is to scope your anomaly detection monitors more tightly after the migration. Create separate monitors for On-Demand and Spot spend. A spike in On-Demand EC2 costs in a cluster that runs primarily Spot is a signal worth investigating. A spike in Spot costs alone is more likely to be autoscaling. Separating the two gives you a cleaner signal without the noise.

Bin-Packing Changes Your Blast Radius

Bin-packing is the core cost win in EKS. Instead of reserving EC2 capacity per service, multiple workloads share nodes with proper namespace isolation. The savings come from higher utilisation. Fewer nodes, more workloads per node.

The security implication is that namespace isolation in Kubernetes is not the same as network isolation between EC2 instances. Namespaces are a logical boundary. They control resource visibility and RBAC scope. They do not prevent a compromised container from attempting to reach the node’s metadata endpoint, probe other pods on the same node, or exploit a kernel vulnerability to escape the container boundary.

In ECS, each service ran on its own EC2 instance or Fargate task with its own network interface. The blast radius of a compromised container was bounded by the instance. In a bin-packed EKS cluster, multiple services share a node. A container escape on one workload potentially exposes every other workload on the same node.

This does not mean bin-packing is wrong. It means the security controls need to match the architecture. Kubernetes Network Policies restrict pod-to-pod communication at the network layer. Without them, every pod in the cluster can reach every other pod by default. The namespace boundary does not enforce network separation.

Network Policies are free to implement. They do not affect cost. They are the requisite control that makes bin-packing safe from a security perspective.

Secrets in Terraform State

The migration guide uses a clean pattern for secrets: SSM Parameter Store fetched by Terraform, written into Kubernetes Secrets, injected into pods via env_from. The flow is logical and works well operationally.

The problem is Terraform state. When Terraform reads SSM parameters and writes them into Kubernetes Secrets, those values appear in the Terraform state file in plaintext. If your state is stored in S3, the bucket policy and access controls on that bucket are now part of your secrets management security posture.

A misconfigured S3 bucket policy on your Terraform state bucket is a credential exposure risk. Every database password, API key, and service credential that flows through Terraform into Kubernetes is readable by anyone with access to the state file.

The AWS Well-Architected Framework Security Pillar recommends using the Secrets Store CSI Driver with the AWS Secrets Manager provider to mount secrets directly into pods from Secrets Manager or SSM, without writing them into Terraform state or Kubernetes Secrets objects. The secrets never touch the state file. The blast radius of a state file exposure is significantly reduced.

This is a migration step that has no cost impact and a meaningful security impact. Most teams skip it because the Terraform approach works and the risk is not visible until something goes wrong.

The Cost Signal After Migration

After an ECS to EKS migration, your cost baseline looks completely different. New line items appear: EKS cluster hours ($0.10/hour per cluster), data transfer patterns change as pods communicate across nodes, NAT Gateway costs shift as workloads move between subnets.

Most teams spend the first month after migration reconciling the new cost structure. What they are not doing is establishing a new security baseline for the new architecture.

The two activities should happen together. When you build your post-migration cost allocation model, tag your node groups, namespaces, and workloads with the same security context tags your SecOps team uses. data-classification, compliance-scope, security-posture. A cost anomaly on a node group running PCI-scoped workloads is a different investigation than a cost anomaly on a node group running internal tooling.

Fact-check: Does this mean EKS is less secure than ECS?

No. EKS with proper controls is more secure than ECS with default settings. The point is that the migration creates a new security surface that needs to be audited with the same rigour as the cost model. The $103K saving is real. The new IAM surface, the Spot baseline problem, the bin-packing blast radius, and the Terraform state exposure are also real. Measuring one without the other is how security debt accumulates quietly while the cost dashboard looks green.

Try CostObserver. Read-only access, no credit card, connects in minutes. Or explore the demo without signing up.