CostObserver

How a Leaked AWS Key Burned $10K in 48 Hours

This is a pattern. The details below are composite, drawn from real incident structures. The mechanics are real.

A startup running a standard AWS setup: EC2 workloads in ap-southeast-1, S3 for storage, a handful of IAM users with programmatic access. Monthly bill: predictable, around $3,000.

Then a developer commits a config file to a public GitHub repository. The file contains an AWS access key. An automated scanner finds it within minutes. By the time the developer notices and revokes the key, 48 hours have passed.

The bill for those 48 hours: over $10,000.

What the Attacker Did

The compromised key had EC2 permissions. Broad ones, because it was originally created for a deployment script and nobody had scoped it down since.

Within the first hour, the attacker ran RunInstances calls in three regions the account had never used: us-east-1, eu-west-1, and ap-northeast-1. GPU instances. The kind used for cryptomining.

The activity was visible in CloudTrail immediately. But nobody was watching CloudTrail in real time. GuardDuty was enabled, but the findings were routed to a security email alias that nobody checked over the weekend.

The cost signal appeared first. By Saturday morning, the account had already spent $4,000 in EC2 charges across regions the team did not operate in. By Sunday evening, it was over $10,000.

Why Cost Explorer Missed It

Cost Explorer did not miss the cost. It showed every dollar. What it missed was the cause.

The EC2 charges appeared under the same service line as legitimate workloads. Without filtering by region and cross-referencing against deployment history, the spike looked like a scaling event or a runaway autoscaler. The kind of thing a FinOps review would catch on Monday morning.

By Monday morning, the damage was done.

The gap is not in the tooling. Cost Explorer, CloudTrail, and GuardDuty all had the data. The gap is in the workflow. Nobody had built a process that connected a cost spike in an unfamiliar region to a security investigation. The two signals lived in separate tools, reviewed by separate teams, on separate schedules.

The Signal That Should Have Triggered an Alert

Three signals were present within the first two hours of the incident. Any one of them, routed correctly, would have shortened the window significantly.

Signal 1: Regional spend anomaly. EC2 charges in us-east-1 from an account that runs exclusively in ap-southeast-1. AWS Cost Anomaly Detection, configured with a monitor scoped to EC2 and a percentage-based threshold, would have flagged this within hours.

Signal 2: CloudTrail RunInstances from an unfamiliar IP. The attacker’s IP was not in the account’s normal access range. A CloudWatch alarm on RunInstances calls from IPs outside a known CIDR range is a one-time setup that takes under 30 minutes.

Signal 3: GuardDuty finding. GuardDuty generated a UnauthorizedAccess:IAMUser/InstanceCredentialExfiltration finding within the first hour. It sat unread in an email alias until Monday.

The Verizon 2025 Data Breach Investigations Report found that stolen credentials were involved in 88% of basic web application attacks. Credential exposure is hard to eliminate completely. The 48-hour window is not.

What Shortens the Window

1. Percentage-based anomaly detection, not fixed-dollar alerts

A fixed-dollar alert set at $500 above baseline does not catch a $10,000 weekend incident fast enough. A percentage-based alert in AWS Cost Anomaly Detection that fires when EC2 spend increases 50% above the trailing 7-day average will catch it within hours, not days.

2. IAM keys scoped to the minimum required permissions

The key that was compromised had ec2:RunInstances across all regions. If it had been scoped to ap-southeast-1 only, the attacker could not have launched instances elsewhere. AWS IAM best practices recommend using condition keys to restrict API calls by region and resource. Most teams know this. Most teams do not do it for deployment keys because it adds friction.

3. GuardDuty findings routed to an active channel

A GuardDuty finding sitting in an email alias is not a security control. Route high-severity findings to SNS, then to Slack or PagerDuty. The finding was generated. It just was not seen.

4. git-secrets or equivalent pre-commit hooks

AWS git-secrets scans commits for AWS credentials before they are pushed. It does not prevent all leaks, but it catches the most common one: a developer committing a config file without realising it contains a key.

The Deeper Problem

The $10,000 bill is the visible damage. The less visible damage is the 48-hour window during which an attacker had valid AWS credentials and broad EC2 permissions.

What else did they do during that window? Did they enumerate S3 buckets? Did they read any objects? Did they create IAM users or access keys for persistence? CloudTrail has the answers, but most teams only look at CloudTrail after they know there was an incident. By then, the attacker has had 48 hours to cover tracks.

The cost signal was the earliest indicator. It appeared before the security team was aware of the incident. That is the SecFinOps pattern: the financial signal and the security signal are the same event, but they are routed to different teams who never compare notes.

According to the AWS Security Blog guidance on programmatic access, long-lived IAM access keys are one of the highest-risk credential types in an AWS account. The recommendation is to use IAM roles wherever possible and rotate any long-lived keys on a short cycle. The deployment script that created this incident had been using the same key for 14 months.

What a Proper Response Looks Like

When a cost spike appears in an unfamiliar region, the first question is not “which team misconfigured something.” It is “does this correlate with any CloudTrail activity from an unfamiliar principal or IP.”

That question takes three minutes to answer if the workflow exists. It takes three days if it does not.

The workflow:

Cost Anomaly Detection fires on the regional EC2 spike
Alert routes to the on-call engineer via SNS
Engineer pulls CloudTrail events for the same time window, filtered by region
RunInstances calls from an unfamiliar IP surface immediately
Key is revoked, instances are terminated, GuardDuty findings are reviewed for scope of access

The total window from first alert to key revocation: under two hours. Not 48.

Try CostObserver. Read-only access, no credit card, connects in minutes. Or explore the demo without signing up.