How to Set Up Cloud Cost Alerts Before Your AWS Bill Explodes — Nubex

You deployed a database dump to prod. Accidentally didn't terminate the instance.

48 hours later: $8,000 in EC2 charges for a $200/month workload.

You catch it because your credit card alerts you to the charge. But you should have caught it because your cloud cost alerted you.

Most DevOps teams don't have real cloud cost alerts. They have the default AWS billing notifications. Which tells you "Your bill this month will be ~$40K" (which is useless if you expected $30K).

Real alerts tell you today if something is wrong, not at the end of the month when it's too late.

Here's how to set this up properly.

The Problem With Default AWS Alerts

AWS CloudWatch Billing Alerts (the built-in thing) are:

Slow to update (4-6 hour lag)
Threshold-based, not anomaly-based (you have to know what "normal" is)
Not actionable (alert says "you spent $5K" — what do I do?)
Not integrated with your workflow (you get an email, then what?)

So most teams just don't set them up and find out about cost overruns at the end of the month.

What Good Cloud Cost Alerts Look Like

A real cloud cost alert system has three layers:

Layer 1: Budget Thresholds

If you spend more than X per day / per week / per month, alert immediately.

Example: "If daily spend exceeds $100 (2x normal), page the on-call engineer."

Layer 2: Anomaly Detection

If today's spend is unusual compared to the last 30 days, alert even if it's below budget.

Example: "If daily spend spikes 50% from average, investigate."

Layer 3: Service-Level Alerts

If a specific service (RDS, DynamoDB, EC2) has unusual cost, alert.

Example: "If RDS daily cost exceeds $50, page DBA."

Nubex provides all three. But let's start with what you can do right now in AWS + a spreadsheet.

Step 1: Set Up AWS Budget Alerts

In AWS Console, go to Billing → Budgets → Create Budget.

Create a budget for your total monthly spend:

1. Set the threshold: If you normally spend $30K/month, set budget to $35K (15% buffer)

2. Set email alerts: Alert at 50%, 75%, 90%, 100%, 110% of budget

3. Recipients: Whoever owns the cloud spend for your team

This gives you early warning. 50% of budget is enough time to investigate before you're at 100%.

But this only works if:

You know what "normal" is
You're paying attention to emails
You have time to investigate during the month

Most teams fail at one of these.

Step 2: Set Up Per-Service Alerts

Most cost spikes come from one or two services. RDS, DynamoDB, EC2.

Set up a separate budget for your top 3 cost drivers:

For RDS:

Normal monthly cost: $2,000
Budget: $2,500
Alert at 75%

For EC2:

Normal monthly cost: $15,000
Budget: $17,500
Alert at 75%

For S3:

Normal monthly cost: $500
Budget: $750
Alert at 75%

When you get an alert that RDS is 75% of its budget, you know to look at RDS specifically. That narrows the investigation from "what service cost $5K?" to "RDS is trending toward $1,875/month instead of $2,000. Why?"

Step 3: Set Up Anomaly Detection

AWS has a built-in anomaly detection feature. Go to Billing → Cost Anomaly Detection.

This monitors your daily spend and alerts you if it differs significantly from your normal spend pattern.

You set:

Threshold: 5%, 10%, or 20% deviation
Frequency: Daily, weekly, or as-it-happens
Recipients: Slack, email, SNS, whatever

This catches weird stuff that budget alerts miss.

Example: You normally spend $1,000/day. Anomaly detection sees $1,500 on Wednesday. It alerts you, even though you're still under budget for the month.

You investigate Wednesday's spike, find out someone ran a big batch job. You adjust the schedule. Problem solved before it escalates.

Step 4: Export Billing Data and Track Trends

Every week, export your billing data:

1. AWS Console → Cost Explorer

2. Export last 30 days of daily costs to CSV

3. Paste into a spreadsheet

4. Add a moving average line

This is your baseline. You can see:

What's "normal" spend
What times of month cost more
What services are trending up vs. down

When you get an alert, you compare it to this baseline. "RDS cost is $5K today instead of normal $70/day. That's 70x. Definitely investigate."

Step 5: Integrate Alerts Into Your Workflow

Emails get lost. Slack notifications get buried.

Integrate alerts into your incident workflow:

If an alert fires:

Create a ticket in Jira
Post in #incidents Slack channel
Page the on-call engineer
Run a cost query to identify the problem

Nubex can automate this. But you can also do it with a simple AWS Lambda + SNS.

Whenever an anomaly is detected:

Invoke Lambda
Lambda queries Cost Explorer for the past 24 hours
Lambda breaks down cost by service
Lambda posts to Slack with the breakdown

Example Slack message:

```

🚨 Cost Anomaly Detected

Daily spend: $1,500 (vs. normal $1,000)

Top cost drivers:

DynamoDB: +$300 (scan-heavy queries)
EC2: +$150 (extra instance started?)
RDS: +$50 (normal)

Investigation: Check DynamoDB queries and EC2 instances

```

Now the on-call engineer has actionable info immediately.

Step 6: Create an Escalation Policy

Not all cost spikes are emergencies.

Create a policy:

| Daily Spend | Action |

|-------------|--------|

| < $1,200 (normal) | No alert |

| $1,200-1,500 | Warning in Slack |

| $1,500-2,000 | Page on-call, post in #incidents |

| $2,000-3,000 | Page on-call + team lead |

| > $3,000 | Page everyone, call CTO |

This prevents alert fatigue. Not every anomaly needs everyone paged.

Specific AWS Services: What to Watch

EC2

Alert if daily spend exceeds 2x normal
Most common cause: forgot to terminate instance
Check: running instances list, see anything new?

RDS

Alert if daily spend exceeds 1.5x normal
Most common cause: query performance degradation (scanning whole table)
Check: CloudWatch metrics, DB query logs

DynamoDB

Alert if daily spend exceeds 2x normal
Most common cause: burst traffic or inefficient query
Check: consumed capacity metrics

Data Transfer / S3

Alert if daily spend exceeds 1.5x normal
Most common cause: running data pipeline without limits
Check: S3 access logs, data pipeline status

Lambda

Alert if monthly invocations exceed baseline
Most common cause: runaway cron job or stuck loop
Check: CloudWatch Logs

Real-World Setup: The 15-Minute Version

If you want to get this done right now:

1. 5 min: Create AWS Budget for total monthly spend, set alert at 90%

2. 5 min: Create AWS Anomaly Detection, set at 20% threshold

3. 5 min: Tell your team where to find cost alerts and who investigates

This takes 15 minutes and catches 80% of cost problems before they blow up.

The other 20% requires per-service budgets and workflow integration, which takes more time but gives you better visibility.

The Payoff

Every dollar you don't waste is a dollar saved.

A well-tuned alert system catches cost problems in hours instead of weeks.

Hours of wasted cloud spend: $0

Hours of wasted DevOps time investigating: $200+

Catch problems fast. Saves money and stress.

Get full cost visibility and smart alerts with Nubex →

---

You've been surprised by your AWS bill. Let's not do that again. Set up alerts this week.