Cost-Optimized Compute

RI vs Savings Plans: the apply order, and why it matters

Three levers cut compute spend. commit, Reserved Instances (RIs) and Savings Plans, both pre-purchased discounts on steady usage; Spot for interruption-tolerant work; right-size with Compute Optimizer. The whole game is matching a workload pattern to its cheapest lever: AWS billing applies discounts in a fixed sequence each hour, and the order changes which mechanism is the right one to buy. The figure below traces this hourly apply order, from Reserved Instances down to On-Demand.

Apply order, every billing hour:

Reserved Instances are applied first, to any usage that matches the RI's family, region, AZ (if zonal), tenancy, and OS.
Savings Plans are applied next, to any remaining eligible usage, in the order that maximizes customer savings (highest-discount usage first).
On-Demand rates apply to whatever is left.

Why this matters for buying decisions:

If RIs already fully cover your baseline (e.g., 10 reserved m5.xlarge), a Compute Savings Plan overlapping that usage gains nothing. RIs consume it first, and the Savings Plan sits idle.

Buy in the inverse order: layer Savings Plans on un-RI-covered usage, then add RIs only when a specific instance type's utilization is reliably steady over 1-3 years.

Practical pattern:

Have a steady baseline that's family-locked? Standard or Convertible RIs (Standard locks the family for the deepest discount, up to 72%; Convertible can be exchanged for another family, OS, or tenancy).
Have a steady baseline that spans EC2 + Fargate + Lambda? Compute Savings Plan (one commitment, three services).
Have a steady baseline that's EC2-only but instance-family-stable with size flexibility? EC2 Instance Savings Plan (up to 72%, family-locked but size-flexible; m5.large/xl/2xl all count toward the same $/hr commitment).

Further reading: How Savings Plans apply^[3].

Hourly discount apply order: RIs consume matching usage first, Savings Plans cover the eligible remainder, On-Demand prices what's left.

Spot interruption handling and Spot Fleet allocation strategies

You run a batch-transcoding fleet whose jobs checkpoint and can restart on a fresh instance, and you want to stop paying On-Demand rates for interruptible work like this. Spot Instances offer up to 90% off On-Demand, but only if the application can absorb the 2-minute termination notice and the variable-capacity reality.

Interruption signals:

2-minute termination notice (written to the instance metadata service at http://169.254.169.254/latest/meta-data/spot/instance-action). Applications should drain in-flight requests, checkpoint state, and deregister from load balancers within this window.
Rebalance recommendation, issued earlier than the termination notice when Spot interruption probability becomes elevated. Acts as a soft warning so applications can proactively migrate work before the 2-minute hard notice arrives. The figure below traces this signal order, from the earlier rebalance recommendation to the 2-minute termination notice and final reclamation.

Spot Fleet allocation strategies:

price-capacity-optimized: a hybrid balancing cost and interruption risk; AWS's recommended strategy for most Spot workloads (the API/CLI default is still lowest-price, override it) and the right starting point for long-running fleets.
capacity-optimized: launches instances from the pool with the lowest predicted interruption rate, when uptime stability matters more than raw $/hr.
capacity-optimized-prioritized: same, but respects an instance-type priority list. Useful when ML/HPC workloads have a preferred instance type for performance reasons.
lowest-price: launches from the cheapest pool. Best for short-lived workloads where interruption recovery is fast.
diversified: spreads across all configured pools. Maximizes resilience but may not pick the cheapest pool.

Pairing with Mixed Instances Policy: Auto Scaling groups can specify multiple instance types and a Spot/On-Demand split. OnDemandBaseCapacity defines a minimum guaranteed On-Demand floor; OnDemandPercentageAboveBaseCapacity controls the split for capacity above that floor.

Further reading: EC2 Spot Fleet allocation strategies^[10].

Spot interruption signal order: the rebalance recommendation arrives before the 2-minute termination notice. The notice is only the final drain window.

Compute Optimizer: when it generates recommendations, when it can't

You've already bought Savings Plans and shifted batch work to Spot, yet one m5.2xlarge has sat at 15% CPU all week — money leaking from an oversized instance that no pricing model can recover. Compute Optimizer analyzes utilization metrics to recommend right-sizing for EC2, Auto Scaling groups, EBS volumes, Lambda, ECS on Fargate, and RDS databases (MySQL and PostgreSQL), among others, the lever that pays out under any pricing model.

Minimum data requirements (these matter for the exam):

EC2 / Auto Scaling: 30 hours of metrics within the past 14 days; the more history, the better the confidence rating.
Lambda: ≥50 invocations across at least 14 days. Functions invoked fewer than 50 times in that window receive no recommendation.
EBS volumes: 30 consecutive hours of metrics.
ECS on Fargate: 24 hours of metrics over the past 14 days.

If you see an exam question where Compute Optimizer 'cannot generate' or 'has insufficient data', it's almost always one of these thresholds.

What it recommends:

Over-provisioned: downsize to a smaller instance/family or reduce memory allocation.
Under-provisioned: upsize, or switch to a higher-performance family.
Optimized: no change recommended.
Inferred workload type (EC2): Compute Optimizer detects whether a workload looks CPU-bound, memory-bound, etc., and may recommend a different family (e.g., m5 → c5 for CPU-heavy patterns, m5 → r5 for memory-heavy).

Enhanced infrastructure metrics: the paid opt-in that extends the lookback^[14] from the 14-day default to up to 93 days, so monthly or quarterly peaks feed the recommendation, not a quiet trough. Memory visibility is separate: the CloudWatch Agent supplies memory-pressure metrics that the default CPU-only signals miss.

Further reading: AWS Compute Optimizer requirements^[9].

Worked example: applying every lever to a 50-instance fleet

One fleet, every lever. This matching is exactly what the exam probes. The decision below pairs each workload class with its matched purchase lever. The figure below captures that same matching at a glance.

Scenario: A SaaS company runs 50 m5.xlarge EC2 instances On-Demand in us-east-1 to serve a stateless web tier. Monthly bill: ~$5,800. Workload characteristics: 30 instances are needed 24/7 (baseline traffic), 20 fluctuate with traffic spikes 2-3 hours per day, and a separate set of 10 batch workers process queued jobs nightly.

Apply every lever:

Baseline 30 instances (predictable, 24/7). Buy 30× 1-year No-Upfront Standard RIs for m5.xlarge in us-east-1. Discount: ~40% → saves ~$1,400/month.
Spiky 20 instances (variable). Auto Scaling group with Mixed Instances Policy: OnDemandBaseCapacity=0, OnDemandPercentageAboveBaseCapacity=0, allocation strategy capacity-optimized (stability-first; price-capacity-optimized also fits). Allow m5.xlarge, m5a.xlarge, m4.xlarge, m6i.xlarge as Spot pools. Discount: ~70% on the Spot portion → saves ~$800/month at typical 50% utilization.
Nightly batch workers (fault-tolerant by definition). Pure Spot Fleet with price-capacity-optimized strategy. Discount: ~80% → saves ~$300/month.
Run Compute Optimizer on the entire fleet after 14 days of metrics. Likely recommends downsizing some over-provisioned instances to m5.large. Estimated savings: another 15-20%.

Total monthly savings: ~$2,500-2,900, or roughly 45-50% off the original $5,800, with zero code changes.

Memorize the matching, not the percentages.

Match the workload to the lever: a 24/7 baseline buys Standard RIs, spiky demand runs Auto Scaling on Spot, fault-tolerant batch runs a Spot Fleet, then Compute Optimizer right-sizes.

EC2 pricing mechanisms compared

Mechanism	Discount vs OD	Commitment	Flexibility	Interruption risk	Best for
Standard RI	Up to 72%	1 or 3 yr, instance family + region	Can sell on RI Marketplace	None	Stable, family-locked workloads
Convertible RI	Up to 66%	1 or 3 yr	Exchange family / OS / tenancy	None	Stable but evolving workloads
Compute Savings Plans	Up to 66%	1 or 3 yr, $/hr commitment	Any family, any region, EC2 + Fargate + Lambda	None	Multi-service compute mix
EC2 Instance Savings Plans	Up to 72%	1 or 3 yr, $/hr commitment	Family + region locked, size-flexible	None	Type-stable EC2 with size flexibility
Spot	Up to 90%	None	Any instance type, any region	2-min termination notice	Fault-tolerant batch, CI, stateless web
On-Demand	0%	None	Full	None	Spiky, unpredictable, short-lived

Decision tree

Cheat sheet

Sharp facts the exam loves — give these one last read before exam day.

Cheat sheet

Sharp facts the exam loves — scan these before test day.

Commit to reserved capacity when usage stays steady most of the year

When EC2, Fargate, or Lambda runs steadily for more than ~70% of a 1- or 3-year window, commit to capacity instead of paying on-demand: the reservation discount runs the whole term and reaches up to ~72% off on-demand at a 3-year all-upfront commitment, while on-demand carries no commitment but the highest per-hour rate. Use on-demand only for short-lived or unpredictable work where you'd rather pay the premium than over-commit. Reserved Instances tie the discount to a specific instance configuration, whereas Savings Plans commit you to a dollars-per-hour spend that applies across families and flexes as your fleet changes.

Trap Treating RIs and Savings Plans as interchangeable: RIs lock to a family and region, SPs trade some discount for cross-family flexibility, and an SP that overlaps your RIs just sits idle behind them.

14 questions test this

Run interruption-tolerant work on EC2 Spot

EC2 Spot runs on spare capacity at up to 90% off On-Demand but can be reclaimed on a 2-minute notice, so it fits stateless, batch, big-data, and CI-fleet jobs where an interruption is survivable, and nothing that can't tolerate being reclaimed mid-task, which belongs on on-demand or reserved capacity. To minimize interruptions, set the price-capacity-optimized allocation strategy (AWS's recommended choice; you must set it explicitly because the CLI/API default is still lowest-price), which draws from pools that are both cheap and deep.

Trap Running stateful workloads on Spot without checkpointing: a reclaim mid-task loses in-flight state, so persist or checkpoint first.

2 questions test this

Reach for Compute Optimizer when you need right-sizing recommendations

Compute Optimizer answers "is this resource the right size?" (analyzing actual utilization to recommend a smaller instance or different family for EC2, EBS, Lambda, and ECS-on-Fargate) rather than "should I commit?". It needs enough data to judge (a Lambda function must see at least 50 invocations in 14 days to qualify) and deliberately stays out of the purchasing decision, leaving RI and Savings Plans recommendations to Trusted Advisor and Cost Explorer.

Trap Expecting Compute Optimizer to recommend RIs or Savings Plans: those purchase recommendations come from Trusted Advisor or Cost Explorer instead.

3 questions test this

RIs apply before Savings Plans in the billing engine

The billing engine applies discounts in a fixed order each hour: RIs first against matching family/region/AZ/OS usage, then Savings Plans against the remaining eligible usage (EC2 Instance SPs apply before the broader Compute SPs, and within that, highest-discount-percentage first), then on-demand for whatever is left. That order is why a Savings Plan overlapping your RIs earns nothing: the RIs have already consumed those hours. Buy SPs to cover usage your RIs don't reach, and add RIs only for instance types whose utilization is reliably steady.

Trap Buying a fresh Compute SP that overlaps usage your RIs already cover: the RIs consume those hours first and the new SP sits idle.

1 question tests this

A media company stores 500 TB of video archive footage in S3 Glacier Deep Archive. A production team urgently needs to retrieve 50 TB of…

Climb the flexibility ladder: Standard RI to Convertible RI to Compute SP

Flexibility forms a ladder (Standard RI to Convertible RI to Compute SP) that trades a little discount for room to change. A Standard RI is locked to its family and can only be sold on the RI Marketplace, not swapped; a Convertible RI can be exchanged for a different family, OS, or tenancy without selling; and a Compute SP is the most flexible, covering EC2, Fargate, and Lambda across any family and region, at a slightly lower maximum discount than a deep RI commitment. Pick the lowest rung that still tolerates how much your workload will shift.

Trap Assuming a Standard RI can be exchanged like a Convertible. It can only be sold on the Marketplace, not swapped to another family.

13 questions test this

Use price-capacity-optimized for long-running Spot fleets, not lowest-price

On a long-running Spot fleet the allocation strategy decides how often you get interrupted, so it matters more than shaving the last cent off the hourly rate. lowest-price^[10] minimizes cost but draws from the shallowest pools, which are reclaimed first; capacity-optimized launches from the deepest, lowest-interruption pools. price-capacity-optimized (the strategy AWS recommends, and the console default for new fleets/ASGs, though the CLI/API default is still lowest-price unless you override it) balances both, and capacity-optimized-prioritized honors your priority list when HPC or ML jobs need a specific instance order.

Trap Choosing lowest-price for a long-running Spot fleet: the cheapest pools are reclaimed first, so you trade a small saving for far more interruptions.

4 questions test this

A low-traffic Lambda gets no Compute Optimizer recommendation

A quiet Lambda below 50 invocations in 14 days^[9] gets no Compute Optimizer recommendation at all, because the service needs enough recent activity to size a function and there simply isn't enough signal. When the exam says 'Compute Optimizer cannot generate a recommendation' for a quiet function, this lookback threshold is the root cause, not anything you've misconfigured.

Reach for Graviton when 'reduce cost' meets an unrestricted architecture

AWS Graviton (ARM64) instances^[11] are AWS-designed processors delivering up to 40% better price-performance than comparable x86, so when a question says 'reduce cost' and doesn't pin the architecture, Graviton is the move. The only catch is the workload must run on ARM, which most managed services already handle (Graviton is supported across RDS, Aurora, ElastiCache, Lambda, and Fargate) so an unrestricted, managed workload has nothing holding it on x86.

Trap Reaching for Spot or Reserved Instances when the question only says 'reduce cost' on an unrestricted workload: those need interruption tolerance or a commitment, while Graviton just lowers the rate.

1 question tests this

A company has multiple AWS Lambda functions that process batch data from Amazon S3. The functions run asynchronously and are not…

Run fault-tolerant containers on Fargate Spot

Fargate Spot^[12] runs tasks on spare capacity at a deep discount but can reclaim them on the same 2-minute notice as EC2 Spot, so it fits fault-tolerant containerized work like CI builds, batch jobs, and dev/test. For anything that must stay up, keep it on regular FARGATE; the standard pattern is a mixed capacity provider holding a FARGATE baseline for steady load and bursting the interruption-tolerant overflow onto FARGATE_SPOT.

Trap Putting always-on production tasks entirely on Fargate Spot: keep the steady baseline on FARGATE and use FARGATE_SPOT only for interruption-tolerant burst.

3 questions test this

Get RI / SP / right-sizing tips for free from Trusted Advisor

Trusted Advisor^[13] surfaces cost-optimization advice (an 'underutilized EC2 instances' check for right-sizing, plus 'RI optimization' and 'Savings Plans recommendations' once it has ~30 days of usage to learn from), and a core subset of those checks is free. Only that subset comes with Basic support; the full cost-optimization check set unlocks with a Business or Enterprise plan.

Trap Assuming every Trusted Advisor check is free: only a core subset is, and the full cost-optimization check set requires a Business or Enterprise support plan.

Replace at-risk Spot before the 2-minute notice with Capacity Rebalancing

Capacity Rebalancing lets an Auto Scaling group act on EC2's rebalance recommendation signal (which arrives ahead of the hard 2-minute interruption notice), launching a replacement proactively while the at-risk instance is still healthy. Waiting on the 2-minute notice alone leaves no headroom because capacity may already be gone. Pair rebalancing with lifecycle hooks so in-flight requests finish draining before the old instance terminates.

Trap Relying only on the 2-minute notice for graceful drain: by then capacity may already be gone, and the earlier rebalance recommendation is what gives you headroom.

3 questions test this

Use Enhanced Infrastructure Metrics for cyclical monthly or quarterly workloads

Enhanced Infrastructure Metrics is the paid Compute Optimizer add-on that extends the lookback to up to 93 days, capturing a full cyclical period and sizing for the peak. The right choice whenever billing or processing follows a monthly or quarterly rhythm. The default 14-day lookback of CloudWatch data can otherwise sample only a quiet trough of a workload that spikes monthly or quarterly and then recommend an instance that's too small.

Trap Trusting a default-lookback recommendation for a cyclical workload: 14 days can sample only a trough and recommend a too-small instance.

2 questions test this

Configure org-wide Compute Optimizer settings from the management account

Recommendation preferences set in the management account (approved instance families, required CPU headroom, lookback window) propagate to every member account in an AWS Organization, so you tune the policy once instead of per account. Note the coverage edges: Compute Optimizer does not rightsize Spot Instances at all, but it does cover RDS for MySQL and PostgreSQL (with Performance Insights enabled) alongside EC2, Lambda, EBS, and ECS.

Trap Expecting Spot Instance rightsizing from Compute Optimizer: it produces no recommendations for Spot, so optimize those via allocation strategy and instance flexibility instead.

3 questions test this

Use a Zonal RI, not Regional, when you need guaranteed capacity in a specific AZ

A Zonal Reserved Instance, scoped to one Availability Zone, gives both the billing discount and a capacity reservation matching the instance attributes, so instances still launch even during a peak-demand crunch in that AZ: choose it whenever guaranteed capacity in a known AZ is the requirement. A Regional Reserved Instance, by contrast, applies its billing discount flexibly across all AZs in the region but reserves no capacity, so it saves money without guaranteeing a launch slot, making it the better default when guaranteed capacity isn't required.

Trap Assuming a Regional RI guarantees a launch slot: it only discounts billing, so you need a Zonal RI (or On-Demand Capacity Reservation) to reserve actual capacity.

4 questions test this

RI vs Savings Plans: the apply order, and why it matters

Spot interruption handling and Spot Fleet allocation strategies

Compute Optimizer: when it generates recommendations, when it can't

Worked example: applying every lever to a 50-instance fleet

EC2 pricing mechanisms compared

Decision tree

Cheat sheet

Also tested in

References