Avg. Waste
15–32%
of cloud spend
Right-Size Savings
30–60%
per instance
Spot Discount
60–91%
vs on-demand
Storage Tiering
30–50%
on infrequent data
fig-A05_cost-tagging-workflow.svg
Cost Tagging Workflow
// Figure A05: 8-step tagging workflow — schema → tag enforcement → optimization
01 The FinOps Framework — Crawl / Walk / Run [ AWS · Azure · GCP ]

The FinOps Foundation defines three maturity stages. Each maps to specific tooling and process changes. The goal at every stage is accountability: knowing who spends what, and why.

StageFocusKey ActionsTypical Time
CrawlVisibilityCost baseline, tagging schema, idle resource scan0–30 days
WalkOptimizationRI/Savings Plan purchase, right-sizing, budget alerts30–90 days
RunAutomationPredictive modeling, auto-scaling policies, showback90+ days
02 Right-Sizing Compute [ HIGH ROI ]

If average CPU is below 40% over a 30-day baseline, the instance is oversized. Industry data shows 60–70% of cloud instances run at 2× required capacity.

fig-A06_finops-glossary.svg
FinOps Glossary Hub
// Figure A06: FinOps Glossary Hub — key terms for cloud cost optimization
Utilization SignalThresholdAction
CPU avg (30d)< 40%Downsize one size tier
Memory avg (30d)< 50%Review instance family
Network throughput< 20% peakEvaluate smaller ENI
Disk I/O< 30% avgSwitch to lower-tier volume
  • Export CPU + memory metrics from CloudWatch / Azure Monitor / GCP Monitoring
  • Identify instances with < 40% avg CPU over 30 days
  • Test at target size before terminating original instance
  • Make one change at a time to attribute performance correctly
  • Set a monitoring alert on the new instance before closing the old one
  • [ AdSense Slot 1 — top of content ]
    03 Reserved Instances vs. Savings Plans [ COMMITMENT REQUIRED ]
    FeatureSavings PlansReserved Instances
    FlexibilityHigh — any instance family, OS, AZLow — specific instance type + AZ
    DiscountUp to 72% vs on-demandUp to 75% vs on-demand
    Commitment1-yr or 3-yr USD amount1-yr or 3-yr specific instance
    Best forBaseline predictable workloadStable critical production
    Recommendation: Start with a Compute Savings Plan for baseline workload. Layer specific RIs for your most stable, highest-utilization instances.
    04 Spot Instances for Fault-Tolerant Workloads [ BATCH / CI-CD ]
    ProviderMax DiscountUse Case
    AWS EC2 Spot60–91% offBatch processing, CI/CD agents
    Azure SpotUp to 90% offData pipelines, rendering
    GCP Spot / PreemptibleUp to 91% offNon-production workloads
    Do NOT use spot for: databases, persistent APIs, any workload requiring consistent uptime.
    05 Storage Tiering [ LIFECYCLE POLICIES ]
    TierAccessCostRetrieval Delay
    Hot / StandardReal-timeBaselineNone
    Cool / InfrequentMonthly–40–60% storageHours
    Archive / ColdQuarterly–70–80% storage12–48 hours
    Glacier / Deep ArchiveAnnual–95% storageHours to days

    Implement lifecycle policies to auto-transition: Hot → Cool (90d)Cool → Glacier (1yr)Glacier → Deep Archive (3yr)

    06 Tagging Strategy [ COST ALLOCATION ]
    Required TagExamplePurpose
    EnvironmentproductionIsolate prod vs. dev spend
    Owner / Teamplatform-engChargeback by team
    CostCenterCC-12345Finance allocation
    Projectpayments-apiPer-project visibility
    ServiceNamepostgres-mainResource identification
    Enforcement: Use SCPs or policies to block resource creation if mandatory tags are missing. Audit weekly with: aws resourcegroupstaggingapi get-resources
    [ AdSense Slot 2 — mid content ]
    fig-A10_waste-checklist.svg
    Waste Detection Checklist
    // Figure A10: 15-point waste detection checklist