AWS Mastery12 min read

    AWS Cost Optimization: Reduce Your Cloud Bill Without Sacrificing Performance

    Tarek Cheikh

    Founder & AWS Cloud Architect

    AWS Cost Optimization: Reduce Your Cloud Bill Without Sacrificing Performance

    AWS billing grows silently. Idle instances, unattached volumes, oversized databases, and missing Savings Plans compound month after month. The tools to find and fix these problems are built into AWS -- Cost Explorer, Compute Optimizer, Budgets, and the CLI. The issue is that most teams never run them.

    This article covers the full cost optimization workflow: finding waste with Cost Explorer and CLI audits, right-sizing with Compute Optimizer, committing to Savings Plans, using Spot Instances, optimizing storage with S3 lifecycle policies and gp3 migration, scheduling dev/test environments, and setting up budget alerts.

    Cost Explorer: Where Is the Money Going?

    # Get monthly cost breakdown by service (last 3 months)
    aws ce get-cost-and-usage \
        --time-period Start=2025-03-01,End=2025-06-01 \
        --granularity MONTHLY \
        --metrics BlendedCost \
        --group-by Type=DIMENSION,Key=SERVICE \
        --query 'ResultsByTime[].Groups[?Metrics.BlendedCost.Amount > `100`].{Service:Keys[0],Cost:Metrics.BlendedCost.Amount}'
    
    # Get daily cost for the current month
    aws ce get-cost-and-usage \
        --time-period Start=2025-06-01,End=2025-06-16 \
        --granularity DAILY \
        --metrics UnblendedCost
    
    # Cost breakdown by instance type (find the expensive ones)
    aws ce get-cost-and-usage \
        --time-period Start=2025-05-01,End=2025-06-01 \
        --granularity MONTHLY \
        --metrics UnblendedCost \
        --group-by Type=DIMENSION,Key=INSTANCE_TYPE \
        --filter '{"Dimensions":{"Key":"SERVICE","Values":["Amazon Elastic Compute Cloud - Compute"]}}'
    
    # Cost by tag (requires cost allocation tags to be activated)
    aws ce get-cost-and-usage \
        --time-period Start=2025-05-01,End=2025-06-01 \
        --granularity MONTHLY \
        --metrics UnblendedCost \
        --group-by Type=TAG,Key=Environment
    
    # Enable a cost allocation tag
    aws ce update-cost-allocation-tags-status \
        --cost-allocation-tags-status '[
            {"TagKey":"Environment","Status":"Active"},
            {"TagKey":"Team","Status":"Active"},
            {"TagKey":"Project","Status":"Active"}
        ]'
    
    # Tags take 24 hours to appear in Cost Explorer after activation.

    Finding Waste: Idle and Unused Resources

    Idle EC2 Instances

    # Find running instances with average CPU below 5% over the last 7 days
    # Step 1: List all running instances
    aws ec2 describe-instances \
        --filters Name=instance-state-name,Values=running \
        --query 'Reservations[].Instances[].{Id:InstanceId,Type:InstanceType,LaunchTime:LaunchTime}' \
        --output table
    
    # Step 2: Check CPU for a specific instance (last 7 days)
    aws cloudwatch get-metric-statistics \
        --namespace AWS/EC2 \
        --metric-name CPUUtilization \
        --dimensions Name=InstanceId,Value=i-0123456789abcdef0 \
        --start-time 2025-06-09T00:00:00Z \
        --end-time 2025-06-16T00:00:00Z \
        --period 86400 \
        --statistics Average \
        --query 'Datapoints[].{Day:Timestamp,AvgCPU:Average}' \
        --output table
    
    # If average CPU is consistently below 5%, the instance is likely idle.
    # Options: stop it, downsize it, or move the workload to Lambda/Fargate.

    Unattached EBS Volumes

    # Find EBS volumes not attached to any instance
    aws ec2 describe-volumes \
        --filters Name=status,Values=available \
        --query 'Volumes[].{VolumeId:VolumeId,Size:Size,Type:VolumeType,Created:CreateTime}' \
        --output table
    
    # Unattached volumes cost money every month:
    #   gp3: $0.08/GB/month
    #   gp2: $0.10/GB/month
    #   io2: $0.125/GB/month + IOPS charges
    #
    # A forgotten 500 GB gp2 volume costs $50/month ($600/year) for nothing.
    
    # Delete after confirming the data is not needed
    # (create a snapshot first if unsure)
    aws ec2 create-snapshot --volume-id vol-0123456789abcdef0 --description "Backup before delete"
    aws ec2 delete-volume --volume-id vol-0123456789abcdef0

    Old EBS Snapshots

    # Find snapshots older than 90 days
    aws ec2 describe-snapshots \
        --owner-ids self \
        --query 'Snapshots[?StartTime<`2025-03-01`].{Id:SnapshotId,Size:VolumeSize,Created:StartTime}' \
        --output table
    
    # Snapshot pricing: $0.05/GB/month
    # 1 TB of old snapshots = $50/month = $600/year
    #
    # Review and delete snapshots that are no longer needed:
    aws ec2 delete-snapshot --snapshot-id snap-0123456789abcdef0

    Elastic IPs Not Attached

    # Unused Elastic IPs cost $0.005/hour ($3.65/month each)
    aws ec2 describe-addresses \
        --query 'Addresses[?AssociationId==null].{IP:PublicIp,AllocationId:AllocationId}' \
        --output table
    
    # Release unused Elastic IPs
    aws ec2 release-address --allocation-id eipalloc-0123456789abcdef0

    Right-Sizing with Compute Optimizer

    # AWS Compute Optimizer analyzes CloudWatch metrics (CPU, memory, network)
    # over the last 14 days and recommends optimal instance types.
    
    # Enable Compute Optimizer (one-time, account-level)
    aws compute-optimizer update-enrollment-status --status Active
    
    # Get EC2 instance recommendations
    aws compute-optimizer get-ec2-instance-recommendations \
        --query 'instanceRecommendations[].{
            Instance:instanceArn,
            Current:currentInstanceType,
            Finding:finding,
            Recommended:recommendationOptions[0].instanceType,
            EstimatedSavings:recommendationOptions[0].estimatedMonthlySavings.value
        }' \
        --output table
    
    # Findings:
    #   OVER_PROVISIONED -- instance is too large, downsize it
    #   UNDER_PROVISIONED -- instance is too small, upsize it
    #   OPTIMIZED -- instance is correctly sized
    
    # Compute Optimizer also covers:
    aws compute-optimizer get-auto-scaling-group-recommendations
    aws compute-optimizer get-ebs-volume-recommendations
    aws compute-optimizer get-lambda-function-recommendations
    aws compute-optimizer get-ecs-service-recommendations
    
    # EBS volume recommendations often suggest gp2 -> gp3 migration.
    # Lambda recommendations suggest optimal memory configuration.
    # ECS recommendations suggest right-sized CPU/memory for Fargate tasks.

    Savings Plans

    # Savings Plans offer up to 72% savings in exchange for a 1-year or 3-year
    # hourly spend commitment. They replace Reserved Instances for most use cases.
    #
    # Two types:
    #
    # Compute Savings Plans:
    #   - Apply to EC2, Fargate, and Lambda across all regions and instance families
    #   - Most flexible: change instance type, region, OS, or tenancy anytime
    #   - Up to 66% savings
    #
    # EC2 Instance Savings Plans:
    #   - Apply to a specific instance family in a specific region (e.g., m5 in us-east-1)
    #   - Less flexible but deeper discounts
    #   - Up to 72% savings
    #
    # Payment options (higher upfront = deeper discount):
    #   No Upfront:      pay monthly, lowest commitment
    #   Partial Upfront:  pay ~50% upfront, rest monthly
    #   All Upfront:      pay 100% upfront, deepest discount
    
    # Get Savings Plans recommendations from Cost Explorer
    aws ce get-savings-plans-purchase-recommendation \
        --savings-plans-type COMPUTE_SP \
        --term-in-years ONE_YEAR \
        --payment-option NO_UPFRONT \
        --lookback-period-in-days SIXTY_DAYS
    
    # View current Savings Plans utilization
    aws ce get-savings-plans-utilization \
        --time-period Start=2025-05-01,End=2025-06-01
    
    # Target 80-90% Savings Plans utilization. Below 80% means you are
    # over-committed and paying for unused capacity. Above 95% means you
    # have on-demand spend that could be covered by additional plans.
    
    # Purchase a Savings Plan (example: $10/hour Compute SP, 1 year, no upfront)
    aws savingsplans create-savings-plan \
        --savings-plan-offering-id offering-id-from-describe \
        --commitment 10.0 \
        --purchase-time 2025-06-16T00:00:00Z

    Spot Instances

    # Spot Instances use unused EC2 capacity at up to 90% discount.
    # AWS can reclaim them with a 2-minute warning.
    #
    # Good for: batch processing, CI/CD, data analysis, stateless web servers
    # Not for: databases, stateful workloads, single-instance applications
    
    # Check spot pricing history
    aws ec2 describe-spot-price-history \
        --instance-types m5.xlarge c5.xlarge \
        --product-descriptions "Linux/UNIX" \
        --start-time 2025-06-15T00:00:00Z \
        --query 'SpotPriceHistory[].{Type:InstanceType,AZ:AvailabilityZone,Price:SpotPrice}' \
        --output table
    
    # Check Spot placement score (likelihood of getting capacity)
    aws ec2 get-spot-placement-scores \
        --target-capacity 10 \
        --instance-types-with-target-capacity-unit-type '{"TargetCapacityUnitType":"units"}' \
        --target-capacity-unit-type units \
        --instance-requirements-with-metadata '{
            "InstanceRequirements": {
                "VCpuCount": {"Min": 2, "Max": 8},
                "MemoryMiB": {"Min": 8192, "Max": 32768}
            }
        }' \
        --region-names us-east-1 us-west-2
    
    # Mixed Instances ASG (combine On-Demand + Spot for reliability)
    # In the ASG launch template, specify:
    #   OnDemandBaseCapacity: 2          (always keep 2 on-demand)
    #   OnDemandPercentageAboveBase: 20  (20% on-demand, 80% spot above base)
    #   SpotAllocationStrategy: capacity-optimized  (fewest interruptions)
    #
    # Example: ASG with 10 instances
    #   2 on-demand (base) + 2 on-demand (20% of remaining 8) + 6 spot
    #   Result: 4 on-demand + 6 spot = ~50% cost reduction with high availability

    S3 Storage Optimization

    # S3 storage classes and approximate pricing (us-east-1, per GB/month):
    #
    # Standard:            $0.023    (frequent access)
    # Intelligent-Tiering: $0.023    (auto-moves between tiers, $0.0025 monitoring fee per 1K objects)
    # Standard-IA:         $0.0125   (infrequent access, min 30 days, 128 KB minimum)
    # One Zone-IA:         $0.010    (single AZ, infrequent access)
    # Glacier Instant:     $0.004    (archive, millisecond retrieval)
    # Glacier Flexible:    $0.0036   (archive, minutes to hours retrieval)
    # Glacier Deep Archive: $0.00099 (long-term archive, 12-48 hour retrieval)
    #
    # Moving 1 TB from Standard to Glacier Deep Archive:
    #   $23.00/month --> $1.01/month = 96% savings
    
    # Enable S3 Intelligent-Tiering (automatic tier management)
    aws s3api put-bucket-intelligent-tiering-configuration \
        --bucket my-data-bucket \
        --id EntireBucket \
        --intelligent-tiering-configuration '{
            "Id": "EntireBucket",
            "Status": "Enabled",
            "Tierings": [
                {"AccessTier": "ARCHIVE_ACCESS", "Days": 90},
                {"AccessTier": "DEEP_ARCHIVE_ACCESS", "Days": 180}
            ]
        }'
    
    # Intelligent-Tiering automatically moves objects:
    #   Frequent Access (default) --> Infrequent Access (after 30 days no access)
    #   --> Archive Access (after 90 days, if configured)
    #   --> Deep Archive Access (after 180 days, if configured)
    # No retrieval fees. Objects move back to Frequent on access.
    
    # S3 lifecycle policy (rule-based transitions and expiration)
    aws s3api put-bucket-lifecycle-configuration \
        --bucket my-logs-bucket \
        --lifecycle-configuration '{
            "Rules": [
                {
                    "ID": "ArchiveOldLogs",
                    "Status": "Enabled",
                    "Filter": {"Prefix": "logs/"},
                    "Transitions": [
                        {"Days": 30, "StorageClass": "STANDARD_IA"},
                        {"Days": 90, "StorageClass": "GLACIER_IR"},
                        {"Days": 365, "StorageClass": "DEEP_ARCHIVE"}
                    ],
                    "Expiration": {"Days": 2555}
                },
                {
                    "ID": "CleanupIncompleteUploads",
                    "Status": "Enabled",
                    "Filter": {"Prefix": ""},
                    "AbortIncompleteMultipartUpload": {"DaysAfterInitiation": 7}
                }
            ]
        }'
    
    # The second rule cleans up incomplete multipart uploads.
    # These are invisible in the console but accumulate storage charges.

    EBS: Migrate gp2 to gp3

    # gp3 is 20% cheaper than gp2 at baseline and includes 3,000 IOPS
    # and 125 MB/s throughput for free (gp2 performance scales with size).
    #
    # gp2: $0.10/GB/month, IOPS = 3 * size (min 100, max 16,000)
    # gp3: $0.08/GB/month, 3,000 IOPS + 125 MB/s included
    #      Additional IOPS: $0.005 per IOPS/month (up to 16,000)
    #      Additional throughput: $0.040 per MB/s/month (up to 1,000 MB/s)
    #
    # A 100 GB volume:
    #   gp2: $10.00/month, 300 IOPS
    #   gp3: $8.00/month, 3,000 IOPS (10x more IOPS, 20% cheaper)
    
    # Find all gp2 volumes
    aws ec2 describe-volumes \
        --filters Name=volume-type,Values=gp2 \
        --query 'Volumes[].{Id:VolumeId,Size:Size,State:State,IOPS:Iops}' \
        --output table
    
    # Migrate a volume from gp2 to gp3 (no downtime, no detach needed)
    aws ec2 modify-volume \
        --volume-id vol-0123456789abcdef0 \
        --volume-type gp3
    
    # If the gp2 volume had more than 3,000 IOPS (size > 1 TB),
    # provision additional IOPS on gp3 to match:
    aws ec2 modify-volume \
        --volume-id vol-0123456789abcdef0 \
        --volume-type gp3 \
        --iops 6000 \
        --throughput 250
    
    # Volume modification takes minutes to hours. Monitor progress:
    aws ec2 describe-volumes-modifications \
        --volume-ids vol-0123456789abcdef0 \
        --query 'VolumesModifications[].{Status:ModificationState,Progress:Progress}'

    Scheduling Dev/Test Environments

    # Dev/test environments running 24/7 waste 65% of their cost.
    # Running only during business hours (12h/day, 5 days/week) = 35% of 24/7 cost.
    #
    # AWS Instance Scheduler is a CloudFormation solution that starts and stops
    # EC2 and RDS instances on a schedule based on tags.
    #
    # Quick approach: tag instances and use EventBridge + Lambda
    
    # Tag instances for scheduling
    aws ec2 create-tags \
        --resources i-0123456789abcdef0 \
        --tags Key=Schedule,Value=office-hours Key=Environment,Value=development
    
    # Create an EventBridge rule to stop instances at 7 PM UTC weekdays
    aws events put-rule \
        --name stop-dev-instances \
        --schedule-expression "cron(0 19 ? * MON-FRI *)" \
        --state ENABLED
    
    # Create an EventBridge rule to start instances at 7 AM UTC weekdays
    aws events put-rule \
        --name start-dev-instances \
        --schedule-expression "cron(0 7 ? * MON-FRI *)" \
        --state ENABLED
    
    # The Lambda target filters by tag and calls ec2:StopInstances / ec2:StartInstances.
    # Also schedule RDS instances:
    aws rds stop-db-instance --db-instance-identifier dev-database
    aws rds start-db-instance --db-instance-identifier dev-database
    # RDS instances auto-restart after 7 days if stopped manually.

    AWS Budgets and Alerts

    # Set a monthly budget with alerts at 80% and 100%
    aws budgets create-budget \
        --account-id 123456789012 \
        --budget '{
            "BudgetName": "Monthly-AWS-Budget",
            "BudgetLimit": {"Amount": "5000", "Unit": "USD"},
            "BudgetType": "COST",
            "TimeUnit": "MONTHLY",
            "CostTypes": {
                "IncludeCredit": false,
                "IncludeRefund": false,
                "IncludeTax": true,
                "IncludeSupport": true
            }
        }' \
        --notifications-with-subscribers '[
            {
                "Notification": {
                    "NotificationType": "ACTUAL",
                    "ComparisonOperator": "GREATER_THAN",
                    "Threshold": 80,
                    "ThresholdType": "PERCENTAGE"
                },
                "Subscribers": [
                    {"SubscriptionType": "EMAIL", "Address": "team@company.com"}
                ]
            },
            {
                "Notification": {
                    "NotificationType": "FORECASTED",
                    "ComparisonOperator": "GREATER_THAN",
                    "Threshold": 100,
                    "ThresholdType": "PERCENTAGE"
                },
                "Subscribers": [
                    {"SubscriptionType": "EMAIL", "Address": "team@company.com"},
                    {"SubscriptionType": "SNS", "Address": "arn:aws:sns:us-east-1:123456789012:budget-alerts"}
                ]
            }
        ]'
    
    # The first alert fires when actual spend reaches 80% of budget.
    # The second alert fires when FORECASTED spend exceeds 100% of budget,
    # giving you time to act before the month ends.
    #
    # AWS Budgets: first 2 budgets are free, then $0.02/day per budget.
    
    # Budget actions: automatically restrict IAM or stop resources when over budget
    aws budgets create-budget-action \
        --account-id 123456789012 \
        --budget-name Monthly-AWS-Budget \
        --notification-type ACTUAL \
        --action-type APPLY_IAM_POLICY \
        --action-threshold '{
            "ActionThresholdValue": 100,
            "ActionThresholdType": "PERCENTAGE"
        }' \
        --definition '{
            "IamActionDefinition": {
                "PolicyArn": "arn:aws:iam::123456789012:policy/DenyEC2Launch",
                "Roles": ["DeveloperRole"]
            }
        }' \
        --execution-role-arn arn:aws:iam::123456789012:role/BudgetActionRole \
        --approval-model AUTOMATIC \
        --subscribers '[{"SubscriptionType":"EMAIL","Address":"admin@company.com"}]'

    Best Practices

    Quick Wins (Week 1)

    • Set up AWS Budgets with alerts at 80% actual and 100% forecasted. Without alerts, cost overruns go unnoticed until the monthly bill arrives.
    • Delete unattached EBS volumes, release unused Elastic IPs, and remove old snapshots. These resources accumulate silently.
    • Abort incomplete multipart uploads in S3 with a lifecycle rule. These are invisible in the console but incur storage charges.
    • Enable Cost Explorer and activate cost allocation tags (Environment, Team, Project). Tag-based cost visibility is the foundation of cost management.

    Compute (Week 2)

    • Enable Compute Optimizer and review right-sizing recommendations. Over-provisioned instances are the single largest source of waste.
    • Purchase Compute Savings Plans for predictable baseline workloads. Start with a commitment that covers 70-80% of your steady-state usage.
    • Use Spot Instances with mixed-instance Auto Scaling Groups for fault-tolerant workloads. Set OnDemandBaseCapacity for your minimum reliable capacity and fill the rest with Spot.
    • Schedule dev/test environments to run only during business hours. Running 12 hours/day, 5 days/week costs 35% of 24/7.

    Storage (Week 3)

    • Migrate all gp2 volumes to gp3. It is 20% cheaper with 10x the baseline IOPS for volumes under 1 TB. The migration requires no downtime.
    • Enable S3 Intelligent-Tiering for buckets with unpredictable access patterns. It automatically moves objects between tiers with no retrieval fees.
    • Set S3 lifecycle policies to transition old data to Glacier or Deep Archive. Logs, backups, and compliance archives rarely need Standard storage after 30 days.
    • Review S3 Storage Lens for bucket-level cost and usage insights across your organization.

    Ongoing

    • Review Cost Explorer weekly. Look for unexpected cost spikes, growing services, and on-demand spend that should be covered by Savings Plans.
    • Monitor Savings Plans utilization. Target 80-90%. Below 80% means over-commitment; above 95% means additional plans could save more.
    • Use AWS Cost Anomaly Detection to automatically flag unusual spending. It uses machine learning to identify cost spikes and sends alerts to SNS or email.
    • Tag every resource at creation. Untagged resources are invisible to cost attribution and make optimization decisions harder.

    Go Deeper: The State of AWS Security 2026

    This article is just the start. Get the full picture with our free whitepaper - 8 chapters covering IAM, S3, VPC, monitoring, agentic AI security, compliance, and a prioritized action plan with 50+ CLI commands.

    AWSCost OptimizationFinOpsSavings PlansSpot InstancesS3