Performance

How We Reduced a Client's Cloud Server Bill by 60% in 30 Days

Muhammad Aliwardana · January 20, 2026 · 7 min read

The Problem

In Q4 2025, a mid-sized Indonesian e-commerce client came to us with a growing infrastructure bill. Their monthly cloud spend had ballooned to $4,200/month as their team had provisioned servers reactively — always scaling up, never scaling right.

What We Found in the First Week

After getting monitoring access, the picture was clear:

  • CPU utilization averaged 12% across 8 application servers. They were paying for capacity they never used.
  • 3 servers had no traffic for 45+ days — forgotten staging environments still running in production accounts.
  • Database queries were hitting an unindexed orders table with 2.3 million rows — every checkout page load triggered a full table scan.
  • Static assets (images, CSS, JS) were being served from the application server instead of a CDN.
  • Backups were running during peak hours (11am-2pm), causing I/O spikes on the primary database.

The 30-Day Optimization Plan

Week 1: Quick Wins

Shut down the 3 idle servers — immediate $840/month saving. Moved static assets to Cloudflare R2 with CDN — reduced bandwidth costs by 70% and cut page load time by 1.4 seconds.

Week 2: Right-sizing

Replaced 8 over-provisioned servers with 4 properly-sized ones using load testing data. Configured horizontal auto-scaling — the cluster now scales from 2 to 6 nodes based on actual traffic patterns.

Week 3: Database Optimization

Added composite indexes on the orders table:

`sql

CREATE INDEX idx_orders_user_status_created

ON orders (user_id, status, created_at DESC);

`

Checkout page load time dropped from 4.2s to 0.8s. Moved backup jobs to 2am.

Week 4: Monitoring and Alerting

Deployed Prometheus + Grafana dashboards. Set up cost anomaly alerts — any daily spend 20% above baseline triggers an immediate notification.

The Result

| Metric | Before | After |

|--------|--------|-------|

| Monthly spend | $4,200 | $1,680 |

| Avg CPU utilization | 12% | 68% |

| Checkout load time | 4.2s | 0.8s |

| Server count | 11 | 4 (auto-scaling) |

60% cost reduction. Better performance. Smaller attack surface.

This is exactly what our Proactive Management tier includes — systematic infrastructure review, not just keeping the lights on.

Back to Blog