Ops & Infrastructure Engineer

Cost-Aware & Scalable
Cloud Infrastructure

The Bridge between Code & Cloud. I design systems that scale automatically with traffic but remain cost-efficient during downtime.

Scalable Systems
FinOps Mindset
Terraform Expert

Featured In-Depth Projects

Cloud-Native QR Code Platform
AWS EKSLocalStackTerraformArgoCDPrometheusTrivy
Enterprise Standard

Cloud-Native QR Code Platform DevSecOps Implementation

The Challenge:Developing cloud-native applications often incurs high costs due to cloud dependencies, and manual deployments to Kubernetes often lead to configuration drift.
The Solution:I architected a "Local-First" development environment and a GitOps-based delivery pipeline to ensure security, consistency, and zero-cost testing.

Zero-Cost Infrastructure Testing: Engineered a local AWS environment using LocalStack and Docker Compose, allowing the team to validate Terraform scripts without touching real AWS credits.

GitOps Delivery Model: Implemented ArgoCD to synchronize the Kubernetes cluster state with the Git repository, enabling automated self-healing and separating CI from CD.

Security First: Replaced static AWS keys with IRSA (IAM Roles for Service Accounts) and integrated Trivy to scan container images.

Impact

  • Reduced development infrastructure costs by 100% during the testing phase.
  • Achieved zero configuration drift between Staging and Production.
Madabank - High-Availability Banking
VPSDocker ComposeNginxGoPrometheusGrafanaLoki

Madabank - High-Availability Banking Infrastructure Migration

The Challenge:The initial deployment on AWS ECS was incurring high monthly costs prohibitive for the startup phase, and the team lacked visibility into application-level errors.
The Solution:I led the migration to a high-performance VPS architecture and built a comprehensive observability suite to monitor not just server health, but business logic.

Cost-Efficient Migration: Containerized the Go microservices using Docker Compose and orchestrated them on a VPS with Nginx as a reverse proxy, slashing infrastructure bills by ~60%.

Business-Centric Observability: Deployed the PLG Stack (Prometheus, Loki, Grafana). Created dashboards monitoring "Transaction Success Rates" and "Fraud Attempts".

Intelligent Alerting: Configured Alertmanager to trigger Slack notifications only for critical anomalies, reducing alert fatigue.

Impact

  • Significant OPEX reduction while maintaining 99.9% uptime.
  • Reduced Mean Time to Resolution (MTTR) for bugs by pinpointing logs instantly via Loki.
Automated Mobile Release Pipeline
GitHub ActionsFastlaneXcodegenRuby

Automated Mobile Release Pipeline iOS CI/CD

The Challenge:Manual iOS deployments were prone to human error (wrong provisioning profiles) and took valuable developer time (2+ hours/week) to upload to TestFlight.
The Solution:I orchestrated a fully automated pipeline that treats mobile releases like server deployments—codified, consistent, and automatic.

Infrastructure as Code for Project Files: Utilized Xcodegen to generate .xcodeproj files on the fly, eliminating endless merge conflicts in the project.pbxproj file.

Automated Signing & Delivery: Configured Fastlane Match to handle code signing certificates securely and automated the upload process to TestFlight.

Strict Quality Gates: The pipeline automatically rejects any commit that fails SwiftLint or Unit Tests.

Impact

  • 0% Crash Rate in production due to enforced testing gates.
  • Saved ~20 engineering hours per month by automating the release process.
Portfolio Site - Edge Deployment
Next.jsVercelGitHub Actions

Portfolio Site - Edge Deployment Modern Web Workflow

The Challenge:Needed a highly performant, globally available personal website with zero downtime during updates.
The Solution:Implemented a modern CI/CD workflow leveraging Edge Computing.

Automated Verification: GitHub Actions run ESLint and Unit Tests before allowing a merge to the main branch.

Atomic Deployments: Utilized Vercel for immutable deployments—every commit creates a unique preview URL, and the main branch updates instantly with zero downtime.

DDoS Protection: Configured edge-level rate limiting to prevent traffic spikes.

Impact

  • Achieved perfect Lighthouse scores (100/100) and sub-second global load times.

How I Think: Architectural Decisions

Why LocalStack over AWS for Dev?

To eliminate billing anxiety and enable offline development. It allows engineers to fail fast and iterate without waiting for cloud provisioning.

Why VPS over ECS?

At our scale, the management overhead of ECS outweighed the benefits. Docker Compose provided sufficient orchestration at 40% of the cost.

Why Fastlane?

Because relying on Xcode's "Archive" button is not reproducible. CI/CD requires CLI tools, not GUI interactions, to ensure consistent builds every time.

Reproducible Infrastructure with Terraform

I don't do "ClickOps". Every piece of infrastructure—from VPC networks to Database instances—is defined as code. This ensures consistency across staging and production environments and enables rapid disaster recovery.

ModulesState LockingMulti-Cloud

Verified Skills

See all badges

Additional Case Studies

Case Study #1

Cost Optimization for Low-Frequency Tasks

"A client needs a script that generates a PDF report once every day at 8:00 AM."

The Problem

Traditional Approach: Spinning up a dedicated EC2 instance (VM) running 24/7 with a cron job.

DevOps Solution

Utilize Serverless Functions (AWS Lambda / Google Cloud Functions) triggered by a Cloud Scheduler/EventBridge.

Architecture Rationale

An EC2 instance incurs costs for idle time (23 hours/day). Serverless functions use a "Pay-as-you-go" model, costing near zero since it only runs for seconds daily.

Case Study #2

Handling Flash Sale Traffic

"An e-commerce app expects a 100x spike in traffic during a "Harbolnas" (Double Date) flash sale."

The Problem

A fixed number of servers will either crash under load or waste money if over-provisioned beforehand.

DevOps Solution

Implement Horizontal Pod Autoscaling (HPA) on Kubernetes combined with Cluster Autoscaler. Additionally, offload static assets to a CDN (Cloudflare/CloudFront).

Architecture Rationale

HPA automatically adds replicas based on CPU/Memory usage, while the CDN reduces the load on the origin server by caching content at the edge.

Case Study #3

Solving "It Works on My Machine"

"Developers complain that code runs perfectly on their MacBook but crashes on the Staging server due to different library versions."

DevOps Solution

Implement Containerization (Docker) for development and production environments, ensuring parity. Use Infrastructure as Code (Terraform) to provision identical infrastructure.

Architecture Rationale

Docker ensures the application runtime (OS, dependencies) is immutable and identical across all environments, eliminating environment drift.

Case Study #4

Managing Secrets & Credentials Securely

"A developer accidentally commits a .env file containing AWS Access Keys and Database Passwords to a public GitHub repository."

DevOps Solution

Remove secrets from Git immediately. Implement a centralized Secret Manager (AWS Secrets Manager / HashiCorp Vault). Inject secrets into containers only at runtime via environment variables.

Architecture Rationale

Hardcoded secrets are a massive security risk. Centralized management allows for rotation, audit logging, and granular access control.

Case Study #5

Reducing Slow CI/CD Build Times

"The deployment pipeline takes 45 minutes to finish, causing developers to wait too long to see their changes."

DevOps Solution

Implement Dependency Caching (e.g., caching node_modules) and utilize Docker Layer Caching. Parallelize independent test jobs.

Architecture Rationale

Most build time is wasted re-downloading dependencies. Caching and parallelism can often reduce build time from 45 mins to under 10 mins.

Case Study #6

Database Performance for Read-Heavy Apps

"A news portal application is slow because thousands of users are reading articles simultaneously, putting high load on the database."

DevOps Solution

Implement Read Replicas for the database. Direct all "Read" (GET) queries to the replicas and only "Write" (POST/PUT) queries to the Primary DB. Add a Redis layer for caching.

Architecture Rationale

Separating Read and Write concerns prevents the primary database from being overwhelmed, while Redis serves data from memory (microseconds latency).

Case Study #7

Zero-Downtime Deployment

"Users experience errors or "Service Unavailable" pages every time the team deploys a new version of the backend."

DevOps Solution

Adopt Blue/Green Deployment or Rolling Updates with Kubernetes.

Architecture Rationale

In Blue/Green, the new version (Green) is deployed alongside the old (Blue). Traffic is switched only after Green is healthy. If issues arise, switching back to Blue is instant.

Case Study #8

Disaster Recovery & High Availability

"The main data center in Jakarta (Region A) goes down due to a flood. The entire banking app goes offline."

DevOps Solution

Architect a Multi-AZ (Availability Zone) setup where resources are spread across physically separate data centers. For critical data, enable Cross-Region Replication.

Architecture Rationale

If one Zone fails, the Load Balancer automatically redirects traffic to the healthy instances in another Zone, ensuring business continuity.

Case Study #9

Debugging in Microservices (Observability)

"A user transaction fails, but the error spans across 5 different microservices, making it impossible to find the root cause by looking at individual server logs."

DevOps Solution

Implement Distributed Tracing (Jaeger / OpenTelemetry) and Centralized Logging (ELK Stack / Loki). Assign a unique TraceID to every request.

Architecture Rationale

A TraceID allows engineers to visualize the entire journey of a request across all services to pinpoint exactly where the latency or error occurred.

Case Study #10

Storage Optimization

"An app allows users to upload profile photos. Storing them on the web server's disk (Block Storage) is becoming expensive and hard to backup."

DevOps Solution

Offload static files to Object Storage (AWS S3 / GCS). Implement Lifecycle Policies to move old, rarely accessed logs/files to cheaper storage classes (e.g., S3 Glacier).

Architecture Rationale

Object storage is infinitely scalable and much cheaper than Block Storage (EBS) for unstructured data. Lifecycle policies automate cost savings for "cold" data.