Open to full-time & consulting roles

Building resilient cloud systems that stay calm in production.

I work on mission-critical infrastructure where uptime, change safety, and delivery speed matter most. My focus is operational clarity: automate the routine, instrument the unknowns, and keep teams confident.

GitHub LinkedIn
Wafiy Firdaus working on cloud infrastructure
Years in production
1 yr 7 mos
AWS certs earned
2
Incidents responded to
30+
Automations shipped
50+

My Journey

Why I chase reliability.

Every system failure I've seen taught me something about clarity, preparation, and calm under pressure.

1

The Start

I began in support and systems administration, spending late nights troubleshooting production issues with cold coffee and a terminal. I learned that the best engineers aren't the ones who fix fast — they're the ones who make failures rare and recoverable.

2

The Pivot

Moving into cloud engineering, I kept seeing the same painful patterns: undocumented infrastructure, fragile deployments, and teams afraid to change production. So I committed to infrastructure as code, observability, and SRE culture — not as buzzwords, but as survival skills.

3

The Mission

Today I help teams build systems they can trust. Whether it's a zero-downtime migration, a CI/CD overhaul, or a cost optimization project, my goal is the same: calm, predictable operations — and a team that sleeps well at night.

Case Studies

Infrastructure projects framed by product impact.

These are the kinds of problems I enjoy solving: operational drag, release friction, and risky migrations.

Featured Project

GitLab AI MCP

GitLab AI MCP

A high-performance, containerized Model Context Protocol (MCP) server for integrating with self-hosted GitLab, featuring local AI triage and token-efficient data bundling.

-80%
LLM context reduction
100%
Local inference

Outcome

Privacy-first CI/CD debugging and 80% reduction in LLM context usage via intelligent response bundling.

GitLab MCP Python Docker AI
View on GitHub

Pipeline Throughput Upgrade

Pipeline Throughput Upgrade

Our release pipelines were getting slower as the team grew. I refined build caching and runner orchestration so developers stopped waiting around for feedback.

-45%
Build time
-60%
Queue wait

Outcome

Shorter feedback loops and fewer deployment bottlenecks.

GitLab BuildKit Terraform

Internal project — details available on request

Operational Reporting Automation

Operational Reporting Automation

I used serverless AWS primitives to automate recurring reports that were eating up hours of manual work every week. Less copy-paste, more time for actual engineering.

12/wk
Hours saved
-90%
Error rate

Outcome

Less manual reporting and more repeatable cloud operations.

Lambda EventBridge Python

Internal project — details available on request

Zero-Downtime Linux Migration

Zero-Downtime Linux Migration

Moving 120+ servers from CentOS to AlmaLinux without breaking payment-facing services was one of the trickiest projects I have led. Staged rollouts and service validation were our safety net.

0 min
Downtime
120+
Servers migrated

Outcome

Modernized estate without disrupting payment-facing services.

Linux Migration Compliance

Internal project — details available on request

Tools I Use

Tech stack I actually run in production.

No resume padding—just the platforms, languages, and tools I touch day-to-day.

Cloud & Infrastructure

  • AWS Production
  • Terraform Production
  • Docker Production
  • Kubernetes Learning

DevOps & CI/CD

  • GitLab CI/CD Production
  • GitHub Actions Production
  • BuildKit Production

Languages & Runtimes

  • Python Production
  • Go Building
  • Bash Production
  • PowerShell Production

Observability & Ops

  • CloudWatch Production
  • New Relic Production
  • Prometheus Production
  • Grafana Production

Credentials

Certifications & Credentials

Formal validation of the skills I apply every day in production environments.

☁️

AWS Certified Solutions Architect

Associate (SAA-C03)

Earned

Earned. Validated skills in VPC, HA, cost-optimized architectures, and secure workload design.

☁️

AWS Certified Cloud Practitioner

Foundational

Earned

Earned. Core AWS services, security fundamentals, billing, and architectural best practices.

Blog

Notes from the field.

Writing about infrastructure migration, pipeline efficiency, and site reliability patterns — the things I learn and re-learn in production.

GitLab MCP Python Docker AI

Architecting a High-Performance MCP Server for Enterprise AI-Driven DevOps

How I engineered a custom Model Context Protocol (MCP) server to bridge the gap between LLMs and enterprise GitLab infrastructure.

Apr 7, 2026 Read post
AWS ECS Terraform GitLab CI

Modernizing CI/CD for Mission-Critical Systems with GitLab and ECS Fargate

How we optimized Docker builds using BuildKit and designed scalable, auto-scaling runners for mission-critical deployments.

Mar 25, 2026 Read post
Linux SRE Migration High-Availability

The Zero-Downtime Blueprint: CentOS to AlmaLinux Migration

A deep dive into our strategy for migrating mission-critical merchant portals with zero user impact.

Mar 10, 2026 Read post
Serverless AWS Lambda Python FinOps

Serverless FinOps: Automated Cost Reporting on AWS

Leveraging AWS Lambda, EventBridge, and SES to build a custom, real-time cost monitoring and reporting system.

Feb 15, 2026 Read post

Get in Touch

Let's work together.

Whether it's a full project, a quick consultation, or just a question about AWS — I'm happy to connect.

Connect on LinkedIn

I usually reply within 24–48 hours.