Building resilient cloud
systems that stay calm in production.
I work on mission-critical infrastructure where uptime, change safety, and delivery speed matter most. My focus is operational clarity: automate the routine, instrument the unknowns, and keep teams confident.
My Journey
Why I chase reliability.
Every system failure I've seen taught me something about clarity, preparation, and calm under pressure.
The Start
I began in support and systems administration, spending late nights troubleshooting production issues with cold coffee and a terminal. I learned that the best engineers aren't the ones who fix fast — they're the ones who make failures rare and recoverable.
The Pivot
Moving into cloud engineering, I kept seeing the same painful patterns: undocumented infrastructure, fragile deployments, and teams afraid to change production. So I committed to infrastructure as code, observability, and SRE culture — not as buzzwords, but as survival skills.
The Mission
Today I help teams build systems they can trust. Whether it's a zero-downtime migration, a CI/CD overhaul, or a cost optimization project, my goal is the same: calm, predictable operations — and a team that sleeps well at night.
Case Studies
Infrastructure projects framed by product impact.
These are the kinds of problems I enjoy solving: operational drag, release friction, and risky migrations.
Featured Project
GitLab AI MCP
A high-performance, containerized Model Context Protocol (MCP) server for integrating with self-hosted GitLab, featuring local AI triage and token-efficient data bundling.
Outcome
Privacy-first CI/CD debugging and 80% reduction in LLM context usage via intelligent response bundling.
Pipeline Throughput Upgrade
Our release pipelines were getting slower as the team grew. I refined build caching and runner orchestration so developers stopped waiting around for feedback.
Outcome
Shorter feedback loops and fewer deployment bottlenecks.
Internal project — details available on request
Operational Reporting Automation
I used serverless AWS primitives to automate recurring reports that were eating up hours of manual work every week. Less copy-paste, more time for actual engineering.
Outcome
Less manual reporting and more repeatable cloud operations.
Internal project — details available on request
Zero-Downtime Linux Migration
Moving 120+ servers from CentOS to AlmaLinux without breaking payment-facing services was one of the trickiest projects I have led. Staged rollouts and service validation were our safety net.
Outcome
Modernized estate without disrupting payment-facing services.
Internal project — details available on request
Tools I Use
Tech stack I actually run in production.
No resume padding—just the platforms, languages, and tools I touch day-to-day.
Cloud & Infrastructure
- AWS Production
- Terraform Production
- Docker Production
- Kubernetes Learning
DevOps & CI/CD
- GitLab CI/CD Production
- GitHub Actions Production
- BuildKit Production
Languages & Runtimes
- Python Production
- Go Building
- Bash Production
- PowerShell Production
Observability & Ops
- CloudWatch Production
- New Relic Production
- Prometheus Production
- Grafana Production
Credentials
Certifications & Credentials
Formal validation of the skills I apply every day in production environments.
AWS Certified Solutions Architect
Associate (SAA-C03)
Earned. Validated skills in VPC, HA, cost-optimized architectures, and secure workload design.
AWS Certified Cloud Practitioner
Foundational
Earned. Core AWS services, security fundamentals, billing, and architectural best practices.
Blog
Notes from the field.
Writing about infrastructure migration, pipeline efficiency, and site reliability patterns — the things I learn and re-learn in production.
Architecting a High-Performance MCP Server for Enterprise AI-Driven DevOps
How I engineered a custom Model Context Protocol (MCP) server to bridge the gap between LLMs and enterprise GitLab infrastructure.
Modernizing CI/CD for Mission-Critical Systems with GitLab and ECS Fargate
How we optimized Docker builds using BuildKit and designed scalable, auto-scaling runners for mission-critical deployments.
The Zero-Downtime Blueprint: CentOS to AlmaLinux Migration
A deep dive into our strategy for migrating mission-critical merchant portals with zero user impact.
Serverless FinOps: Automated Cost Reporting on AWS
Leveraging AWS Lambda, EventBridge, and SES to build a custom, real-time cost monitoring and reporting system.
Get in Touch
Let's work together.
Whether it's a full project, a quick consultation, or just a question about AWS — I'm happy to connect.
I usually reply within 24–48 hours.