Skip to main content

Command Palette

Search for a command to run...

Deploying on AWS EKS with Terraform, Jenkins & Kubernetes

Updated
9 min read
Deploying on AWS EKS with Terraform, Jenkins & Kubernetes

This is a technical walkthrough of an end-to-end DevOps project where I provisioned AWS infrastructure using Terraform, set up a Jenkins CI/CD pipeline, containerized a full-stack application, and deployed it to an EKS Kubernetes cluster — all on t3.micro instances under free-tier constraints.

GitHub Repository: github.com/imdibr/k8s_e2e


Architecture Overview

The system has five layers:

Layer Technology Details
IaC Terraform 3 modules (vpc, eks, jenkins) — 41 managed resources
Networking Custom VPC 10.0.0.0/16 — 2 public + 2 private subnets, IGW, NAT
CI/CD Jenkins on EC2 t3.micro, Ubuntu 22.04, IAM Instance Profile
Orchestration Amazon EKS Kubernetes 1.32, 2x t3.micro worker nodes
Containers Docker + ECR Frontend (nginx:alpine) + Backend (node:18-alpine)

The traffic flow: User → ALB (internet-facing) → Ingress (path-based routing) → ClusterIP Services → Pods

Architecture Diagram

Phase 1 — VPC & Networking (Terraform)

Everything starts with the network. I built a custom VPC with proper subnet isolation:

VPC: 10.0.0.0/16
├── Public Subnet 1:  10.0.0.0/24  (ap-south-1a) → Jenkins, ALB, NAT Gateway
├── Public Subnet 2:  10.0.1.0/24  (ap-south-1b) → ALB multi-AZ
├── Private Subnet 1: 10.0.10.0/24 (ap-south-1a) → EKS Worker Node 1
└── Private Subnet 2: 10.0.11.0/24 (ap-south-1b) → EKS Worker Node 2

The Terraform VPC module handles all of this — IGW for public subnets, NAT Gateway with an Elastic IP for private subnet outbound traffic, and separate route tables for each.

Security Groups

Security groups follow the principle of least privilege:

  • Jenkins SG: SSH and port 8080 locked to the operator's IP only, detected dynamically using data.http.myip in Terraform. No 0.0.0.0/0.

  • EKS Cluster SG: Allows all traffic within the VPC CIDR (10.0.0.0/16) for cluster-node communication.

  • ALB SG: Port 80 from 0.0.0.0/0 — this is the only internet-facing entry point.

IAM Roles

Four IAM roles, zero static credentials:

Role Purpose
jenkins-ec2-role Instance Profile — ECR push, EKS access
eks-cluster-role EKS control plane
eks-node-role Worker nodes — CNI, ECR pull
alb-controller-role IRSA — ALB Controller assumes this via OIDC
Jenkins Pipeline on EC2 instance

Phase 2 — EKS Cluster Setup

Cluster Configuration

The EKS cluster (devops-intern-cluster) runs Kubernetes 1.32 with worker nodes in private subnets only. Public + private endpoint access is enabled so both external kubectl and internal node communication work.

The node group uses t3.micro instances (free tier). This was a deliberate constraint that introduced real challenges — more on that later.

Cluster Add-ons (all provisioned via Terraform)

Add-on Method Purpose
CoreDNS EKS managed add-on Cluster DNS
kube-proxy EKS managed add-on Service networking
VPC CNI EKS managed add-on Pod networking with VPC IPs
AWS Load Balancer Controller Helm chart (Terraform) Creates ALB from Ingress
EBS CSI Driver EKS managed add-on PVC with EBS volumes
Metrics Server Helm chart (Terraform) HPA CPU metrics

OIDC & IRSA

The ALB Controller needs AWS API access to create load balancers. Instead of hardcoding credentials, I set up an OIDC provider for EKS and used IAM Roles for Service Accounts (IRSA) — the controller's Kubernetes ServiceAccount is annotated with an IAM role ARN, and it assumes that role via web identity federation.

Listing all the nodes Listing pods in the name space

Phase 3 — Jenkins CI/CD Pipeline

Jenkins Setup

Jenkins runs on a t3.micro EC2 instance provisioned by Terraform. The user_data script installs Java 17, Jenkins (.war file), Docker, kubectl, AWS CLI v2, and configures a 2GB swap file (because t3.micro has only 1GB RAM). JVM is capped at -Xmx256m.

The Jenkinsfile — 7 Stages

Here's the actual pipeline:

pipeline {
  agent any

  environment {
    AWS_REGION      = "ap-south-1"
    AWS_ACCOUNT_ID  = "984285320367"
    ECR_FRONTEND    = "\({AWS_ACCOUNT_ID}.dkr.ecr.\){AWS_REGION}.amazonaws.com/frontend"
    ECR_BACKEND     = "\({AWS_ACCOUNT_ID}.dkr.ecr.\){AWS_REGION}.amazonaws.com/backend"
    CLUSTER_NAME    = "devops-intern-cluster"
    NAMESPACE       = "devops"
    IMAGE_TAG       = "${BUILD_NUMBER}"
  }

  stages {
    stage('Checkout')         { /* Clone from GitHub using PAT */ }
    stage('Build & Test')     { /* Smoke tests */ }
    stage('Docker Build')     { /* Build both images with BUILD_NUMBER tag */ }
    stage('Push to ECR')      { /* aws ecr get-login-password, push :tag + :latest */ }
    stage('Deploy to EKS')    { /* kubectl apply individual manifests, set image */ }
    stage('Verify')           { /* kubectl rollout status --timeout=300s */ }
    stage('Notify')           { /* Print deployment summary */ }
  }
}

Key decisions:

  • No static AWS credentials — Jenkins uses an IAM Instance Profile. aws ecr get-login-password just works.

  • Individual manifest applies — not kubectl apply -f k8s/ (which would deploy everything including PostgreSQL, eating up pod slots we don't have).

  • kubectl set image after apply — ensures the deployment picks up the new ECR image tag.

Jenkins Pipeline Stages

Phase 4 — Kubernetes Deployment

Everything lives in the devops namespace.

Deployments

Backend Frontend
Image ECR backend:BUILD_NUMBER ECR frontend:BUILD_NUMBER
Replicas 1 1
Port 3000 80
CPU 20m–50m 10m–30m
Memory 32Mi–64Mi 16Mi–48Mi
Liveness HTTP GET /health HTTP GET /
Readiness HTTP GET /health HTTP GET /
Strategy RollingUpdate (maxSurge:0) RollingUpdate (maxSurge:0)

maxSurge: 0 is critical on t3.micro — we can't afford an extra pod during rollout.

Services & Ingress

Both services are ClusterIP (internal only). The AWS ALB Ingress Controller creates an internet-facing Application Load Balancer with path-based routing:

  • / → frontend-service

  • /api → backend-service

Target type is ip (direct pod IP targeting, not NodePort).

HPA

The backend deployment has an HPA: min 1, max 2 replicas, scaling at 60% CPU utilization. In practice on t3.micro, we rarely have room for the second replica, but the configuration is there.

Persistent Volume

A PostgreSQL 15 StatefulSet with a 1Gi gp2 PVC via the EBS CSI Driver. This was excluded from the automated pipeline to save pod capacity but the PVC was demonstrated in Bound state.

Cheking services and replica sets Checking ingress service Checking PVC bound state Checking HPA

The t3.micro Pod Capacity Problem

This deserves its own section because it was the biggest challenge of the project.

t3.micro has 2 ENIs with 2 IPs each = 4 pod slots per node. With 2 nodes, that's 8 total slots. Here's how they were allocated:

Pod Namespace Notes
aws-node (x2) kube-system VPC CNI DaemonSet — mandatory
kube-proxy (x2) kube-system iptables DaemonSet — mandatory
coredns kube-system DNS — mandatory
ALB Controller kube-system Creates ALB — mandatory
backend devops Application
frontend devops Application

8/8 slots. Zero spare capacity. This meant:

  • PostgreSQL had to be excluded from the pipeline

  • Metrics Server was scaled to 0 replicas

  • maxSurge: 0 on rolling updates (can't create a new pod before killing the old one)

  • Any misconfiguration that spawned an extra pod would cascade into Pending states and ALB 503s


Issues I Hit (and Fixed)

Issue What Happened Fix
ALB Controller CrashLoopBackOff IRSA not configured Created OIDC provider + IAM role with trust policy
ALB not created from Ingress Subnets missing discovery tags Added kubernetes.io/role/elb = 1 tag
Jenkins kubectl access denied Missing EKS policies on IAM role Added EKSClusterPolicy + WorkerNodePolicy
Pod Pending → ALB 503 t3.micro ENI limit exhausted Excluded PostgreSQL, scaled metrics-server to 0
Jenkins unreachable after restart Public IP changed, SG had old IP terraform apply refreshes data.http.myip
Jenkins OOM Default JVM heap too large for 1GB -Xmx256m + 2GB swap

Terraform Module Structure

infra/
├── main.tf              # Module calls + ALB Controller IRSA + Helm releases
├── provider.tf          # AWS provider, Helm provider, Kubernetes provider
└── modules/
    ├── vpc/
    │   ├── main.tf      # VPC, subnets, IGW, NAT, route tables
    │   └── outputs.tf   # vpc_id, subnet IDs
    ├── eks/
    │   ├── main.tf      # Cluster, node group, add-ons, OIDC
    │   ├── outputs.tf   # Cluster endpoint, OIDC ARN
    │   └── variables.tf
    └── jenkins/
        ├── main.tf      # EC2, SG, IAM role, EC2 instance profile
        └── variables.tf

41 resources total when you run terraform state list. Every single one provisioned and destroyed cleanly.


Security Summary

  • Zero static AWS credentials in the entire project

  • Jenkins SG restricted to operator IP (dynamically detected)

  • Workers in private subnets only

  • ALB Controller uses IRSA (OIDC federation)

  • Kubernetes Secrets for sensitive config

  • No 0.0.0.0/0 on anything except ALB port 80


Wrapping Up

This project covers VPC networking, EKS cluster setup, Jenkins CI/CD, Docker containerization, ECR, Kubernetes deployments with health checks and autoscaling, ALB ingress, persistent storage, IAM security — all glued together with Terraform.

The t3.micro constraint made it significantly harder than it would be on t3.medium, but it also forced me to understand exactly what every pod and every resource was doing. You can't waste a single pod slot when you only have 8.

Full project code: github.com/imdibr/k8s_e2e

I've also written a companion post about what I actually learned from building this — the non-technical takeaways: My Two Cents After Deploying to EKS