Skip to content

Kubernetes Dev Cluster — Vagrant

A local three-node Kubernetes cluster running on Ubuntu 22.04 (Jammy) VMs, provisioned with Vagrant and VirtualBox. Version 2 adds a full optional addon layer: DNS caching, a Web UI, resource monitoring, log aggregation, ingress, TLS, and a private container registry.

┌──────────────────────────────────────────────────────────────┐
│                        Host Machine                          │
│                                                              │
│  ┌───────────────┐   ┌──────────────┐   ┌──────────────┐    │
│  │  k8s-control  │   │ k8s-worker-1 │   │ k8s-worker-2 │    │
│  │ 192.168.56.10 │   │192.168.56.11 │   │192.168.56.12 │    │
│  │  2 CPU / 2 GB │   │ 2 CPU / 2 GB │   │ 2 CPU / 2 GB │    │
│  └──────┬────────┘   └──────┬───────┘   └──────┬───────┘    │
│         └──────────────────┴──────────────────┘             │
│                      Private Network                         │
└──────────────────────────────────────────────────────────────┘

For scripts used in this demo, go to the Playroom Security GitHub repositoty here:


What's New in v2

Category Addon Default
DNS NodeLocal DNSCache ✅ On
Web UI Kubernetes Dashboard ✅ On
Resource monitoring Metrics Server ✅ On
Resource monitoring Prometheus + Grafana ⬜ Off
Cluster logging Loki + Promtail ⬜ Off
Ingress ingress-nginx ✅ On
TLS cert-manager + self-signed issuer ⬜ Off
Registry In-cluster Docker Registry ⬜ Off
Network Cilium support added

All addons are toggled in settings.yaml — no script editing required.


Prerequisites

Tool Version Notes
Vagrant ≥ 2.3
VirtualBox ≥ 7.0
vagrant-reload plugin any vagrant plugin install vagrant-reload
Free RAM (default addons) ≥ 8 GB 3 × 2 GB + host overhead
Free RAM (with Prometheus) ≥ 12 GB increase worker memory to 3072 MB
Free Disk ≥ 30 GB ~10 GB per VM

Project Structure

k8s-dev/
├── settings.yaml        # All configuration and addon toggles
├── Vagrantfile          # VM definitions, passes addon env vars to install.sh
├── install.sh           # Node bootstrap + addon installer
└── README.md

# Generated at runtime (do not commit):
├── join.sh              # kubeadm join token (written by control-plane)
├── admin.conf           # kubeconfig for local kubectl access
└── dashboard-token.txt  # Dashboard bearer token (if addon enabled)

Configuration — settings.yaml

Kubernetes & nodes

kubernetes:
  version: "1.29"
  pod_cidr: "10.244.0.0/16"
  cni: "flannel"           # flannel | calico | cilium

nodes:
  control_plane:
    ip: "192.168.56.10"
    cpu: 2
    memory: 2048
  workers:
    - name: "k8s-worker-1"
      ip: "192.168.56.11"
      cpu: 2
      memory: 2048

CNI options

CNI cni value pod_cidr Notes
Flannel flannel 10.244.0.0/16 Simple, default
Calico calico 192.168.0.0/16 NetworkPolicy support
Cilium cilium 10.244.0.0/16 eBPF-based, feature-rich

Addons

Every addon has an enabled flag. Flip it to true/false and rebuild.

addons:
  nodelocal_dns:        { enabled: true }
  metrics_server:       { enabled: true,  version: "v0.7.1" }
  dashboard:            { enabled: true,  version: "v2.7.0" }
  ingress_nginx:        { enabled: true,  version: "v1.10.1",
                          http_nodeport: 30080, https_nodeport: 30443 }
  prometheus_grafana:   { enabled: false, grafana_password: "admin" }
  loki:                 { enabled: false }   # requires prometheus_grafana
  cert_manager:         { enabled: false, version: "v1.14.5" }
  registry:             { enabled: false, nodeport: 32000 }

Loki requires prometheus_grafana.enabled: true — it adds a datasource to the Grafana instance deployed by that addon.


Quick Start

1. Install the Vagrant plugin

vagrant plugin install vagrant-reload

2. Review settings.yaml

Enable or disable addons to match your available RAM. The defaults (DNS cache, Metrics Server, Dashboard, ingress-nginx) work comfortably within 8 GB.

3. Start the cluster

# Recommended: control-plane first so join.sh exists before workers start
vagrant up k8s-control
vagrant up k8s-worker-1 k8s-worker-2

# Or all at once (Vagrant will handle ordering)
vagrant up

First run takes 15–25 minutes depending on internet speed and enabled addons.

4. Verify

export KUBECONFIG=$(pwd)/admin.conf

kubectl get nodes
kubectl get pods -A

Expected nodes output (allow ~90 s for Ready):

vagrant-vm

NAME           STATUS   ROLES           AGE   VERSION
k8s-control    Ready    control-plane   6m    v1.29.x
k8s-worker-1   Ready    <none>          4m    v1.29.x
k8s-worker-2   Ready    <none>          4m    v1.29.x

Addon Reference

NodeLocal DNSCache

Runs a DNS caching agent on every node at 169.254.20.10, reducing latency and CoreDNS load.

# Verify DaemonSet is running
kubectl get ds node-local-dns -n kube-system

Metrics Server

Enables kubectl top and the HorizontalPodAutoscaler.

kubectl top nodes
kubectl top pods -A

Kubernetes Dashboard

Browser-based cluster management UI. The bearer token is written to dashboard-token.txt in your project directory.

# Get the NodePort
kubectl get svc kubernetes-dashboard -n kubernetes-dashboard

# Get the token (also saved to ./dashboard-token.txt)
cat dashboard-token.txt

Open https://192.168.56.10:<nodeport> in your browser, accept the self-signed certificate warning, and paste the token.

Dashboard Token Renewal

To renew the Dashboard Token if it ever expires:

kubectl -n kubernetes-dashboard create token admin-user --duration=720h
Maximum duration = 720h


ingress-nginx

A single entry point for HTTP/HTTPS traffic into the cluster.

# Verify controller is running
kubectl get pods -n ingress-nginx

# Example Ingress resource
cat <<EOF | kubectl apply -f -
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-app
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
  - host: my-app.local
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: my-app-svc
            port:
              number: 80
EOF

Add 192.168.56.10 my-app.local to your host /etc/hosts, then access via http://my-app.local:30080.


cert-manager

Automates TLS certificate issuance. A selfsigned-issuer ClusterIssuer is created automatically.

# Request a certificate
cat <<EOF | kubectl apply -f -
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: my-cert
  namespace: default
spec:
  secretName: my-cert-tls
  issuerRef:
    name: selfsigned-issuer
    kind: ClusterIssuer
  dnsNames:
  - my-app.local
EOF

kubectl get certificate my-cert

Prometheus + Grafana

Full metrics stack deployed via kube-prometheus-stack. Pre-built dashboards for nodes, pods, and workloads are included out of the box.

Service URL Credentials
Grafana http://192.168.56.10:30300 admin / <grafana_password>
Prometheus http://192.168.56.10:30090
Alertmanager http://192.168.56.10:30093

Requires workers to have at least 3072 MB RAM. Update settings.yaml before enabling.

# Check all monitoring pods are running
kubectl get pods -n monitoring

Loki + Promtail

Lightweight log aggregation. Promtail ships logs from all pods to Loki; query them in Grafana under Explore → Loki.

# Verify Promtail DaemonSet
kubectl get ds promtail -n monitoring

# Quick log query via Grafana Explore:
# {namespace="default"} |= "error"

Requires prometheus_grafana.enabled: true.


Container Registry

An in-cluster Docker Registry exposed as a NodePort. Useful for pushing locally built images without Docker Hub.

# Push an image (from your host)
docker tag myapp 192.168.56.10:32000/myapp:latest
docker push 192.168.56.10:32000/myapp:latest

# Pull from inside the cluster
# image: 192.168.56.10:32000/myapp:latest

containerd on every node is pre-configured to allow insecure access to this registry.


Daily Usage

SSH into a node

vagrant ssh k8s-control
vagrant ssh k8s-worker-1
vagrant ssh k8s-worker-2

kubectl from your host

export KUBECONFIG=$(pwd)/admin.conf
# Or add to ~/.bashrc / ~/.zshrc to make it permanent

Stop and restart (preserves state)

vagrant halt
vagrant up

Reprovision a single node

vagrant provision k8s-control

Destroy and rebuild from scratch

vagrant destroy -f
vagrant up

Resource Guide

Use this table to plan your settings.yaml memory values.

Addons enabled Recommended worker RAM
Defaults only (DNS, Metrics, Dashboard, ingress-nginx) 2048 MB
+ cert-manager or Registry 2048 MB
+ Prometheus + Grafana 3072 MB
+ Prometheus + Grafana + Loki 4096 MB

Troubleshooting

Nodes stuck in NotReady

CNI or CoreDNS may still be initialising. Wait 90 seconds and recheck:

kubectl get nodes
kubectl get pods -n kube-system

Addon pod in CrashLoopBackOff

kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> --previous

join.sh not found on worker

Provision the control-plane first:

vagrant provision k8s-control
vagrant provision k8s-worker-1 k8s-worker-2

Dashboard shows no metrics

Ensure Metrics Server is enabled and running:

kubectl get pods -n kube-system | grep metrics-server
kubectl top nodes

VirtualBox host-only network conflict

Change node IPs in settings.yaml to an unused subnet (e.g. 192.168.100.x) and rebuild:

vagrant destroy -f && vagrant up

Extending the Cluster

Add a worker node

Add an entry to nodes.workers in settings.yaml:

- name: "k8s-worker-3"
  ip: "192.168.56.13"
  cpu: 2
  memory: 2048

Then:

vagrant up k8s-worker-3

Upgrade Kubernetes

Update kubernetes.version in settings.yaml, destroy, and rebuild:

vagrant destroy -f
vagrant up