David Igou

David Igou


Blog Archive Contact Dale
  • Kubernetes ingress whitelisting behind a loadbalancer
  • Switching to Ubuntu 64 bit for my Pis
  • Prometheus in Your Home
  • Simple Go Webhook Receiver
  • Kubernetes Local Storage
  • Secure Networking between local hosts and an AWS VPC
  • Nginx metrics via exporter sidecar
  • k3s tweaks
  • Securing a Kubernetes ingress with htpasswd
  • Running a static website on Kubernetes
  • Buildin this site again .. again
  • Openshift on AWS caveats
  • Prometheus
  • Building this site (again)
  • Kubernetes-2
  • Kubernetes
  • Plotting banned hosts
  • Fail2ban
  • Building this site [Legacy]
  • Prometheus in Your Home

    All metrics of nodes and services in my home lab are scraped by Prometheus. Here are some challenges I ran into, and how I worked around them.

    Why use Prometheus in your Homelab

    I’m approaching ~15 nodes in my home and Prometheus is a great way to get insight on my resource usage and performance. I have alerts configured for when things go down (or I break things), and I can get visibility on what other users are doing in my cluster. There are tons of useful exporters I consume - I have blackbox exporters that check if my site is up, have a temperature exporter for my Raspberry Pi’s, and I get utilization metrics from my file servers.

    I keep some example configurations here.

    Trial #1: Cheap ec2’s

    The shortest path to getting a running instance in the beginning for me was doing a kubectl create -f bundle.yaml -n monitoring from the official prometheus-operator. My worker nodes were originally t2.nanos, and I allocated storage via the ebs-csi driver. Originally this appeared fine and scraping a couple node-exporters and some small bundled metrics endpoints didn’t seem to cause issues.

    Eventually, I noticed the memory usage would slowly become too much for the instance size. The CPU spikes from saving blocks to the TSDB also were too much, and my node would become unresponsive. Eventually, the operator would reschedule the pod on another node and the cycle would repeat where it slowly deathballed through my cluster. Upgrading to t2.micros did not fix this.

    Setting limits sort of helped, but there were gaps in metrics from Kubernetes killing the pod when it eventually hit said limit. Limiting scrape intervals and having a node that I only scheduled Prometheus on would only delay the inevitable. I also tired switching to k3OS to reduce the OS footprint, but to no avail. It became clear I wasn’t going to see success on an EC2

    Recently I was able to pinpoint some performance issues I experienced on cert-manager and the ebs-csi controllers. These projects are still Alpha status and things like that are to be expected. If you’re reading this in 2021, this might now be possible.

    Trial #2: Unused hardware in my office

    k3s_pis

    When it became clear I needed more power than I was willing to pay AWS bills for, I built a VPN Gateway into my VPC, added my Raspberry Pis to the cluster, and created a new Prometheus object backed by NFS storage. I used images from this Prometheus ARM project.

    This worked for a bit, but I ran into more performance issues with NFS. Eventually the database got corrupted, and I saw more performance issues linked to this issue. The cause was related to the OS running 32 bit.

    So I needed a 64 Bit OS, and better storage performance..

    Trial #3: This is getting ridiculous

    Science

    So here we are at 4x the memory we started at on an aarch64 CPU. The Pi 4’s cooling issues are a real thing, so the sink isn’t entirely a joke.

    Out of the box, Raspbian does not boot a 64bit kernel, so we will need to enable it.

    rpi-update
    echo 'arm_64bit=1' >> /boot/config.txt
    

    Reboot and throw a uname -a

    Linux pi-x 4.19.97-v8+ #1293 SMP PREEMPT Wed Jan 22 17:22:12 GMT 2020 aarch64 GNU/Linux
    

    For storage, I used local storage via the fastest USB 3.0 flash drive I could find. I have a previous blog on configuring a PV backed by it.

    Conclusions

    When working with constraints, innovation truly shines.