Taming AMD's Linux AI Stack: From Kernel Panics to 80% Idle

Debugging • Linux Kernel • Computer Vision

Taming AMD's Linux AI Stack:
From Kernel Panics to 87% CPU Idle

How I fixed a fatal GPU deadlock in Frigate NVR by ripping out ROCm and routing AI inference through Linux gaming drivers.

The Short Version (For Everyone)
The Deep Dive (For the Engineers)
1. The Nightmare: D State and TTM Deadlocks
2. The Failed Attempts & ROCm Pitfalls
3. The Breakthrough: Vulkan and RADV
4. Building the Custom Detector
5. Final Results: ROCm vs CPU vs Vulkan

The Short Version (For Everyone)

I recently set up a smart security camera system called Frigate on my home server. Frigate is incredibly smart—it looks at camera feeds in real-time to detect people, cars, and animals. To do this without melting the server's main processor (CPU), it uses the graphics card (GPU).

My server has a brand new AMD Ryzen processor with built-in graphics (Radeon 760M). On paper, it's a beast. In reality? The server was completely crashing and freezing every 6 to 10 minutes. I had to pull the power plug to fix it.

Why was it crashing?

Imagine a busy intersection with traffic lights controlled by a highly complex, proprietary computer system built by AMD (called ROCm). The system was trying to route two massive fleets of trucks at exactly the same time: one fleet carrying video data, the other carrying AI math calculations. The traffic controller completely panicked, caused a massive pileup, and then the tow trucks (the system reset protocol) broke down on the way to the scene.

How did I fix it?

Instead of relying on AMD's proprietary AI traffic controller, I fired them. I found a different, open-source tool (called ncnn) that routes the AI math through Vulkan. Vulkan is the exact same underlying technology that makes massive 3D video games run smoothly on Linux (like on the Steam Deck).

Because the gaming drivers are heavily tested by millions of players, they are rock solid. They handled the video and the AI math perfectly. My server went from crashing every 6 minutes and using 100% of its CPU, to running flawlessly with the CPU sitting at 87% idle, sipping power.

The Deep Dive (For the Engineers)

1. The Nightmare: D State and TTM Deadlocks

The hardware: An AMD Ryzen 5 8600G (Phoenix1 architecture, RDNA3, gfx1103 APU). The software: Dockerized Frigate 0.17 utilizing ONNX Runtime.

Frigate utilizes the GPU for two distinct pipelines: VAAPI for hardware video decoding (4 camera streams), and ROCm/MIGraphX for ONNX Runtime object detection (YOLOv9).

Shortly after startup, the Frigate container would become completely unresponsive. Docker could not kill it (SIGKILL was ignored). A quick dive into the host system revealed the horror:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root    337200  0.0  0.0      0     0 ?        D<   02:01   0:00 [kworker/u49:9+ttm]
root    337215  0.0  0.0      0     0 ?        D<   02:01   0:00 [kworker/u49:10+ttm]
... (16 workers stuck in D state)

Processes stuck in D (uninterruptible sleep) usually indicate severe I/O or kernel-level locks. Looking at dmesg confirmed it:

[37456.170181] amdgpu 0000:0f:00.0: GPU reset begin!. Source: 5

The Root Cause: Because this is an APU, system RAM is shared as VRAM. The VAAPI processes (media engine) and the MIGraphX processes (compute engine) were stepping on each other inside the TTM (Translation Table Maps) memory manager. This deadlock triggered a GPU reset (Source 5 = compute engine hang). However, on gfx1103, the amdgpu kernel reset sequence is known-buggy and never completed, resulting in a zombie GPU.

2. The Failed Attempts & ROCm Pitfalls

Before abandoning ROCm entirely, I tried standard mitigation strategies to stop the engines from fighting:

Attempt 1 (Software Decode + ROCm): I disabled VAAPI to dedicate the GPU strictly to MIGraphX inference. While ROCm achieved a blazing fast ~14ms inference speed, the GPU still hung after about 6 minutes. ROCm compute on consumer RDNA3 APUs on Linux is just fundamentally unstable, suffering from missing engine isolation.
Attempt 2 (VAAPI Decode + CPU Inference): I pushed YOLO detection to the CPU via standard ONNX Runtime and left VAAPI running on the GPU. Stability was achieved, but inference ballooned to ~77ms, and the CPU load hovered at a staggering 45 (94% to 100% utilization, completely pegged). Unacceptable for a home lab meant to run other services concurrently.

3. The Breakthrough: Vulkan and RADV

While AMD's proprietary compute stack (ROCm) is deeply flawed for consumer APUs, their Linux gaming stack is incredible. Mesa's RADV Vulkan driver is bulletproof. If I could route AI inference through Vulkan instead of ROCm, I could bypass the broken amdgpu compute paths entirely.

Microsoft's ONNX Runtime does have a Vulkan Execution Provider, but it is not shipped in their pre-built Linux wheels. Building it from source inside the Frigate container was an option, but a heavy one.

Instead, I pivoted to ncnn, Tencent's high-performance neural network inference framework optimized for mobile platforms. ncnn has native, highly-optimized Vulkan support. Because Python bindings for ncnn exist, I could write a custom detector plugin for Frigate.

4. Building the Custom Detector

I downloaded a pre-converted YOLOv5s .ncnn model and its parameter file. However, substituting ONNX for ncnn meant I lost Frigate's built-in ONNX post-processing. `ncnn` outputs raw logits; it doesn't apply sigmoid activations or grid decoding internally like the ONNX graphs do.

I wrote a custom Python class to hijack Frigate's ONNXDetector type, manually reshape the raw tensors, apply the sigmoid functions, map the anchor boxes, and run Non-Maximum Suppression (NMS). Here is a snippet of the crucial grid decoding logic that bridged the gap:

# Reshape YOLOv5 outputs: (255, H, W) -> (3, 85, H, W) -> (3*H*W, 85)
def decode_output(ncnn_mat, stride):
    arr = np.array(ncnn_mat)  # (255, grid_h, grid_w)
    na, nc = 3, 80
    no = 5 + nc  # 85 (4 box + 1 obj + 80 class)
    grid_h, grid_w = arr.shape[1], arr.shape[2]
    
    # Reshape and permute
    arr = arr.reshape(na, no, grid_h, grid_w)
    arr = np.transpose(arr, (2, 3, 0, 1))  # (grid_h, grid_w, 3, 85)
    arr = arr.reshape(-1, no)  # (grid_h*grid_w*3, 85)
    
    # Apply sigmoid (ncnn outputs raw logits)
    arr = 1.0 / (1.0 + np.exp(-arr))
    
    # Generate anchor grids
    grid_y, grid_x = np.meshgrid(np.arange(grid_h), np.arange(grid_w), indexing='ij')
    grid = np.stack([grid_x, grid_y], axis=-1)
    grid = np.expand_dims(grid, axis=2)
    grid = np.tile(grid, (1, 1, na, 1)).reshape(-1, 2)
    
    # Extract bounding boxes
    xy = arr[:, :2]
    wh = arr[:, 2:4]
    
    # YOLOv5 Decode formula
    xy = (xy * 2.0 - 0.5 + grid) * stride
    anchors_tiled = np.tile(self._anchors[stride], (grid_h * grid_w, 1))
    wh = (wh * 2.0) ** 2 * anchors_tiled
    
    # ... Confidence thresholding and NMS follows ...

5. Final Results: ROCm vs CPU vs Vulkan

With the custom image deployed, the transformation was instantaneous. By running the inference directly through Mesa's RADV gaming drivers, we kept the performance benefits of hardware acceleration while dodging the kernel panics entirely.

Metric	ROCm + VAAPI (The Original Goal)	CPU + VAAPI (The Fallback)	ncnn Vulkan + VAAPI (The Fix)
Inference Speed	~14ms 🚀	~77ms 🐌	~28ms ⚡
CPU Idle	~74%	0% (Completely Pegged)	87%
GPU Status	Deadlock / Hang	Video decode only	Decode + Inference (~39% Util)
System Stability	❌ Kernel Panic (6 min)	✅ Stable (But system unusable)	✅ 100% Stable (3+ hrs)

The Legacy Hardware Implication: This workaround isn't just for bleeding-edge RDNA3. Older APUs like the Ryzen 3500U (Vega 8) were never officially supported by ROCm. However, because they fully support Vulkan 1.2, this exact same ncnn Vulkan pipeline enables hardware-accelerated AI on them flawlessly.

The PR for this fallback mechanism is open on GitHub. If you're running Frigate on AMD hardware and pulling your hair out over amdgpu kernel crashes, bypass ROCm entirely. Vulkan is the way.

See Through Walls with a $9 Microcontroller

See Through Walls with a $9 Microcontroller | RuView

router

cell_wifi Edge AI & Sensors

See Through Walls with a $9 Microcontroller

Deploying WiFi DensePose on Kubernetes.

"How I deployed a real-time WiFi-based human sensing system on a homelab K3s cluster with a $9 ESP32-S3, live pose estimation, and RTSP camera fusion."

WiFi signals pass through walls. When a person moves — or even breathes — those signals scatter differently. What if you could read that scattering pattern and reconstruct what happened on the other side?

That's exactly what RuView does. Built on research from Carnegie Mellon's DensePose From WiFi paper, RuView is an open-source edge AI system that turns commodity WiFi signals into real-time human pose estimation, vital sign monitoring, and presence detection — all without a single pixel of video.

I took it a step further: deployed it on Kubernetes, wired up a live ESP32-S3 sensor, and fused the WiFi signal data with an RTSP camera feed for dual-modal pose estimation. Here's how.

memory

The Hardware: $9 and a WiFi Router

The entire sensing hardware cost me $9:

check_circle 1x ESP32-S3 ($9) — a dual-core microcontroller with WiFi that exposes Channel State Information (CSI). CSI gives you per-subcarrier amplitude and phase data — 56+ data points per WiFi frame, 20 times per second. That's the raw material for sensing.

Standard consumer WiFi only gives you RSSI (a single signal strength number). CSI is like going from a thermometer to a thermal camera — instead of one number, you get a detailed map of how the signal is being affected by everything in the room.

info

I also had three ESP32-C3s sitting around, but those are single-core RISC-V chips that can't handle the DSP pipeline. The S3's dual-core Xtensa is required — one core captures CSI interrupts while the other runs signal processing.

layers

The Software Stack

RuView's Rust sensing server processes the signal chain:

terminal

ESP32 CSI (UDP) → Hampel outlier rejection → SpotFi phase correction
    → Fresnel zone modeling → FFT vital sign extraction
    → AI backbone (RuVector attention networks)
    → 17 body keypoints + breathing rate + heart rate + presence

At 54,000 frames/sec throughput in Rust, this is fast enough to process live data from multiple sensors with headroom to spare. The server exposes a REST API, WebSocket stream, and a full browser UI.

bolt

Flashing the ESP32-S3

The firmware ships as pre-built binaries in GitHub Releases. Flashing takes 30 seconds:

Bash

pip install esptool
python -m esptool --chip esp32s3 --port COM7 --baud 460800 \
  write_flash --flash_mode dio --flash_size 8MB \
  0x0 bootloader.bin \
  0x8000 partition-table.bin \
  0xd000 ota_data_initial.bin \
  0x10000 esp32-csi-node.bin

Then provision it with your WiFi credentials and the IP of your server:

Bash

python provision.py --port COM7 \
  --ssid "MyWiFi" --password "secret" \
  --target-ip 192.168.1.9

The ESP32 connects to WiFi and starts streaming CSI frames over UDP to port 5005. No internet needed after provisioning — everything stays local.

view_in_ar

Containerizing for Kubernetes

The project includes a multi-stage Dockerfile that compiles the Rust server and bundles the UI into a minimal Debian image:

Dockerfile

FROM rust:1.85-bookworm AS builder
WORKDIR /build
COPY rust-port/wifi-densepose-rs/ ./
RUN cargo build --release -p wifi-densepose-sensing-server \
    && strip target/release/sensing-server

FROM debian:bookworm-slim
COPY --from=builder /build/target/release/sensing-server /app/
COPY ui/ /app/ui/
EXPOSE 3000/tcp 3001/tcp 5005/udp
CMD ["/app/sensing-server --source auto --ui-path /app/ui --bind-addr 0.0.0.0"]

Built and pushed to GHCR:

docker build -f docker/Dockerfile.rust -t ghcr.io/zektopic/ruview:k8s-poc .
docker push ghcr.io/zektopic/ruview:k8s-poc

dns

The K8s Deployment

The deployment has one unusual requirement: UDP hostPort. The ESP32 sends raw CSI frames to a specific IP:port, so the pod needs to receive those packets directly on the host's network interface without kube-proxy NAT:

YAML

ports:
- containerPort: 5005
  protocol: UDP
  hostPort: 5005    # ESP32 sends directly to host

This means the pod must be pinned to a specific node (nodeName) — if it moves, the ESP32 would be sending to the wrong IP. For a homelab this is fine; for production you'd use a DaemonSet or a LoadBalancer with UDP support.

The HTTP API and WebSocket get standard NodePort services:

type: NodePort
ports:
- name: http
  port: 3000
  nodePort: 30900

An nginx reverse proxy ties it all together, handling WebSocket upgrades with proper timeout settings so the live data stream doesn't drop.

dashboard

What It Looks Like

The Observatory UI is the star — a cinematic Three.js dashboard with five holographic panels:

waves

Subcarrier Manifold

Live heatmap of all 56+ WiFi subcarriers, showing frequency effects.

favorite

Vital Signs Oracle

Breathing rate (6-30 BPM) & heart rate (40-120 BPM) from phase variations.

person_search

Presence Heatmap

Room-level signal field showing where people are located.

scatter_plot

Phase Constellation

Complex-plane plot of CSI phase, revealing movement patterns.

memory_alt

Convergence Engine

Signal processing pipeline metrics and overall health.

The Pose Fusion view goes further — it overlays WiFi-derived pose estimation onto a live camera feed. I connected my RTSP camera through Frigate's go2rtc, which already handles RTSP-to-HLS transcoding. The browser loads the HLS stream alongside the CSI data, and the fusion engine cross-correlates video motion with WiFi signal changes.

analytics

Real Data, Real Results

With the ESP32-S3 powered on and placed in my office, the system immediately detected:

sensors Presence: true with confidence ~0.78
directions_run Motion level: present_moving → present_still → active
group Person count: 1 (estimated from CSI subcarrier patterns)
speed 64 subcarriers streaming at 20 Hz

All through the wall, with no camera in the room.

JSON Response

{
  "classification": {
    "confidence": 0.78,
    "motion_level": "present_moving",
    "presence": true
  },
  "estimated_persons": 1,
  "features": {
    "breathing_band_power": 34.27,
    "motion_band_power": 61.17,
    "spectral_power": 158.92
  }
}

shield_lock

security

Privacy by Design

This is the compelling part. There is no camera in the sensing loop. The ESP32 captures WiFi signal disturbances — amplitude and phase changes caused by human bodies scattering radio waves.

There are no images, no video frames, no biometric data stored. The "sensing" is fundamentally different from surveillance.

For applications like elderly care monitoring, hospital patient tracking, or smart building occupancy — where cameras raise serious privacy and regulatory concerns — WiFi sensing sidesteps the problem entirely.

rocket_launch

What's Next

hub

Multi-node mesh

Adding 3-6 ESP32-S3 nodes for full 360-degree room coverage with multistatic fusion.

radar

ESP32-C6 + mmWave

Pairing the C6 with a Seeed MR60BHA2 60 GHz sensor for clinical-grade vital signs.

extension

Edge WASM modules

65 implemented edge intelligence modules run directly on the ESP32 as tiny WASM binaries (fall detection, sleep monitoring) with zero cloud dependency.

model_training

Training pipeline

Recording labeled CSI sessions to train the adaptive classifier for room-specific signal characteristics.

play_circle

Try It Yourself

The fastest path to a working system:

# 1. Docker (simulated data, no hardware)
docker run -p 3000:3000 ghcr.io/zektopic/ruview:k8s-poc
# Open http://localhost:3000/ui/

# 2. With ESP32-S3 hardware (~$9)
# Flash firmware, provision WiFi, run server with --source auto

# 3. Full K8s deployment
# See the deployment guide for complete instructions

The entire system — firmware, server, UI, signal processing, neural networks — is open source under MIT. One $9 microcontroller and some WiFi signals. That's all it takes to give a room spatial awareness.

From Docker Compose to Kubernetes: Migrating a Real Homelab Stack

A practical account of migrating 15+ self-hosted services to K3s — including AMD GPU passthrough, WiFi camera routing, custom monitoring, and a feudal Japan dashboard.

Introduction

Most Kubernetes tutorials start with a todo app and end before things get complicated. This isn't that article.

This is the story of migrating a real homelab — 15+ production-grade services including AI camera detection with AMD ROCm GPU acceleration, Home Assistant with hardware integrations, Jellyfin with hardware video transcoding, and a custom Discord alerting pipeline — from Docker Compose to K3s running on a single machine.

Everything in this post happened on real hardware. Every problem described is a problem that actually happened. Every fix is the fix that actually worked.

The Stack Before Migration

The homelab ran on a single Ubuntu machine (manupa-hn-wx9x, 192.168.1.9) using Docker Compose. Services included:

Category	Services
Smart Home	Home Assistant, NanoMQ (MQTT broker)
Surveillance	Frigate (AMD ROCm GPU object detection), frigate-telegram bot
Media	Jellyfin
Monitoring	Netdata, Uptime Kuma
Tools	Stirling PDF, tldraw, FileBrowser, qBittorrent
Archiving	Archive Team Warrior
Privacy	Snowflake Proxy (Tor bridge)
AI	Open WebUI

Why migrate? The honest answer: Docker Compose works fine until you want to scale a single service independently, pin resource limits per container, get structured health alerting, or reproduce the entire stack from code in under 10 minutes. Kubernetes gives you all of that.

Why K3s

Full Kubernetes (kubeadm) adds significant operational overhead for a homelab. K3s is Rancher's lightweight distribution — a single binary, installs as a systemd service, and ships with:

Traefik — ingress controller (takes port 80)
local-path provisioner — dynamic PVC storage in /var/lib/rancher/k3s/storage/
CoreDNS — service discovery
Flannel — pod networking (VXLAN overlay, 10.42.0.0/16)
Built-in containerd — no separate Docker daemon needed

Installation:

curl -sfL https://get.k3s.io | sh -

One command. 30 seconds. Production-grade Kubernetes cluster on your desk.

Architecture

                    manupa-hn-wx9x (192.168.1.9)
                    ┌─────────────────────────────────────┐
                    │                                     │
                    │  homelab namespace                  │
                    │  ├── Frigate (GPU, hostNetwork)     │
                    │  ├── Jellyfin (GPU, hostNetwork)    │
                    │  ├── Home Assistant (hostNetwork)   │
                    │  ├── Netdata (DaemonSet)            │
                    │  ├── Uptime Kuma, FileBrowser       │
                    │  ├── qBittorrent, Stirling PDF      │
                    │  ├── tldraw, Archive Warrior        │
                    │  ├── Snowflake Proxy                │
                    │  ├── Open WebUI, Open Terminal      │
                    │  └── Homepage Dashboard             │
                    │                                     │
                    │  monitoring namespace               │
                    │  ├── Prometheus + Alertmanager      │
                    │  ├── Grafana                        │
                    │  ├── Node Exporter (DaemonSet)      │
                    │  └── Kube State Metrics             │
                    │                                     │
                    │  Still on Docker                    │
                    │  ├── frigate-telegram               │
                    │  └── NanoMQ (MQTT)                  │
                    └─────────────────────────────────────┘
                                      │
                             Nginx reverse proxy
                             (original ports → NodePorts)

All manifests managed with Kustomize (kubectl apply -k), stored in /home/manupa/Docker/k8s/.

Storage Strategy

The simplest approach for a single-node cluster with existing data: hostPath PersistentVolumes pointing directly at existing directories.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: jellyfin-config-pv
spec:
  capacity:
    storage: 10Gi
  accessModes: [ReadWriteOnce]
  hostPath:
    path: /home/manupa/Docker/jellyfin/config

No data migration. No downtime. Existing files immediately available to pods. For new services with no existing data, the K3s local-path StorageClass handles dynamic provisioning automatically.

The trade-off: hostPath volumes are node-specific. When you add a second node, pods using hostPath must be pinned to the node where the data lives via nodeSelector. This is manageable but plan for shared storage (NFS or Longhorn) if you want true pod mobility later.

The Tricky Bits

1. AMD GPU Passthrough for Frigate (ROCm)

Frigate's AMD ROCm GPU access requires three things in the pod spec:

securityContext:
  privileged: true
env:
  - name: LIBVA_DRIVER_NAME
    value: "radeonsi"
  - name: HSA_ENABLE_SDMA
    value: "0"
volumes:
  - name: dev-kfd
    hostPath:
      path: /dev/kfd
  - name: dev-dri
    hostPath:
      path: /dev/dri

2. WiFi Hotspot vs. Flannel CIDR Conflict

K3s Flannel uses 10.42.0.0/16 by default. The WiFi hotspot on this machine also auto-assigned itself 10.42.x.x. Every pod lost network access after K3s installed.

Fix: Change the hotspot subnet to something that doesn't conflict:

sudo nmcli connection modify Hotspot ipv4.addresses 10.50.0.1/24

3. Frigate Cameras on the Hotspot (10.50.0.x)

Fix: hostNetwork: true on the Frigate pod. The pod uses the host's network stack directly, which has a route to 10.50.0.x via the hotspot interface.

spec:
  hostNetwork: true
  dnsPolicy: ClusterFirstWithHostNet

4. /dev/shm Running Out (Frigate)

Frigate's shared memory usage hit 71% of the allocated 500Mi emptyDir limit.

# After — disk-backed, no size limit
volumes:
  - name: dshm
    hostPath:
      path: /tmp/frigate-shm
      type: DirectoryOrCreate

Monitoring and Alerting

Stack

Node Exporter (DaemonSet) ──┐
Kube State Metrics          ├──▶ Prometheus ──▶ Alertmanager ──▶ Discord
K8s API / cAdvisor          ┘        │
                                     ▼
                                  Grafana

Discord Integration

receivers:
  - name: discord
    discord_configs:
      - webhook_url: 'https://discord.com/api/webhooks/...'
        title: >-
          {{ if eq .Status "firing" }}🔥{{ else }}✅{{ end }}
          [{{ .Status | toUpper }}] {{ .GroupLabels.alertname }}
        send_resolved: true

The Homepage Dashboard

Homepage (gethomepage/homepage) serves as the central entry point. The default dark theme was replaced with a custom feudal Japan / sumi-e (墨絵) aesthetic:

Background: deep ink-wash gradient with ambient radial glows in bamboo green and sakura rose
Cards: dark lacquer panels with barely-visible gold borders, 2px lift on hover
Group headers: antique gold, ultra-light weight, torii-bar underline
Typography: system serif stack (Hiragino Mincho ProN → Yu Mincho → Georgia)
Scrollbar: 3px thin, gold-tinted

Multi-Node Expansion

curl -sfL https://get.k3s.io | \
  K3S_URL=https://192.168.1.9:6443 \
  K3S_TOKEN=<token> \
  sh -

The node labelling pattern:

kubectl label node manupa-hn-wx9x role=primary gpu=amd
kubectl label node manupa role=worker

Lessons Learned

Plan your storage before you plan your services: If multi-node is in your future, set up NFS or Longhorn first.
hostNetwork is not a bad word: In a homelab, some services genuinely need it (mDNS/SSDP, WiFi hotspots).
GPU access in Kubernetes is not scary: Four YAML fields. It works the same as in Docker.
Alertmanager is better than dashboards for homelab ops: You want Discord to tell you when something breaks at 2am.
YAML sprawl is real but manageable: Kustomize keeps it organized.
K3s is genuinely production-grade: For a homelab, you give up almost nothing vs. full Kubernetes.

Closing

The migration took a weekend of focused work. The result is a homelab that is Documented, Version-controlled, Observable, Scalable, and Recoverable.

If you're running a homelab on Docker Compose and wondering whether Kubernetes is worth the learning curve — it is. Start with K3s. Start with one service. The rest follows naturally.

Stack: K3s v1.34.5 · Ubuntu 25.10 · AMD ROCm · Flannel CNI · Traefik · Kustomize
Services: Home Assistant · Frigate · Jellyfin · Prometheus · Grafana · Alertmanager · Netdata · Uptime Kuma · FileBrowser · Stirling PDF · tldraw · qBittorrent · Archive Warrior · Snowflake Proxy · Open WebUI · Open Terminal · Homepage

Revamping my Sensor Dashboard

New Release

Introducing SensorDash 2.0

A complete architectural overhaul designed for speed, scalability, and a superior user experience.

View Live Demo launch

storage

The Engine Room

Moving from SQLite to InfluxDB

The original version relied on SQLite, which lagged with continuous streams. We migrated to InfluxDB, a dedicated Time-Series Database (TSDB).

Blazing Fast Queries: Instant aggregation for historical data.
Automatic Data Lifecycle: Retention policies keep storage lightweight.
Scalability: Ingests thousands of data points per second.

web

The Visuals

Embracing ShadCN UI

A powerful backend deserves a beautiful frontend. We rebuilt the UI using ShadCN UI (Tailwind) for a minimalist aesthetic.

Modern Aesthetics: Data is put front and center.
Interactive Widgets: Responsive, real-time monitoring.
Accessibility: Keyboard-friendly by default.
Dark Mode: A highly requested feature, now built-in.

compare_arrows

Version Comparison

Feature	SensorDash 1.0 (Old)	SensorDash 2.0 (New)
Database	SQLite (Relational)	InfluxDB (Time-Series)
Query Speed	Slower on large datasets	Real-time / Instant
UI Library	Standard CSS/Bootstrap	ShadCN UI (Tailwind)
Scalability	Limited	High

What's Next?

With this robust foundation in place, I plan to introduce customizable alerts and multi-sensor overlay graphs in the next update.

Visit SensorDash 2.0 View Legacy 1.0 Demo

Beyond the Microsoft Store: How to Build a Custom WSL Distribution from an Ubuntu 25.04 ISO

Installing a Custom Ubuntu WSL Distribution from an ISO

A comprehensive guide to extracting, packaging, and repairing a custom WSL image from scratch.

Introduction

This document provides a comprehensive guide on how to install a Windows Subsystem for Linux (WSL) distribution using a downloaded ISO file. The target distribution in this guide is named Ubuntu-25.04.

Installing a WSL distribution from an ISO is not a direct process. You cannot simply point WSL to the .iso file. Instead, the process involves extracting the core Linux filesystem from the ISO, packaging it into a compressed tarball (.tar.gz), importing this tarball into WSL, and performing post-installation fixes to ensure core utilities function correctly.

This guide details the entire journey, including initial failed attempts and the final successful methodology, providing both the necessary commands and the theory behind why certain steps were taken.

Part 1: Preparation and Initial Extraction Attempts

This part covers the initial setup and the challenges encountered during the extraction of the root filesystem. Understanding these failures is key to understanding the successful method.

Step 1: Creating Necessary Directories

Before starting, we need to create two folders on the Windows host system:

A temporary location to store the extracted filesystem and the final tarball.
A permanent location where WSL will store the virtual hard disk (.vhdx) for the new distribution.

The following commands were used to create these directories:

mkdir C:\temp\ubuntu-rootfs
mkdir C:\WSL\Ubuntu-25.04

Step 2: The Challenge of Filesystem Extraction

The core of the Linux OS inside the Ubuntu ISO is stored in a compressed file named filesystem.squashfs (or similar, in our case it was minimal.squashfs). The main challenge is extracting this file while preserving Linux-specific attributes like symbolic links and file permissions.

First Attempt: Using 7-Zip on Windows

The most straightforward approach is to use a tool like 7-Zip to extract the minimal.squashfs file.

Command Attempted:

"C:\Program Files\7-Zip\7z.exe" x "C:\...\minimal.squashfs" -o"C:\temp\ubuntu-rootfs"

Result: Failure.

Theory: This method failed because 7-Zip, when running on Windows as a standard user, does not have the necessary privileges to create symbolic links, which are fundamental to a Linux filesystem. It also cannot create special device files (e.g., /dev/null). This resulted in numerous "Cannot create symbolic link" errors.

Second Attempt: Using `unsquashfs` in WSL to a Windows Directory

A more advanced approach is to use the unsquashfs utility from within an existing WSL instance, which is designed to understand Linux filesystems. The initial idea was to extract directly to the temporary directory on the Windows C: drive.

Command Attempted:

wsl unsquashfs -f -d /mnt/c/temp/ubuntu-rootfs /mnt/c/.../minimal.squashfs

Result: Failure.

Theory: This attempt also failed, but for a more subtle reason. While unsquashfs in WSL can create symbolic links, the target filesystem was NTFS (the Windows C: drive, mounted at /mnt/c/). NTFS has limitations in how it handles the sheer number and complexity of Linux-style symbolic links, leading to a "Too many levels of symbolic links" error. The process also failed because it couldn't create character device files without root privileges.

Part 2: The Successful Method: Isolate and Repair

The successful strategy involved isolating the extraction and packaging process entirely within the WSL native (ext4) filesystem and then fixing the permissions as a post-installation step.

Step 3: Extraction Inside the WSL Filesystem

To overcome the filesystem limitations of NTFS, the extraction was performed inside the home directory of the default WSL user, which resides on a native ext4 filesystem.

Create a temporary directory inside WSL and copy the squashfs file into it:

wsl -- bash -c "mkdir -p ~/wsl-temp && cp /mnt/c/Users/ManupaWickramasinghe/Downloads/ubuntu-25.04-desktop-amd64/casper/minimal.squashfs ~/wsl-temp/"

Extract the filesystem without `sudo`: To avoid an interactive password prompt for sudo, the extraction was performed as a regular user. This required flags to bypass errors related to creating special files and attributes, which would be fixed later.
```
wsl -- bash -c "cd ~/wsl-temp && unsquashfs -f -no-xattrs -ignore-errors minimal.squashfs"
```
- -f: Force overwrite of any existing files in the destination.
- -no-xattrs: Prevents the tool from trying to write extended attributes, avoiding a class of permission errors.
- -ignore-errors: Ignores errors related to creating special device files, allowing the extraction to complete.

Step 4: Packaging the Filesystem into a Tarball

With the filesystem successfully extracted to the squashfs-root directory inside ~/wsl-temp, it was then packaged into a compressed tarball.

wsl -- bash -c "cd ~/wsl-temp && tar -czvf ubuntu-25.04.tar.gz -C squashfs-root/ ."

-c: Create a new archive.
-z: Compress the archive with gzip.
-v: Verbose output (optional, shows files being added).
-f ubuntu-25.04.tar.gz: Specifies the output filename.
-C squashfs-root/: Changes to the squashfs-root directory before adding files. This is crucial to ensure the tarball does not contain an extra parent directory.
.: Specifies that all files in the current directory (now squashfs-root) should be added.

Step 5: Importing the Tarball into WSL

The final tarball was then moved from the WSL filesystem back to the Windows temporary directory and imported.

Move the tarball:

wsl -- bash -c "mv ~/wsl-temp/ubuntu-25.04.tar.gz /mnt/c/temp/"

Import the distribution:
```
wsl --import Ubuntu-25.04 C:\WSL\Ubuntu-25.04 C:\temp\ubuntu-25.04.tar.gz
```
This command registers Ubuntu-25.04 as a new WSL distribution, storing its virtual disk in the previously created C:\WSL\Ubuntu-25.04 directory.

Part 3: Post-Installation Configuration and Repair

The extraction method, while successful, left critical system files with incorrect ownership and permissions because it was run without sudo. This required a final repair phase.

Step 6: Repairing `sudo` and `passwd`

The sudo and passwd commands were not working due to incorrect file permissions. They were fixed by using wsl -u root to run commands as the root user from outside the distribution, bypassing the broken sudo.

Fix `sudo` executable and configuration files:
```
wsl -d Ubuntu-25.04 -u root -- bash -c "chown root:root /etc/sudo.conf && chown root:root /usr/bin/sudo && chmod 4755 /usr/bin/sudo"
```
Theory: /usr/bin/sudo must be owned by root and have the setuid bit (4...) set. This allows it to run with root privileges even when executed by a normal user.

Fix `passwd` utility and shadow file:

wsl -d Ubuntu-25.04 -u root -- bash -c "chown root:root /usr/bin/passwd && chmod 4755 /usr/bin/passwd && chown root:shadow /etc/shadow && chmod 640 /etc/shadow"

Theory: Similar to sudo, passwd needs setuid to modify the protected /etc/shadow file, which stores user password hashes.

Fix the `/etc/sudoers` file:

wsl -d Ubuntu-25.04 -u root -- bash -c "chown root:root /etc/sudoers && chmod 0440 /etc/sudoers"
wsl -d Ubuntu-25.04 -u root -- bash -c "chown -R root:root /etc/sudoers.d && chmod 0755 /etc/sudoers.d"

Theory: /etc/sudoers is the main configuration file for sudo. It must be owned by root and be read-only for security reasons.

Step 7: Final User Configuration

With the system repaired, a user account was created and configured.

Create a new user and add to the `sudo` group:

wsl -d Ubuntu-25.04 useradd -m -G sudo -s /bin/bash manupawick

Set the new user as the default for login:

wsl -d Ubuntu-25.04 -u root bash -c "echo -e '[user]\ndefault=manupawick' > /etc/wsl.conf"

Set the user's password non-interactively:

wsl -d Ubuntu-25.04 -u root -- bash -c "echo 'manupawick:password12312' | chpasswd"

Finalize the installation by shutting down WSL to apply all changes:
```
wsl --shutdown
```

Conclusion

The installation was successful. The key takeaway is that extracting a Linux filesystem for WSL requires careful handling of file permissions and symbolic links. The most reliable method is to perform the extraction and packaging within a native Linux filesystem (like the one provided by WSL itself) and then perform targeted permission repairs as a post-installation step.

Taming AMD's Linux AI Stack: From Kernel Panics to 80% Idle

Contents

The Short Version (For Everyone)

Why was it crashing?

How did I fix it?

The Deep Dive (For the Engineers)

1. The Nightmare: D State and TTM Deadlocks

2. The Failed Attempts & ROCm Pitfalls

3. The Breakthrough: Vulkan and RADV

4. Building the Custom Detector

5. Final Results: ROCm vs CPU vs Vulkan

See Through Walls with a $9 Microcontroller

The Hardware: $9 and a WiFi Router

The Software Stack

Flashing the ESP32-S3

Containerizing for Kubernetes

The K8s Deployment

What It Looks Like

Subcarrier Manifold

Vital Signs Oracle

Presence Heatmap

Phase Constellation

Convergence Engine

Real Data, Real Results

Privacy by Design

What's Next

Multi-node mesh

ESP32-C6 + mmWave

Edge WASM modules

Training pipeline

Try It Yourself

From Docker Compose to Kubernetes: Migrating a Real Homelab Stack

From Docker Compose to Kubernetes: Migrating a Real Homelab Stack

Introduction

The Stack Before Migration

Why K3s

Architecture

Storage Strategy

The Tricky Bits

1. AMD GPU Passthrough for Frigate (ROCm)

2. WiFi Hotspot vs. Flannel CIDR Conflict

3. Frigate Cameras on the Hotspot (10.50.0.x)

4. /dev/shm Running Out (Frigate)

Monitoring and Alerting

Stack

Discord Integration

The Homepage Dashboard

Multi-Node Expansion

Lessons Learned

Closing

Revamping my Sensor Dashboard

Introducing SensorDash 2.0

The Engine Room

Moving from SQLite to InfluxDB

The Visuals

Embracing ShadCN UI

Version Comparison

What's Next?

Beyond the Microsoft Store: How to Build a Custom WSL Distribution from an Ubuntu 25.04 ISO

Introduction

Part 1: Preparation and Initial Extraction Attempts

Step 1: Creating Necessary Directories

Step 2: The Challenge of Filesystem Extraction

First Attempt: Using 7-Zip on Windows

Second Attempt: Using `unsquashfs` in WSL to a Windows Directory

Part 2: The Successful Method: Isolate and Repair

Step 3: Extraction Inside the WSL Filesystem

Step 4: Packaging the Filesystem into a Tarball

Step 5: Importing the Tarball into WSL

Part 3: Post-Installation Configuration and Repair

Step 6: Repairing `sudo` and `passwd`

Step 7: Final User Configuration

Conclusion