Building a Reliable Windows + Linux Local AI Workstation
Spec a workstation-class box, then choose WSL2 + Docker over dual-boot or full VMs with a clear read of Type-1 vs. Type-2 virtualization, and bring a GPU-backed service stack up as containers: the foundation for the whole build.
Part 1 of a 9-part series: teaching CompTIA A+ (Core 1 / 220-1101 and Core 2 / 220-1102) through a real build, a private, local AI workstation/server for a small business.
The job: one box, two operating systems, no cloud
A small business wants its own AI assistant, language models, document analysis, the works, running on hardware it owns, in a room it controls. No data leaves the building. That last constraint is the whole reason the project exists, and it's the reason the cloud-shaped answer ("just rent some GPUs") isn't on the table.
So you're standing up the first box. And almost immediately you hit a fork that every homelabber and field tech eventually meets: the operator's tooling lives on Windows, that's where the admin sits, where the GPU vendor's driver tooling is most comfortable, where the people who use this thing already work, but the service stack you're about to run (a database, a cache, a model server, an API) is happiest on Linux. Containers, systemd, the whole ecosystem assumes Linux.
You have three ways to make one machine do both:
- Dual-boot. Two operating systems on the same disk, pick one at power-on. Clean separation, but you can only run one at a time: useless when the Linux services need to be up while the operator works in Windows.
- Full VMs (a Type-2 hypervisor). Run a complete Linux virtual machine on top of Windows. Works, but you're paying the full tax of a second kernel, virtual disks, and a hard memory carve-out.
- WSL2 + Docker. Run a real Linux kernel alongside Windows through the Windows Subsystem for Linux, then run the service stack as containers inside it.
We picked door number three, and this article is about why: plus all the A+ hardware and OS knowledge you need to make the choice on purpose instead of by accident. By the end you'll have spec'd the box, understood the virtualization options well enough to defend the decision, and brought the base stack up with a repeatable script.
📘 Objectives covered (220-1101 / 220-1102) >This article maps to the following CompTIA A+ exam objectives. If you'restudying for the exam, these are your anchors; if you're here for the build,skim past: the box explains itself. >Core 1 (220-1101)- 3.x, Hardware: selecting a CPU, RAM capacity, power supply (PSU) headroom, and case/cooling for a compute-heavy workstation.- 4.x: Virtualization & Cloud: purpose of virtual machines; **Type-1 vs. Type-2 hypervisors**; resource requirements; WSL2 as a worked virtualization case. >Core 2 (220-1102)- 1.x: Operating Systems: Windows editions/features and the virtualization features they require; Linux installation and basic config; the difference between an OS install and an OS that's ready to do work. >Concepts taught below: workstation component selection, Type-1 vs. Type-2virtualization, WSL2 architecture, containers vs. VMs, and dual-OSinstall/config basics on both platforms.
Concepts: the hardware and the hypervisor
Picking the parts (1101 3.x)
A box that runs AI inference is not a normal office PC, and the A+ hardware domain gives you the vocabulary to size it correctly. Four things matter more than the rest:
CPU and PCIe lanes. The CPU isn't doing the heavy math (the GPU is) but the CPU feeds the GPU, and it does so over PCIe lanes. A GPU wants a full PCIe x16 slot. Consumer desktop CPUs publish a limited number of lanes (commonly enough for one full-speed GPU plus some NVMe); the moment you want a second GPU at full bandwidth, or a GPU and a fast NVMe array, you're looking at a workstation-class CPU (the Xeon-W / Threadripper-PRO tier) that publishes far more lanes. For the A+ exam and for real life: count your lanes before you count your cores.
RAM capacity. System RAM is separate from GPU VRAM, and you need plenty of both. The OS, the database, the cache, the container runtime, and the model server all want system memory; a serious build starts at 128 GB and scales toward 1–2 TB on the multi-GPU tower tier. Under-provisioning RAM shows up as swapping and stalls long before you run out of GPU.
PSU headroom (this is the one people get wrong). A single high-end GPU can pull ~600 W on its own. Two of them plus a workstation CPU is a ~3 kW machine. The power supply must be rated well above the steady-state draw to absorb transient spikes, a PSU running at its absolute ceiling browns out under load, and the symptom (random reboots under inference, not at idle) looks like a software bug for days. Spec the PSU for transient peak, not average. And at that power level you're often past what a standard wall circuit delivers, a multi-GPU tower may need a dedicated 240 V circuit, which is a real install-day fact, not a footnote.
Case and cooling. GPUs at 600 W each dump that wattage into the room as heat. A 2-GPU desktop survives on air; a 4-GPU tower needs serious airflow or liquid cooling and gets loud; push past that and you're in rack/liquid-cooled, server-room territory. Thermal envelope is a legitimate reason to choose a smaller, quieter footprint even when a denser box is technically faster, for an office or lab, "fits in the cabinet and doesn't sound like a jet" can outweigh raw throughput.
Type-1 vs. Type-2, and where WSL2 fits (1101 4.x)
Virtualization is a guaranteed A+ topic, and the distinction the exam cares about is hypervisor type:
- A Type-1 (bare-metal) hypervisor runs directly on the hardware, with the host OS and guest OSes both sitting on top of it. ESXi, Hyper-V in its server role, and Proxmox are Type-1. This is the data-center / "this machine exists to host VMs" pattern.
- A Type-2 (hosted) hypervisor runs as an application on top of an existing OS. VirtualBox, VMware Workstation, and the desktop face of Hyper-V are Type-2. This is the "I have a Windows machine and I also want a Linux VM" pattern: which is exactly our situation.
WSL2 is the interesting case, and it's worth being precise because the exam and the marketing both blur it. WSL2 runs a real Linux kernel inside a lightweight, managed VM on top of Windows' virtualization platform (the same hardware-virtualization features Hyper-V uses). So under the hood it leans on Type-1-style hardware virtualization, but from the operator's chair it behaves like a Type-2 convenience: you're in Windows, you type wsl, and you're in Linux, shared filesystem, shared network, no separate VM to boot and babysit. For exam purposes, treat WSL2 as a virtualization feature of Windows that gives you a genuine Linux kernel without a full, manually-managed VM.
Containers vs. VMs
Once you have Linux (via WSL2), you have a second choice for the services: plain processes, or containers. We use containers (Docker). The A+-level distinction:
- A VM virtualizes hardware: each VM carries its own full OS kernel. Heavy, isolated, slow to start.
- A container virtualizes the OS: containers share the host's kernel and package just the app and its dependencies. Light, fast, reproducible.
For a service stack (database, cache, model server, API) you want containers: one docker compose up brings the whole thing online identically every time, and tearing it down leaves no residue on the host. We're effectively nesting: Windows → WSL2 (a managed Linux VM) → Docker containers. That sounds like a lot of layers, but each one earns its place, and the operator only ever sees two: Windows for their tools, and one command to manage the stack.
Hands-on walkthrough: spec the box, then bring it up
Step 1, A generic, defensible hardware spec
Here's a starting workstation spec for a single-box build, with the A+ reasoning attached to each line. Treat it as a template, not a shopping list: scale to the models you actually intend to run.
Component | Starting spec | Why (the A+ reasoning) |
|---|---|---|
CPU | Workstation-class (high PCIe-lane count) | Feeds the GPU(s) over PCIe x16; lane count caps how many GPUs/NVMe you can run at full speed (1101 3.x) |
GPU | 1× high-VRAM card to start (≥16 GB), tower chassis that accepts more | VRAM is the gating resource for model size; chassis headroom lets you add cards later |
System RAM | 128 GB ECC, expandable | OS + DB + cache + model server all want host RAM, separate from VRAM |
Boot/OS disk | NVMe SSD (PCIe) | Fast OS + container layer; covered in depth in Part 2 |
PSU | Rated well above peak GPU + CPU draw | A GPU can pull ~600 W; size for transient peak, not average (random reboots under load = under-spec'd PSU) |
Cooling | Air for 1–2 GPUs; plan airflow/liquid beyond | GPUs dump their wattage as heat; thermal envelope can decide the chassis |
A note on the GPU tier, because it drives everything downstream: a single card with ≥16 GB VRAM is the practical floor for serious local inference. Below ~8 GB you're limited to tiny models; a single high-end card runs most mid-size models comfortably; two or more cards with high combined VRAM unlock the largest models. We'll lean on this tiering again when we automate hardware detection below and throughout the series.
Step 2, Enable virtualization in UEFI/BIOS
None of the virtualization works if the CPU's virtualization extensions are disabled in firmware, and on a fresh box they often are. Reboot into UEFI/BIOS and enable:
- Intel VT-x or AMD-V (CPU virtualization): sometimes labeled "SVM Mode" on AMD or just "Virtualization Technology."
- VT-d / AMD-Vi (IOMMU) if you'll pass hardware through to guests.
This is the single most common reason WSL2 or any hypervisor refuses to start, and the error message rarely says "go fix your BIOS." Check it first.
Step 3, Install WSL2 and confirm the kernel version
On the Windows side, from an elevated PowerShell:
wsl --install
# reboot when prompted, then:
wsl --set-default-version 2
wsl --install -d Ubuntu-24.04The --set-default-version 2 line matters. WSL has two generations: WSL1 translated Linux syscalls and had no real kernel; WSL2 ships an actual Linux kernel in a managed VM and is what Docker and the service stack require. A distro accidentally left at version 1 will fail in confusing ways. Verify:
wsl -l -v
# NAME STATE VERSION
# * Ubuntu-24.04 Running 2That VERSION column reading 2 is your first verification checkpoint.
Step 4, The base-stack bring-up script
Inside the Linux environment, a bootstrap script does the unglamorous prep work before the application installer takes over. The real installer in this build is a few hundred lines of defensive Bash; here's the shape of it, genericized, showing the steps that matter for A+ understanding:
#!/usr/bin/env bash
set -euo pipefail
# 1. Refuse to run on an unsupported OS — fail early with a clear message.
detect_distro() {
. /etc/os-release
case "$ID-$VERSION_ID" in
ubuntu-22.04|ubuntu-24.04|debian-12) echo "OK: $PRETTY_NAME" ;;
rhel-9*|rocky-9*|almalinux-9*) echo "OK: $PRETTY_NAME" ;;
*) echo "Unsupported distro: $ID $VERSION_ID" >&2; exit 3 ;;
esac
}
# 2. Must be root — installing packages and configuring services needs it.
[[ $EUID -eq 0 ]] || { echo "Run with sudo." >&2; exit 4; }
# 3. Install OS prerequisites: container runtime, Python, TLS + diagnostics.
apt-get update -y
apt-get install -y \
ca-certificates curl gnupg git openssl \
python3 python3-venv \
net-tools dnsutils iputils-ping netcat-openbsd # ss, nslookup, ping, nc
# 4. Install Docker engine + the compose plugin (skip if already present).
if ! command -v docker >/dev/null 2>&1; then
install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
-o /etc/apt/keyrings/docker.asc
echo "deb [signed-by=/etc/apt/keyrings/docker.asc] \
https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo "$VERSION_CODENAME") stable" \
> /etc/apt/sources.list.d/docker.list
apt-get update -y
apt-get install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin
systemctl enable --now docker
fi
# 5. GPU preflight — refuse to continue silently on CPU if a GPU is expected.
if ! command -v nvidia-smi >/dev/null 2>&1; then
echo "WARNING: no GPU driver detected — inference will run on CPU (~50× slower)."
fiTwo things in there are worth calling out as real-world lessons baked into code, because they're the kind of thing the exam's troubleshooting domain is testing for:
- It installs diagnostic tools up front (
net-tools,dnsutils,iputils-ping,netcat). A minimal Linux image ships with almost none of these. When something breaks on site at 6 p.m., you do not want the first blocker to be "I can't even runping." A few megabytes of cheap insurance. - It refuses, loudly, when a GPU is expected but absent. GPU-accelerated containers silently fall back to CPU if the host driver/runtime isn't there, roughly 50× slower, and "working" enough that nobody notices until the thing is unusably slow in production. Fail fast and visibly.
Step 5: The base services (containers)
The service stack comes up as containers via Compose. A trimmed docker-compose.yml for the foundation:
services:
postgres: # episodic/relational storage
image: postgres:15-alpine
environment:
POSTGRES_DB: appdb
POSTGRES_USER: appuser
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
ports: ["5432:5432"]
volumes: ["postgres_data:/var/lib/postgresql/data"]
healthcheck:
test: ["CMD-SHELL", "pg_isready -U appuser -d appdb"]
interval: 10s
retries: 5
redis: # working memory / cache / pub-sub
image: redis:7-alpine
command: ["redis-server", "--appendonly", "yes", "--requirepass", "${REDIS_PASSWORD}"]
ports: ["6379:6379"]
inference-server: # local LLM serving endpoint
image: ollama/ollama:latest
ports: ["11434:11434"]
deploy: # pass the GPU through to the container
resources:
reservations:
devices: [{ driver: nvidia, count: all, capabilities: [gpu] }]
volumes: ["inference_data:/root/.ollama"]
volumes:
postgres_data:
redis_data:
inference_data:Note the deploy.resources.reservations.devices block on the inference server: that's how a container is granted the host's GPU. It only works if the host has both the GPU driver and the container toolkit installed; miss either and the container starts fine but runs on CPU. (That's the silent-fallback trap from Step 4, now you see where it bites.)
Verification: prove the box actually works
A build isn't done when the installer exits: it's done when you've confirmed each layer. Three checks, bottom to top:
1. The Linux VM is at version 2 (from Windows):
wsl -l -v # VERSION column must read 22. The base services are up and healthy (inside Linux):
docker ps
# CONTAINER ID IMAGE STATUS
# ... postgres:15-alpine Up 2 minutes (healthy)
# ... redis:7-alpine Up 2 minutes (healthy)
# ... ollama/ollama:latest Up 2 minutesLook specifically for (healthy), not just "Up": a container can be running while its app inside is broken. The healthchecks in the compose file are what populate that status.
3. The Linux stack can reach the Windows host (cross-boundary smoke test). From inside WSL2, the Windows host is reachable; confirm the network bridge works both ways and the GPU is visible to the container:
docker exec -it inference-server nvidia-smi # GPU visible inside the container?
curl -s http://localhost:11434/api/tags # inference server answering?If nvidia-smi prints your card from inside the container, the whole Windows → WSL2 → Docker → GPU chain is intact. That's the moment the foundation is real.
🎯 What the exam asks >CompTIA frames this material in a few predictable ways. Know these cold: >- Type-1 vs. Type-2 hypervisor is a near-guaranteed question. Type-1 = bare metal, runs on the hardware (ESXi, Proxmox, server-role Hyper-V). Type-2 = hosted, runs as an app on an OS (VirtualBox, VMware Workstation). If the scenario says "installed on top of an existing Windows desktop," the answer is Type-2. The exam also tests this distinction when comparing approaches: dual-boot requires you to reboot to switch operating systems (only one runs at a time), while virtualization lets both the host OS and guest(s) run concurrently.- Resource requirements for virtualization: the exam wants you to know a VM needs CPU virtualization support (VT-x / AMD-V) enabled in UEFI/BIOS, plus adequate RAM and disk allocated to the guest. "VM won't start" → check that virtualization is enabled in firmware first.- Windows editions & features: client Hyper-V and WSL are feature-dependent: they need a supported Windows edition (Pro and up) and the right optional features turned on. Expect a question that hinges on "which edition supports this feature."- VM vs. container may appear as a "best choice" scenario. Containers = lightweight, share the host kernel, fast to deploy, ideal for apps/services. VMs = full OS isolation, heavier. Match the tool to the requirement.- PSU sizing lives in the hardware domain: the exam tests that you size a power supply for the total draw with headroom, and recognizes "random shutdowns/reboots under load" as a power symptom, not a software one.
Common pitfalls (most of these are from the real build)
- Under-spec'd PSU for the GPUs. The classic. The machine is rock-solid at idle and reboots randomly under inference load. Looks like a software crash; it's the PSU browning out on transient peaks. Size for peak + headroom.
- WSL2 memory ballooning. By default WSL2 will happily claim a large slice of host RAM and not give it back, starving the Windows side. Cap it explicitly in
.wslconfig(memory=,processors=) so the operator's Windows tools don't get squeezed. - Virtualization disabled in UEFI/BIOS. WSL2 / Hyper-V refuse to start and the error doesn't point at the cause. Always the first thing to check.
- Wrong WSL version. A distro silently left at WSL1 breaks Docker and the service stack in baffling ways.
wsl -l -v, confirm the 2. - Silent CPU fallback. GPU containers run on CPU (~50× slower) when the host is missing the GPU driver or container toolkit. The stack "works," just uselessly slowly. The installer's GPU preflight exists precisely to make this loud instead of silent.
- Mixing up Type-1 and Type-2 expectations. A concrete trap: standing up a Type-2 desktop hypervisor like VirtualBox or VMware Workstation and expecting the isolation/performance guarantees of a Type-1 bare-metal system like ESXi. Know which one you're running and what it can and can't promise.
Recap + what's next
You spec'd a workstation with the A+ hardware domain doing the reasoning (PCIe lanes, RAM, PSU headroom, cooling) chose WSL2 + Docker over dual-boot or full VMs with a clear-eyed read of Type-1 vs. Type-2 virtualization, and brought a base service stack (database, cache, inference server) up as containers with the GPU passed through. You verified every layer instead of trusting the installer's exit code.
The foundation is live. But there's a problem hiding in that hardware table: we put the OS and the database on "an NVMe SSD" and waved past it. On a real build, where each piece of data lives is the difference between snappy and unusable, and getting it wrong is one of the most common performance complaints in self-hosted systems.
Next up: Part 2: "Storage That Survives: NVMe/SSD/HDD Tiers & RAID." The database is indexing at a crawl because it's on the wrong disk. We'll fix it with a tiered storage layout, talk through RAID levels and what they do (and the one thing RAID emphatically is not), and add a backup target: covering 1101 3.3 storage and 1101 5.3 storage/RAID troubleshooting. See you there.
