complete i18n migration to /[locale]/ with EN+ES content

Full rewrite of the docs site under app/[locale]/ with next-intl
in localePrefix:"always" mode. Every page now exists at both
/en/<path> and /es/<path>; the root / shows a meta-refresh + JS
redirect to /<defaultLocale>/ so GitHub Pages serves something
on the apex URL.

Highlights:
- 107 doc pages migrated to file-per-page JSON namespaces under
  messages/en/ and messages/es/. Spanish content is fully
  translated (no copy-of-English placeholders).
- New documentation for the Active Suppressions section in the
  Settings tab and the per-event Dismiss dropdown in the Health
  Monitor modal.
- New screenshots: dismiss-duration-dropdown.png and an updated
  health-suppression-settings.png.
- Pagefind integrated for client-side search; index is built on
  every CI deploy (not committed).
- RSS feeds: per-locale at /<locale>/rss.xml plus root /rss.xml
  for backward compat.
- Removed the dead app/[locale]/guides/[slug]/ route — every
  guide now has its own static page and no markdown source
  remains.
- Fixed orphan link /guides/nvidia -> /guides/nvidia-manual in
  docs/hardware/nvidia-host.
- Removed obsolete components (footer2, calendar, drawer).

Verified locally with `npm ci && npm run build`: 2804 files in
out/, 231 pages indexed by pagefind, root redirect intact, both
locale roots and the new Active Suppressions docs render OK.
This commit is contained in:
MacRimi
2026-05-31 12:41:10 +02:00
parent 875910b4d7
commit 5ca3463bf6
649 changed files with 83958 additions and 11096 deletions

View File

@@ -0,0 +1,120 @@
{
"meta": {
"title": "Add Coral TPU to LXC | ProxMenux Documentation",
"description": "Pass a Google Coral TPU (USB or M.2 / PCIe) into a Proxmox LXC container. ProxMenux writes the right dev / cgroup / mount entries for each variant, boots the container, and installs the Edge TPU runtime inside so apps like Frigate can use the accelerator."
},
"header": {
"title": "Add Coral TPU to LXC",
"description": "Share a Google Coral TPU (USB Accelerator or M.2 / Mini-PCIe) with a Proxmox LXC container. ProxMenux handles the LXC config and the inside-container Edge TPU runtime install. Coral is TPU-only: for GPU / iGPU sharing (Quick Sync, VA-API, NVENC) in the same container, run Add GPU to LXC separately.",
"section": "Hardware: GPUs and Coral-TPU"
},
"intro": {
"title": "What this does",
"body": "Writes the passthrough config into <code>/etc/pve/lxc/&lt;ctid&gt;.conf</code> — different entries depending on whether the host has a <strong>USB Accelerator</strong>, a <strong>M.2 / PCIe Coral</strong>, or both. Then it starts the container and installs Google's latest <code>libedgetpu</code> runtime inside it. iGPU / GPU passthrough is handled by a separate script (<lxcGpuLink>Add GPU to LXC</lxcGpuLink>) — this one focuses on the TPU."
},
"whenUse": {
"heading": "When to use this",
"body": "Typical use case: running <frigateLink>Frigate</frigateLink>, Agent DVR, Blue Iris + CodeProject.AI, or any other object-detection app inside an LXC and wanting the Coral TPU to do the ML inference instead of the CPU. With Coral, inference latency drops from ~100 ms to ~5 ms per frame and CPU load stays near zero."
},
"prereqs": {
"title": "Before you start",
"drivers": "<strong>Coral drivers already installed on the host</strong>. This script does not install them; it only configures passthrough to the container. Run <hostLink>Install Coral TPU on the Host</hostLink> first if you haven't.",
"driversCheck": "ls /dev/apex_* 2>/dev/null ; lsusb | grep -E '1a6e:089a|18d1:9302'",
"container": "<strong>An existing LXC container</strong>, ideally running a <strong>Debian / Ubuntu</strong>-based distro. The inside-container install uses <code>apt-get</code>; Alpine / Arch containers are not currently supported by this script.",
"downtime": "<strong>Be OK with a brief downtime</strong> of the container. The script stops it to apply config changes, then starts it back up to install drivers inside. No host reboot needed."
},
"hostPrep": {
"title": "Host must be prepared first",
"body": "If you run this script before installing Coral drivers on the host (the <code>gasket</code>/<code>apex</code> kernel module for M.2, the <code>libedgetpu</code> runtime for USB), the LXC config is still written but the container won't find the device at runtime. Order matters: <strong>host install → LXC passthrough → in-container app</strong>."
},
"running": {
"heading": "Running the script",
"body": "Open ProxMenux on the host, go to <strong>Hardware: GPUs and Coral-TPU → Add Coral TPU to LXC</strong>.",
"imageAlt": "Menu entry for 'Add Coral TPU to LXC' inside Hardware: GPUs and Coral-TPU"
},
"howRuns": {
"heading": "How the script runs",
"body": "One decision upfront (which container?), then the script handles USB and PCIe paths independently based on what it finds on the host. If both are present, both get passed."
},
"walkthrough": {
"heading": "Walking through the flow",
"pick": {
"title": "Pick the LXC container",
"body": "A dialog shows every LXC on the host (from <code>pct list</code>). Pick the one that should get the Coral. Running or stopped, it doesn't matter — the script handles both (stops it briefly to write config, then starts it back up to install drivers)."
},
"gpuHint": {
"title": "GPU passthrough suggestion (optional)",
"body": "If the host has a GPU (Intel iGPU, AMD, NVIDIA) and the chosen container does NOT have GPU passthrough configured, the script shows a one-time dialog suggesting you run <lxcGpuLink>Add GPU to LXC</lxcGpuLink> first. Coral is most often paired with hardware video decode (Quick Sync, VA-API, NVENC) for apps like Frigate. You can say yes (exits, run the GPU script, then come back) or no (continues with TPU-only setup)."
},
"usb": {
"title": "Write LXC config — USB path",
"body": "If a USB Accelerator is present, the script does two things: (1) writes a udev rule on the host so the device gets a stable name <code>/dev/coral</code> whatever USB port it's in, and (2) bind-mounts the <strong>whole</strong> <code>/dev/bus/usb</code> tree into the container.",
"whyTitle": "Why mount /dev/bus/usb instead of /dev/coral?",
"whyBody": "The USB device node path (e.g. <code>/dev/bus/usb/001/005</code>) changes when you replug the accelerator into a different port. Earlier versions of the script bind-mounted the <code>/dev/coral</code> symlink, which pointed at an old path and broke. Mounting the whole USB tree means the container sees whatever the current path is, so a replug just works."
},
"pcie": {
"title": "Write LXC config — M.2 / PCIe path",
"body1": "If a M.2 / PCIe Coral is present on the host and <code>/dev/apex_0</code> exists (the apex kernel module is loaded), the script uses the modern Proxmox <code>dev</code> API which handles cgroup2 permissions automatically for both privileged and unprivileged containers:",
"body2": "If the host hasn't booted yet with the apex module loaded (you just ran <hostLink>Install Coral on Host</hostLink> and haven't rebooted), <code>/dev/apex_0</code> doesn't exist yet. The script falls back to classic cgroup2 + bind mount with <code>create=file</code> so the entries are valid even when the device hasn't materialised:",
"rebootTitle": "Reboot the host first if you just installed Coral drivers",
"rebootBody": "The fallback cgroup2 + mount will be written, but the device only actually exists after a host reboot loads the <code>apex</code> module. If you haven't rebooted, reboot now before starting the container."
},
"drivers": {
"title": "Start the container + install Coral runtime inside",
"body": "Config changes in Proxmox LXC take effect on the next start — so the script starts the container, waits up to 15 seconds for <code>pct exec</code> to respond, then drops a bash script inside that:",
"items": [
"Runs <code>apt-get update</code>.",
"Installs the Coral repository prerequisites: <code>gnupg</code>, <code>curl</code>, <code>ca-certificates</code>.",
"Imports Google's Coral GPG key to <code>/etc/apt/keyrings/coral-edgetpu.gpg</code> (modern path, same as the host installer uses) and adds the <code>coral-edgetpu-stable</code> APT repository with <code>signed-by=</code>.",
"Installs the latest <code>libedgetpu1-std</code> (default). If you have a M.2 Coral, you'll be prompted to pick between <code>libedgetpu1-std</code> (standard) and <code>libedgetpu1-max</code> (max performance, runs hotter)."
],
"noIgpuTitle": "Why no iGPU drivers here?",
"noIgpuBody": "Earlier versions of this script also installed Intel <code>va-driver-all</code>, <code>intel-opencl-icd</code> and friends so the same container could do Quick Sync video decode alongside Coral inference. That doubled-up responsibility caused confusing failures when the user only wanted Coral. The iGPU side is now the exclusive job of <lxcGpuLink>Add GPU to LXC</lxcGpuLink> — run it first if you also want hardware video decode in the container.",
"debianTitle": "Debian / Ubuntu containers only",
"debianBody": "The in-container install uses <code>apt-get</code> directly. Alpine, Arch or RHEL-based containers are not currently supported — the install step will fail and leave the LXC with the passthrough config but no drivers inside. For those distros, install the Coral runtime manually following Google's <coralLink>official guide</coralLink> after the LXC config step."
},
"summary": {
"title": "Summary",
"body": "The script prints a checklist at the end summarising what was enabled (Coral USB, Coral M.2) with ✓ or ⚠ marks depending on whether the hardware was actually detected. The container stays running — you can jump straight into Frigate / CodeProject.AI / your app config."
}
},
"manual": {
"heading": "Manual equivalent",
"body": "If you want to see exactly what goes into the LXC config, or apply it by hand:",
"usbHeading": "USB Coral",
"pcieHeading": "M.2 / PCIe Coral",
"runtimeHeading": "Inside the container — Coral runtime"
},
"verification": {
"heading": "Verification",
"body": "Enter the container and check the Coral is visible:"
},
"troubleshoot": {
"heading": "Troubleshooting",
"apexTitle": "Container started but /dev/apex_0 missing inside",
"apexBody": "Host apex module isn't loaded. On the host: <code>lsmod | grep apex</code> — if empty, run <code>modprobe apex</code>, or reboot if you just installed Coral drivers. Once the host has <code>/dev/apex_0</code>, restart the container: <code>pct stop &lt;ctid&gt; &amp;&amp; pct start &lt;ctid&gt;</code>.",
"replugTitle": "USB Coral disappears after replug in a different port",
"replugBody": "This is exactly why the script mounts <code>/dev/bus/usb</code> instead of the <code>/dev/coral</code> symlink. If you're hitting this, check your LXC config has <code>lxc.mount.entry: /dev/bus/usb dev/bus/usb ...</code> and not a reference to <code>/dev/coral</code> directly. Old configs from earlier script versions may need updating — re-run the script on the same container and the config gets refreshed.",
"alpineTitle": "In-container install fails on an Alpine container",
"alpineBody": "The script uses <code>apt-get</code>, which Alpine doesn't have. The LXC passthrough config is still valid — just install the Coral runtime manually with <code>apk add</code> following Google's guide for Alpine, or use a Debian-based container if you don't need the smaller footprint.",
"frigateTitle": "Frigate says 'Coral EdgeTPU detected but not available'",
"frigateBody": "Almost always a permissions issue inside the container. Frigate runs as root by default; check the root user is in the <code>plugdev</code> group inside the container (for USB), and that the process can read <code>/dev/apex_0</code> (for M.2). <code>ls -l /dev/apex_0</code> from inside the container should show group <code>apex</code> — if not, add the GID alignment to <code>/etc/group</code> or switch the container to privileged mode.",
"logsTitle": "Check both host and container logs",
"logsBody": "On the host: <code>journalctl -u pvedaemon | grep -i coral</code>. Inside the container: check the app logs (Frigate: <code>/config/logs/</code>, CodeProject.AI: its own log directory). The classic error pattern is \"Coral detected, runtime loaded, but inference engine can't claim it\" — that's permissions 9 out of 10 times."
},
"related": {
"heading": "Related",
"items": [
{
"label": "Install Coral TPU (Host)",
"href": "/docs/hardware/install-coral-tpu-host",
"tail": " — required prerequisite for M.2 / PCIe Coral cards before passing into a CT."
},
{
"label": "Add GPU to LXC",
"href": "/docs/hardware/igpu-acceleration-lxc",
"tail": " — same pattern for GPUs (often paired with a Coral TPU in Frigate setups)."
}
]
}
}

View File

@@ -0,0 +1,178 @@
{
"meta": {
"title": "Add GPU to VM (Passthrough) | ProxMenux Documentation",
"description": "Pass an Intel, AMD or NVIDIA GPU through to a Proxmox VM with near-native performance. ProxMenux handles host preparation (VFIO modules, driver blacklist, kernel cmdline), VM configuration (hostpci, audio function, IOMMU group siblings), vendor-specific workarounds (NVIDIA Code 43, AMD reset bug, ROM dump) and switch-mode conflicts with LXCs."
},
"header": {
"title": "Add GPU to VM (Passthrough)",
"description": "Give one of your GPUs to a virtual machine with near-native performance. ProxMenux detects Intel / AMD / NVIDIA, validates IOMMU, analyses the GPU's IOMMU group to pass every sibling device together, configures VFIO on the host, writes the right hostpci lines into the VM config, and applies vendor-specific fixes where needed.",
"section": "Hardware: GPUs and Coral-TPU"
},
"intro": {
"title": "What this does",
"body": "Everything the <pveLink>official Proxmox PCI passthrough wiki</pveLink> walks you through manually — IOMMU enablement, VFIO modules, driver blacklisting, vendor ID discovery, <code>hostpci</code> setup, ROM dumps on AMD, KVM hiding on NVIDIA — done in one run with sanity checks at every step. The script is also aware of the <em>other</em> things on your host: if the same GPU is already assigned to an LXC or another VM, it offers to migrate cleanly instead of silently breaking the existing setup."
},
"who": {
"heading": "Who is this for?",
"body": "You have a physical GPU in your Proxmox host and you want a <strong>virtual machine</strong> (Windows gaming, macOS, a headless GPU compute node, a VM-based media server) to use it directly. Passing a GPU to a VM is not the same as passing it to an LXC — VMs need the kernel to treat the GPU as a VFIO device (essentially \"the host won't touch it\"), which means the host cannot use that GPU for anything else while the VM is running. For <em>LXC</em> transcoding / compute, use <lxcLink>Add GPU to LXC</lxcLink> instead — it shares the GPU and does not need VFIO."
},
"prereqs": {
"title": "Before you start",
"gpu": "<strong>A supported GPU</strong> physically installed. The script detects Intel, AMD and NVIDIA via <code>lspci</code>.",
"gpuCheck": "lspci | grep -iE 'VGA|3D|Display'",
"iommu": "<strong>IOMMU virtualization</strong> available in BIOS/UEFI (Intel VT-d or AMD-Vi). If it's off at the firmware level, no amount of Linux config fixes it — you have to enable it in the BIOS first. The script detects this and offers to enable it on the OS side.",
"q35": "The target VM uses <strong>q35</strong> machine type. Older <code>i440fx</code> does not reliably support PCIe passthrough and the script will refuse to proceed.",
"q35Check": "qm config '<'vmid'>' | grep machine",
"moreGpus": "Preferably <strong>more than one GPU</strong> in the host, or console access on another output (IPMI, KVM-over-IP, serial). Once you pass the only GPU to a VM, the host console goes dark. With two NVIDIA GPUs you can pass one to a VM and keep the other on the host — the script handles this per-BDF (see <em>NVIDIA</em> in the vendor notes below).",
"nvidiaInstalled": "If you're on a Proxmox that already installed the NVIDIA driver via <nvidiaLink>NVIDIA Drivers on the Host</nvidiaLink>: the GPU you pass to the VM gets unbound from the host <code>nvidia</code> driver and rebound to <code>vfio-pci</code>. The <code>nvidia</code> module stays loaded so any <strong>other</strong> NVIDIA GPU you have on the host keeps working with <code>nvidia-smi</code>."
},
"pickOne": {
"title": "VM passthrough vs LXC sharing — pick one per GPU",
"body": "A GPU bound to <code>vfio-pci</code> for VM passthrough cannot simultaneously be used by the host or an LXC. If you have two GPUs, you can dedicate one to each path. If you have only one, choose:",
"vmItem": "<strong>VM route (this page):</strong> full hardware access, but exclusively for the VM that owns the GPU while it's running.",
"lxcItem": "<strong>LXC route (<lxcLink>Add GPU to LXC</lxcLink>):</strong> shared with the host and other containers, great for transcoding, no VFIO magic needed."
},
"running": {
"heading": "Running the installer",
"body": "Open ProxMenux on the host, go to <strong>Hardware: GPUs and Coral-TPU → Add GPU to VM</strong>.",
"imageAlt": "Menu entry for 'Add GPU to VM' inside Hardware: GPUs and Coral-TPU"
},
"howRuns": {
"heading": "How the script runs",
"body": "The flow has three phases with clear separation between \"collecting information and decisions\" and \"actually applying changes\". Until the final confirmation, nothing on your host or VM has been touched."
},
"walkthrough": {
"heading": "Walking through the flow",
"detect": {
"title": "Detect GPUs and check IOMMU",
"body": "The script lists every GPU it finds. If IOMMU isn't already enabled in the running kernel cmdline, you'll get a yes/no prompt to append <code>intel_iommu=on</code> (or <code>amd_iommu=on</code>) + <code>iommu=pt</code> to the right boot file — <code>/etc/kernel/cmdline</code> on ZFS (systemd-boot) or <code>/etc/default/grub</code> on LVM/ext4. If you accept and the kernel cmdline changes, the script flags that the reboot prompt at the end will be required.",
"tipTitle": "Already ran post-install?",
"tipBody": "If you previously enabled <postLink>VFIO IOMMU support</postLink> from the post-install scripts, IOMMU is already on and this step silently passes. Good.",
"imageAlt": "List of detected GPUs with vendor and PCI address"
},
"preflight": {
"title": "Pick a GPU and run pre-flight checks",
"intro": "Once you pick the GPU, the script runs a series of checks that can each block further progress:",
"items": [
"<strong>Not in SR-IOV.</strong> If the device is a Virtual Function or a Physical Function with active VFs, passthrough would clash with SR-IOV usage. Blocked.",
"<strong>Single-GPU warning.</strong> If this is the only GPU in the host, you get a scary dialog reminding you that after reboot the console goes dark — make sure you have SSH or web-UI access from another machine.",
"<strong>AMD reset method.</strong> AMD GPUs have a long history of not resetting cleanly between VM stops and starts. The script checks <code>/sys/bus/pci/devices/&lt;pci&gt;/reset_method</code>: if the card is an APU without FLR it <em>blocks</em> (practically unusable); a dedicated AMD card without FLR is also blocked; anything with an unknown reset mode warns but lets you continue with explicit override.",
"<strong>Not in D3cold.</strong> Some AMD cards report <code>D3cold</code> power state while idle, which makes them invisible during VM startup. Blocked until you wake the GPU.",
"<strong>IOMMU group analysis.</strong> Reads <code>/sys/kernel/iommu_groups/</code> to find every non-bridge device in the GPU's group. <em>All of them</em> will be passed to the VM together — if your motherboard groups the GPU with a network card, the network card goes too."
],
"audioIntro": "<strong>Audio companion.</strong> Two paths, depending on where the audio lives:",
"audioDgpu": "<strong>Discrete GPU (NVIDIA / AMD):</strong> HDMI audio sits on the same card as function <code>.1</code> of the GPU's PCI slot. Auto-included. This audio device was never used by the host, so no one loses anything.",
"audioIgpu": "<strong>Intel iGPU (or any GPU without a <code>.1</code> sibling):</strong> the HDMI / analog audio lives on the chipset at a different slot (<code>00:1f.3</code> typically). The script scans the host, lists every PCI audio controller with its current driver (<code>snd_hda_intel</code>, etc.), and asks you which one(s) to pass through. Default is <strong>none</strong> — you explicitly opt in."
},
"pickVm": {
"title": "Pick the target VM",
"body": "You're shown the list of VMs on the host and pick one. The script checks the VM is q35 — BIOS/i440fx machine types are refused because PCIe passthrough on them is unreliable. If you have a q35 VM with the GPU already assigned (partially or fully), the existing entry is reused instead of being duplicated.",
"imageAlt": "VM list with name, ID and status shown as a picker"
},
"switchMode": {
"title": "Switch mode — handling the GPU already being elsewhere",
"intro": "The script scans every VM config and every LXC config on the host looking for the GPU you picked. Three possible outcomes:",
"items": [
"<strong>GPU is free.</strong> Nothing to do, continue.",
"<strong>GPU is in a different VM.</strong> You're offered to remove it from that other VM before assigning it here. If you decline, the script aborts — two VMs can't share an exclusive VFIO assignment.",
"<strong>GPU is in an LXC (shared mode).</strong> You're offered to remove the LXC passthrough configuration (<code>lxc.cgroup2.devices.allow</code> + <code>lxc.mount.entry</code> lines). The LXC won't see the GPU anymore, but the VM will — this is the \"switch mode\" mechanic that gives this menu entry its secondary label."
],
"imageAlt": "Dialog offering to remove the GPU from an LXC before assigning it to the VM",
"smartTitle": "Audio siblings are cleaned up smartly too",
"smartBody": "If the source VM had extra audio devices attached alongside the GPU, the script removes <strong>only the ones that are now orphan</strong> — i.e. audio entries whose display sibling is also being removed. Audio tied to a different GPU that stays in the VM is kept untouched. This matters when you detach an Intel iGPU (which shares chipset audio) from a VM that also has a discrete NVIDIA / AMD card still passed through: the dGPU's HDMI audio (<code>02:00.1</code>) stays, the chipset audio (<code>00:1f.3</code>) leaves."
},
"audioPick": {
"title": "(If no .1 sibling) Pick which audio controllers to include",
"body": "Only happens for Intel iGPU and similar split-audio setups. A checklist shows every PCI audio controller on the host (excluding any already in the GPU's IOMMU group), labelled with its current driver. Select the ones you want — or none, if the VM doesn't need audio from the host hardware.",
"imageAlt": "Checklist dialog with every host PCI audio controller (BDF + driver) when the GPU has no .1 sibling audio",
"warnTitle": "Don't tick audio the host relies on",
"warnBody": "If the host currently uses an audio controller for anything — for example, the Proxmox shell beeping, or a VM you've already passed it to — ticking it here means the host (and any other VM sharing it) loses access after reboot. When in doubt, leave this empty and you can always re-run the script later to add audio if needed."
},
"summary": {
"title": "Review the confirmation summary",
"body": "A final dialog shows exactly what's about to change on the host and in the VM. This is the last off ramp — if anything looks wrong (an extra device in the IOMMU group you didn't expect, the wrong GPU, the wrong VM), cancel here and nothing has been touched yet.",
"imageAlt": "Summary dialog listing host changes (VFIO/blacklist files) and VM config changes (hostpci lines) before applying"
},
"hostApply": {
"title": "Host changes are applied",
"intro": "Phase 2 runs non-interactively. Host-side the script can touch:",
"items": [
"<code>/etc/modules</code> — adds <code>vfio</code>, <code>vfio_iommu_type1</code>, <code>vfio_pci</code> (plus <code>vfio_virqfd</code> on kernels &lt; 6.2).",
"<code>/etc/modprobe.d/vfio.conf</code> — for AMD / Intel, sets <code>options vfio-pci ids=&lt;vendor:device,...&gt; disable_vga=1</code> so VFIO claims the GPU early at boot. For NVIDIA the file only adds <code>softdep nvidia pre: vfio-pci</code> (plus <code>_drm</code>/<code>_modeset</code>/<code>_uvm</code>) — actual binding is per-BDF via the udev rule below. On AMD, also adds <code>softdep</code> lines forcing <code>vfio-pci</code> to load before <code>radeon</code> / <code>amdgpu</code>.",
"<code>/etc/modprobe.d/iommu_unsafe_interrupts.conf</code> and <code>kvm.conf</code> — sensible workarounds that most Windows / macOS VMs need (<code>allow_unsafe_interrupts=1</code>, <code>ignore_msrs=1</code>).",
"<code>/etc/modprobe.d/blacklist.conf</code> — blacklists the open-source companion drivers (<code>nouveau</code>, <code>amdgpu</code>, <code>radeon</code>, <code>i915</code>) that would otherwise grab the GPU before VFIO. The proprietary <code>nvidia</code> module is <strong>never blacklisted</strong> — it stays available for any OTHER NVIDIA GPU you keep on the host.",
"<code>/etc/udev/rules.d/10-proxmenux-vfio-bind.rules</code> + <code>/etc/proxmenux/vfio-bind.bdfs</code> — <strong>NVIDIA only</strong>. Per-BDF binding state. The udev rule applies <code>ATTR'{'driver_override'}'=\"vfio-pci\"</code> at the PCI ADD event for each tracked Bus:Device.Function, so only the GPU(s) you've explicitly passed go to VFIO. This is what makes multi-GPU NVIDIA work — your other NVIDIA cards keep their <code>nvidia</code> driver and stay usable on the host.",
"<strong>AMD only.</strong> Dumps the GPU ROM from sysfs (<code>/sys/bus/pci/.../rom</code>) or the ACPI VFCT table to <code>/usr/share/kvm/vbios_&lt;card&gt;.bin</code>. The VM references it via <code>romfile=</code> so cards that misreport their own VBIOS still initialise correctly.",
"<strong>NVIDIA only.</strong> Stops and disables host NVIDIA services that could probe / lock the GPU at boot (<code>nvidia-persistenced</code>, <code>nvidia-powerd</code>, <code>nvidia-fabricmanager</code>). The <code>nvidia</code> module itself is left loaded so other NVIDIA GPUs on the host keep working with <code>nvidia-smi</code>.",
"<code>update-initramfs -u -k all</code> — only runs if any of the above actually changed."
]
},
"vmApply": {
"title": "VM config is applied via qm set",
"body": "The VM config at <code>/etc/pve/qemu-server/&lt;vmid&gt;.conf</code> is updated via <code>qm set</code> (never by direct <code>sed</code>):",
"after1": "A <code>x-vga=1</code> flag is added for every vendor <strong>except</strong> Intel iGPU — Intel integrated GPUs don't have dedicated VRAM for a pre-boot console, so that flag causes hangs.",
"after2": "Additional <code>hostpciN</code> lines are appended if the GPU's IOMMU group contains other devices you need to pass together."
},
"reboot": {
"title": "Reboot if host config was touched",
"body": "If Phase 2 changed anything at the kernel-module / cmdline / blacklist level, you'll be prompted to reboot. Reboot is mandatory before starting the VM — otherwise the GPU is still held by the host driver and VFIO can't claim it.",
"imageAlt": "Summary screen showing what was changed, followed by a reboot prompt"
}
},
"vendors": {
"heading": "Vendor-specific notes",
"nvidiaHeading": "NVIDIA",
"nvidiaBody": "NVIDIA consumer drivers detect that they're running in a VM and refuse to initialise with the infamous <em>\"Code 43\"</em> error. ProxMenux's workaround: hide KVM from the guest (<code>hidden=1</code>), set a spoofed hypervisor vendor ID <code>NV43FIX</code> in the <code>args</code> line, and pass <code>kvm=off</code>. This has worked reliably on GeForce drivers for years. On datacenter / Tesla / Quadro cards this isn't needed — those drivers are licensed for virtualisation.",
"nvidiaMultiHeading": "Multi-GPU NVIDIA support",
"nvidiaMultiBody": "Hosts with two or more NVIDIA GPUs are first-class. You can pass one card to a VM and keep the other(s) on the host for <code>nvidia-smi</code>, LXC GPU sharing, or any host-side workload. ProxMenux binds VFIO <strong>per-BDF</strong> (Bus:Device.Function) via a udev rule rather than globally blacklisting the <code>nvidia</code> module — so each card's destination is independent of the others, even when both GPUs are the same model and share the same <code>vendor:device</code> ID. The host nvidia driver stays loaded; only the specific BDFs you select get redirected to <code>vfio-pci</code>.",
"amdHeading": "AMD",
"amdBody": "The \"AMD reset bug\" means some cards crash when the VM stops and can't re-initialise without a host reboot. ProxMenux pre-screens for this by reading the PCI reset method, but cannot fix it after the fact. If you hit it, the community fix is the <vendorResetLink>vendor-reset</vendorResetLink> kernel module. The script doesn't install it automatically — the module is a DKMS build you add yourself if you see reset failures. Also on Windows guests, the <em>RadeonResetBugFix</em> service is the common userspace workaround.",
"intelHeading": "Intel iGPU",
"intelBody": "Intel iGPU passthrough is flaky but possible on UHD 630+ generations with <sriovLink>i915-sriov-dkms</sriovLink> for SR-IOV. For a single \"give the iGPU to one VM\" case, the script binds it exactly like a dedicated GPU, but skips <code>x-vga=1</code> (iGPUs don't carry a pre-boot VBIOS). You'll lose host console output — plan accordingly."
},
"verification": {
"heading": "Verification"
},
"troubleshoot": {
"heading": "Troubleshooting",
"code43Title": "Code 43 in Windows (NVIDIA)",
"code43Body": "The KVM hiding args didn't apply. Check <code>qm config &lt;vmid&gt; | grep -E \"cpu|args\"</code> — you should see <code>hidden=1</code> and <code>hv_vendor_id=NV43FIX</code>. If missing, re-run the script and re-select the same VM.",
"amdResetTitle": "AMD GPU works once, fails on VM restart",
"amdResetBody": "The AMD reset bug. Solutions (in order): (1) reboot the host — GPU will be usable again for one more VM cycle; (2) install the <code>vendor-reset</code> DKMS module and add <code>softdep amdgpu pre: vendor-reset</code>; (3) inside Windows, install the <em>RadeonResetBugFix</em> service.",
"stuckBootTitle": "VM stuck booting / GPU not detected",
"stuckBootBody": "Confirm VFIO actually holds the GPU on boot: <code>lspci -nnk -d vendor:device</code> must show <code>Kernel driver in use: vfio-pci</code>. If it still shows the vendor driver, the blacklist didn't take effect — check <code>/etc/modprobe.d/blacklist.conf</code> and <code>dmesg | grep vfio</code>, and regenerate initramfs: <code>update-initramfs -u -k all</code> then reboot.",
"darkTitle": "Host console goes dark after reboot and I can't SSH in",
"darkBody": "You passed the primary GPU through before having alternate access. Boot into a recovery shell (rescue ISO, IPMI), remove the lines from the VM config (<code>/etc/pve/qemu-server/&lt;vmid&gt;.conf</code>), and remove the vfio options:",
"logTitle": "Check the install log",
"logBody": "Every run writes to <code>/tmp/add_gpu_vm.log</code>. Attach it when asking for help on GitHub."
},
"revert": {
"heading": "Reverting manually",
"intro": "There isn't a dedicated \"remove GPU from VM\" shortcut in ProxMenux today. To detach cleanly:"
},
"related": {
"heading": "Related",
"items": [
{
"label": "Install NVIDIA Drivers (Host)",
"href": "/docs/hardware/nvidia-host",
"tail": " — install drivers on the host first if you also want the GPU usable from there."
},
{
"label": "Add GPU to LXC",
"href": "/docs/hardware/igpu-acceleration-lxc",
"tail": " — alternative model: share the GPU with multiple containers instead of dedicating it to a VM."
},
{
"label": "Switch GPU Mode (VM ↔ LXC)",
"href": "/docs/hardware/switch-gpu-mode",
"tail": " — flip the same GPU between modes without re-doing all the wiring."
},
{
"label": "GPU Passthrough commands",
"href": "/docs/help-info/gpu-commands",
"tail": " — lspci, IOMMU verification, qm set hostpci reference."
}
]
}
}

View File

@@ -0,0 +1,185 @@
{
"meta": {
"title": "Add GPU to LXC | ProxMenux Documentation",
"description": "Share an Intel, AMD or NVIDIA GPU with an LXC container for hardware-accelerated transcoding (Plex / Jellyfin / Frigate), OpenCL / CUDA workloads, and Mesa video acceleration. ProxMenux handles device nodes, GID alignment, and distro-specific driver install inside the container."
},
"header": {
"title": "Add GPU to LXC",
"description": "Share one or more GPUs with a Proxmox LXC container. The host keeps using the GPU normally — the container just gets access through device nodes. Works with Intel iGPUs (Quick Sync / VA-API), AMD cards (Mesa / ROCm), and NVIDIA (CUDA / NVENC).",
"section": "Hardware: GPUs and Coral-TPU"
},
"intro": {
"title": "What this does",
"body": "Adds <code>dev&lt;N&gt;</code> entries to the LXC config (<code>/etc/pve/lxc/&lt;ctid&gt;.conf</code>) so the container sees <code>/dev/dri/*</code>, <code>/dev/kfd</code> or <code>/dev/nvidia*</code> — whichever applies to your GPU. Then it boots the container, detects the distro inside, and installs the matching userspace drivers (Mesa, intel-media-driver, NVIDIA runtime…) so apps like Plex, Jellyfin or Frigate actually use the GPU for transcoding. GIDs (<code>video</code>, <code>render</code>) are aligned between host and container so permissions match."
},
"compare": {
"heading": "LXC sharing vs VM passthrough",
"intro": "LXC containers share the host kernel, so they can <em>share</em> the host's GPU without taking it over. That's a big difference from VMs: with <vmLink>VM passthrough</vmLink> the GPU is exclusive to one VM and the host can't use it. With LXC, multiple containers plus the host can all hit the same GPU at once.",
"headerFeature": "Feature",
"headerLxc": "LXC (this page)",
"headerVm": "VM",
"rows": [
{
"feature": "Host keeps using the GPU?",
"lxc": "Yes",
"vm": "No — exclusive to the VM"
},
{
"feature": "Multiple containers sharing one GPU?",
"lxc": "Yes",
"vm": "No"
},
{
"feature": "Requires IOMMU / VFIO on the host?",
"lxc": "No",
"vm": "Yes"
},
{
"feature": "Reboot required?",
"lxc": "Usually no (just restart the CT)",
"vm": "Yes, always"
},
{
"feature": "Supports running any OS?",
"lxc": "Only Linux (LXC is Linux-only)",
"vm": "Windows, macOS, any Linux"
}
]
},
"prereqs": {
"title": "Before you start",
"gpu": "<strong>A GPU on the host</strong> — Intel iGPU, AMD dGPU or APU, or an NVIDIA card. The script auto-detects all three via <code>lspci</code>.",
"gpuCheck": "lspci | grep -iE 'VGA|3D|Display'",
"vfio": "<strong>The GPU is NOT bound to vfio-pci.</strong> If the GPU is currently assigned to a VM via passthrough, it's invisible to the host kernel driver and the LXC can't use it. The script detects this and offers to run <switchLink>Switch GPU Mode</switchLink> for you.",
"nvidia": "<strong>For NVIDIA only:</strong> the NVIDIA host driver must already be installed — ProxMenux needs to match the container's userspace libs to the host version. If you haven't done it yet, run <nvidiaLink>Install NVIDIA Drivers on the Host</nvidiaLink> first.",
"nvidiaCheck": "nvidia-smi",
"container": "<strong>An existing LXC container.</strong> The script operates on a container you already created — it doesn't create one. The container should ideally be <strong>privileged</strong> (unprivileged works but needs UID/GID mapping which the script does not configure)."
},
"unpriv": {
"title": "Works on both privileged and unprivileged containers",
"body": "The script writes <code>dev&lt;N&gt;</code> entries to the LXC config and, on unprivileged containers, aligns the <code>video</code> and <code>render</code> GIDs between host and container so the GPU device nodes are reachable from inside without you having to hand-edit <code>lxc.idmap</code>."
},
"running": {
"heading": "Running the installer",
"body": "Open ProxMenux on the host, go to <strong>Hardware: GPUs and Coral-TPU → Add GPU to LXC</strong>.",
"imageAlt": "Menu entry for 'Add GPU to LXC' inside Hardware: GPUs and Coral-TPU"
},
"howRuns": {
"heading": "How the script runs",
"body": "Two phases: all the decisions upfront, then all the changes in one go. Nothing on your container is touched until you confirm."
},
"walkthrough": {
"heading": "Walking through the flow",
"detect": {
"title": "Detect host GPUs",
"body": "The script scans <code>lspci</code> for VGA / 3D / Display controllers matching Intel, AMD or NVIDIA. For NVIDIA it also verifies the <code>nvidia</code> kernel module is loaded and <code>nvidia-smi</code> works — the host driver version it reports will be used to pick the right <code>.run</code> installer for the container.",
"tipTitle": "NVIDIA not ready?",
"tipBody": "If NVIDIA is detected but the module isn't loaded, the script won't offer the NVIDIA path. Run <nvidiaLink>Install NVIDIA Drivers on the Host</nvidiaLink> first (and reboot), then come back."
},
"pickCt": {
"title": "Pick an LXC container",
"body": "You'll see a list of every LXC on the host with its ID and name. Pick the one that should get the GPU. The container can be running or stopped — the script handles both (stops it briefly during config, restarts it, and leaves it in its original state at the end).",
"imageAlt": "Dialog listing existing LXC containers to choose from"
},
"selectGpu": {
"title": "Select the GPU(s) to add",
"body": "If more than one GPU is present, you get a checklist. You can add multiple to the same container (e.g. an Intel iGPU for Quick Sync + an AMD dGPU for ROCm). If only one GPU is detected, it's auto-selected.",
"imageAlt": "Checklist showing detected GPUs (Intel / AMD / NVIDIA) with vendor and PCI address"
},
"preflight": {
"title": "Pre-flight checks",
"imageAlt": "Dialog offering to run Switch GPU Mode when the selected GPU is still bound to vfio-pci for VM passthrough",
"intro": "Three checks, any of which can block or redirect you:",
"items": [
"<strong>SR-IOV.</strong> If the selected GPU is a Virtual Function (VF) or a Physical Function with active VFs, LXC passthrough doesn't apply — the device is managed by the SR-IOV driver. Blocked.",
"<strong>Bound to vfio-pci.</strong> If the GPU is currently held by VFIO for VM passthrough, the host kernel can't create <code>/dev/dri/*</code> or <code>/dev/nvidia*</code> nodes for it. The script offers to run <switchLink>Switch GPU Mode</switchLink> which undoes the VFIO binding; you'll likely need a reboot before re-running Add GPU to LXC.",
"<strong>Already configured.</strong> If the container already has every dev node for the selected GPU, the script says so and exits cleanly. If it's partially configured, it continues with only the missing pieces."
]
},
"applyConfig": {
"title": "Apply the LXC config changes",
"body1": "The script stops the container, edits <code>/etc/pve/lxc/&lt;ctid&gt;.conf</code>, and adds <code>dev&lt;N&gt;</code> entries with the right GIDs for the selected GPUs. Using <code>dev:</code> entries (over the older <code>lxc.mount.entry</code> lines) is the modern Proxmox way — group permissions are set at config parse time instead of at mount time.",
"body2": "Example after Intel + NVIDIA on the same container:"
},
"installDrivers": {
"title": "Start the container and install drivers inside",
"body": "Once the config is written, the script starts the container, waits up to ~30 seconds for <code>pct exec</code> to respond, and then detects the container's distro from <code>/etc/os-release</code>. Based on that, it installs the right userspace packages.",
"headerDistro": "Distro",
"headerInt": "Intel / AMD",
"headerNvidia": "NVIDIA",
"rows": [
{
"distro": "Alpine",
"intel": "apk add mesa-va-gallium intel-media-driver libva-utils",
"nvidia": "apk add nvidia-utils"
},
{
"distro": "Arch / Manjaro",
"intel": "pacman -Sy intel-media-driver mesa libva-utils",
"nvidia": "pacman -Sy nvidia-utils"
}
],
"debianDistro": "Debian / Ubuntu / others",
"debianIntel": "apt-get install va-driver-all intel-opencl-icd vainfo",
"debianNvidia": "extract host <code>.run</code> → <code>pct push</code> → run with <code>--no-kernel-modules --no-dkms</code>",
"whyTitle": "Why the NVIDIA .run dance on Debian",
"whyBody": "Debian / Ubuntu don't ship NVIDIA packages with a version granular enough to match the host driver byte-for-byte. The userspace libs inside the container <strong>must match the kernel module version</strong> loaded on the host, or <code>nvidia-smi</code> fails with a version mismatch. ProxMenux solves this by using the exact same <code>.run</code> installer that was used for the host — extracted, tarred, pushed into the container with <code>pct push</code>, and run with <code>--no-kernel-modules --no-dkms</code> so only the userspace is touched."
},
"alignGids": {
"title": "Align GIDs and restore state",
"body1": "Device files on the host are owned by group <code>video</code> (GID 44) or <code>render</code> (GID 104). The container's distro may ship different GID numbers for those groups, which would make the GPU nodes unreachable from inside. The script rewrites <code>/etc/group</code> in the container so <code>video:44</code> and <code>render:104</code> match exactly.",
"body2": "Finally, it restores the container to its original state — if it was stopped when you started, it gets stopped again. If it was running, it stays running."
}
},
"vendors": {
"heading": "Vendor-specific notes",
"intelHeading": "Intel iGPU",
"intelBody": "Most common path — great for Plex / Jellyfin / Frigate hardware transcoding via <em>Quick Sync</em>. The container gets <code>/dev/dri/card0</code> (legacy) and <code>/dev/dri/renderD128</code> (modern render-only node — what apps actually use). No host-side changes needed; the <code>i915</code> driver on the host already created the nodes.",
"amdHeading": "AMD",
"amdBody": "Same DRI nodes as Intel for graphics / VA-API. If <code>/dev/kfd</code> exists on the host (AMD compute / ROCm kernel support), the script also adds it so containers can do OpenCL / ROCm workloads. Mesa VA drivers cover the video decode side.",
"nvidiaHeading": "NVIDIA",
"nvidiaBody": "Adds every <code>/dev/nvidia*</code> node the host exposes. The critical piece is <strong>driver-version matching</strong>: host module version and container userspace lib version must be identical, otherwise <code>nvidia-smi</code> inside the container fails. ProxMenux captures the host version at detection time and uses the same <code>.run</code> file to install the container userspace. For Debian containers the install bumps container memory to 2 GB temporarily (installer needs ~1.5 GB free to extract) and restores it afterwards.",
"updateTitle": "After you update the host NVIDIA driver, re-run this script",
"updateBody": "When you upgrade the NVIDIA driver on the host, the container's userspace libs stay on the old version and <code>nvidia-smi</code> inside the container breaks. ProxMenux's <nvidiaLink>NVIDIA host installer</nvidiaLink> detects containers with NVIDIA passthrough and offers to update them automatically — but if you skipped that prompt, just run Add GPU to LXC again on the same container and it'll refresh the userspace."
},
"verification": {
"heading": "Verification",
"body": "After the script finishes, log into the container and check the GPU is visible:"
},
"troubleshoot": {
"heading": "Troubleshooting",
"mismatchTitle": "nvidia-smi: Failed to initialize NVML: Driver/library version mismatch",
"mismatchBody": "Container userspace version ≠ host module version. Run Add GPU to LXC again on that container — the script extracts the current host <code>.run</code> and re-installs userspace matching.",
"denyTitle": "Permission denied on /dev/dri/renderD128 inside the container",
"denyBody": "Usually one of: (1) container is unprivileged without UID/GID mapping to host <code>render</code> group; (2) the user inside the container isn't in the <code>render</code> group. Fix: add the user to <code>render</code> inside the container (<code>usermod -aG render &lt;user&gt;</code>), or switch to privileged mode if the workload is trusted.",
"vainfoTitle": "vainfo says: VA-API version 1.xx; failed to initialize",
"vainfoBody": "The VA-API runtime is there but no suitable driver was installed. On Intel, install <code>intel-media-driver</code> (newer gens) or <code>i965-va-driver</code> (older gens). On AMD, <code>mesa-va-drivers</code>. Re-run the script if in doubt.",
"logTitle": "Install log",
"logBody": "Every run writes to <code>/tmp/add_gpu_lxc.log</code> on the host. Include it when asking for help on GitHub."
},
"related": {
"heading": "Related",
"items": [
{
"label": "Install NVIDIA Drivers (Host)",
"href": "/docs/hardware/nvidia-host",
"tail": " — required prerequisite for NVIDIA GPUs before passing them to a container."
},
{
"label": "Add GPU to VM (Passthrough)",
"href": "/docs/hardware/gpu-vm-passthrough",
"tail": " — alternative model when you need the GPU dedicated to a single VM."
},
{
"label": "Switch GPU Mode (VM ↔ LXC)",
"href": "/docs/hardware/switch-gpu-mode",
"tail": " — toggle the same GPU between LXC sharing and VM passthrough."
},
{
"label": "GPU Passthrough commands",
"href": "/docs/help-info/gpu-commands",
"tail": " — quick reference for related shell commands."
}
]
}
}

View File

@@ -0,0 +1,191 @@
{
"meta": {
"title": "Install Coral TPU on the Host | ProxMenux Documentation",
"description": "Install Google Coral TPU drivers on a Proxmox VE host. ProxMenux auto-detects whether you have M.2 / Mini-PCIe or the USB Accelerator (or both) and runs only the install path that applies — gasket/apex kernel modules via DKMS for PCIe, libedgetpu1-std runtime for USB."
},
"header": {
"title": "Install Coral TPU on the Host",
"description": "Prepare the Proxmox host so a Coral TPU can later be passed to an LXC container. ProxMenux auto-detects whether you have a M.2 / Mini-PCIe card, a USB Accelerator, or both, and runs only the install path that applies — kernel-module build via DKMS for PCIe, Google's Edge TPU runtime for USB.",
"section": "Hardware: GPUs and Coral-TPU"
},
"intro": {
"title": "What this does",
"body": "The Coral installer is <strong>unified and hardware-aware</strong>: you run one menu option and it figures out what you actually have. M.2 / Mini-PCIe cards need kernel modules (<code>gasket</code> + <code>apex</code>) built with DKMS so they survive kernel upgrades. USB Accelerators just need the Google Edge TPU runtime (<code>libedgetpu1-std</code>) from Google's own APT repository. Both types plugged in? Both paths run in sequence."
},
"which": {
"heading": "Which Coral do I have?",
"body": "The two kinds of Coral devices you can plug into a Proxmox host look very different and the host-side install is very different too. The script handles both, but it's good to know what you have before running anything:",
"headerForm": "Form factor",
"headerDetect": "Detection",
"headerInstall": "Install type",
"headerReboot": "Reboot?",
"pcieForm": "M.2 / Mini-PCIe",
"pcieFormSub": "(dual-edge TPU on an M.2 card)",
"pcieDetect": "PCI vendor <code>1ac1</code>",
"pcieInstall": "Kernel module (gasket + apex) via DKMS",
"pcieReboot": "Yes",
"usbForm": "USB Accelerator",
"usbFormSub": "(small plastic dongle)",
"usbDetect": "USB <code>1a6e:089a</code> (unprogrammed) / <code>18d1:9302</code> (programmed)",
"usbInstall": "User-space runtime (<code>libedgetpu1-std</code>) from Google APT repo",
"usbReboot": "No"
},
"prereqs": {
"title": "Before you start",
"coral": "<strong>A Coral TPU plugged into the host</strong> (M.2, Mini-PCIe or USB). No Coral means no install — the script just tells you nothing was detected and exits.",
"coralCheck": "lspci -d 1ac1: ; lsusb | grep -E '1a6e:089a|18d1:9302'",
"internet": "<strong>Internet access</strong> on the host — for PCIe the DKMS path clones a GitHub repo, for USB it adds a Google APT repo and downloads the <code>libedgetpu1-std</code> package.",
"headers": "<strong>Kernel headers available via APT</strong>. On Proxmox this means <code>proxmox-headers-$(uname -r)</code> must be installable — needed only for the PCIe/M.2 path, but the script fetches it automatically so you don't have to.",
"reboot": "<strong>Be OK with a reboot</strong> if you have PCIe hardware. The kernel module is built and loaded at the end, but a fresh boot is the clean way to confirm it comes up on its own."
},
"hostPrepTip": {
"title": "This only prepares the host",
"body": "Installing Coral here makes the host <em>ready</em> — the TPU is visible to the host kernel and can be handed to an LXC. To actually <em>use</em> it from a container (Frigate, DeepStack, etc.), the next step is <lxcLink>Add Coral TPU to an LXC</lxcLink>."
},
"running": {
"heading": "Running the installer",
"body": "Open ProxMenux on the host, go to <strong>Hardware: GPUs and Coral-TPU → Install/Update Coral TPU on Host</strong>.",
"imageAlt": "Menu entry for 'Install/Update Coral TPU on Host' inside Hardware: GPUs and Coral-TPU"
},
"howRuns": {
"heading": "How the script runs",
"body": "Hardware is detected first, so the install plan is shaped to your actual setup. Nothing is installed until you confirm."
},
"walkthrough": {
"heading": "Walking through the flow",
"detect": {
"title": "Hardware detection",
"body": "The script reads <code>/sys/bus/pci/devices/*/vendor</code> looking for <code>0x1ac1</code> (Global Unichip Corp., the silicon vendor of the Coral TPU chips) and runs <code>lsusb</code> looking for the USB Accelerator IDs. The USB device has <strong>two</strong> IDs depending on whether its firmware has been loaded yet:",
"items": [
"<code>1a6e:089a</code> — Global Unichip, <em>unprogrammed</em> state (before the Edge TPU runtime talks to it the first time).",
"<code>18d1:9302</code> — Google, <em>programmed</em> state (after the runtime loads firmware onto it)."
],
"outro": "If neither is found, you get an informative dialog and the script exits cleanly — no partial changes on the host."
},
"prompt": {
"title": "Pre-install prompt",
"body": "Before touching anything, a single dialog summarises what was detected and what will be installed. You can cancel here without any side effects.",
"imageAlt": "Pre-install dialog showing detected Coral hardware (M.2/PCIe count + USB count) and the list of steps the installer will run"
},
"pcie": {
"title": "PCIe path — gasket + apex kernel modules via DKMS",
"body": "Only runs if the script found a PCIe / M.2 Coral. It's the heavier of the two paths because it's compiling a kernel module for your exact running kernel.",
"items": [
"<strong>Cleanup.</strong> If a previous <code>gasket-dkms</code> install left dpkg in a broken state (typical after a PVE 9 kernel upgrade where DKMS autoinstall failed silently), force-purge it.",
"<strong>Install build deps:</strong> <code>git</code>, <code>dkms</code>, <code>build-essential</code>, <code>proxmox-headers-$(uname -r)</code>."
],
"cloneIntro": "<strong>Clone the driver source.</strong> Prefers the <feranickLink>feranick/gasket-driver</feranickLink> community fork (actively maintained, kernel 6.12+ ready). Falls back to <googleLink>google/gasket-driver</googleLink> if feranick is unreachable, applying kernel-specific patches:",
"kernelPatches": [
"Kernel 6.5+ : <code>no_llseek</code> removed upstream → <code>noop_llseek</code> substitution",
"Kernel 6.13+ : <code>MODULE_IMPORT_NS(DMA_BUF)</code> requires string literal"
],
"afterItems": [
"<strong>Stage sources under <code>/usr/src/gasket-1.0/</code>,</strong> generate <code>dkms.conf</code>, register with <code>dkms add</code>.",
"<strong>Build + install:</strong> <code>dkms build</code> then <code>dkms install</code>. If anything fails, the last 50 lines of <code>make.log</code> print to the terminal so you see the real error — no hunting in log files.",
"<strong>Create the <code>apex</code> group + udev rules</strong> (<code>/etc/udev/rules.d/99-coral-apex.rules</code>) so <code>/dev/apex_*</code> nodes get the right group on the next boot.",
"<strong>Load the modules:</strong> <code>modprobe gasket</code> + <code>modprobe apex</code>."
],
"imageAlt": "DKMS build progress + module load output for the gasket/apex kernel modules"
},
"usb": {
"title": "USB path — Edge TPU runtime from Google",
"body": "Only runs if a USB Accelerator was detected. Much simpler:",
"items": [
"<strong>Add Google's GPG key</strong> to <code>/etc/apt/keyrings/coral-edgetpu.gpg</code>.",
"<strong>Add the APT repository</strong> to <code>/etc/apt/sources.list.d/coral-edgetpu.list</code> with <code>signed-by=</code> pointing at the keyring (modern, non-deprecated format).",
"<strong>Install <code>libedgetpu1-std</code></strong> — the standard-performance Edge TPU runtime.",
"<strong>Reload udev</strong> so the rules shipped with the package apply to the USB device without having to unplug/replug."
],
"stdTitle": "Why libedgetpu1-std and not libedgetpu1-max?",
"stdBody": "The <code>max</code> variant overclocks the Coral and runs hotter. Fine for desktop use with airflow; not recommended inside small NUC / Mini-PC builds or passively cooled Proxmox hosts. If you really want it, install by hand afterwards: <code>apt install libedgetpu1-max</code>."
},
"reboot": {
"title": "Reboot prompt (only if PCIe ran)",
"body": "If the PCIe path ran, the script offers a reboot. The modules were <code>modprobe</code>'d already so in theory you can skip the reboot — but a clean boot is the right way to verify the module comes up on its own and <code>/dev/apex_0</code> appears with group <code>apex</code>. For USB-only installs, no reboot is suggested (the runtime and udev rules are active immediately).",
"imageAlt": "Final summary + reboot prompt after a PCIe install"
}
},
"reinstallUninstall": {
"heading": "Reinstall or uninstall",
"intro": "Running the installer on a host where Coral is already installed (PCIe via <code>gasket-dkms</code>, USB via <code>libedgetpu1-std</code>/<code>libedgetpu1-max</code>, or both) no longer drops straight into another fresh install. Instead, ProxMenux detects the existing setup and shows an action menu so you can decide what to do.",
"imageAlt": "Coral action menu with two choices — Reinstall / update Coral drivers, or Uninstall Coral drivers and configuration — shown when the installer detects a previous Coral install on the host",
"imageCaption": "The action menu only appears when at least one Coral component is already installed (DKMS gasket entry, <code>libedgetpu1-*</code> package, or live <code>/dev/apex_*</code> device nodes).",
"reinstallHeading": "Reinstall / update",
"reinstallBody": "Continues with the normal install flow — useful after a kernel upgrade if the DKMS rebuild didn't happen automatically, or to lift the runtime to a newer <code>libedgetpu1-*</code> from the Google Coral apt repo. The previous DKMS state is cleaned up first so a half-installed package from a failed earlier attempt doesn't block the new build.",
"uninstallHeading": "Uninstall — what gets removed",
"uninstallIntro": "Confirms with a yes/no dialog before doing anything (LXC containers with apex passthrough will lose access to <code>/dev/apex_*</code> after the next reboot — the warning makes that clear). Then it performs a full, idempotent rollback:",
"uninstallItems": [
"Unloads the <code>apex</code> and <code>gasket</code> kernel modules.",
"Removes every DKMS-registered <code>gasket/&lt;version&gt;</code> entry so the modules don't come back on the next kernel install.",
"Purges the <code>gasket-dkms</code>, <code>libedgetpu1-std</code> and <code>libedgetpu1-max</code> apt packages, then runs <code>apt-get autoremove --purge</code>.",
"Deletes the udev rule <code>/etc/udev/rules.d/99-coral-apex.rules</code> ProxMenux wrote; restores the upstream <code>60-gasket-dkms.rules</code> group to its default if it's still present.",
"Removes the <code>apex</code> system group <strong>only if no users still belong to it</strong> — if you mapped a custom user into <code>apex</code> for an LXC passthrough, the group is left in place and a warning prints the current members.",
"Cleans the Google Coral apt repository entry and keyring (<code>/etc/apt/sources.list.d/coral-edgetpu.list</code> + <code>coral-edgetpu-archive-keyring.gpg</code>).",
"Prompts for a reboot at the end <strong>only if the PCIe path was installed</strong> — the cleanest way to flush the kernel modules. USB-only uninstalls don't need one."
],
"lxcWarnTitle": "LXC containers with apex passthrough",
"lxcWarnBody": "Uninstalling on the host invalidates the device path mapped into any LXC container configured for apex passthrough. Plan the operation during a maintenance window if Frigate / DeepStack / similar workloads depend on it."
},
"updates": {
"heading": "Update notifications",
"intro": "ProxMenux now tracks the installed Coral components in its managed-installs registry. Both variants are followed independently — a host with both M.2 and USB Coral devices gets two update streams, each with its own upstream source:",
"headerVariant": "Variant",
"headerTracked": "Tracked version",
"headerUpstream": "Upstream source",
"pcieVariant": "PCIe / M.2",
"pcieTracked": "<code>gasket-dkms</code> Debian version (or the DKMS-registered version if the package was force-removed)",
"pcieUpstream": "Latest tag of <code>feranick/gasket-driver</code> on GitHub (7-day cache, build-number comparison)",
"usbVariant": "USB",
"usbTracked": "<code>libedgetpu1-std</code> or <code>libedgetpu1-max</code> apt version",
"usbUpstream": "Local <code>apt-cache policy</code> candidate (Google Coral apt repo)",
"outro": "When a newer version is detected the Monitor fires a <code>coral_driver_update_available</code> notification on every enabled channel (Telegram, Discord, Gotify, ntfy, email, webhook). The notification points back at the same installer entry — pick <strong>Reinstall / update</strong> from the action menu above to apply it.",
"antiTitle": "Anti-cascade by design",
"antiBody": "One notification per variant, only when the upstream version actually changes — never on every 24h scan. If you ignore an update it doesn't re-ping you until a newer release lands.",
"rebootTitle": "Reboot is only needed for the PCIe path",
"rebootBody": "The gasket DKMS rebuild loads new kernel modules — that needs a reboot to be active. The USB runtime upgrade is a user-space library swap, no reboot required."
},
"manual": {
"heading": "Manual equivalent",
"intro": "If you want to know what happens under the hood, or redo an individual step by hand, the raw commands per path look like this.",
"pcieHeading": "PCIe / M.2 (gasket + apex via DKMS)",
"usbHeading": "USB (libedgetpu runtime)"
},
"verification": {
"heading": "Verification",
"pcieHeading": "PCIe / M.2",
"usbHeading": "USB"
},
"troubleshoot": {
"heading": "Troubleshooting",
"dkmsFailTitle": "DKMS build fails after a kernel upgrade",
"dkmsFailBody": "Most common cause on PVE 9: the running kernel bumped but <code>proxmox-headers-$(uname -r)</code> isn't installed for it yet. Check with <code>dpkg -l proxmox-headers-$(uname -r)</code>. Install the missing headers and re-run the script — ProxMenux's <code>cleanup_broken_gasket_dkms</code> step handles any leftover half-configured package state.",
"apexMissTitle": "/dev/apex_0 missing after reboot",
"apexMissBody": "The module isn't loaded. Try <code>modprobe apex</code> by hand. If that errors, check <code>dmesg | grep -iE \"apex|gasket\"</code> for the real failure — common culprits are kernel version mismatch (DKMS was built for a different kernel than the one you booted) or a firmware upgrade that disabled the PCIe slot the Coral is in.",
"lxcMissTitle": "Can see /dev/apex_0 but LXC can't",
"lxcMissBody": "Host is fine. Problem is passthrough config — see <lxcLink>Add Coral TPU to an LXC</lxcLink>.",
"usbUnreachTitle": "USB Accelerator detected but Frigate / TFLite can't reach it",
"usbUnreachBody": "Check the udev rules shipped with <code>libedgetpu1-std</code> took effect: <code>ls -l /dev/bus/usb/*/*</code>. The device should NOT be owned by root:root with mode 0600 — if it is, run <code>udevadm control --reload-rules &amp;&amp; udevadm trigger</code> on the host, unplug the USB Coral, wait 3 seconds and plug it back in.",
"logTitle": "Install log",
"logBody": "Every run writes to <code>/tmp/coral_install.log</code> on the host. If the DKMS build dies, the script also appends the last 50 lines of <code>/var/lib/dkms/gasket/1.0/build/make.log</code> to that log — attach it when asking for help on GitHub."
},
"related": {
"heading": "Related",
"items": [
{
"label": "Add Coral TPU to LXC",
"href": "/docs/hardware/coral-tpu-lxc",
"tail": " — pass the host Coral device into a container (Frigate, CodeProject.AI…)."
},
{
"label": "Install NVIDIA Drivers (Host)",
"href": "/docs/hardware/nvidia-host",
"tail": " — same idea for NVIDIA GPUs."
},
{
"label": "ProxMenux Monitor — Hardware tab",
"href": "/docs/monitor/dashboard/hardware",
"tail": " — the Coral modal that surfaces driver, modules, device nodes, runtime status and live temperature once the host install is done."
}
]
}
}

View File

@@ -0,0 +1,204 @@
{
"meta": {
"title": "Install NVIDIA Drivers on the Host | ProxMenux Documentation",
"description": "Install and configure NVIDIA proprietary drivers on a Proxmox VE host using ProxMenux. Covers kernel compatibility, VFIO setup, persistence service, optional NVENC patch and automatic LXC propagation."
},
"header": {
"title": "Install NVIDIA Drivers on the Host",
"description": "Install the NVIDIA proprietary driver on a Proxmox VE host using ProxMenux. The installer handles kernel compatibility, nouveau blacklisting, VFIO configuration, persistence service and can propagate the driver to any LXC container that already has NVIDIA passthrough configured.",
"section": "Hardware: GPUs and Coral-TPU"
},
"intro": {
"title": "What this does",
"body": "ProxMenux automates the whole NVIDIA driver lifecycle on the host: detects your GPU, picks a driver version that is compatible with your running kernel, blacklists <code>nouveau</code>, downloads and runs the official NVIDIA <code>.run</code> installer with DKMS, installs the <code>nvidia-persistenced</code> service and udev rules, and offers to apply the optional NVENC patch. If you already have LXC containers with NVIDIA passthrough, it can update the userspace libraries inside them so their version matches the host."
},
"who": {
"heading": "Who is this for?",
"body": "If you have an NVIDIA GPU and you want to use it for hardware-accelerated transcoding in Plex, Jellyfin, Frigate, Immich, Stable Diffusion or any other app running <strong>inside an LXC container</strong>, you need to install the driver on the Proxmox <em>host</em> first. This page covers that host-side install. Passing the GPU to a <strong>virtual machine (VM)</strong> uses a different flow (VFIO passthrough) and is documented separately."
},
"prereqs": {
"title": "Before you start",
"gpu": "<strong>An NVIDIA GPU</strong> physically installed in the host. The script auto-detects it; AMD and Intel GPUs are not handled here.",
"gpuCheck": "lspci | grep -i nvidia",
"notVm": "The GPU <strong>is not currently assigned to a VM via VFIO passthrough</strong>. If it is, the script will refuse to install the host driver to avoid breaking the passthrough config.",
"internet": "Internet access on the host. The installer downloads the driver from <code>download.nvidia.com</code> and, optionally, clones <code>nvidia-persistenced</code> and <code>nvidia-patch</code> from GitHub.",
"space": "About <strong>2 GB of free space</strong> in <code>/opt/nvidia</code> (workdir) plus the RAM used during the install. A reboot is required at the end."
},
"vmWarn": {
"title": "GPU assigned to a VM? Stop here",
"body": "If the GPU is currently bound to <code>vfio-pci</code> (i.e. it is being passed through to a VM), installing the host driver can break the passthrough and destabilise the system. ProxMenux detects this and aborts. Remove the GPU from the VM passthrough configuration and reboot before running this script."
},
"running": {
"heading": "Running the installer",
"body": "Open ProxMenux on the host, go to <strong>Hardware Graphics → NVIDIA GPU Driver Installer</strong>. What you see depends on whether a driver is already present.",
"imageAlt": "Hardware Graphics menu with the NVIDIA GPU Driver Installer entry highlighted"
},
"howRuns": {
"heading": "How the script runs",
"body": "The installer goes through three phases with clear separation between \"collecting information and validating\" and \"actually touching the host\". Until the final confirmation, nothing has been changed."
},
"walkthrough": {
"detect": {
"title": "GPU detection and overview",
"body1": "The script scans the PCI bus and shows every NVIDIA video controller it finds, the current driver status (or <em>\"No NVIDIA driver installed\"</em>), and any LXC container that already has NVIDIA passthrough configured (driver version inside each one).",
"body2": "Review the overview carefully. If the detected GPU is not what you expect, or if a container's version already matches the host, you can cancel here without side effects.",
"imageAlt": "Pre-install overview showing detected GPUs, current driver status and LXC containers with NVIDIA passthrough"
},
"version": {
"title": "Choose the driver version",
"body1": "ProxMenux fetches the list of available drivers from NVIDIA and <strong>filters out versions that are not compatible with your running kernel</strong>. The <em>Latest available</em> option is almost always the right pick.",
"body2": "The compatibility matrix the script uses:",
"headerKernel": "Kernel",
"headerPve": "Typical PVE version",
"headerMin": "Minimum NVIDIA driver",
"rows": [
{
"kernel": "6.17+",
"pve": "Proxmox VE 9.x",
"minCode": "580.82.07",
"minTail": " or newer"
},
{
"kernel": "6.8 6.16",
"pve": "Proxmox VE 8.2+",
"minCode": "550.x",
"minTail": " or newer"
},
{
"kernel": "6.2 6.7",
"pve": "Proxmox VE 8.0 8.1",
"minCode": "535.x",
"minTail": " or newer"
},
{
"kernel": "5.15+",
"pve": "Proxmox VE 7.x (legacy)",
"minCode": "470.x",
"minTail": " or newer"
}
],
"whyTitle": "Why kernel matters",
"whyBody": "Kernel 6.17 introduced internal API changes that break older NVIDIA drivers. If you install a driver below the minimum for your kernel, DKMS will fail to build the module and the GPU will not be available after reboot. ProxMenux filters the list so you can't pick an incompatible version by accident.",
"imageAlt": "Driver version selector with kernel-compatible versions, Latest available on top"
},
"uninstall": {
"title": "Clean uninstall (only if reinstalling)",
"body": "If a driver is already present and you picked a different version, ProxMenux stops the NVIDIA services, unloads the kernel modules, removes DKMS entries and purges <code>nvidia-*</code> / <code>libnvidia-*</code> / <code>cuda-*</code> packages before touching the new installer. This avoids the classic mixed-version mess."
},
"prepare": {
"title": "Prepare the system",
"body": "Behind a single confirmation, the script:",
"items": [
"Installs <code>pve-headers-$(uname -r)</code> (or <code>proxmox-headers-$(uname -r)</code>), <code>build-essential</code> and <code>dkms</code>.",
"Creates <code>/etc/modprobe.d/nouveau-blacklist.conf</code> blacklisting <code>nouveau</code>, and tries to unload it immediately.",
"Writes <code>/etc/modules-load.d/nvidia-vfio.conf</code> with <code>vfio</code>, <code>vfio_pci</code>, <code>nvidia</code>, <code>nvidia_uvm</code> and related modules."
]
},
"download": {
"title": "Download and run the NVIDIA installer",
"body": "The installer downloads the <code>NVIDIA-Linux-x86_64-&lt;version&gt;.run</code> file into <code>/opt/nvidia</code>, validates it (size + executable signature, not just HTTP 200), then runs it with DKMS so the kernel module rebuilds automatically across kernel upgrades.",
"imageAlt": "Download progress followed by the NVIDIA installer running its DKMS build"
},
"persist": {
"title": "Persistence service and udev rules",
"body": "ProxMenux then installs <persistLink>nvidia-persistenced</persistLink> and writes udev rules at <code>/etc/udev/rules.d/70-nvidia.rules</code> so <code>/dev/nvidia*</code> device nodes appear reliably on boot. Without these, LXC passthrough can race on container startup and end up with a container that can't see the GPU."
},
"nvenc": {
"title": "Optional: apply the NVENC patch",
"body": "Consumer NVIDIA GPUs (GeForce line) limit the number of simultaneous NVENC encoding sessions. The community <patchLink>keylase/nvidia-patch</patchLink> removes that restriction. If you plan to use the GPU for Plex / Jellyfin / Frigate with many concurrent streams, answer <strong>Yes</strong> when prompted.",
"supportTitle": "Check patch support for your driver",
"supportBody": "The patch does not cover every driver version. Before relying on it in production, verify your version is listed in the <patchTableLink>patch table</patchTableLink>. If it isn't supported yet, pick a nearby older driver that is."
},
"propagate": {
"title": "Optional: propagate the driver to LXC containers",
"body1": "If the overview screen listed containers with NVIDIA passthrough, ProxMenux now offers to update the userspace libraries inside each one to match the host. Host kernel module and container userspace <strong>must be the exact same version</strong> — otherwise <code>nvidia-smi</code> inside the container will fail with a \"version mismatch\" error.",
"body2": "The update is distro-aware: <code>apk</code> for Alpine, <code>pacman</code> for Arch, and the same <code>.run</code> installer (with <code>--no-kernel-modules --no-dkms --no-install-compat32-libs</code>) for Debian/Ubuntu and other distros. It temporarily raises container RAM to 2 GB if lower, runs the install, then restores the original RAM setting.",
"imageAlt": "Prompt listing LXCs with NVIDIA passthrough and current driver version, with Yes/No to update them all"
},
"reboot": {
"title": "Reboot",
"body": "Finally, the script rebuilds <code>initramfs</code> for all kernels and offers to reboot. A reboot <strong>is required</strong>: the nouveau blacklist and the new kernel module only take effect after restart."
}
},
"reinstallUninstall": {
"heading": "Reinstall or uninstall",
"intro": "When the installer detects that a NVIDIA driver is already loaded (<code>nvidia-smi</code> returns a version), it doesn't silently re-install on top. Instead it shows an action menu so you can choose what to do.",
"imageAlt": "NVIDIA action menu offered when a driver is already installed — two choices: Reinstall / update driver, or Uninstall the NVIDIA driver completely",
"imageCaption": "The action menu only appears when an NVIDIA driver is currently active on the host.",
"reinstallHeading": "Reinstall / update",
"reinstallBody": "Continues with the normal install flow but, before downloading anything, runs a clean removal of the current driver (apt purge + DKMS entries dropped + leftover modules unloaded). This is the safe path to apply a newer driver version, switch branches when the kernel demands it, or recover from a half-broken state. The LXC propagation and NVENC patch prompts re-run at the end.",
"uninstallHeading": "Uninstall — what gets removed",
"uninstallIntro": "Confirms with a yes/no dialog first. Then performs a full, idempotent rollback:",
"uninstallItems": [
"Stops and disables <code>nvidia-persistenced</code>, unloads the kernel modules (<code>nvidia_uvm</code>, <code>nvidia_drm</code>, <code>nvidia_modeset</code>, <code>nvidia</code>) — any LXC container with NVIDIA passthrough will be cleanly cut off.",
"Runs <code>apt purge</code> on every NVIDIA package, removes the DKMS source tree and the <code>/opt/nvidia</code> .run installer cache.",
"Reverts the nouveau blacklist (<code>/etc/modprobe.d/nouveau-blacklist.conf</code>) and the modules-load config (<code>/etc/modules-load.d/nvidia-vfio.conf</code>) so nouveau can come back if you want generic graphics again.",
"Removes the udev rules (<code>/etc/udev/rules.d/70-nvidia.rules</code>) and the NVENC patch state file (if the keylase patch was applied earlier).",
"Rebuilds <code>initramfs</code> for all kernels and prompts for a reboot to finalise (the nouveau unblacklist only takes effect after restart)."
],
"lxcWarnTitle": "LXC containers with NVIDIA passthrough",
"lxcWarnBody": "Removing the host driver invalidates the device paths and CUDA libraries mapped into any LXC with NVIDIA passthrough. Plan the operation during a maintenance window if Frigate / Plex / Jellyfin / Ollama (or anything else) depends on it."
},
"updates": {
"heading": "Update notifications",
"body": "The installed NVIDIA driver is tracked in ProxMenux's managed-installs registry. On startup and every 24h the Monitor checks the upstream listing at <code>download.nvidia.com/XFree86/Linux-x86_64/</code> against the version <code>nvidia-smi</code> reports, and fires a notification when a newer compatible version is available.",
"kindsHeading": "Two kinds of update message",
"kindsItems": [
"<strong>Same-branch patch.</strong> A newer maintenance release in your current driver branch (e.g. installed 580.65.06 → available 580.105.08). Bug fixes and security patches without changing branch.",
"<strong>Branch upgrade required by kernel.</strong> If the host is on a kernel that no longer supports your current branch (e.g. you upgraded the host kernel to 6.17 while still on driver 570.x), the message says so explicitly and recommends the kernel's minimum compatible branch — same matrix the installer uses to filter the version menu."
],
"antiTitle": "Anti-cascade by design",
"antiBody": "One notification per distinct upstream version, never on every 24h scan. The branch-upgrade message in particular only fires once you actually need to switch — until then the same-branch tracker stays muted.",
"applyTitle": "Applying the update",
"applyBody": "The Monitor doesn't auto-apply driver updates — reinstalling the NVIDIA driver always needs a reboot. Open the same installer entry described above, pick <strong>Reinstall / update</strong>, and the new version is downloaded, the DKMS module rebuilt against the running kernel, and the reboot prompted at the end."
},
"verify": {
"heading": "Verifying the install",
"intro": "Once the host is back up, log in over SSH or the Proxmox shell and run:",
"after": "You should see your GPU listed, the driver version on the top border, and no processes yet (nothing is using the GPU at this point). Then check the persistence service:",
"imageAlt": "Output of nvidia-smi on the host showing the detected GPU and installed driver version"
},
"troubleshoot": {
"heading": "Troubleshooting",
"smiFailTitle": "`nvidia-smi` says 'NVIDIA-SMI has failed'",
"smiFailBody": "Almost always a <strong>nouveau</strong> still loaded or a <strong>kernel header mismatch</strong>. After reboot, run <code>lsmod | grep nouveau</code> — if it returns anything, the blacklist didn't take effect (check <code>/etc/modprobe.d/nouveau-blacklist.conf</code> exists and rebuild initramfs with <code>update-initramfs -u -k all</code>, then reboot). If nouveau is gone, check <code>dmesg | grep -i nvidia</code> — DKMS build errors usually mean your kernel headers don't match the running kernel; reinstall them with <code>apt install --reinstall pve-headers-$(uname -r)</code>.",
"lxcMissTitle": "LXC container can't see the GPU after host update",
"lxcMissBody": "The container's userspace libraries are stuck at the previous driver version. Either re-run the NVIDIA installer and accept the LXC propagation prompt, or install the same driver version manually inside the container with <code>--no-kernel-modules</code>.",
"logTitle": "Check the install log",
"logBody": "Every install writes to <code>/tmp/nvidia_install.log</code>. If something fails silently, that file has the full output (downloads, DKMS build, service installs). Attach it when reporting issues on GitHub."
},
"manualSteps": {
"heading": "Looking for the manual steps?",
"body": "The original community guide — installing everything by hand with <code>wget</code> and <code>./NVIDIA-Linux-...run</code> — is still available as a reference under <guideLink>Guides → NVIDIA</guideLink>. It's useful if you want to understand every command the ProxMenux installer runs, or if you're troubleshooting an unusual setup. For day-to-day installs, use ProxMenux — it's the path that keeps receiving fixes (kernel compatibility, LXC propagation, VFIO safety checks)."
},
"related": {
"heading": "Related",
"items": [
{
"label": "Add GPU to VM (Passthrough)",
"href": "/docs/hardware/gpu-vm-passthrough",
"tail": " — pass the NVIDIA GPU to a VM (different binding model from LXC)."
},
{
"label": "Add GPU to LXC",
"href": "/docs/hardware/igpu-acceleration-lxc",
"tail": " — share the NVIDIA GPU with one or more containers."
},
{
"label": "Switch GPU Mode (VM ↔ LXC)",
"href": "/docs/hardware/switch-gpu-mode",
"tail": " — toggle the same GPU between passthrough (VM) and shared (LXC) modes."
},
{
"label": "ProxMenux Monitor — Hardware tab",
"href": "/docs/monitor/dashboard/hardware",
"tail": " — the GPU modal that triggers this installer in one click, plus live monitoring once it's done."
},
{
"label": "GPU Passthrough commands",
"href": "/docs/help-info/gpu-commands",
"tail": " — lspci / dmesg / IOMMU / nvidia-smi reference."
}
]
}
}

View File

@@ -0,0 +1,172 @@
{
"meta": {
"title": "Switch GPU Mode (VM ↔ LXC) | ProxMenux Documentation",
"description": "Move an already-assigned GPU between VM mode (vfio-pci) and LXC mode (native driver) on a Proxmox host. ProxMenux detects the current binding, prompts per-workload conflict policies, handles orphan audio cleanup, and rebuilds initramfs only when needed."
},
"header": {
"title": "Switch GPU Mode (VM ↔ LXC)",
"description": "Reassign a GPU that's already in use — flip it from VM passthrough to LXC sharing or the other way round. ProxMenux handles all the host-side binding changes (vfio.conf, driver blacklist, modules, initramfs) and offers a clean policy for every VM or LXC currently using the GPU, so the switch doesn't leave broken config behind.",
"section": "Hardware: GPUs and Coral-TPU"
},
"intro": {
"title": "What this does",
"body": "A GPU on a Proxmox host lives in one of two modes: bound to <code>vfio-pci</code> (reserved for a VM) or bound to its native driver <em>(i915 / amdgpu / nvidia)</em> so the host + LXCs can share it. <strong>Switch GPU Mode</strong> flips between those two without you having to hand-edit <code>vfio.conf</code>, manage blacklists, or remember which VM / LXC lines point to the card. It also warns you cleanly if a workload still references the GPU so you don't end up with a broken VM at boot."
},
"graphics": {
"lxcTitle": "Ready for LXC containers",
"lxcDesc": "Native driver active",
"vmTitle": "Ready for VM passthrough",
"vmDesc": "VFIO-PCI driver active"
},
"when": {
"heading": "When should I use this?",
"intro": "Use this script when a GPU is <strong>already assigned</strong> and you want to move it:",
"headerSituation": "Situation",
"headerUse": "Use this page?",
"rows": [
{
"situation": "GPU is free — never assigned. Want to give it to a VM.",
"useRich": "No — use <vmLink>Add GPU to VM</vmLink>"
},
{
"situation": "GPU is free — never assigned. Want to give it to an LXC.",
"useRich": "No — use <lxcLink>Add GPU to LXC</lxcLink>"
},
{
"situationRich": "GPU is in a VM via <code>vfio-pci</code>, I want to use it from an LXC instead.",
"useRich": "<strong>Yes — this page.</strong>"
},
{
"situation": "GPU is shared with an LXC, I want to dedicate it to a VM.",
"useRich": "<strong>Yes — this page.</strong>"
},
{
"situation": "I just want to completely unbind a GPU from everything.",
"use": "Yes — pick the LXC-mode target and then detach manually."
}
]
},
"prereqs": {
"title": "Before you start",
"assigned": "<strong>A GPU already assigned</strong> — either in a VM via VFIO or attached to at least one LXC. If you haven't assigned it yet, start from Add GPU to VM / LXC instead.",
"iommu": "<strong>IOMMU enabled on the host</strong> — only strictly required when switching <em>to</em> VM mode, but worth having on either way. The script warns if the kernel param is missing.",
"iommuCheck": "dmesg | grep -i 'IOMMU enabled' | head -1",
"reboot": "<strong>Be OK with a reboot.</strong> Switching GPU bindings at the kernel level means the host regenerates initramfs and you reboot to apply. The script prompts at the end.",
"knowList": "<strong>Know which VMs / LXCs are using the GPU.</strong> The script will find them and ask what to do with each, but it's faster if you already know the list."
},
"blocklist": {
"title": "Not all GPUs are safe to pass to a VM",
"body": "A small blocklist of GPU IDs is refused for VM mode due to known passthrough instability (e.g. Intel Arc A770 <code>8086:5a84</code> / <code>8086:5a85</code>). If the selected GPU matches, the script explains why and exits. Switching <em>to</em> LXC mode is always allowed."
},
"running": {
"heading": "Running the script",
"body": "Open ProxMenux on the host, go to <strong>Hardware: GPUs and Coral-TPU → Switch GPU Mode (VM ↔ LXC)</strong>.",
"imageAlt": "Menu entry for 'Switch GPU Mode (VM ↔ LXC)' inside Hardware: GPUs and Coral-TPU"
},
"howRuns": {
"heading": "How the script runs",
"body": "Two phases as usual: everything is collected and validated first, nothing is applied until you confirm at the end."
},
"walkthrough": {
"heading": "Walking through the flow",
"detect": {
"title": "Detect GPUs and their current binding",
"body": "The script scans every VGA / 3D / Display controller on the host and inspects <code>/sys/bus/pci/devices/*/driver</code> to find the current kernel driver. You'll see each GPU labelled with its name, PCI slot and current driver binding — so you can tell at a glance which mode it's in.",
"imageAlt": "GPU checklist showing each detected GPU with its current driver (vfio-pci / nvidia / amdgpu / i915) and PCI slot"
},
"pickGpu": {
"title": "Pick the GPU(s) to switch",
"body": "Single GPU → auto-selected. Multiple GPUs → checklist. You can tick several, but they must all be in the <em>same</em> current mode — otherwise the script can't pick a target mode for the batch and you get a \"mixed mode\" warning asking you to narrow the selection.",
"tipTitle": "Batching switches",
"tipBody": "Useful when you're rebuilding a host: \"All three NVIDIAs go to VM mode, then the iGPU goes back to LXC.\" Two runs, each with uniform target, much less friction than one-at-a-time."
},
"direction": {
"title": "Review the proposed direction",
"intro": "Based on the current mode, the script proposes the opposite as target:",
"items": [
"<strong>VM → LXC:</strong> unbind from <code>vfio-pci</code>, let the native driver (<code>nvidia</code>, <code>amdgpu</code>, <code>i915</code>) reclaim the card so LXCs can share it. On NVIDIA, the per-BDF entry is removed from <code>/etc/udev/rules.d/10-proxmenux-vfio-bind.rules</code> so the nvidia module reclaims the GPU after reboot.",
"<strong>LXC → VM:</strong> bind to <code>vfio-pci</code> so the card is free for VFIO passthrough to a single VM. On AMD / Intel this means blacklisting the native driver and setting <code>options vfio-pci ids=…</code>. On NVIDIA the <code>nvidia</code> module is <strong>not</strong> blacklisted — instead a per-BDF udev rule applies <code>driver_override=vfio-pci</code> only to the GPUs you select, so other NVIDIA GPUs on the host keep their <code>nvidia</code> driver."
],
"outro": "Confirm the direction or cancel."
},
"conflict": {
"title": "Conflict policy per affected workload",
"body": "The script scans every <code>/etc/pve/lxc/*.conf</code> and <code>/etc/pve/qemu-server/*.conf</code> looking for references to the GPU's PCI slot. For each affected workload you pick a policy:",
"headerPolicy": "Policy",
"headerEffect": "Effect",
"headerWhen": "When to pick",
"keepPolicy": "Keep config, disable onboot",
"keepEffect": "<code>pct set -onboot 0</code> (or <code>qm set</code>). GPU lines stay in the config.",
"keepWhen": "You plan to come back to this VM/LXC once the GPU is back in its original mode. Safe default.",
"removePolicy": "Remove GPU from config",
"removeEffect": "<code>hostpci</code> / <code>dev</code> lines for this GPU's slot are sed'd out.",
"removeWhen": "The VM/LXC will keep running without the GPU (CPU-only transcoding, etc.). Clean workflow.",
"imageAlt": "Dialog asking per-VM / per-LXC conflict policy when switching a GPU that's currently assigned"
},
"audio": {
"title": "Orphan audio cleanup (only when leaving VM mode)",
"body1": "dGPUs (NVIDIA / AMD) ship with an HDMI audio function at <code>.1</code> of the same slot, and sometimes extra audio controllers are attached alongside the GPU. When the GPU leaves the VM, those audio lines become orphans — the VM has <code>hostpci</code> entries pointing to audio devices that aren't going with the GPU.",
"body2": "The script discovers them (precise BDF match, no substring false-positives) and shows a checklist so you can remove them cleanly. It also cleans their vendor:device IDs from <code>/etc/modprobe.d/vfio.conf</code> — but only if no other VM still uses those audio IDs."
},
"apply": {
"title": "Apply host + workload changes",
"body": "Once you confirm, the script writes the host-side changes — <code>vfio.conf</code>, blacklist, modules, and (for NVIDIA) the per-BDF udev rule at <code>/etc/udev/rules.d/10-proxmenux-vfio-bind.rules</code> plus the BDF state at <code>/etc/proxmenux/vfio-bind.bdfs</code>. It also applies the chosen conflict policy to each affected VM/LXC. If the host config actually changed, it runs <code>update-initramfs -u -k all</code> — otherwise it skips that step."
},
"reboot": {
"title": "Reboot",
"body": "The new GPU binding only takes effect after a reboot. The script prompts you; you can reboot now or later, but don't start the target VM/LXC until the host has rebooted — otherwise the GPU is still held by the previous driver.",
"imageAlt": "Summary dialog listing what changed, followed by the reboot prompt"
}
},
"manual": {
"heading": "Manual equivalent",
"intro": "If you want to understand exactly what the script does (or troubleshoot one of the steps by hand), these are the raw operations for <strong>VM → LXC</strong> on an NVIDIA card with vendor:device <code>10de:2204</code>:",
"lxcToVm": "And for <strong>LXC → VM</strong>:",
"oneVmTitle": "Only one VM can use a given vfio-pci GPU at a time",
"oneVmBody": "Putting multiple <code>hostpci</code> entries with the same PCI slot in two VMs is valid config but only one of the VMs can start with the GPU — the second one will fail. The ProxMenux conflict policy step is exactly about avoiding this trap."
},
"verification": {
"heading": "Verification after reboot"
},
"troubleshoot": {
"heading": "Troubleshooting",
"stillVfioTitle": "GPU still shows vfio-pci after switching to LXC mode",
"stillVfioBody": "<code>update-initramfs</code> didn't run (or the reboot didn't actually happen). Check <code>lsmod | grep vfio</code> — if vfio-pci is loaded, rerun <code>update-initramfs -u -k all</code> and reboot. For AMD/Intel: verify <code>vfio.conf</code> no longer contains the GPU's vendor:device ID. For NVIDIA: verify the BDF is no longer in <code>/etc/proxmenux/vfio-bind.bdfs</code> and that <code>/etc/udev/rules.d/10-proxmenux-vfio-bind.rules</code> doesn't list it.",
"vmFailTitle": "A VM won't start after switching a GPU to LXC mode",
"vmFailBody": "The VM still has <code>hostpci</code> entries pointing to a GPU it can't claim. Run the script again and pick the <em>Remove GPU from config</em> policy, or clean the config by hand:",
"smiFailTitle": "nvidia-smi fails with 'Driver/library version mismatch' after going back to LXC",
"smiFailBody": "Host NVIDIA modules didn't reload cleanly. <code>modprobe -r nvidia</code> then <code>modprobe nvidia</code>. If that fails, reboot — a full reboot always clears residual state from the vfio binding.",
"logTitle": "Install log",
"logBody": "Every run writes to <code>/tmp/proxmenux_gpu_switch_mode.log</code> on the host. Attach it when asking for help on GitHub."
},
"related": {
"heading": "Related",
"items": [
{
"label": "Add GPU to VM (Passthrough)",
"href": "/docs/hardware/gpu-vm-passthrough",
"tail": " — first-time VM mode setup."
},
{
"label": "Add GPU to LXC",
"href": "/docs/hardware/igpu-acceleration-lxc",
"tail": " — first-time LXC mode setup."
},
{
"label": "Install NVIDIA Drivers (Host)",
"href": "/docs/hardware/nvidia-host",
"tail": " — required for LXC mode on NVIDIA GPUs."
},
{
"label": "ProxMenux Monitor — Hardware tab",
"href": "/docs/monitor/dashboard/hardware",
"tail": " — the Graphics Cards section where each GPU shows its current mode and exposes the inline switch control."
},
{
"label": "GPU Passthrough commands",
"href": "/docs/help-info/gpu-commands",
"tail": " — quick command reference."
}
]
}
}