Commit Graph

116 Commits

Author SHA1 Message Date
MacRimi
3629fe8848 Add beta 1.2.2.1 2026-06-05 17:12:23 +02:00
MacRimi
9677c5cb19 Health Monitor: reconcile stale disk warnings across reboots
When a host gets transient I/O events on a disk while smartctl is
momentarily unavailable (the canonical case: late in a noisy
shutdown), the disk-scan code records a `disk_<name>` WARNING tagged
"SMART: unavailable" exactly once and trusts the next scan to clear
it. That trust is misplaced: the clear path only fires when the
device shows up in the current dmesg window with zero events. After
a reboot, dmesg is empty for that device — so the device never gets
iterated, resolve_error is never called, and the dashboard stays
orange for a disk whose SMART now reports PASSED.

Caught on a lab host where `disk_nvme2n1` had been stuck as WARNING
for hours after a reboot. SMART was 100% healthy at the moment of
inspection (Critical Warning 0x00, 0 media errors, 100% spare). The
error's first_seen and last_seen were identical and pre-dated the
current boot, confirming a one-shot record that nothing had cleared.

Fix: add a `_reconcile_stale_disk_warnings()` pass at the top of
`_check_disks_optimized()`. For every active `disk_*` error
(skipping `disk_fs_*`, which is already reconciled separately):

  - device gone from /dev/   → resolve "Device no longer present"
  - device present + SMART PASSED → resolve "Transient I/O cleared,
    SMART now reports healthy"
  - device present + SMART UNKNOWN/FAILED → leave active so the
    main loop can re-classify on the next dmesg window

Acknowledged errors are left alone so the user's explicit dismiss
intent isn't overridden.

Verified end-to-end: re-injected the original `disk_nvme2n1`
warning into the persistence DB on the lab host, waited one scan
cycle, error was resolved automatically with `resolved_at` set and
`resolution_reason = 'Transient I/O cleared, SMART now reports
healthy'`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-01 22:54:14 +02:00
MacRimi
4bf49675d2 Update ProxMenux 1.2.1.4-beta 2026-05-30 21:54:32 +02:00
MacRimi
f2a40b993a Update AppImage 1.2.1.3 2026-05-22 18:47:30 +02:00
MacRimi
6eb1312c61 1.2.1.1-beta: notification + LXC + post-install fixes
- flask_notification_routes: PVE webhook X-Webhook-Secret written in
  standard base64 so PVE can decode it (GH #198)
- notification_channels: Gmail SMTP App Password handling — normalize
  tls_mode (None/empty → starttls), reject creds without host (false-
  positive sendmail delivery), surface "AUTH not advertised" hint
- notification_events: is_vzdump_active_on_host() reads /var/log/pve/
  tasks/active directly so backup_start fallback and vm_shutdown
  suppression survive a Monitor restart mid-backup
- notification_templates: extract --storage flag from vzdump log →
  "PBS-Cloud: vm/104/…" instead of generic "PBS:" prefix when multiple
  PBS endpoints exist
- health_monitor: pve_storage_capacity + zfs_pool_capacity respect
  per-item dismiss (don't keep category WARNING/CRITICAL after user
  dismisses); updates_check cache invalidated when /var/log/apt/
  history.log mtime advances
- lxc_mount_points: PVE volume size from subvol quota (df via
  /proc/<host_pid>/root/<target> + lxc.conf size=NNNG fallback);
  host_source_state detects "host detached" zombie binds; per-mount
  subprocess work parallelised via ThreadPoolExecutor so a CT with
  many bind mounts doesn't trip the Caddy 3s reverse-proxy timeout
- virtual-machines: "host detached" badge on bind mounts whose host
  source path disappeared
- auto/customizable_post_install: log2ram FUNC_VERSION 1.1 → 1.2; new
  log2ram-check.sh vacuums journal + truncates non-rotating logs
  (pveproxy/access.log, pveam.log) instead of only calling
  `log2ram write` (which leaves the tmpfs full); auto flow gains the
  missing SystemMaxUse in /etc/systemd/journald.conf

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 00:06:49 +02:00
MacRimi
b4e8c5101a Update AppImage 2026-05-10 05:11:51 +02:00
MacRimi
911886b90c Update AppImage 2026-05-10 05:00:00 +02:00
MacRimi
2f919de9e3 update beta ProxMenux 1.2.1.1-beta 2026-05-09 18:59:59 +02:00
MacRimi
039e35f3c5 update health_monitor.py 2026-04-17 16:39:08 +02:00
MacRimi
baa2ff4fa9 update health_persistence.py 2026-04-17 10:38:39 +02:00
MacRimi
ee1204c566 update health_monitor.py 2026-04-16 19:10:47 +02:00
MacRimi
2b8caa924f update notification_events.py 2026-04-09 12:34:03 +02:00
MacRimi
adde2ce5b9 update health_persistence.py 2026-04-06 12:02:05 +02:00
MacRimi
95e876b37f update health_monitor.py 2026-04-05 12:17:42 +02:00
MacRimi
e7dc030304 Update health_monitor.py 2026-04-05 12:02:59 +02:00
MacRimi
4b01ba1d2f update health_monitor.py 2026-04-05 11:58:14 +02:00
MacRimi
e9851da12f update virtual-machines.tsx 2026-04-05 11:51:26 +02:00
MacRimi
e0e732dd2c update health_persistence.py 2026-04-04 01:31:37 +02:00
MacRimi
c2073a5db5 Update health_monitor.py 2026-04-01 15:24:47 +02:00
MacRimi
d62396717a update health_persistence.py 2026-04-01 12:03:54 +02:00
MacRimi
a734fa5566 Update health_monitor.py 2026-04-01 00:01:12 +02:00
MacRimi
2df55d2839 Update health_monitor.py 2026-03-31 23:14:48 +02:00
MacRimi
e00051caa7 update health_monitor.py 2026-03-31 23:00:00 +02:00
MacRimi
80afa789e7 Update notification service 2026-03-30 22:26:20 +02:00
MacRimi
c549737ad0 Update HealthMonitor 2026-03-30 20:52:25 +02:00
MacRimi
2fc5e2865d Update notification service 2026-03-30 19:55:19 +02:00
MacRimi
d628233982 Update notification service 2026-03-28 15:50:30 +01:00
MacRimi
0edc2cc3af Update notification service 2026-03-27 19:40:17 +01:00
MacRimi
6bb9313b95 Update notification service 2026-03-27 19:15:11 +01:00
MacRimi
839a20df97 Update notification service 2026-03-26 19:05:11 +01:00
MacRimi
8b6755d866 Update notification service 2026-03-25 22:43:42 +01:00
MacRimi
bcacd8b98e Update notification service 2026-03-24 17:48:52 +01:00
MacRimi
d2c8178772 Update notification service 2026-03-24 17:34:05 +01:00
MacRimi
d34cebc90d Update health_monitor.py 2026-03-23 20:25:27 +01:00
MacRimi
c7ef51a73c Update notification service 2026-03-23 20:14:25 +01:00
MacRimi
ab34fb08c1 Update health_monitor.py 2026-03-23 19:31:21 +01:00
MacRimi
4ac71381da Update health_monitor.py 2026-03-23 18:08:22 +01:00
MacRimi
04564bc9cf Update health_monitor.py 2026-03-22 14:57:46 +01:00
MacRimi
d33741a90d Update notification service 2026-03-22 14:20:47 +01:00
MacRimi
7838762a4e Update health_monitor.py 2026-03-21 23:19:41 +01:00
MacRimi
876194cdc8 update storage settings 2026-03-19 19:07:26 +01:00
MacRimi
69e0bfe89a update notification service 2026-03-18 17:48:02 +01:00
MacRimi
6aaaa910af Update health_monitor.py 2026-03-16 09:36:40 +01:00
MacRimi
785d58cb59 Update health monitor 2026-03-15 18:12:42 +01:00
MacRimi
af61d145da Update oci manager 2026-03-15 17:59:47 +01:00
MacRimi
9112bcc52f Update health_monitor.py 2026-03-15 10:54:37 +01:00
MacRimi
e534cffcf7 Update health_monitor.py 2026-03-15 10:41:34 +01:00
MacRimi
a184dcc38f Update health_monitor.py 2026-03-15 10:36:19 +01:00
MacRimi
e169200f40 Update health monitor 2026-03-15 10:03:35 +01:00
MacRimi
6d4006fd93 update oci manager 2026-03-12 22:13:56 +01:00