Add database support for measurements and historic prediction data. (#848)

The database supports backend selection, compression, incremental data load, automatic data saving to storage, automatic vaccum and compaction. Make SQLite3 and LMDB database backends available. Update tests for new interface conventions regarding data sequences, data containers, data providers. This includes the measurements provider and the prediction providers. Add database documentation. The fix includes several bug fixes that are not directly related to the database implementation but are necessary to keep EOS running properly and to test and document the changes. * fix: config eos test setup Make the config_eos fixture generate a new instance of the config_eos singleton. Use correct env names to setup data folder path. * fix: startup with no config Make cache and measurements complain about missing data path configuration but do not bail out. * fix: soc data preparation and usage for genetic optimization. Search for soc measurments 48 hours around the optimization start time. Only clamp soc to maximum in battery device simulation. * fix: dashboard bailout on zero value solution display Do not use zero values to calculate the chart values adjustment for display. * fix: openapi generation script Make the script also replace data_folder_path and data_output_path to hide real (test) environment pathes. * feat: add make repeated task function make_repeated_task allows to wrap a function to be repeated cyclically. * chore: removed index based data sequence access Index based data sequence access does not make sense as the sequence can be backed by the database. The sequence is now purely time series data. * chore: refactor eos startup to avoid module import startup Avoid module import initialisation expecially of the EOS configuration. Config mutation, singleton initialization, logging setup, argparse parsing, background task definitions depending on config and environment-dependent behavior is now done at function startup. * chore: introduce retention manager A single long-running background task that owns the scheduling of all periodic server-maintenance jobs (cache cleanup, DB autosave, …) * chore: canonicalize timezone name for UTC Timezone names that are semantically identical to UTC are canonicalized to UTC. * chore: extend config file migration for default value handling Extend the config file migration handling values None or nonexisting values that will invoke a default value generation in the new config file. Also adapt test to handle this situation. * chore: extend datetime util test cases * chore: make version test check for untracked files Check for files that are not tracked by git. Version calculation will be wrong if these files will not be commited. * chore: bump pandas to 3.0.0 Pandas 3.0 now performs inference on the appropriate resolution (a.k.a. unit) for the output dtype which may become datetime64[us] (before it was ns). Also numeric dtype detection is now more strict which needs a different detection for numerics. * chore: bump pydantic-settings to 2.12.0 pydantic-settings 2.12.0 under pytest creates a different behaviour. The tests were adapted and a workaround was introduced. Also ConfigEOS was adapted to allow for fine grain initialization control to be able to switch off certain settings such as file settings during test. * chore: remove sci learn kit from dependencies The sci learn kit is not strictly necessary as long as we have scipy. * chore: add documentation mode guarding for sphinx autosummary Sphinx autosummary excecutes functions. Prevent exceptions in case of pure doc mode. * chore: adapt docker-build CI workflow to stricter GitHub handling Signed-off-by: Bobby Noelte <b0661n0e17e@gmail.com>
2026-02-23 17:36:19 +00:00 · 2026-02-22 14:12:42 +01:00
parent 5f66591d21
commit 6498c7dc32
92 changed files with 12710 additions and 2173 deletions
--- a/docs/_generated/config.md
+++ b/docs/_generated/config.md
@@ -4,6 +4,7 @@

 ../_generated/configadapter.md
 ../_generated/configcache.md
+../_generated/configdatabase.md
 ../_generated/configdevices.md
 ../_generated/configelecprice.md
 ../_generated/configems.md
--- a/docs/_generated/configadapter.md
+++ b/docs/_generated/configadapter.md
@@ -10,7 +10,7 @@
 | homeassistant | `EOS_ADAPTER__HOMEASSISTANT` | `HomeAssistantAdapterCommonSettings` | `rw` | `required` | Home Assistant adapter settings. |
 | nodered | `EOS_ADAPTER__NODERED` | `NodeREDAdapterCommonSettings` | `rw` | `required` | NodeRED adapter settings. |
 | provider | `EOS_ADAPTER__PROVIDER` | `Optional[list[str]]` | `rw` | `None` | List of adapter provider id(s) of provider(s) to be used. |
-| providers | | `list[str]` | `ro` | `N/A` | Available electricity price provider ids. |
+| providers | | `list[str]` | `ro` | `N/A` | Available adapter provider ids. |
 :::
 <!-- pyml enable line-length -->

@@ -33,10 +33,7 @@
               "pv_production_emr_entity_ids": null,
               "device_measurement_entity_ids": null,
               "device_instruction_entity_ids": null,
-               "solution_entity_ids": null,
-               "homeassistant_entity_ids": [],
-               "eos_solution_entity_ids": [],
-               "eos_device_instruction_entity_ids": []
+               "solution_entity_ids": null
           },
           "nodered": {
               "host": "127.0.0.1",
--- a/docs/_generated/configcache.md
+++ b/docs/_generated/configcache.md
@@ -7,7 +7,7 @@

 | Name | Environment Variable | Type | Read-Only | Default | Description |
 | ---- | -------------------- | ---- | --------- | ------- | ----------- |
-| cleanup_interval | `EOS_CACHE__CLEANUP_INTERVAL` | `float` | `rw` | `300` | Intervall in seconds for EOS file cache cleanup. |
+| cleanup_interval | `EOS_CACHE__CLEANUP_INTERVAL` | `float` | `rw` | `300.0` | Intervall in seconds for EOS file cache cleanup. |
 | subpath | `EOS_CACHE__SUBPATH` | `Optional[pathlib.Path]` | `rw` | `cache` | Sub-path for the EOS cache data directory. |
 :::
 <!-- pyml enable line-length -->
--- a/docs/_generated/configdatabase.md
+++ b/docs/_generated/configdatabase.md
@@ -0,0 +1,72 @@
+## Configuration model for database settings
+
+Attributes:
+    provider: Optional provider identifier (e.g. "LMDB").
+    max_records_in_memory: Maximum records kept in memory before auto-save.
+    auto_save: Whether to auto-save when threshold exceeded.
+    batch_size: Batch size for batch operations.
+
+<!-- pyml disable line-length -->
+:::{table} database
+:widths: 10 20 10 5 5 30
+:align: left
+
+| Name | Environment Variable | Type | Read-Only | Default | Description |
+| ---- | -------------------- | ---- | --------- | ------- | ----------- |
+| autosave_interval_sec | `EOS_DATABASE__AUTOSAVE_INTERVAL_SEC` | `Optional[int]` | `rw` | `10` | Automatic saving interval [seconds].
+Set to None to disable automatic saving. |
+| batch_size | `EOS_DATABASE__BATCH_SIZE` | `int` | `rw` | `100` | Number of records to process in batch operations. |
+| compaction_interval_sec | `EOS_DATABASE__COMPACTION_INTERVAL_SEC` | `Optional[int]` | `rw` | `604800` | Interval in between automatic tiered compaction runs [seconds].
+Compaction downsamples old records to reduce storage while retaining coverage. Set to None to disable automatic compaction. |
+| compression_level | `EOS_DATABASE__COMPRESSION_LEVEL` | `int` | `rw` | `9` | Compression level for database record data. |
+| initial_load_window_h | `EOS_DATABASE__INITIAL_LOAD_WINDOW_H` | `Optional[int]` | `rw` | `None` | Specifies the default duration of the initial load window when loading records from the database, in hours. If set to None, the full available range is loaded. The window is centered around the current time by default, unless a different center time is specified. Different database namespaces may define their own default windows. |
+| keep_duration_h | `EOS_DATABASE__KEEP_DURATION_H` | `Optional[int]` | `rw` | `None` | Default maximum duration records shall be kept in database [hours, none].
+None indicates forever. Database namespaces may have diverging definitions. |
+| provider | `EOS_DATABASE__PROVIDER` | `Optional[str]` | `rw` | `None` | Database provider id of provider to be used. |
+| providers | | `List[str]` | `ro` | `N/A` | Return available database provider ids. |
+:::
+<!-- pyml enable line-length -->
+
+<!-- pyml disable no-emphasis-as-heading -->
+**Example Input**
+<!-- pyml enable no-emphasis-as-heading -->
+
+<!-- pyml disable line-length -->
+```json
+   {
+       "database": {
+           "provider": "LMDB",
+           "compression_level": 0,
+           "initial_load_window_h": 48,
+           "keep_duration_h": 48,
+           "autosave_interval_sec": 5,
+           "compaction_interval_sec": 604800,
+           "batch_size": 100
+       }
+   }
+```
+<!-- pyml enable line-length -->
+
+<!-- pyml disable no-emphasis-as-heading -->
+**Example Output**
+<!-- pyml enable no-emphasis-as-heading -->
+
+<!-- pyml disable line-length -->
+```json
+   {
+       "database": {
+           "provider": "LMDB",
+           "compression_level": 0,
+           "initial_load_window_h": 48,
+           "keep_duration_h": 48,
+           "autosave_interval_sec": 5,
+           "compaction_interval_sec": 604800,
+           "batch_size": 100,
+           "providers": [
+               "LMDB",
+               "SQLite"
+           ]
+       }
+   }
+```
+<!-- pyml enable line-length -->
--- a/docs/_generated/configdevices.md
+++ b/docs/_generated/configdevices.md
@@ -50,19 +50,7 @@
                       1.0
                   ],
                   "min_soc_percentage": 0,
-                   "max_soc_percentage": 100,
-                   "measurement_key_soc_factor": "battery1-soc-factor",
-                   "measurement_key_power_l1_w": "battery1-power-l1-w",
-                   "measurement_key_power_l2_w": "battery1-power-l2-w",
-                   "measurement_key_power_l3_w": "battery1-power-l3-w",
-                   "measurement_key_power_3_phase_sym_w": "battery1-power-3-phase-sym-w",
-                   "measurement_keys": [
-                       "battery1-soc-factor",
-                       "battery1-power-l1-w",
-                       "battery1-power-l2-w",
-                       "battery1-power-l3-w",
-                       "battery1-power-3-phase-sym-w"
-                   ]
+                   "max_soc_percentage": 100
               }
           ],
           "max_batteries": 1,
@@ -89,19 +77,7 @@
                       1.0
                   ],
                   "min_soc_percentage": 0,
-                   "max_soc_percentage": 100,
-                   "measurement_key_soc_factor": "battery1-soc-factor",
-                   "measurement_key_power_l1_w": "battery1-power-l1-w",
-                   "measurement_key_power_l2_w": "battery1-power-l2-w",
-                   "measurement_key_power_l3_w": "battery1-power-l3-w",
-                   "measurement_key_power_3_phase_sym_w": "battery1-power-3-phase-sym-w",
-                   "measurement_keys": [
-                       "battery1-soc-factor",
-                       "battery1-power-l1-w",
-                       "battery1-power-l2-w",
-                       "battery1-power-l3-w",
-                       "battery1-power-3-phase-sym-w"
-                   ]
+                   "max_soc_percentage": 100
               }
           ],
           "max_electric_vehicles": 1,
--- a/docs/_generated/configems.md
+++ b/docs/_generated/configems.md
@@ -7,7 +7,7 @@

 | Name | Environment Variable | Type | Read-Only | Default | Description |
 | ---- | -------------------- | ---- | --------- | ------- | ----------- |
-| interval | `EOS_EMS__INTERVAL` | `Optional[float]` | `rw` | `None` | Intervall in seconds between EOS energy management runs. |
+| interval | `EOS_EMS__INTERVAL` | `float` | `rw` | `300.0` | Intervall between EOS energy management runs [seconds]. |
 | mode | `EOS_EMS__MODE` | `Optional[akkudoktoreos.core.emsettings.EnergyManagementMode]` | `rw` | `None` | Energy management mode [OPTIMIZATION | PREDICTION]. |
 | startup_delay | `EOS_EMS__STARTUP_DELAY` | `float` | `rw` | `5` | Startup delay in seconds for EOS energy management runs. |
 :::
--- a/docs/_generated/configexample.md
+++ b/docs/_generated/configexample.md
@@ -15,10 +15,7 @@
               "pv_production_emr_entity_ids": null,
               "device_measurement_entity_ids": null,
               "device_instruction_entity_ids": null,
-               "solution_entity_ids": null,
-               "homeassistant_entity_ids": [],
-               "eos_solution_entity_ids": [],
-               "eos_device_instruction_entity_ids": []
+               "solution_entity_ids": null
           },
           "nodered": {
               "host": "127.0.0.1",
@@ -29,6 +26,15 @@
           "subpath": "cache",
           "cleanup_interval": 300.0
       },
+       "database": {
+           "provider": "LMDB",
+           "compression_level": 0,
+           "initial_load_window_h": 48,
+           "keep_duration_h": 48,
+           "autosave_interval_sec": 5,
+           "compaction_interval_sec": 604800,
+           "batch_size": 100
+       },
       "devices": {
           "batteries": [
               {
@@ -53,19 +59,7 @@
                       1.0
                   ],
                   "min_soc_percentage": 0,
-                   "max_soc_percentage": 100,
-                   "measurement_key_soc_factor": "battery1-soc-factor",
-                   "measurement_key_power_l1_w": "battery1-power-l1-w",
-                   "measurement_key_power_l2_w": "battery1-power-l2-w",
-                   "measurement_key_power_l3_w": "battery1-power-l3-w",
-                   "measurement_key_power_3_phase_sym_w": "battery1-power-3-phase-sym-w",
-                   "measurement_keys": [
-                       "battery1-soc-factor",
-                       "battery1-power-l1-w",
-                       "battery1-power-l2-w",
-                       "battery1-power-l3-w",
-                       "battery1-power-3-phase-sym-w"
-                   ]
+                   "max_soc_percentage": 100
               }
           ],
           "max_batteries": 1,
@@ -92,19 +86,7 @@
                       1.0
                   ],
                   "min_soc_percentage": 0,
-                   "max_soc_percentage": 100,
-                   "measurement_key_soc_factor": "battery1-soc-factor",
-                   "measurement_key_power_l1_w": "battery1-power-l1-w",
-                   "measurement_key_power_l2_w": "battery1-power-l2-w",
-                   "measurement_key_power_l3_w": "battery1-power-l3-w",
-                   "measurement_key_power_3_phase_sym_w": "battery1-power-3-phase-sym-w",
-                   "measurement_keys": [
-                       "battery1-soc-factor",
-                       "battery1-power-l1-w",
-                       "battery1-power-l2-w",
-                       "battery1-power-l3-w",
-                       "battery1-power-3-phase-sym-w"
-                   ]
+                   "max_soc_percentage": 100
               }
           ],
           "max_electric_vehicles": 1,
@@ -138,8 +120,8 @@
           }
       },
       "general": {
-           "version": "0.2.0.dev84352035",
-           "data_folder_path": null,
+           "version": "0.2.0.dev58204789",
+           "data_folder_path": "/home/user/.local/share/net.akkudoktoreos.net",
           "data_output_subpath": "output",
           "latitude": 52.52,
           "longitude": 13.405
@@ -157,6 +139,7 @@
           "file_level": "TRACE"
       },
       "measurement": {
+           "historic_hours": 17520,
           "load_emr_keys": [
               "load0_emr"
           ],
--- a/docs/_generated/configgeneral.md
+++ b/docs/_generated/configgeneral.md
@@ -9,14 +9,14 @@
 | ---- | -------------------- | ---- | --------- | ------- | ----------- |
 | config_file_path | | `Optional[pathlib.Path]` | `ro` | `N/A` | Path to EOS configuration file. |
 | config_folder_path | | `Optional[pathlib.Path]` | `ro` | `N/A` | Path to EOS configuration directory. |
-| data_folder_path | `EOS_GENERAL__DATA_FOLDER_PATH` | `Optional[pathlib.Path]` | `rw` | `None` | Path to EOS data directory. |
+| data_folder_path | `EOS_GENERAL__DATA_FOLDER_PATH` | `Path` | `rw` | `required` | Path to EOS data folder. |
 | data_output_path | | `Optional[pathlib.Path]` | `ro` | `N/A` | Computed data_output_path based on data_folder_path. |
-| data_output_subpath | `EOS_GENERAL__DATA_OUTPUT_SUBPATH` | `Optional[pathlib.Path]` | `rw` | `output` | Sub-path for the EOS output data directory. |
-| home_assistant_addon | | `bool` | `ro` | `N/A` | EOS is running as home assistant add-on. |
+| data_output_subpath | `EOS_GENERAL__DATA_OUTPUT_SUBPATH` | `Optional[pathlib.Path]` | `rw` | `output` | Sub-path for the EOS output data folder. |
+| home_assistant_addon | `EOS_GENERAL__HOME_ASSISTANT_ADDON` | `bool` | `rw` | `required` | EOS is running as home assistant add-on. |
 | latitude | `EOS_GENERAL__LATITUDE` | `Optional[float]` | `rw` | `52.52` | Latitude in decimal degrees between -90 and 90. North is positive (ISO 19115) (°) |
 | longitude | `EOS_GENERAL__LONGITUDE` | `Optional[float]` | `rw` | `13.405` | Longitude in decimal degrees within -180 to 180 (°) |
 | timezone | | `Optional[str]` | `ro` | `N/A` | Computed timezone based on latitude and longitude. |
-| version | `EOS_GENERAL__VERSION` | `str` | `rw` | `0.2.0.dev84352035` | Configuration file version. Used to check compatibility. |
+| version | `EOS_GENERAL__VERSION` | `str` | `rw` | `0.2.0.dev58204789` | Configuration file version. Used to check compatibility. |
 :::
 <!-- pyml enable line-length -->

@@ -28,8 +28,8 @@
 ```json
   {
       "general": {
-           "version": "0.2.0.dev84352035",
-           "data_folder_path": null,
+           "version": "0.2.0.dev58204789",
+           "data_folder_path": "/home/user/.local/share/net.akkudoktoreos.net",
           "data_output_subpath": "output",
           "latitude": 52.52,
           "longitude": 13.405
@@ -46,16 +46,15 @@
 ```json
   {
       "general": {
-           "version": "0.2.0.dev84352035",
-           "data_folder_path": null,
+           "version": "0.2.0.dev58204789",
+           "data_folder_path": "/home/user/.local/share/net.akkudoktoreos.net",
           "data_output_subpath": "output",
           "latitude": 52.52,
           "longitude": 13.405,
           "timezone": "Europe/Berlin",
-           "data_output_path": null,
+           "data_output_path": "/home/user/.local/share/net.akkudoktoreos.net/output",
           "config_folder_path": "/home/user/.config/net.akkudoktoreos.net",
-           "config_file_path": "/home/user/.config/net.akkudoktoreos.net/EOS.config.json",
-           "home_assistant_addon": false
+           "config_file_path": "/home/user/.config/net.akkudoktoreos.net/EOS.config.json"
       }
   }
 ```
--- a/docs/_generated/configmeasurement.md
+++ b/docs/_generated/configmeasurement.md
@@ -9,6 +9,7 @@
 | ---- | -------------------- | ---- | --------- | ------- | ----------- |
 | grid_export_emr_keys | `EOS_MEASUREMENT__GRID_EXPORT_EMR_KEYS` | `Optional[list[str]]` | `rw` | `None` | The keys of the measurements that are energy meter readings of energy export to grid [kWh]. |
 | grid_import_emr_keys | `EOS_MEASUREMENT__GRID_IMPORT_EMR_KEYS` | `Optional[list[str]]` | `rw` | `None` | The keys of the measurements that are energy meter readings of energy import from grid [kWh]. |
+| historic_hours | `EOS_MEASUREMENT__HISTORIC_HOURS` | `Optional[int]` | `rw` | `17520` | Number of hours into the past for measurement data |
 | keys | | `list[str]` | `ro` | `N/A` | The keys of the measurements that can be stored. |
 | load_emr_keys | `EOS_MEASUREMENT__LOAD_EMR_KEYS` | `Optional[list[str]]` | `rw` | `None` | The keys of the measurements that are energy meter readings of a load [kWh]. |
 | pv_production_emr_keys | `EOS_MEASUREMENT__PV_PRODUCTION_EMR_KEYS` | `Optional[list[str]]` | `rw` | `None` | The keys of the measurements that are PV production energy meter readings [kWh]. |
@@ -23,6 +24,7 @@
 ```json
   {
       "measurement": {
+           "historic_hours": 17520,
           "load_emr_keys": [
               "load0_emr"
           ],
@@ -48,6 +50,7 @@
 ```json
   {
       "measurement": {
+           "historic_hours": 17520,
           "load_emr_keys": [
               "load0_emr"
           ],
--- a/docs/_generated/openapi.md
+++ b/docs/_generated/openapi.md
@@ -1,6 +1,6 @@
 # Akkudoktor-EOS

-**Version**: `v0.2.0.dev84352035`
+**Version**: `v0.2.0.dev58204789`

 <!-- pyml disable line-length -->
 **Description**: This project provides a comprehensive solution for simulating and optimizing an energy system based on renewable energy sources. With a focus on photovoltaic (PV) systems, battery storage (batteries), load management (consumer requirements), heat pumps, electric vehicles, and consideration of electricity price data, this system enables forecasting and optimization of energy flow and costs over a specified period.
@@ -338,6 +338,56 @@ Returns:

 ---

+## GET /v1/admin/database/stats
+
+<!-- pyml disable line-length -->
+**Links**: [local](http://localhost:8503/docs#/default/fastapi_admin_database_stats_get_v1_admin_database_stats_get), [eos](https://petstore3.swagger.io/?url=https://raw.githubusercontent.com/Akkudoktor-EOS/EOS/refs/heads/main/openapi.json#/default/fastapi_admin_database_stats_get_v1_admin_database_stats_get)
+<!-- pyml enable line-length -->
+
+Fastapi Admin Database Stats Get
+
+<!-- pyml disable line-length -->
+```python
+"""
+Get statistics from database.
+
+Returns:
+    data (dict): The database statistics
+"""
+```
+<!-- pyml enable line-length -->
+
+**Responses**:
+
+- **200**: Successful Response
+
+---
+
+## POST /v1/admin/database/vacuum
+
+<!-- pyml disable line-length -->
+**Links**: [local](http://localhost:8503/docs#/default/fastapi_admin_database_vacuum_post_v1_admin_database_vacuum_post), [eos](https://petstore3.swagger.io/?url=https://raw.githubusercontent.com/Akkudoktor-EOS/EOS/refs/heads/main/openapi.json#/default/fastapi_admin_database_vacuum_post_v1_admin_database_vacuum_post)
+<!-- pyml enable line-length -->
+
+Fastapi Admin Database Vacuum Post
+
+<!-- pyml disable line-length -->
+```python
+"""
+Remove old records from database.
+
+Returns:
+    data (dict): The database stats after removal of old records.
+"""
+```
+<!-- pyml enable line-length -->
+
+**Responses**:
+
+- **200**: Successful Response
+
+---
+
 ## POST /v1/admin/server/restart

 <!-- pyml disable line-length -->
--- a/docs/akkudoktoreos/database.md
+++ b/docs/akkudoktoreos/database.md
@@ -0,0 +1,599 @@
+% SPDX-License-Identifier: Apache-2.0
+(database-page)=
+
+# Database
+
+## Overview
+
+The EOS database system provides a flexible, pluggable persistence layer for time-series data
+records with automatic lazy loading, dirty tracking, and multi-backend support. The architecture
+separates the abstract database interface from concrete storage implementations, allowing seamless
+switching between LMDB and SQLite backends.
+
+## Architecture
+
+### Three-Layer Design
+
+**Abstract Interface Layer** (`DatabaseABC`)
+
+- Defines the contract for all database operations
+- Provides compression/decompression utilities
+- Backend-agnostic API
+
+**Backend Implementation Layer** (`DatabaseBackendABC`)
+
+- Concrete implementations: `LMDBDatabase`, `SQLiteDatabase`
+- Singleton pattern ensures single instance per backend
+- Thread-safe operations via internal locking
+
+**Record Protocol Layer** (`DatabaseRecordProtocolMixin`)
+
+- Manages in-memory record lifecycle
+- Implements lazy loading strategies
+- Handles dirty tracking and autosave
+
+## Configuration
+
+### Database Settings (`DatabaseCommonSettings`)
+
+```python
+provider: Optional[str] = None        # "LMDB" or "SQLite"
+compression_level: int = 9            # 0-9, gzip compression
+initial_load_window_h: Optional[int] = None  # Hours, None = full load
+keep_duration_h: Optional[int] = None        # Retention period
+autosave_interval_sec: Optional[int] = None  # Auto-flush interval
+compaction_interval_sec: Optional[int] = 604800  # Compaction interval
+batch_size: int = 100                 # Batch operation size
+```
+
+### User Configuration Guide
+
+This section explains what each setting does in practical terms and gives
+concrete recommendations for common deployment scenarios.
+
+#### `provider` — choosing a backend
+
+Set `provider` to `"LMDB"` or `"SQLite"`. Leave it `None` only during
+development or unit testing — with `None` set, nothing is persisted to disk and
+all data is lost on restart.
+
+**Use LMDB** for a long-running home server that records data continuously. It
+is significantly faster for high-frequency writes and range reads because it
+uses memory-mapped files. The trade-off is that it pre-allocates a large file
+on disk (default 10 GB) even when mostly empty.
+
+**Use SQLite** when disk space is constrained, for portable single-file
+deployments, or when you want to inspect or manipulate the database with
+standard SQL tools. SQLite is slightly slower for bulk writes but perfectly
+adequate for home energy data volumes.
+
+**Do not** switch backends while data exists in the old backend — records are
+not migrated automatically. If you need to switch, vacuum the old database
+first, export your data, then reconfigure.
+
+#### `compression_level` — storage size vs. CPU
+
+Values range from `0` (no compression) to `9` (maximum compression). The default of `9` is
+appropriate for most deployments: home energy time-series data compresses very well (often
+60–80 % reduction) and the CPU overhead is negligible on modern hardware.
+
+**Set to `0`** only if you are running on very constrained hardware (e.g. a single-core ARM
+board at full load) and storage space is not a concern.
+
+**Do not** change this setting after data has been written — the database stores each record
+with the compression level active at write time and auto-detects the format on read, so mixed
+levels are fine technically, but you will not reclaim space from already-written records until
+they are rewritten by compaction.
+
+#### `initial_load_window_h` — startup memory usage
+
+Controls how much history is loaded into memory when the application first accesses a namespace.
+
+**Set a window** (e.g. `48`) on systems with limited RAM or large databases. Only the most
+recent 48 hours are loaded immediately; older data is fetched on demand if a query reaches
+outside that window.
+
+**Leave as `None`** (the default) on well-resourced systems or when you need guaranteed
+access to all history from the first query. Full load is simpler and avoids the small latency
+spike of incremental loads.
+
+**Do not** set this to a very small value (e.g. `1`) if your forecasting or reporting queries
+routinely look back further — every out-of-window query triggers a database read, and many
+small reads are slower than one full load.
+
+#### `keep_duration_h` — data retention
+
+Sets the age limit (in hours) for the vacuum operation. Records older than
+`max_timestamp - keep_duration_h` are permanently deleted when vacuum runs.
+
+**Set this** to match your actual analysis needs. If your forecast models only look back 7 days,
+keeping 14 days (`336`) gives a comfortable safety margin without accumulating indefinitely.
+
+**Leave as `None`** only if you have a strong archival requirement and understand that the
+database will grow without bound. Even with compaction reducing resolution, old data is not
+deleted unless vacuum runs with a retention limit.
+
+**Do not** set `keep_duration_h` shorter than the oldest data your forecast or reporting
+queries ever request — vacuum is permanent and irreversible.
+
+#### `autosave_interval_sec` — write durability
+
+Controls how often dirty (modified) records are flushed to disk automatically, in seconds.
+
+**Set to a low value** (e.g. `10`–`30`) on a system that could lose power unexpectedly,
+such as a Raspberry Pi without a UPS. A power cut between autosaves loses that window of data.
+
+**Set to a higher value** (e.g. `300`) on stable systems to reduce write amplification. Each
+autosave is a full flush of all dirty records, so frequent saves on large dirty sets are
+more expensive.
+
+**Leave as `None`** only if you call `db_save_records()` manually at appropriate points in
+your application code. With `None`, data written since the last manual save is lost on crash.
+
+#### `compaction_interval_sec` — automatic tiered downsampling
+
+Controls how often the compaction maintenance job runs, in seconds. The default is
+604 800 (one week). Set to `None` to disable automatic compaction entirely.
+
+Compaction applies a tiered downsampling policy to old records:
+
+- Records older than **2 hours** are downsampled to **15-minute** resolution
+- Records older than **14 days** are downsampled to **1-hour** resolution
+
+This reduces storage and speeds up range queries on historical data while preserving full
+resolution for recent data where it matters most. Each tier is processed incrementally —
+only the window since the last compaction run is examined, so weekly runs are fast regardless
+of total history length.
+
+**Leave at the default weekly interval** for most deployments. Compaction is idempotent and
+cheap when run frequently on small new windows.
+
+**Set to a shorter interval** (e.g. `86400`, daily) if your device records at very high
+frequency (sub-minute) and disk space is a concern.
+
+**Set to `None`** only if you have a custom retention policy and manage downsampling manually,
+or if you store data that must not be averaged (e.g. raw event logs where mean resampling
+would be meaningless).
+
+**Do not** set the interval shorter than `autosave_interval_sec` — compaction reads from the
+backend and a record that has not been saved yet will not be visible to it.
+
+**Interaction with vacuum:** compaction and vacuum are complementary. Compaction reduces
+resolution of old data; vacuum deletes it entirely past `keep_duration_h`. The recommended
+pipeline is: compaction runs first (weekly), then vacuum runs immediately after. This means
+vacuum always operates on already-downsampled data, which is faster and produces cleaner
+storage boundaries.
+
+### Recommended Configurations by Scenario
+
+#### Home server, typical (Raspberry Pi 4, SSD)
+
+```python
+provider = "LMDB"
+compression_level = 9
+initial_load_window_h = 48
+keep_duration_h = 720          # 30 days
+autosave_interval_sec = 30
+compaction_interval_sec = 604800  # weekly
+```
+
+#### Home server, low storage (Raspberry Pi Zero, SD card)
+
+```python
+provider = "SQLite"
+compression_level = 9
+initial_load_window_h = 24
+keep_duration_h = 168          # 7 days
+autosave_interval_sec = 60
+compaction_interval_sec = 86400   # daily — reclaim space faster
+```
+
+#### Development / testing
+
+```python
+provider = "SQLite"            # or None for fully in-memory
+compression_level = 0          # faster without compression overhead
+initial_load_window_h = None   # always load everything
+keep_duration_h = None         # never vacuum automatically
+autosave_interval_sec = None   # manual saves only
+compaction_interval_sec = None # disable compaction
+```
+
+#### High-frequency recording (sub-minute intervals)
+
+```python
+provider = "LMDB"
+compression_level = 9
+initial_load_window_h = 24
+keep_duration_h = 336          # 14 days
+autosave_interval_sec = 10
+compaction_interval_sec = 86400   # daily — essential at high frequency
+```
+
+## Storage Backends
+
+### LMDB Backend
+
+**Characteristics:**
+
+- Memory-mapped file database
+- Native namespace support via DBIs (Database Instances)
+- High-performance reads with MVCC
+- Configurable map size (default: 10 GB)
+
+**Configuration:**
+
+```python
+map_size: int = 10 * 1024 * 1024 * 1024  # 10 GB
+writemap=True, map_async=True             # Performance optimizations
+max_dbs=128                                # Maximum namespaces
+```
+
+**File Structure:**
+
+```text
+data_folder_path/
+└── db/
+    └── lmdbdatabase/
+        ├── data.mdb
+        └── lock.mdb
+```
+
+### SQLite Backend
+
+**Characteristics:**
+
+- Single-file relational database
+- Namespace emulation via `namespace` column
+- ACID transactions with autocommit mode
+- Cross-platform compatibility
+
+**Schema:**
+
+```sql
+CREATE TABLE records (
+    namespace TEXT NOT NULL DEFAULT '',
+    key BLOB NOT NULL,
+    value BLOB NOT NULL,
+    PRIMARY KEY (namespace, key)
+);
+
+CREATE TABLE metadata (
+    namespace TEXT PRIMARY KEY,
+    value BLOB
+);
+```
+
+**File Structure:**
+
+```text
+data_folder_path/
+└── db/
+    └── sqlitedatabase/
+        └── data.db
+```
+
+## Timestamp System
+
+### DatabaseTimestamp
+
+All records are indexed by UTC timestamps in sortable ISO 8601 format:
+
+```python
+DatabaseTimestamp.from_datetime(dt: DateTime) -> "20241027T123456[Z]"
+```
+
+**Properties:**
+- Always stored in UTC (timezone-aware required)
+- Lexicographically sortable
+- Bijective conversion to/from `pendulum.DateTime`
+- Second-level precision
+
+### Unbounded Sentinels
+
+```python
+UNBOUND_START  # Smaller than any timestamp
+UNBOUND_END    # Greater than any timestamp
+```
+
+Used for open-ended range queries without special-casing `None`.
+
+## Lazy Loading Strategy
+
+### Three-Phase Loading
+
+The system uses a progressive loading model to minimize memory footprint:
+
+#### **Phase 0: NONE**
+
+- No records loaded
+- First query triggers either:
+  - Initial window load (if `initial_load_window_h` configured)
+  - Full database load (if `initial_load_window_h = None`)
+  - Targeted range load (if explicit range requested)
+
+#### **Phase 1: INITIAL**
+
+- Partial time window loaded
+- `_db_loaded_range` tracks coverage: `[start_timestamp, end_timestamp)`
+- Out-of-window queries trigger incremental expansion:
+  - Left expansion: load records before current window
+  - Right expansion: load records after current window
+- Unbounded queries escalate to FULL
+
+#### **Phase 2: FULL**
+
+- All database records in memory
+- No further database access needed
+- `_db_loaded_range` spans entire dataset
+
+### Boundary Extension
+
+When loading a range `[start, end)`, the system automatically extends boundaries to include:
+- **First record before** `start` (for interpolation/context)
+- **First record at or after** `end` (for closing boundary)
+
+This prevents additional database lookups during nearest-neighbor searches.
+
+## Namespace Support
+
+Namespaces provide logical isolation within a single database instance:
+
+```python
+# LMDB: uses native DBIs
+db.save_records(records, namespace="measurement")
+
+# SQLite: uses namespace column
+SELECT * FROM records WHERE namespace='measurement'
+```
+
+**Default Namespace:**
+- Can be set during `open(namespace="default")`
+- Operations with `namespace=None` use the default
+- Each record class typically defines its own namespace via `db_namespace()`
+
+## Record Lifecycle
+
+### Insertion
+
+```python
+db_insert_record(record, mark_dirty=True)
+```
+
+1. Normalize `record.date_time` to UTC `DatabaseTimestamp`
+2. Ensure timestamp range is loaded (lazy load if needed)
+3. Check for duplicates (raises `ValueError`)
+4. Insert into sorted position in memory
+5. Update index: `_db_record_index[timestamp] = record`
+6. Mark dirty if `mark_dirty=True`
+
+### Retrieval
+
+```python
+db_get_record(target_timestamp, time_window=None)
+```
+
+**Search Strategies:**
+
+| `time_window` | Behavior |
+|---|---|
+| `None` | Exact match only |
+| `UNBOUND_WINDOW` | Nearest record (unlimited search) |
+| `Duration` | Nearest within symmetric window |
+
+**Memory-First:** Checks in-memory index before querying database.
+
+### Deletion
+
+```python
+db_delete_records(start_timestamp, end_timestamp)
+```
+
+1. Ensure range is fully loaded
+2. Remove from memory: `records`, `_db_sorted_timestamps`, `_db_record_index`
+3. Add to `_db_deleted_timestamps` (tombstone)
+4. Discard from dirty sets (cancel pending writes)
+5. Physical deletion deferred until `db_save_records()`
+
+## Dirty Tracking
+
+The system maintains three dirty sets to optimize writes:
+
+```python
+_db_dirty_timestamps: set[DatabaseTimestamp]    # Modified records
+_db_new_timestamps: set[DatabaseTimestamp]      # Newly inserted
+_db_deleted_timestamps: set[DatabaseTimestamp]  # Pending deletes
+```
+
+**Write Strategy:**
+
+1. **Saves first:** Insert/update all dirty records
+2. **Deletes last:** Remove tombstoned records
+3. **Clear tracking sets:** Reset dirty state
+
+**Autosave:** Triggered periodically if `autosave_interval_sec` configured.
+
+## Compression
+
+Optional gzip compression reduces storage footprint:
+
+```python
+# Serialize
+data = pickle.dumps(record.model_dump())
+if compression_level > 0:
+    data = gzip.compress(data, compresslevel=compression_level)
+
+# Deserialize (auto-detect)
+if data[:2] == b'\x1f\x8b':  # gzip magic bytes
+    data = gzip.decompress(data)
+record_data = pickle.loads(data)
+```
+
+**Compression is transparent:** Application code never handles compressed data directly.
+
+## Metadata
+
+Each namespace can store arbitrary metadata (version, creation time, provider):
+
+```python
+_db_metadata = {
+    "version": 1,
+    "created": "2024-01-01T00:00:00Z",
+    "provider_id": "LMDB",
+    "compression": True,
+    "backend": "LMDBDatabase"
+}
+```
+
+Stored separately from records using reserved key `__metadata__`.
+
+## Compaction
+
+Compaction reduces storage by downsampling old records to a lower time resolution. Unlike
+vacuum — which deletes records outright — compaction preserves the full time span of the
+data while replacing many fine-grained records with fewer coarse-grained averages.
+
+### Tiered Downsampling Policy
+
+The default policy has two tiers, applied coarsest-first:
+
+| Age threshold | Target resolution | Effect |
+|---|---|---|
+| Older than 14 days | 1 hour | 15-min records → 1 per hour (75 % reduction) |
+| Older than 2 hours | 15 minutes | 1-min records → 1 per 15 min (93 % reduction) |
+
+Records within the most recent 2 hours are never touched.
+
+### How Compaction Works
+
+Each tier is processed incrementally using a stored cutoff timestamp per tier. On each run,
+only the window `[last_cutoff, new_cutoff)` is examined — records already compacted in a
+previous run are never re-processed. This makes weekly runs fast even on years of history.
+
+For each writable numeric field, records in the window are mean-resampled at the target
+interval using time interpolation. The original records are deleted and the downsampled
+records are written back. A **sparse-data guard** skips any window where the existing record
+count is already at or below the resampled bucket count, preventing compaction from
+accidentally *increasing* record count for data that is already coarse or irregular.
+
+### Customising the Policy per Namespace
+
+Individual data providers can override `db_compact_tiers()` to use a different policy:
+
+```python
+class PriceDataProvider(DataProvider):
+    def db_compact_tiers(self):
+        # Price data is already at 15-min resolution from the source.
+        # Skip the first tier; only compact to hourly after 2 weeks.
+        return [(to_duration("14 days"), to_duration("1 hour"))]
+```
+
+Return an empty list to disable compaction for a specific namespace entirely:
+
+```python
+class EventLogProvider(DataProvider):
+    def db_compact_tiers(self):
+        return []  # Raw events must not be averaged
+```
+
+### Manual Invocation
+
+```python
+# Compact all providers in the container
+data_container.db_compact()
+
+# Compact a single provider
+provider.db_compact()
+
+# Use a one-off policy without changing the instance default
+provider.db_compact(compact_tiers=[
+    (to_duration("7 days"), to_duration("1 hour"))
+])
+```
+
+### Interaction with Vacuum
+
+Compaction and vacuum are complementary and should always run in this order:
+
+```text
+compact → vacuum
+```
+
+Compact first so that vacuum operates on already-downsampled records. This produces cleaner
+retention boundaries and ensures the vacuum cutoff falls on hour-aligned timestamps rather
+than arbitrary sub-minute ones. Running them in reverse order (vacuum then compact) wastes
+work: vacuum may delete records that compaction would have downsampled and kept.
+
+The `RetentionManager` registers both jobs and ensures compaction always runs before vacuum
+within the same maintenance window.
+
+## Vacuum Operation
+
+Remove old records to reclaim space:
+
+```python
+db_vacuum(keep_hours=48)        # Keep last 48 hours
+db_vacuum(keep_timestamp=cutoff) # Keep from cutoff onward
+```
+
+**Strategy:**
+- Computes cutoff relative to `max_timestamp - keep_hours`
+- Deletes all records before cutoff
+- Immediately persists changes via `db_save_records()`
+
+## Thread Safety
+
+- **LMDB:** Internal lock protects write transactions; reads are lock-free via MVCC
+- **SQLite:** Lock guards all operations (autocommit mode eliminates transaction deadlocks)
+- **Record Protocol:** No internal locking (assumes single-threaded access per instance)
+
+## Performance Characteristics
+
+| Operation | LMDB | SQLite |
+|---|---|---|
+| Sequential read | Excellent (mmap) | Good (indexed) |
+| Random read | Excellent (mmap) | Good (B-tree) |
+| Bulk write | Excellent (single txn) | Good (batch insert) |
+| Range query | Excellent (cursor) | Good (indexed scan) |
+| Disk usage | Moderate (pre-allocated) | Compact (auto-grow) |
+| Concurrency | High (MVCC readers) | Low (write serialization) |
+
+**Recommendation:** Use LMDB for high-frequency time-series workloads;
+SQLite for portability and simpler deployment.
+
+## Example Usage
+
+```python
+# Configuration
+config.database.provider = "LMDB"
+config.database.compression_level = 9
+config.database.initial_load_window_h = 24  # Load last 24h initially
+config.database.keep_duration_h = 720       # Retain 30 days
+config.database.compaction_interval_sec = 604800  # Compact weekly
+
+# Access (automatic singleton initialization)
+class MeasurementData(DatabaseRecordProtocolMixin):
+    records: list[MeasurementRecord] = []
+
+    def db_namespace(self) -> str:
+        return "measurement"
+
+# Operations
+measurement = MeasurementData()
+
+# Lazy load on first access
+record = measurement.db_get_record(
+    DatabaseTimestamp.from_datetime(now),
+    time_window=Duration(hours=1)
+)
+
+# Insert new record
+measurement.db_insert_record(new_record)
+
+# Automatic save (if autosave configured) or manual
+measurement.db_save_records()
+
+# Maintenance pipeline (normally handled by RetentionManager)
+measurement.db_compact()    # downsample old records first
+measurement.db_vacuum(keep_hours=720)  # then delete beyond retention
+```
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -18,7 +18,7 @@ from akkudoktoreos.core.version import __version__
 # https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

 project = "Akkudoktor EOS"
-copyright = "2025, Andreas Schmitz"
+copyright = "2025..2026, Andreas Schmitz"
 author = "Andreas Schmitz"
 release = __version__

--- a/docs/index.md
+++ b/docs/index.md
@@ -50,6 +50,7 @@ akkudoktoreos/prediction.md
 akkudoktoreos/measurement.md
 akkudoktoreos/integration.md
 akkudoktoreos/logging.md
+akkudoktoreos/database.md
 akkudoktoreos/adapter.md
 akkudoktoreos/serverapi.md
 akkudoktoreos/api.rst