Fix2 config and predictions revamp. (#281)

measurement:

- Add new measurement class to hold real world measurements.
- Handles load meter readings, grid import and export meter readings.
- Aggregates load meter readings aka. measurements to total load.
- Can import measurements from files, pandas datetime series,
    pandas datetime dataframes, simple daetime arrays and
    programmatically.
- Maybe expanded to other measurement values.
- Should be used for load prediction adaptions by real world
    measurements.

core/coreabc:

- Add mixin class to access measurements

core/pydantic:

- Add pydantic models for pandas datetime series and dataframes.
- Add pydantic models for simple datetime array

core/dataabc:

- Provide DataImport mixin class for generic import handling.
    Imports from JSON string and files. Imports from pandas datetime dataframes
    and simple datetime arrays. Signature of import method changed to
    allow import datetimes to be given programmatically and by data content.
- Use pydantic models for datetime series, dataframes, arrays
- Validate generic imports by pydantic models
- Provide new attributes min_datetime and max_datetime for DataSequence.
- Add parameter dropna to drop NAN/ None values when creating lists, pandas series
    or numpy array from DataSequence.

config/config:

- Add common settings for the measurement module.

predictions/elecpriceakkudoktor:

- Use mean values of last 7 days to fill prediction values not provided by
    akkudoktor.net (only provides 24 values).

prediction/loadabc:

- Extend the generic prediction keys by 'load_total_adjusted' for load predictions
    that adjust the predicted total load by measured load values.

prediction/loadakkudoktor:

- Extend the Akkudoktor load prediction by load adjustment using measured load
    values.

prediction/load_aggregator:

- Module removed. Load aggregation is now handled by the measurement module.

prediction/load_corrector:

- Module removed. Load correction (aka. adjustment of load prediction by
    measured load energy) is handled by the LoadAkkudoktor prediction and
    the generic 'load_mean_adjusted' prediction key.

prediction/load_forecast:

- Module removed. Functionality now completely handled by the LoadAkkudoktor
    prediction.

utils/cacheutil:

- Use pydantic.
- Fix potential bug in ttl (time to live) duration handling.

utils/datetimeutil:

- Added missing handling of pendulum.DateTime and pendulum.Duration instances
    as input. Handled before as datetime.datetime and datetime.timedelta.

utils/visualize:

- Move main to generate_example_report() for better testing support.

server/server:

- Added new configuration option server_fastapi_startup_server_fasthtml
  to make startup of FastHTML server by FastAPI server conditional.

server/fastapi_server:

- Add APIs for measurements
- Improve APIs to provide or take pandas datetime series and
    datetime dataframes controlled by pydantic model.
- Improve APIs to provide or take simple datetime data arrays
    controlled by pydantic model.
- Move fastAPI server API to v1 for new APIs.
- Update pre v1 endpoints to use new prediction and measurement capabilities.
- Only start FastHTML server if 'server_fastapi_startup_server_fasthtml'
    config option is set.

tests:

- Adapt import tests to changed import method signature
- Adapt server test to use the v1 API
- Extend the dataabc test to test for array generation from data
    with several data interval scenarios.
- Extend the datetimeutil test to also test for correct handling
    of to_datetime() providing now().
- Adapt LoadAkkudoktor test for new adjustment calculation.
- Adapt visualization test to use example report function instead of visualize.py
    run as process.
- Removed test_load_aggregator. Functionality is now tested in test_measurement.
- Added tests for measurement module

docs:

- Remove sphinxcontrib-openapi as it prevents build of documentation.
    "site-packages/sphinxcontrib/openapi/openapi31.py", line 305, in _get_type_from_schema
    for t in schema["anyOf"]: KeyError: 'anyOf'"

Signed-off-by: Bobby Noelte <b0661n0e17e@gmail.com>
This commit is contained in:
Bobby Noelte
2024-12-29 18:42:49 +01:00
committed by GitHub
parent 2a8e11d7dc
commit 830af85fca
38 changed files with 3671 additions and 948 deletions

View File

@@ -8,13 +8,15 @@ format, enabling consistent access to forecasted and historical electricity pric
from typing import Any, List, Optional, Union
import numpy as np
import requests
from pydantic import ValidationError
from numpydantic import NDArray, Shape
from pydantic import Field, ValidationError
from akkudoktoreos.core.pydantic import PydanticBaseModel
from akkudoktoreos.prediction.elecpriceabc import ElecPriceDataRecord, ElecPriceProvider
from akkudoktoreos.utils.cacheutil import cache_in_file
from akkudoktoreos.utils.datetimeutil import compare_datetimes, to_datetime
from akkudoktoreos.utils.cacheutil import CacheFileStore, cache_in_file
from akkudoktoreos.utils.datetimeutil import compare_datetimes, to_datetime, to_duration
from akkudoktoreos.utils.logutil import get_logger
logger = get_logger(__name__)
@@ -63,6 +65,20 @@ class ElecPriceAkkudoktor(ElecPriceProvider):
_update_data(): Processes and updates forecast data from Akkudoktor in ElecPriceDataRecord format.
"""
elecprice_8days: NDArray[Shape["24, 8"], float] = Field(
default=np.full((24, 8), np.nan),
description="Hourly electricity prices for the last 7 days and today (€/KWh). "
"A NumPy array of 24 elements, each representing the hourly prices "
"of the last 7 days (index 0..6, Monday..Sunday) and today (index 7).",
)
elecprice_8days_weights_day_of_week: NDArray[Shape["7, 8"], float] = Field(
default=np.full((7, 8), np.nan),
description="Daily electricity price weights for the last 7 days and today. "
"A NumPy array of 7 elements (Monday..Sunday), each representing "
"the daily price weights of the last 7 days (index 0..6, Monday..Sunday) "
"and today (index 7).",
)
@classmethod
def provider_id(cls) -> str:
"""Return the unique identifier for the Akkudoktor provider."""
@@ -84,6 +100,50 @@ class ElecPriceAkkudoktor(ElecPriceProvider):
raise ValueError(error_msg)
return akkudoktor_data
def _calculate_weighted_mean(self, day_of_week: int, hour: int) -> float:
"""Calculate the weighted mean price for given day_of_week and hour.
Args:
day_of_week (int). The day of week to calculate the mean for (0=Monday..6).
hour (int): The hour week to calculate the mean for (0..23).
Returns:
price_weihgted_mead (float): Weighted mean price for given day_of:week and hour.
"""
if np.isnan(self.elecprice_8days_weights_day_of_week[0][0]):
# Weights not initialized - do now
# Priority of day: 1=most .. 7=least
priority_of_day = np.array(
# Available Prediction days /
# M,Tu,We,Th,Fr,Sa,Su,Today/ Forecast day_of_week
[
[1, 2, 3, 4, 5, 6, 7, 1], # Monday
[3, 1, 2, 4, 5, 6, 7, 1], # Tuesday
[4, 2, 1, 3, 5, 6, 7, 1], # Wednesday
[5, 4, 2, 1, 3, 6, 7, 1], # Thursday
[5, 4, 3, 2, 1, 6, 7, 1], # Friday
[7, 6, 5, 4, 2, 1, 3, 1], # Saturday
[7, 6, 5, 4, 3, 2, 1, 1], # Sunday
]
)
# Take priorities above to decrease relevance in 2s exponential
self.elecprice_8days_weights_day_of_week = 2 / (2**priority_of_day)
# Compute the weighted mean for day_of_week and hour
prices_of_hour = self.elecprice_8days[hour]
if np.isnan(prices_of_hour).all():
# No prediction prices available for this hour - use mean value of all prices
price_weighted_mean = np.nanmean(self.elecprice_marketprice_8day)
else:
weights = self.elecprice_8days_weights_day_of_week[day_of_week]
prices_of_hour_masked: NDArray[Shape["24"]] = np.ma.MaskedArray(
prices_of_hour, mask=np.isnan(prices_of_hour)
)
price_weighted_mean = np.ma.average(prices_of_hour_masked, weights=weights)
return float(price_weighted_mean)
@cache_in_file(with_ttl="1 hour")
def _request_forecast(self) -> AkkudoktorElecPrice:
"""Fetch electricity price forecast data from Akkudoktor API.
@@ -98,13 +158,13 @@ class ElecPriceAkkudoktor(ElecPriceProvider):
ValueError: If the API response does not include expected `electricity price` data.
"""
source = "https://api.akkudoktor.net"
date = to_datetime(self.start_datetime, as_string="Y-M-D")
# Try to take data from 7 days back for prediction - usually only some hours back are available
date = to_datetime(self.start_datetime - to_duration("7 days"), as_string="Y-M-D")
last_date = to_datetime(self.end_datetime, as_string="Y-M-D")
response = requests.get(
f"{source}/prices?date={date}&last_date={last_date}&tz={self.config.timezone}"
)
url = f"{source}/prices?date={date}&last_date={last_date}&tz={self.config.timezone}"
response = requests.get(url)
logger.debug(f"Response from {url}: {response}")
response.raise_for_status() # Raise an error for bad responses
logger.debug(f"Response from {source}: {response}")
akkudoktor_data = self._validate_data(response.content)
# We are working on fresh data (no cache), report update time
self.update_datetime = to_datetime(in_timezone=self.config.timezone)
@@ -131,38 +191,66 @@ class ElecPriceAkkudoktor(ElecPriceProvider):
f"but only {values_len} data sets are given in forecast data."
)
previous_price = akkudoktor_data.values[0].marketpriceEurocentPerKWh
# Get cached 8day values
elecprice_cache_file = CacheFileStore().get(key="ElecPriceAkkudoktor8dayCache")
if elecprice_cache_file is None:
# Cache does not exist - create it
elecprice_cache_file = CacheFileStore().create(
key="ElecPriceAkkudoktor8dayCache",
until_datetime=to_datetime("infinity"),
suffix=".npy",
)
np.save(elecprice_cache_file, self.elecprice_8days)
elecprice_cache_file.seek(0)
self.elecprice_8days = np.load(elecprice_cache_file)
for i in range(values_len):
original_datetime = akkudoktor_data.values[i].start
dt = to_datetime(original_datetime, in_timezone=self.config.timezone)
akkudoktor_value = akkudoktor_data.values[i]
if compare_datetimes(dt, self.start_datetime).le:
if compare_datetimes(dt, self.start_datetime).lt:
# forecast data is too old
previous_price = akkudoktor_data.values[i].marketpriceEurocentPerKWh
self.elecprice_8days[dt.hour, dt.day_of_week] = (
akkudoktor_value.marketpriceEurocentPerKWh
)
continue
self.elecprice_8days[dt.hour, 7] = akkudoktor_value.marketpriceEurocentPerKWh
record = ElecPriceDataRecord(
date_time=dt,
elecprice_marketprice=akkudoktor_data.values[i].marketpriceEurocentPerKWh,
elecprice_marketprice=akkudoktor_value.marketpriceEurocentPerKWh,
)
self.append(record)
# Update 8day cache
elecprice_cache_file.seek(0)
np.save(elecprice_cache_file, self.elecprice_8days)
# Check for new/ valid forecast data
if len(self) == 0:
# Got no valid forecast data
return
# Assure price starts at start_time
if compare_datetimes(self[0].date_time, self.start_datetime).gt:
while compare_datetimes(self[0].date_time, self.start_datetime).gt:
# Repeat the mean on the 8 day array to cover the missing hours
dt = self[0].date_time.subtract(hours=1) # type: ignore
value = self._calculate_weighted_mean(dt.day_of_week, dt.hour)
record = ElecPriceDataRecord(
date_time=self.start_datetime,
elecprice_marketprice=previous_price,
date_time=dt,
elecprice_marketprice=value,
)
self.insert(0, record)
# Assure price ends at end_time
if compare_datetimes(self[-1].date_time, self.end_datetime).lt:
while compare_datetimes(self[-1].date_time, self.end_datetime).lt:
# Repeat the mean on the 8 day array to cover the missing hours
dt = self[-1].date_time.add(hours=1) # type: ignore
value = self._calculate_weighted_mean(dt.day_of_week, dt.hour)
record = ElecPriceDataRecord(
date_time=self.end_datetime,
elecprice_marketprice=self[-1].elecprice_marketprice,
date_time=dt,
elecprice_marketprice=value,
)
self.append(record)
# If some of the hourly values are missing, they will be interpolated when using
# `key_to_array`.

View File

@@ -1,37 +0,0 @@
from collections import defaultdict
from collections.abc import Sequence
class LoadAggregator:
def __init__(self, prediction_hours: int = 24) -> None:
"""Initializes the LoadAggregator object with the number of prediction hours.
:param prediction_hours: Number of hours to predict (default: 24)
"""
self.loads: defaultdict[str, list[float]] = defaultdict(
list
) # Dictionary to hold load arrays for different sources
self.prediction_hours: int = prediction_hours
def add_load(self, name: str, last_array: Sequence[float]) -> None:
"""Adds a load array for a specific source. Accepts a Sequence of floats.
:param name: Name of the load source (e.g., "Household", "Heat Pump").
:param last_array: Sequence of loads, where each entry corresponds to an hour.
:raises ValueError: If the length of last_array doesn't match the prediction hours.
"""
# Check length of the array without converting
if len(last_array) != self.prediction_hours:
raise ValueError(f"Total load inconsistent lengths in arrays: {name} {len(last_array)}")
self.loads[name] = list(last_array)
def calculate_total_load(self) -> list[float]:
"""Calculates the total load for each hour by summing up the loads from all sources.
:return: A list representing the total load for each hour.
Returns an empty list if no loads have been added.
"""
# Optimize the summation using a single loop with zip
total_load = [sum(hourly_loads) for hourly_loads in zip(*self.loads.values())]
return total_load

View File

@@ -1,202 +0,0 @@
from datetime import datetime
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.metrics import mean_squared_error, r2_score
from akkudoktoreos.prediction.load_forecast import LoadForecast
class LoadPredictionAdjuster:
def __init__(
self, measured_data: pd.DataFrame, predicted_data: pd.DataFrame, load_forecast: LoadForecast
):
self.measured_data = measured_data
self.predicted_data = predicted_data
self.load_forecast = load_forecast
self.merged_data = self._merge_data()
def _remove_outliers(self, data: pd.DataFrame, threshold: int = 2) -> pd.DataFrame:
# Calculate the Z-Score of the 'Last' data
data["Z-Score"] = np.abs((data["Last"] - data["Last"].mean()) / data["Last"].std())
# Filter the data based on the threshold
filtered_data = data[data["Z-Score"] < threshold]
return filtered_data.drop(columns=["Z-Score"])
def _merge_data(self) -> pd.DataFrame:
# Convert the time column in both DataFrames to datetime
self.predicted_data["time"] = pd.to_datetime(self.predicted_data["time"])
self.measured_data["time"] = pd.to_datetime(self.measured_data["time"])
# Ensure both time columns have the same timezone
if self.measured_data["time"].dt.tz is None:
self.measured_data["time"] = self.measured_data["time"].dt.tz_localize("UTC")
self.predicted_data["time"] = (
self.predicted_data["time"].dt.tz_localize("UTC").dt.tz_convert("Europe/Berlin")
)
self.measured_data["time"] = self.measured_data["time"].dt.tz_convert("Europe/Berlin")
# Optionally: Remove timezone information if only working locally
self.predicted_data["time"] = self.predicted_data["time"].dt.tz_localize(None)
self.measured_data["time"] = self.measured_data["time"].dt.tz_localize(None)
# Now you can perform the merge
merged_data = pd.merge(self.measured_data, self.predicted_data, on="time", how="inner")
print(merged_data)
merged_data["Hour"] = merged_data["time"].dt.hour
merged_data["DayOfWeek"] = merged_data["time"].dt.dayofweek
return merged_data
def calculate_weighted_mean(
self, train_period_weeks: int = 9, test_period_weeks: int = 1
) -> None:
self.merged_data = self._remove_outliers(self.merged_data)
train_end_date = self.merged_data["time"].max() - pd.Timedelta(weeks=test_period_weeks)
train_start_date = train_end_date - pd.Timedelta(weeks=train_period_weeks)
test_start_date = train_end_date + pd.Timedelta(hours=1)
test_end_date = (
test_start_date + pd.Timedelta(weeks=test_period_weeks) - pd.Timedelta(hours=1)
)
self.train_data = self.merged_data[
(self.merged_data["time"] >= train_start_date)
& (self.merged_data["time"] <= train_end_date)
]
self.test_data = self.merged_data[
(self.merged_data["time"] >= test_start_date)
& (self.merged_data["time"] <= test_end_date)
]
self.train_data["Difference"] = self.train_data["Last"] - self.train_data["Last Pred"]
weekdays_train_data = self.train_data[self.train_data["DayOfWeek"] < 5]
weekends_train_data = self.train_data[self.train_data["DayOfWeek"] >= 5]
self.weekday_diff = (
weekdays_train_data.groupby("Hour").apply(self._weighted_mean_diff).dropna()
)
self.weekend_diff = (
weekends_train_data.groupby("Hour").apply(self._weighted_mean_diff).dropna()
)
def _weighted_mean_diff(self, data: pd.DataFrame) -> float:
train_end_date = self.train_data["time"].max()
weights = 1 / (train_end_date - data["time"]).dt.days.replace(0, np.nan)
weighted_mean = (data["Difference"] * weights).sum() / weights.sum()
return weighted_mean
def adjust_predictions(self) -> None:
self.train_data["Adjusted Pred"] = self.train_data.apply(self._adjust_row, axis=1)
self.test_data["Adjusted Pred"] = self.test_data.apply(self._adjust_row, axis=1)
def _adjust_row(self, row: pd.Series) -> pd.Series:
if row["DayOfWeek"] < 5:
return row["Last Pred"] + self.weekday_diff.get(row["Hour"], 0)
else:
return row["Last Pred"] + self.weekend_diff.get(row["Hour"], 0)
def plot_results(self) -> None:
self._plot_data(self.train_data, "Training")
self._plot_data(self.test_data, "Testing")
def _plot_data(self, data: pd.DataFrame, data_type: str) -> None:
plt.figure(figsize=(14, 7))
plt.plot(data["time"], data["Last"], label=f"Actual Last - {data_type}", color="blue")
plt.plot(
data["time"],
data["Last Pred"],
label=f"Predicted Last - {data_type}",
color="red",
linestyle="--",
)
plt.plot(
data["time"],
data["Adjusted Pred"],
label=f"Adjusted Predicted Last - {data_type}",
color="green",
linestyle=":",
)
plt.xlabel("Time")
plt.ylabel("Load")
plt.title(f"Actual vs Predicted vs Adjusted Predicted Load ({data_type} Data)")
plt.legend()
plt.grid(True)
plt.show()
def evaluate_model(self) -> None:
mse = mean_squared_error(self.test_data["Last"], self.test_data["Adjusted Pred"])
r2 = r2_score(self.test_data["Last"], self.test_data["Adjusted Pred"])
print(f"Mean Squared Error: {mse}")
print(f"R-squared: {r2}")
def predict_next_hours(self, hours_ahead: int) -> pd.DataFrame:
last_date = self.merged_data["time"].max()
future_dates = [last_date + pd.Timedelta(hours=i) for i in range(1, hours_ahead + 1)]
future_df = pd.DataFrame({"time": future_dates})
future_df["Hour"] = future_df["time"].dt.hour
future_df["DayOfWeek"] = future_df["time"].dt.dayofweek
future_df["Last Pred"] = future_df["time"].apply(self._forecast_next_hours)
future_df["Adjusted Pred"] = future_df.apply(self._adjust_row, axis=1)
return future_df
def _forecast_next_hours(self, timestamp: datetime) -> float:
date_str = timestamp.strftime("%Y-%m-%d")
hour = timestamp.hour
daily_forecast = self.load_forecast.get_daily_stats(date_str)
return daily_forecast[0][hour] if hour < len(daily_forecast[0]) else np.nan
# if __name__ == '__main__':
# estimator = LastEstimator()
# start_date = "2024-06-01"
# end_date = "2024-08-01"
# last_df = estimator.get_last(start_date, end_date)
# selected_columns = last_df[['timestamp', 'Last']]
# selected_columns['time'] = pd.to_datetime(selected_columns['timestamp']).dt.floor('H')
# selected_columns['Last'] = pd.to_numeric(selected_columns['Last'], errors='coerce')
# # Drop rows with NaN values
# cleaned_data = selected_columns.dropna()
# print(cleaned_data)
# # Create an instance of LoadForecast
# lf = LoadForecast(filepath=r'.\load_profiles.npz', year_energy=6000*1000)
# # Initialize an empty DataFrame to hold the forecast data
# forecast_list = []
# # Loop through each day in the date range
# for single_date in pd.date_range(cleaned_data['time'].min().date(), cleaned_data['time'].max().date()):
# date_str = single_date.strftime('%Y-%m-%d')
# daily_forecast = lf.get_daily_stats(date_str)
# mean_values = daily_forecast[0] # Extract the mean values
# hours = [single_date + pd.Timedelta(hours=i) for i in range(24)]
# daily_forecast_df = pd.DataFrame({'time': hours, 'Last Pred': mean_values})
# forecast_list.append(daily_forecast_df)
# # Concatenate all daily forecasts into a single DataFrame
# forecast_df = pd.concat(forecast_list, ignore_index=True)
# # Create an instance of the LoadPredictionAdjuster class
# adjuster = LoadPredictionAdjuster(cleaned_data, forecast_df, lf)
# # Calculate the weighted mean differences
# adjuster.calculate_weighted_mean()
# # Adjust the predictions
# adjuster.adjust_predictions()
# # Plot the results
# adjuster.plot_results()
# # Evaluate the model
# adjuster.evaluate_model()
# # Predict the next x hours
# future_predictions = adjuster.predict_next_hours(48)
# print(future_predictions)

View File

@@ -1,99 +0,0 @@
from datetime import datetime
from pathlib import Path
import numpy as np
# Load the .npz file when the application starts
class LoadForecast:
def __init__(self, filepath: str | Path, year_energy: float):
self.filepath = filepath
self.year_energy = year_energy
self.load_data()
def get_daily_stats(self, date_str: str) -> np.ndarray:
"""Returns the 24-hour profile with mean and standard deviation for a given date.
:param date_str: Date as a string in the format "YYYY-MM-DD"
:return: An array with shape (2, 24), contains means and standard deviations
"""
# Convert the date string into a datetime object
date = self._convert_to_datetime(date_str)
# Calculate the day of the year (1 to 365)
day_of_year = date.timetuple().tm_yday
# Extract the 24-hour profile for the given date
daily_stats = self.data_year_energy[day_of_year - 1] # -1 because indexing starts at 0
return daily_stats
def get_hourly_stats(self, date_str: str, hour: int) -> np.ndarray:
"""Returns the mean and standard deviation for a specific hour of a given date.
:param date_str: Date as a string in the format "YYYY-MM-DD"
:param hour: Specific hour (0 to 23)
:return: An array with shape (2,), contains mean and standard deviation for the specified hour
"""
# Convert the date string into a datetime object
date = self._convert_to_datetime(date_str)
# Calculate the day of the year (1 to 365)
day_of_year = date.timetuple().tm_yday
# Extract mean and standard deviation for the given hour
hourly_stats = self.data_year_energy[day_of_year - 1, :, hour] # Access the specific hour
return hourly_stats
def get_stats_for_date_range(self, start_date_str: str, end_date_str: str) -> np.ndarray:
"""Returns the means and standard deviations for a date range.
:param start_date_str: Start date as a string in the format "YYYY-MM-DD"
:param end_date_str: End date as a string in the format "YYYY-MM-DD"
:return: An array with aggregated data for the date range
"""
start_date = self._convert_to_datetime(start_date_str)
end_date = self._convert_to_datetime(end_date_str)
start_day_of_year = start_date.timetuple().tm_yday
end_day_of_year = end_date.timetuple().tm_yday
# Note that in leap years, the day of the year may need adjustment
stats_for_range = self.data_year_energy[
start_day_of_year:end_day_of_year
] # -1 because indexing starts at 0
stats_for_range = stats_for_range.swapaxes(1, 0)
stats_for_range = stats_for_range.reshape(stats_for_range.shape[0], -1)
return stats_for_range
def load_data(self) -> None:
"""Loads data from the specified file."""
try:
data = np.load(self.filepath)
self.data = np.array(list(zip(data["yearly_profiles"], data["yearly_profiles_std"])))
self.data_year_energy = self.data * self.year_energy
# pprint(self.data_year_energy)
except FileNotFoundError:
print(f"Error: File {self.filepath} not found.")
except Exception as e:
print(f"An error occurred while loading data: {e}")
def get_price_data(self) -> None:
"""Returns price data (currently not implemented)."""
raise NotImplementedError
# return self.price_data
def _convert_to_datetime(self, date_str: str) -> datetime:
"""Converts a date string to a datetime object."""
return datetime.strptime(date_str, "%Y-%m-%d")
# Example usage of the class
if __name__ == "__main__":
filepath = r"..\data\load_profiles.npz" # Adjust the path to the .npz file
lf = LoadForecast(filepath=filepath, year_energy=2000)
specific_date_prices = lf.get_daily_stats("2024-02-16") # Adjust date as needed
specific_hour_stats = lf.get_hourly_stats("2024-02-16", 12) # Adjust date and hour as needed
print(specific_hour_stats)

View File

@@ -18,8 +18,14 @@ logger = get_logger(__name__)
class LoadDataRecord(PredictionRecord):
"""Represents a load data record containing various load attributes at a specific datetime."""
load_mean: Optional[float] = Field(default=None, description="Load mean value (W)")
load_std: Optional[float] = Field(default=None, description="Load standard deviation (W)")
load_mean: Optional[float] = Field(default=None, description="Predicted load mean value (W)")
load_std: Optional[float] = Field(
default=None, description="Predicted load standard deviation (W)"
)
load_mean_adjusted: Optional[float] = Field(
default=None, description="Predicted load mean value adjusted by load measurement (W)"
)
class LoadProvider(PredictionProvider):

View File

@@ -8,7 +8,7 @@ from pydantic import Field
from akkudoktoreos.config.configabc import SettingsBaseModel
from akkudoktoreos.prediction.loadabc import LoadProvider
from akkudoktoreos.utils.datetimeutil import to_datetime, to_duration
from akkudoktoreos.utils.datetimeutil import compare_datetimes, to_datetime, to_duration
from akkudoktoreos.utils.logutil import get_logger
logger = get_logger(__name__)
@@ -30,6 +30,58 @@ class LoadAkkudoktor(LoadProvider):
"""Return the unique identifier for the LoadAkkudoktor provider."""
return "LoadAkkudoktor"
def _calculate_adjustment(self, data_year_energy: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
"""Calculate weekday and week end adjustment from total load measurement data.
Returns:
weekday_adjust (np.ndarray): hourly adjustment for Monday to Friday.
weekend_adjust (np.ndarray): hourly adjustment for Saturday and Sunday.
"""
weekday_adjust = np.zeros(24)
weekday_adjust_weight = np.zeros(24)
weekend_adjust = np.zeros(24)
weekend_adjust_weight = np.zeros(24)
if self.measurement.max_datetime is None:
# No measurements - return 0 adjustment
return (weekday_adjust, weekday_adjust)
# compare predictions with real measurement - try to use last 7 days
compare_start = self.measurement.max_datetime - to_duration("7 days")
if compare_datetimes(compare_start, self.measurement.min_datetime).lt:
# Not enough measurements for 7 days - use what is available
compare_start = self.measurement.min_datetime
compare_end = self.measurement.max_datetime
compare_interval = to_duration("1 hour")
load_total_array = self.measurement.load_total(
start_datetime=compare_start,
end_datetime=compare_end,
interval=compare_interval,
)
compare_dt = compare_start
for i in range(len(load_total_array)):
load_total = load_total_array[i]
# Extract mean (index 0) and standard deviation (index 1) for the given day and hour
# Day indexing starts at 0, -1 because of that
hourly_stats = data_year_energy[compare_dt.day_of_year - 1, :, compare_dt.hour]
weight = 1 / ((compare_end - compare_dt).days + 1)
if compare_dt.day_of_week < 5:
weekday_adjust[compare_dt.hour] += (load_total - hourly_stats[0]) * weight
weekday_adjust_weight[compare_dt.hour] += weight
else:
weekend_adjust[compare_dt.hour] += (load_total - hourly_stats[0]) * weight
weekend_adjust_weight[compare_dt.hour] += weight
compare_dt += compare_interval
# Calculate mean
for i in range(24):
if weekday_adjust_weight[i] > 0:
weekday_adjust[i] = weekday_adjust[i] / weekday_adjust_weight[i]
if weekend_adjust_weight[i] > 0:
weekend_adjust[i] = weekend_adjust[i] / weekend_adjust_weight[i]
return (weekday_adjust, weekend_adjust)
def load_data(self) -> np.ndarray:
"""Loads data from the Akkudoktor load file."""
load_file = Path(__file__).parent.parent.joinpath("data/load_profiles.npz")
@@ -54,13 +106,24 @@ class LoadAkkudoktor(LoadProvider):
def _update_data(self, force_update: Optional[bool] = False) -> None:
"""Adds the load means and standard deviations."""
data_year_energy = self.load_data()
weekday_adjust, weekend_adjust = self._calculate_adjustment(data_year_energy)
date = self.start_datetime
for i in range(self.config.prediction_hours):
# Extract mean and standard deviation for the given day and hour
# Extract mean (index 0) and standard deviation (index 1) for the given day and hour
# Day indexing starts at 0, -1 because of that
hourly_stats = data_year_energy[date.day_of_year - 1, :, date.hour]
self.update_value(date, "load_mean", hourly_stats[0])
self.update_value(date, "load_std", hourly_stats[1])
if date.day_of_week < 5:
# Monday to Friday (0..4)
self.update_value(
date, "load_mean_adjusted", hourly_stats[0] + weekday_adjust[date.hour]
)
else:
# Saturday, Sunday (5, 6)
self.update_value(
date, "load_mean_adjusted", hourly_stats[0] + weekend_adjust[date.hour]
)
date += to_duration("1 hour")
# We are working on fresh data (no cache), report update time
self.update_datetime = to_datetime(in_timezone=self.config.timezone)

View File

@@ -13,6 +13,7 @@ from typing import List, Optional
from pendulum import DateTime
from pydantic import Field, computed_field
from akkudoktoreos.core.coreabc import MeasurementMixin
from akkudoktoreos.core.dataabc import (
DataBase,
DataContainer,
@@ -27,10 +28,11 @@ from akkudoktoreos.utils.logutil import get_logger
logger = get_logger(__name__)
class PredictionBase(DataBase):
class PredictionBase(DataBase, MeasurementMixin):
"""Base class for handling prediction data.
Enables access to EOS configuration data (attribute `config`).
Enables access to EOS configuration data (attribute `config`) and EOS measurement data
(attribute `measurement`).
"""
pass

View File

@@ -70,7 +70,7 @@ class PVForecastCommonSettings(SettingsBaseModel):
pvforecast0_inverter_paco: Optional[int] = Field(
default=None, description="AC power rating of the inverter. [W]"
)
pvforecast0_modules_per_string: Optional[str] = Field(
pvforecast0_modules_per_string: Optional[int] = Field(
default=None, description="Number of the PV modules of the strings of this plane."
)
pvforecast0_strings_per_inverter: Optional[str] = Field(
@@ -124,7 +124,7 @@ class PVForecastCommonSettings(SettingsBaseModel):
pvforecast1_inverter_paco: Optional[int] = Field(
default=None, description="AC power rating of the inverter. [W]"
)
pvforecast1_modules_per_string: Optional[str] = Field(
pvforecast1_modules_per_string: Optional[int] = Field(
default=None, description="Number of the PV modules of the strings of this plane."
)
pvforecast1_strings_per_inverter: Optional[str] = Field(
@@ -178,7 +178,7 @@ class PVForecastCommonSettings(SettingsBaseModel):
pvforecast2_inverter_paco: Optional[int] = Field(
default=None, description="AC power rating of the inverter. [W]"
)
pvforecast2_modules_per_string: Optional[str] = Field(
pvforecast2_modules_per_string: Optional[int] = Field(
default=None, description="Number of the PV modules of the strings of this plane."
)
pvforecast2_strings_per_inverter: Optional[str] = Field(
@@ -232,7 +232,7 @@ class PVForecastCommonSettings(SettingsBaseModel):
pvforecast3_inverter_paco: Optional[int] = Field(
default=None, description="AC power rating of the inverter. [W]"
)
pvforecast3_modules_per_string: Optional[str] = Field(
pvforecast3_modules_per_string: Optional[int] = Field(
default=None, description="Number of the PV modules of the strings of this plane."
)
pvforecast3_strings_per_inverter: Optional[str] = Field(
@@ -286,7 +286,7 @@ class PVForecastCommonSettings(SettingsBaseModel):
pvforecast4_inverter_paco: Optional[int] = Field(
default=None, description="AC power rating of the inverter. [W]"
)
pvforecast4_modules_per_string: Optional[str] = Field(
pvforecast4_modules_per_string: Optional[int] = Field(
default=None, description="Number of the PV modules of the strings of this plane."
)
pvforecast4_strings_per_inverter: Optional[str] = Field(
@@ -340,7 +340,7 @@ class PVForecastCommonSettings(SettingsBaseModel):
pvforecast5_inverter_paco: Optional[int] = Field(
default=None, description="AC power rating of the inverter. [W]"
)
pvforecast5_modules_per_string: Optional[str] = Field(
pvforecast5_modules_per_string: Optional[int] = Field(
default=None, description="Number of the PV modules of the strings of this plane."
)
pvforecast5_strings_per_inverter: Optional[str] = Field(