app.ml_utils module#

Machine learning utilities for models and training-log queries.

This module provides small, focused utilities that are used across the training and visualization layers: a minimal PyTorch regression network, typed structures for training results, and helpers to retrieve and format the latest and historical training scores from the database. It exists to keep shared ML-centric logic isolated from the FastAPI views and the standalone training job.

Notes#

Primary role: define lightweight ML helpers and expose read-only queries for app.models.TrainingLog suitable for dashboards and APIs.
Key dependencies: a reachable database via app.database.SessionLocal and an optional writable shared volume at /data for model artifacts.
Invariants: the database schema for training_logs must be present.

Examples#

>>> # Fetch latest scores grouped by horizon (requires DB)
>>> from app.ml_utils import get_latest_training_logs
>>> latest = get_latest_training_logs()
>>> isinstance(latest, dict)
True

>>> # Create a tiny regression net (no DB required)
>>> import torch
>>> from app.ml_utils import SimpleRegressionNet
>>> net = SimpleRegressionNet(input_dim=4)
>>> y = net(torch.randn(2, 4))
>>> y.shape
torch.Size([2, 1])

class app.ml_utils.SimpleRegressionNet(*args: Any, **kwargs: Any)[source]#

Bases: Module

A minimal fully connected network for regression tasks.

Parameters:

input_dimint: Number of input features; must be a positive integer.

Examples

>>> import torch
>>> net = SimpleRegressionNet(input_dim=3)
>>> out = net(torch.randn(2, 3))
>>> out.shape
torch.Size([2, 1])

forward(x: torch.Tensor) → torch.Tensor[source]#

Compute predictions for a batch of inputs.

Parameters:

xtorch.Tensor: Input feature tensor with shape (batch_size, input_dim).

Returns:

torch.Tensor: Output tensor with shape (batch_size, 1).

class app.ml_utils.TrainingLogDetails[source]#

Bases: TypedDict

Structured details for a single training log entry.

This typed mapping captures the essential fields used by the UI and reporting layers when presenting the most recent score per horizon.

Attributes:

timestampdatetime | None: Completion time of the training run in UTC.
sklearn_scorefloat: R^2 score of the Scikit-learn model for this run.
pytorch_scorefloat: R^2 score of the PyTorch model for this run.
data_countint: Number of samples used for training and evaluation.
coord_latitudefloat | None: Coordinate latitude associated with the run, if any.
coord_longitudefloat | None: Coordinate longitude associated with the run, if any.
horizon_labelstr | None: Human-friendly horizon label (e.g., "5min", "1h"), if set.
horizon_display_namestr: Preformatted string suitable for charts/legends.

coord_latitude: float | None#

coord_longitude: float | None#

data_count: int#

horizon_display_name: str#

horizon_label: str | None#

pytorch_score: float#

sklearn_score: float#

timestamp: datetime | None#

app.ml_utils.assert_positive_input_dim(input_dim: int) → None[source]#

Validate that input_dim is a positive integer.

Parameters:

input_dimint: The number of input features expected by the model. Must be > 0.

Raises:

ValueError: If input_dim is not a positive integer.

Examples

>>> assert_positive_input_dim(4)
>>> assert_positive_input_dim(0)
Traceback (most recent call last):
...
ValueError: input_dim must be a positive integer, but was 0 (type: <class 'int'>).

app.ml_utils.get_historical_scores() → Dict[str, Dict[str, Any]][source]#

Fetch historical scores grouped by horizon.

Returns time-ordered scores for every distinct horizon key found in the database. Database errors are logged and an empty mapping is returned on failure to keep callers resilient.

Returns:

dict[str, dict[str, Any]]: Mapping from horizon key to a dictionary with keys "timestamps", "sklearn_scores", "pytorch_scores", and "display_name".

Notes

All exceptions are caught and logged; on any error this function returns an empty dictionary.

app.ml_utils.get_latest_training_logs() → Dict[str, TrainingLogDetails][source]#

Fetch the latest training log per horizon.

Iterates over distinct horizon keys in training_logs and returns the most recent entry for each. Database errors are logged and an empty mapping is returned on failure to keep callers resilient.

Returns:

dict[str, TrainingLogDetails]: Mapping from horizon key to latest log details.

Notes

All exceptions are caught and logged; on any error this function returns an empty dictionary.

app.ml_utils module#

See Also#

Notes#

Examples#