app.ml_utils module#
Machine learning utilities for models and training-log queries.
This module provides small, focused utilities that are used across the training and visualization layers: a minimal PyTorch regression network, typed structures for training results, and helpers to retrieve and format the latest and historical training scores from the database. It exists to keep shared ML-centric logic isolated from the FastAPI views and the standalone training job.
See Also#
app.ml_trainStandalone Slurm-executed training pipeline that writes logs.
app.slurm_job_triggerDispatches training jobs into the Slurm cluster.
app.databaseEngine and session factories (
SessionLocal).app.models.TrainingLogORM model consumed by the query helpers here.
Notes#
Primary role: define lightweight ML helpers and expose read-only queries for
app.models.TrainingLogsuitable for dashboards and APIs.Key dependencies: a reachable database via
app.database.SessionLocaland an optional writable shared volume at/datafor model artifacts.Invariants: the database schema for
training_logsmust be present.
Examples#
>>> # Fetch latest scores grouped by horizon (requires DB)
>>> from app.ml_utils import get_latest_training_logs
>>> latest = get_latest_training_logs()
>>> isinstance(latest, dict)
True
>>> # Create a tiny regression net (no DB required)
>>> import torch
>>> from app.ml_utils import SimpleRegressionNet
>>> net = SimpleRegressionNet(input_dim=4)
>>> y = net(torch.randn(2, 4))
>>> y.shape
torch.Size([2, 1])
- class app.ml_utils.SimpleRegressionNet(*args: Any, **kwargs: Any)[source]#
Bases:
ModuleA minimal fully connected network for regression tasks.
- Parameters:
- input_dim
int Number of input features; must be a positive integer.
- input_dim
Examples
>>> import torch >>> net = SimpleRegressionNet(input_dim=3) >>> out = net(torch.randn(2, 3)) >>> out.shape torch.Size([2, 1])
- forward(x: torch.Tensor) torch.Tensor[source]#
Compute predictions for a batch of inputs.
- Parameters:
- x
torch.Tensor Input feature tensor with shape
(batch_size, input_dim).
- x
- Returns:
torch.TensorOutput tensor with shape
(batch_size, 1).
- class app.ml_utils.TrainingLogDetails[source]#
Bases:
TypedDictStructured details for a single training log entry.
This typed mapping captures the essential fields used by the UI and reporting layers when presenting the most recent score per horizon.
- Attributes:
- timestamp
datetime|None Completion time of the training run in UTC.
- sklearn_score
float R^2 score of the Scikit-learn model for this run.
- pytorch_score
float R^2 score of the PyTorch model for this run.
- data_count
int Number of samples used for training and evaluation.
- coord_latitude
float|None Coordinate latitude associated with the run, if any.
- coord_longitude
float|None Coordinate longitude associated with the run, if any.
- horizon_label
str|None Human-friendly horizon label (e.g.,
"5min","1h"), if set.- horizon_display_name
str Preformatted string suitable for charts/legends.
- timestamp
- app.ml_utils.assert_positive_input_dim(input_dim: int) None[source]#
Validate that
input_dimis a positive integer.- Parameters:
- input_dim
int The number of input features expected by the model. Must be
> 0.
- input_dim
- Raises:
ValueErrorIf
input_dimis not a positive integer.
Examples
>>> assert_positive_input_dim(4) >>> assert_positive_input_dim(0) Traceback (most recent call last): ... ValueError: input_dim must be a positive integer, but was 0 (type: <class 'int'>).
- app.ml_utils.get_historical_scores() Dict[str, Dict[str, Any]][source]#
Fetch historical scores grouped by horizon.
Returns time-ordered scores for every distinct horizon key found in the database. Database errors are logged and an empty mapping is returned on failure to keep callers resilient.
- Returns:
Notes
All exceptions are caught and logged; on any error this function returns an empty dictionary.
- app.ml_utils.get_latest_training_logs() Dict[str, TrainingLogDetails][source]#
Fetch the latest training log per horizon.
Iterates over distinct horizon keys in
training_logsand returns the most recent entry for each. Database errors are logged and an empty mapping is returned on failure to keep callers resilient.- Returns:
dict[str,TrainingLogDetails]Mapping from horizon key to latest log details.
Notes
All exceptions are caught and logged; on any error this function returns an empty dictionary.