nrelWattileExt 0.3.0 - Axon funcs

nrelWattileExtAxon funcs

Registered StackHub users may elect to receive email notifications whenever a new package version is released.

There are 0 watchers.

v0.3.0

toWattileModels

toWattileModels(points)

Read all wattileModel records associated with points.

toWattilePoints

toWattilePoints(wattileModels)

Read all wattilePoint points associated with wattileModels via wattileModelRef.

wattileExportTrainingData

wattileExportTrainingData(predictors, targets, span, dir, opts: {})

For the specified predictors, targets, and time span, export historical data to directory dir for training Wattile models. predictors and targets are any inputs supported by toRecList(); span is any input supported by toSpan(). The exported data are consistent with the training data input format required by Wattile.

Details

Writes the following files within dir:

<<Name>> Predictors <<Date Range>>.csv: Time series file(s) containing predictor (input) data in CSV format
<<Name>> Targets <<Date Range>>.csv: Time series file(s) containing target (output) data in CSV format
<<Name>> Config.json: Data set configuration metadata in JSON format

In the filenames, <<Name>> is the data set name (which defaults to the name of dir; see Options) and <<Date Range>> is the date range spanned by each file (with format dependent on the export options selected).

Each CSV file contains a Timestamp column followed by one or more numeric value columns corresponding to either the predictor or the target points. Timestamps are exported in ISO 8601 standard format, including time zone offset and time zone name. Only numeric data are supported.

The JSON configuration file consists of a single top-level dictionary containing five nested dictionaries:

dates: The data set start and end dates (timestamps)
predictors: A list of predictor variables, including the following metadata for each, as available:
- id: Unique identifier
- dis: Display name; populated via dis()
- column: Column name in CSV file(s)
- unit: Unit
- maxVal: Maximum valid value
- minVal: Minimum valid value
- defVal: Default value; substituted automatically for out-of-range data
targets: A list of target variables, including at minimum the column field and following the same conventions as predictors
files: Provides a list of all CSV files in the data set, each as an dictionary with the following metadata:
- filename: Name of the associated CSV file
- contentType: Either "predictors" or "targets"
- start: Start of the time range spanned by the file
- end: End of the time range spanned by the file
export_options: The opts arguments used for the export (provided for repeatability)

The JSON configuration file uses Haystack Json encoding (Version 3 by default; see Options). Unicode characters are not escaped.

Options

Export behavior can be modified by control options passed via opts:

appendUnits: Boolean; if true then units will be appended to column names (Default = false)
clean: Boolean; see wattileReadHis() (Default = false)
batch; Integer; see wattileReadHis() (Default = none)
defVal: haystack::Number or haystack::NA; see wattileReadHis() (Default = none)
interpolate: Boolean; see wattileReadHis() (Default = false)
interval: haystack::Number; see wattileReadHis() (Default = none)
jsonVersion: sys::Str; either v3 or v4; see Json (Default = v3)
name: sys::Str; data set name to use in output files (Default = name of dir)
preview: Boolean; if true then return the data set configuration JSON without writing any files (Default = false)
removeNA: Boolean; see wattileReadHis() (Default = false)
rollup: Boolean; see wattileReadHis() (Default = false)
splitBy: sys::Str; One of "day", "week", "month", or "year" indicating the interval at which to split the data files (Default = no splitting)
timeout: haystack::Number; see wattileReadHis() (Default = none)
warn: Boolean; see wattileReadHis() (Default = true)

Boolean options may be also be passed as markers, with x equivalent to x:true and -x equivalent to x:false.

Pre-Processing

Options clean, defVal, interpolate, interval, and rollup control data pre-processing prior to export. See wattileReadHis for each option's effect.

Note: interpolation is performed for each exported CSV file independently, therefore, the set of timestamps in target and predictor CSV file for the same date range are not guaranteed to be the same.

Tips

If you are trying to read history from remote projects via Arcbeam and keep seeing an error that XQuery did not return the correct number of points, experiment with increasing timeout and/or decreasing batch. See wattileReadHis for details.

wattileImportModels

wattileImportModels(dirs, opts: {})

Imports proxy records for Wattile model(s) located in dirs and optionally commits them to the database. dirs may be a single uri, a list of uris, a list of anything that can be parsed to a uri, or a grid with a uri column.

Each imported model is a dict with tags:

wattileModel
wattilePredictors: haystack::Grid of predictor metadata (see documentation)
wattileTargetRef: Optional haystack::Ref to the prediction target (if available from predictors_target_config.json; see documentation)
wattileVersion: Optional Wattile version string (if available from metadata.json; see documentation)
uri: Path to the model directory
unit: Optional unit for the prediction output

Does not populate optional tags dis, tz, or wattileReadOpts (as this information is not currently included with Wattile models). Returns a list of model records.

Options

Import behavior can be modified by control options passed via opts:

checked: Boolean; throw an error for invalid models (Default = true)
commit: Boolean; if true commits the model records to the database (Default = false)
conflict: Action on conflicts; one of "skip", "overwrite", "duplicate", or "error" (Default = "error")
warn: Boolean; if true log warnings; if false suppress them (Default = true)

Boolean options may also be passed as markers, with x equivalent to x:true and -x equivalent to x:false.

Conflicts

Conflicts with existing wattileModel records (matched by uri) are handled based on the conflict option:

skip: Skip import
overwrite: Merge imported metadata onto existing record, overwriting it
duplicate: Create a new wattileModel record with duplicate uri
error: Throw error

Model Validation

This function performs some basic validation on imported models, including checking directory integrity and verifying the existence of the model's predictor and target points within the SkySpark cluster. Behavior on encountering an invalid model depends on the checked option:

If checked is true, throws an error when an invalid model is encountered
If checked is false, skips invalid models...
- If warn is true, with a warning
- If warn is false, silently

wattilePythonEventCleanup

wattilePythonEventCleanup(session, model)

Clean up Wattile prediction event logs to prevent accumulation. Not intended to be called directly; instead use wattilePythonTask().

wattilePythonInit

wattilePythonInit(session)

Initialize Wattile Python session. Not intended to be called directly; instead use wattilePythonTask().

wattilePythonModelPredict

wattilePythonModelPredict(session, model, span)

Execute a Wattile model prediction. Not intended to be called directly; instead use wattilePythonTask().

wattilePythonModelSetup

wattilePythonModelSetup(session, model)

Configure a Wattile model for use with SkySpark. Not intended to be called directly; instead use wattilePythonTask().

wattilePythonTask

wattilePythonTask(msg, image: "wattile")

Task function to handle Wattile Python Docker container interactions. Intended to be run within a dedicated persistent task by passing action messages:

msg must be a dictionary with an action key (Str) and any action-specific inputs as key-value pairs.
image specifies the Docker image by name (Default = "wattile")

Available actions are "init", "setup", "predict", and "cleanup", documented below. Each action uses a dedicated helper function.

For task record configuration, see the extension documentation.

Init

Initialize a Wattile Python session. This action is only required once and is performed automatically when the session starts.

action: "init"

Calls wattilePythonInit().

Setup

Prepare a Wattile model for prediction. Must be run for each model before taking the "predict" action.

action: "setup"
model: A Wattile model record

Calls wattilePythonModelSetup().

Predict

Executes a prediction for a Wattile model.

action: "predict"
model: A Wattile model record
span: Time span for prediction

Returns a tidy grid with columns:

timestamp: Nominal prediction timestamp (time corresponding to horizon = 0)
horizon: Time horizon of predicted value
quantile: Quantile for predicted value
pred_ts: Actual prediction timestamp; equals timestamp + horizon
pred_val: Predicted value

Each row of the grid represents the value of a unique ["timestamp", "quantile", "horizon"] tuple from the underlying Wattile output XArray.

The Wattile model record(s) may have a wattileReadOpts tag which provides model-specific options to use when calling wattileReadHis in the context of model prediction.

Calls wattilePythonModelPredict().

Cleanup

Clean up prediction events and log files for a Wattile model.

action: "cleanup"
model: A Wattile model record

Calls wattilePythonEventCleanup()

wattileReadHis

wattileReadHis(points, span, opts: {})

For the specified points and time span, read and pre-process historical data for use with a Wattile model. points is any input supported by toRecList(); span is any input supported by toSpan(). Returns a history grid.

Options

Data pre-processing is controlled by the following options, passed via opts:

clean: Boolean; if true then data will be range-cleaned prior to export by removing values outside the range defined by each point's minVal and maxVal tags (Default = false)
batch: Integer; optional batch size for XQuery; see below (Default = none)
defVal: haystack::Number or haystack::NA; governs range cleaning behavior; see below (Default = none)
interpolate: Boolean; if true then data will be interpolated (Default = false)
interval: haystack::Number; optional interval at which to interpolate and/or roll up the data (Default = none)
removeNA: Boolean; if true then NA values will be removed (Default = false)
rollup: Boolean; if true then a history rollup will be applied (Default = false)
timeout: haystack::Number; optional timeout for XQuery; see below (Default = none)
warn: Boolean; if true log warnings; if false suppress them (Default = true)

Boolean options may be also be passed as markers, with x equivalent to x:true and -x equivalent to x:false. If rollup is true, then interval must also be specified.

Range Cleaning

If clean is true, values outside the range [minVal, maxVal], as defined by each point's minVal and maxVal tags, are removed from the data set. (Missing minVal and maxVal tags imply permissible minimum and maximum values of negative and positive infinity, respectively.)

If the defVal option is provided, values removed are replaced with defVal; otherwise, they are replaced with Null. Optionally, defVal may be passed as a tag on each point; defVal point tags override the global option on a per-point basis.

Range cleaning is performed prior to NA removal, interpolation, and/or rollup.

NA Removal

If removeNA is true, NA values will be removed from the data set, including any NA values generated from defVal during range cleaning.

NA removal is performed prior to interpolation and/or rollup.

Interpolation

If interpolate is true, the entire history grid is interpolated via hisInterpolate(), following SkySpark's normal interpolation rules. If the interval option is also specified, then additional interpolation is performed selectively to ensure each interval of data contains at least one value for each input point. Interpolation is performed prior to rollup.

Rollup

If the rollup option is provided, data are rolled up at the specified interval using hisRollupAuto().

Handling Large History Reads

Large or computationally-intensive history reads from remote projects can cause unexpected XQuery behavior, such as silent timeouts or keep-alive resets of the Arcbeam websocket. When this happens, you may see an error message like this:

Expected X points; XQuery returned Y.

Two options can help:

timeout is passed through to xq(); increase it to allow each XQuery more time to complete
batch sets the maximum number of points read per XQuery; decrease it to spread out the history reads over more XQueries

Note: smaller batch size reduces the chance of individual XQuery failures but increases both the total number of XQueries and the total function execution time.

wattileRecDisWithId

wattileRecDisWithId(rec)

Returns a record's display name and id as a sys::Str using the standard format: Display Name (@id).

wattileResolveRec

wattileResolveRec(ref, checked: true)

Return the record Dict for ref. If the record does not exist anywhere in the cluster, throw an error or return null based on the checked flag.

wattileResolveRef

wattileResolveRef(refStr, checked: true)

Resolve refStr into a valid haystack::Ref. If the record does not exist anywhere in the cluster, throw an error or return null based on the checked flag.

Handles any of the following:

With or without leading @
With or without trailing description
Absolute or relative
(Relative refs) with or without "r:" prefix

wattileSyncHis

wattileSyncHis(points, task, span: null, opts: {})

Syncs Wattile prediction history for points and span, using the specified Wattile Python task.

points may be anything accepted by toRecIdList.
span may be anything acceptable by toSpan, or Null to sync all history after each point's hisEnd.

Each point must define valid tags wattileModelRef and wattileQuantile. If run within a task, will report task progress.

Sync

The default sync behavior is:

Execute a prediction call to each Wattile model for the specified span.
Extract the horizon = 0 prediction data for each point based on its wattileQuantile.
Drop any predictions with timestamps prior to each point's hisEnd.
Write new prediction history is written persistently to each point.

This behavior can be modified via opts; see below.

Options

Sync behavior can be modified by control options passed via opts:

delay: Number; delay from the present for syncing predictions (Default = 0s)
forecast: Boolean; also write forecast data (Default = false)
forecastOnly: Boolean; only write forecast data (Default = false)
hotPeriod: Number; optional hot period for syncing predictions (Default = None)
limit: Number; limits the length of time span to sync when span is Null (Default = None)
overwrite: Boolean; allows existing history to be overwritten (ignores hisEnd) (Default = false)
progress: Boolean; report task progress via taskProgress (Default = true)

Boolean options may be also be passed as markers, with x equivalent to x:true and -x equivalent to x:false. If forecastOnly = true, the forecast option is ignored. To avoid warning spam in the logs, the overwrite and hotPeriod options also set the hisWrite noWarn flag.

Model-specific sync options may also be defined via the wattileSyncOpts tag on a wattileModel record. Model-specific sync options apply to all points associated with that model and override values provided in opts.

Forecasts

If either the forecast or the forecastOnly option is true, then forecasts (data for horizon > 0 in Wattile results) are also written to each point. Only the forecast from the most recent Wattile prediction in the results (most recent value of the timestamp column) is written. Forecasts are always written transiently by setting the hisWrite forecast flag.

Calculated Span

If span is Null, then the time span to sync is calculated for each set of points grouped by wattileModelRef. For each group:

Start of span: Equals the earliest hisEnd among points in the group or the beginning of the hot period, whichever is earlier
End of span: Equals the output of now minus delay or the start of span plus limit (when specified), whichever is earlier

Note that any points without existing history (hisEnd equal to Null) are ignored when using a calculated span. When this function is called from a task or job to keep predictions up-to-date, best practice is to also specify limit to avoid extremely large sync batches.

Sync Delay

The delay option delays syncing of predictions in "real time" to ensure complete input data, specified as a duration measuring backwards in time from the present (as returned by now). The minimum recommended delay is the amount of time needed for predictor data to stabilize, e.g. for predictor point histories to sync and be written to disk. This helps ensure that the predictions are computed from complete and valid data.

If span is specified (non-Null), the delay option is ignored.

Hot Period

The hotPeriod option functions similarly to the hot period for rules: during the hot period the synced predictions are continuously refreshed. This is accomplished by clearing point history within the hot period immediately prior to writing the new predictions. Like delay, hotPeriod is specified as a duration measuring backwards in time from the present.

Setting hotPeriod greater than limit will prevent predictions from being fully synced through the present. If span is specified (non-Null), the hotPeriod option is ignored.

wattileViewPredictionHistory

wattileViewPredictionHistory(model, span, target: null, opts: {})

Visualize Wattile model prediction history for span. If available, history from the model's target point is also shown. The target point is resolved via the model's wattileTargetRef or can be specified by the user via target.

Options

Appearance can be modified by control options passed via opts:

interval: Time interval for hisRollupAuto() (Default = none)
lineColor: Target history line color (Default = "#34495e")
lineWidth: Target history line (stroke) width (Default = 1.5)
predictionGradient: List of colors used to create the prediction shading (Default = OrRd from Color Brewer)
predictionWidth: Prediction history line (stroke) width (Default = 0.5)

The first color in predictionGradient corresponds to the 0th percentile prediction and the last color corresponds to the 50th percentile (median) prediction. The gradient is interpolated in reverse for the 50th to 100th percentile.

wattileViewPredictionQuantiles

wattileViewPredictionQuantiles(model, span, target: null)

Visualize the goodness of fit of a Wattile model during span. The target is resolved via the model's wattileTargetRef or can be specified by the user via target.

The output of this function is a grid intended for viewing as a scatterplot:

Each plot point (q, r) represents one quantile predicted by the Wattile model
The X-axis value q is the expected (predicted) quantile
The Y-axis value r is the actual (observed) quantile

The result is similar to the quantile-quantile plot (QQ plot) used to compare probability distributions.

Details

To generate the plot, the function:

Reads the target's history and the model prediction history for span
Interpolates the history grid
For each predicted quantile, calculate the fraction of target history values that are less than or equal to the prediction history values; this is the observed quantile
Record the observed (q) vs. predicted (r) quantile as an output point on the plot.

Only the quantiles associated with the available prediction points (via the wattileQuantile tag) are included in the analysis.

Interpretation

For each plot point (q, r):

If r = q, the model accurately predicted the target values for quantile q within span
If r > q, the model overpredicted the target values for quantile q within span
If r < q, the model underpredicted the target values for quantile q within span

Ideally, all points on the plot should lie on a 45° diagonal line (slope 1, intercept 0). This line is plotted in red for reference.

Caution! This diagnostic plot only shows whether, overall, there is a bias in the predictions for specific quantiles. It does not provide information about how well the predictions trends match the target as it changes with time or with respect to the predictor variables.

Published by NREL