Registered StackHub users may elect to receive email notifications whenever a new package version is released.
There are 0 watchers.
Read all wattilePoint points associated with wattileModels
via wattileModelRef.
For the specified predictors
, targets
, and time span
, export historical data to directory dir
for training Wattile models. predictors
and targets
are any inputs supported by toRecList(); span
is any input supported by toSpan(). The exported data are consistent with the training data input format required by Wattile.
Writes the following files within dir
:
<<Name>> Predictors <<Date Range>>.csv
: Time series file(s) containing predictor (input) data in CSV format<<Name>> Targets <<Date Range>>.csv
: Time series file(s) containing target (output) data in CSV format<<Name>> Config.json
: Data set configuration metadata in JSON formatIn the filenames, <<Name>>
is the data set name (which defaults to the name of dir
; see Options) and <<Date Range>>
is the date range spanned by each file (with format dependent on the export options selected).
Each CSV file contains a Timestamp
column followed by one or more numeric value columns corresponding to either the predictor or the target points. Timestamps are exported in ISO 8601 standard format, including time zone offset and time zone name. Only numeric data are supported.
The JSON configuration file consists of a single top-level dictionary containing five nested dictionaries:
dates
: The data set start
and end
dates (timestamps)predictors
: A list of predictor variables, including the following metadata for each, as available:
id
: Unique identifierdis
: Display name; populated via dis()column
: Column name in CSV file(s)unit
: UnitmaxVal
: Maximum valid valueminVal
: Minimum valid valuedefVal
: Default value; substituted automatically for out-of-range datatargets
: A list of target variables, including at minimum the column
field and following the same conventions as predictors
files
: Provides a list of all CSV files in the data set, each as an dictionary with the following metadata:
filename
: Name of the associated CSV filecontentType
: Either "predictors"
or "targets"
start
: Start of the time range spanned by the fileend
: End of the time range spanned by the fileexport_options
: The opts
arguments used for the export (provided for repeatability)The JSON configuration file uses Haystack Json encoding (Version 3 by default; see Options). Unicode characters are not escaped.
Export behavior can be modified by control options passed via opts
:
appendUnits
: Boolean; if true
then units will be appended to column names (Default = false)clean
: Boolean; see wattileReadHis() (Default = false)batch
; Integer; see wattileReadHis() (Default = none)defVal
: haystack::Number or haystack::NA; see wattileReadHis() (Default = none)interpolate
: Boolean; see wattileReadHis() (Default = false)interval
: haystack::Number; see wattileReadHis() (Default = none)jsonVersion
: sys::Str; either v3
or v4
; see Json (Default = v3
)name
: sys::Str; data set name to use in output files (Default = name of dir
)preview
: Boolean; if true
then return the data set configuration JSON without writing any files (Default = false)removeNA
: Boolean; see wattileReadHis() (Default = false)rollup
: Boolean; see wattileReadHis() (Default = false)splitBy
: sys::Str; One of "day", "week", "month", or "year" indicating the interval at which to split the data files (Default = no splitting)timeout
: haystack::Number; see wattileReadHis() (Default = none)warn
: Boolean; see wattileReadHis() (Default = true)Boolean options may be also be passed as markers, with x
equivalent to x:true
and -x
equivalent to x:false
.
Options clean
, defVal
, interpolate
, interval
, and rollup
control data pre-processing prior to export. See wattileReadHis for each option's effect.
Note: interpolation is performed for each exported CSV file independently, therefore, the set of timestamps in target and predictor CSV file for the same date range are not guaranteed to be the same.
timeout
and/or decreasing batch
. See wattileReadHis for details.Imports proxy records for Wattile model(s) located in dirs
and optionally commits them to the database. dirs
may be a single uri, a list of uris, a list of anything that can be parsed to a uri, or a grid with a uri
column.
Each imported model is a dict with tags:
predictors_target_config.json
; see documentation)metadata.json
; see documentation)Does not populate optional tags dis, tz, or wattileReadOpts (as this information is not currently included with Wattile models). Returns a list of model records.
Import behavior can be modified by control options passed via opts
:
checked
: Boolean; throw an error for invalid models (Default = true
)commit
: Boolean; if true
commits the model records to the database (Default = false
)conflict
: Action on conflicts; one of "skip", "overwrite", "duplicate", or "error" (Default = "error")warn
: Boolean; if true
log warnings; if false
suppress them (Default = true)Boolean options may also be passed as markers, with x
equivalent to x:true
and -x
equivalent to x:false
.
Conflicts with existing wattileModel records (matched by uri
) are handled based on the conflict
option:
This function performs some basic validation on imported models, including checking directory integrity and verifying the existence of the model's predictor and target points within the SkySpark cluster. Behavior on encountering an invalid model depends on the checked
option:
checked
is true
, throws an error when an invalid model is encounteredchecked
is false
, skips invalid models...
warn
is true
, with a warningwarn
is false
, silentlyClean up Wattile prediction event logs to prevent accumulation. Not intended to be called directly; instead use wattilePythonTask().
Initialize Wattile Python session. Not intended to be called directly; instead use wattilePythonTask().
Execute a Wattile model prediction. Not intended to be called directly; instead use wattilePythonTask().
Configure a Wattile model for use with SkySpark. Not intended to be called directly; instead use wattilePythonTask().
Task function to handle Wattile Python Docker container interactions. Intended to be run within a dedicated persistent task by passing action messages:
msg
must be a dictionary with an action
key (Str) and any action-specific inputs as key-value pairs.image
specifies the Docker image by name (Default = "wattile"
)Available actions are "init", "setup", "predict", and "cleanup", documented below. Each action uses a dedicated helper function.
For task record configuration, see the extension documentation.
Initialize a Wattile Python session. This action is only required once and is performed automatically when the session starts.
action
: "init"Calls wattilePythonInit().
Prepare a Wattile model for prediction. Must be run for each model before taking the "predict" action.
action
: "setup"model
: A Wattile model recordCalls wattilePythonModelSetup().
Executes a prediction for a Wattile model.
action
: "predict"model
: A Wattile model recordspan
: Time span for predictionReturns a tidy grid with columns:
Each row of the grid represents the value of a unique ["timestamp", "quantile", "horizon"]
tuple from the underlying Wattile output XArray.
The Wattile model record(s) may have a wattileReadOpts tag which provides model-specific options to use when calling wattileReadHis in the context of model prediction.
Calls wattilePythonModelPredict().
Clean up prediction events and log files for a Wattile model.
action
: "cleanup"model
: A Wattile model recordFor the specified points
and time span
, read and pre-process historical data for use with a Wattile model. points
is any input supported by toRecList(); span
is any input supported by toSpan(). Returns a history grid.
Data pre-processing is controlled by the following options, passed via opts
:
clean
: Boolean; if true
then data will be range-cleaned prior to export by removing values outside the range defined by each point's minVal and maxVal tags (Default = false)batch
: Integer; optional batch size for XQuery; see below (Default = none)defVal
: haystack::Number or haystack::NA; governs range cleaning behavior; see below (Default = none)interpolate
: Boolean; if true
then data will be interpolated (Default = false)interval
: haystack::Number; optional interval at which to interpolate and/or roll up the data (Default = none)removeNA
: Boolean; if true
then NA values will be removed (Default = false)rollup
: Boolean; if true
then a history rollup will be applied (Default = false)timeout
: haystack::Number; optional timeout for XQuery; see below (Default = none)warn
: Boolean; if true
log warnings; if false
suppress them (Default = true)Boolean options may be also be passed as markers, with x
equivalent to x:true
and -x
equivalent to x:false
. If rollup
is true
, then interval
must also be specified.
If clean
is true
, values outside the range [minVal, maxVal]
, as defined by each point's minVal and maxVal tags, are removed from the data set. (Missing minVal
and maxVal
tags imply permissible minimum and maximum values of negative and positive infinity, respectively.)
If the defVal
option is provided, values removed are replaced with defVal
; otherwise, they are replaced with Null. Optionally, defVal
may be passed as a tag on each point; defVal
point tags override the global option on a per-point basis.
Range cleaning is performed prior to NA removal, interpolation, and/or rollup.
If removeNA
is true
, NA values will be removed from the data set, including any NA values generated from defVal
during range cleaning.
NA removal is performed prior to interpolation and/or rollup.
If interpolate
is true
, the entire history grid is interpolated via hisInterpolate(), following SkySpark's normal interpolation rules. If the interval
option is also specified, then additional interpolation is performed selectively to ensure each interval of data contains at least one value for each input point. Interpolation is performed prior to rollup.
If the rollup
option is provided, data are rolled up at the specified interval
using hisRollupAuto().
Large or computationally-intensive history reads from remote projects can cause unexpected XQuery behavior, such as silent timeouts or keep-alive resets of the Arcbeam websocket. When this happens, you may see an error message like this:
Expected X points; XQuery returned Y.
Two options can help:
timeout
is passed through to xq(); increase it to allow each XQuery more time to completebatch
sets the maximum number of points read per XQuery; decrease it to spread out the history reads over more XQueriesNote: smaller batch
size reduces the chance of individual XQuery failures but increases both the total number of XQueries and the total function execution time.
Return the record Dict for ref
. If the record does not exist anywhere in the cluster, throw an error or return null based on the checked
flag.
Resolve refStr
into a valid haystack::Ref. If the record does not exist anywhere in the cluster, throw an error or return null based on the checked
flag.
Handles any of the following:
@
"r:"
prefixSyncs Wattile prediction history for points
and span
, using the specified Wattile Python task
.
points
may be anything accepted by toRecIdList.span
may be anything acceptable by toSpan, or Null to sync all history after each point's hisEnd
.Each point must define valid tags wattileModelRef and wattileQuantile. If run within a task, will report task progress.
The default sync behavior is:
span
.horizon = 0
prediction data for each point based on its wattileQuantile
.hisEnd
.This behavior can be modified via opts
; see below.
Sync behavior can be modified by control options passed via opts
:
delay
: Number; delay from the present for syncing predictions (Default = 0s
)forecast
: Boolean; also write forecast data (Default = false
)forecastOnly
: Boolean; only write forecast data (Default = false
)hotPeriod
: Number; optional hot period for syncing predictions (Default = None)limit
: Number; limits the length of time span to sync when span
is Null (Default = None)overwrite
: Boolean; allows existing history to be overwritten (ignores hisEnd
) (Default = false
)progress
: Boolean; report task progress via taskProgress (Default = true
)Boolean options may be also be passed as markers, with x
equivalent to x:true
and -x
equivalent to x:false
. If forecastOnly = true
, the forecast
option is ignored. To avoid warning spam in the logs, the overwrite
and hotPeriod
options also set the hisWrite noWarn
flag.
Model-specific sync options may also be defined via the wattileSyncOpts tag on a wattileModel record. Model-specific sync options apply to all points associated with that model and override values provided in opts
.
If either the forecast
or the forecastOnly
option is true
, then forecasts (data for horizon > 0
in Wattile results) are also written to each point. Only the forecast from the most recent Wattile prediction in the results (most recent value of the timestamp
column) is written. Forecasts are always written transiently by setting the hisWrite forecast
flag.
If span
is Null, then the time span to sync is calculated for each set of points grouped by wattileModelRef
. For each group:
hisEnd
among points in the group or the beginning of the hot period, whichever is earlierdelay
or the start of span plus limit
(when specified), whichever is earlierNote that any points without existing history (hisEnd
equal to Null) are ignored when using a calculated span. When this function is called from a task or job to keep predictions up-to-date, best practice is to also specify limit
to avoid extremely large sync batches.
The delay
option delays syncing of predictions in "real time" to ensure complete input data, specified as a duration measuring backwards in time from the present (as returned by now). The minimum recommended delay
is the amount of time needed for predictor data to stabilize, e.g. for predictor point histories to sync and be written to disk. This helps ensure that the predictions are computed from complete and valid data.
If span
is specified (non-Null), the delay
option is ignored.
The hotPeriod
option functions similarly to the hot period for rules: during the hot period the synced predictions are continuously refreshed. This is accomplished by clearing point history within the hot period immediately prior to writing the new predictions. Like delay
, hotPeriod
is specified as a duration measuring backwards in time from the present.
Setting hotPeriod
greater than limit
will prevent predictions from being fully synced through the present. If span
is specified (non-Null), the hotPeriod
option is ignored.
Visualize Wattile model
prediction history for span
. If available, history from the model's target point is also shown. The target point is resolved via the model's wattileTargetRef or can be specified by the user via target
.
Appearance can be modified by control options passed via opts
:
interval
: Time interval for hisRollupAuto() (Default = none)lineColor
: Target history line color (Default = "#34495e")lineWidth
: Target history line (stroke) width (Default = 1.5)predictionGradient
: List of colors used to create the prediction shading (Default = OrRd from Color Brewer)predictionWidth
: Prediction history line (stroke) width (Default = 0.5)The first color in predictionGradient
corresponds to the 0th percentile prediction and the last color corresponds to the 50th percentile (median) prediction. The gradient is interpolated in reverse for the 50th to 100th percentile.
Visualize the goodness of fit of a Wattile model
during span
. The target is resolved via the model's wattileTargetRef or can be specified by the user via target
.
The output of this function is a grid intended for viewing as a scatterplot:
(q, r)
represents one quantile predicted by the Wattile modelq
is the expected (predicted) quantiler
is the actual (observed) quantileThe result is similar to the quantile-quantile plot (QQ plot) used to compare probability distributions.
To generate the plot, the function:
span
q
) vs. predicted (r
) quantile as an output point on the plot.Only the quantiles associated with the available prediction points (via the wattileQuantile tag) are included in the analysis.
For each plot point (q, r)
:
r = q
, the model accurately predicted the target values for quantile q
within span
r > q
, the model overpredicted the target values for quantile q
within span
r < q
, the model underpredicted the target values for quantile q
within span
Ideally, all points on the plot should lie on a 45° diagonal line (slope 1, intercept 0). This line is plotted in red for reference.
Caution! This diagnostic plot only shows whether, overall, there is a bias in the predictions for specific quantiles. It does not provide information about how well the predictions trends match the target as it changes with time or with respect to the predictor variables.