nrelUtilityExt 2.1.0 - Missing Axon utility functions

nrelUtilityExt

Registered StackHub users may elect to receive email notifications whenever a new package version is released.

There are 3 watchers.

v2.1.0

Overview

The NREL Utility extension provides a variety of missing utility functions for the Axon programming language. Most functions are overridable. See the nrelUtilityExt Github repository for important release notes and a change log. Please report issues here.

Set Operations

The union, intersect, and setDiff functions provide set operations for lists and dicts modeled on the equivalent functions in the R programming language. They complement the core function merge.

SQL-Style Grid Joins

The NREL Utility extension includes a set of functions that implement SQL-style joins for grids, modeled closely on the join functions included in the dplyr package for R. Each join function has the general form *Join(a, b, by, opts), in which a and b are grids to join, by specifies the column(s) to use to join the grids, and opts is a dict of control options.

Join Types

Six types of joins are available:

innerJoin: Returns all rows from a where there are matching values in b and all columns from a and b
leftJoin: Returns all rows from a and all columns from a and b
rightJoin: Returns all rows from b and all columns from a and b
fullJoin: Returns all rows and columns from both a and b
semiJoin: Returns all rows from a where there are matching values in b, keeping columns from a only
antiJoin: Returns all rows from a where there are not matching values in b, keeping columns from a only

Parameters

The *Join functions accept the following parameters:

a, b: Grids to join
by: Columns to join by. If by is null (the default), then *Join will perform a natural join, using columns with names common to both a and b. by may also be:
- A string specifying a single common column name
- A list of strings specifying multiple common column names
- A dict in the form {x:y, ...} where each key name x is a column name in a and each key value y (which must be a string) is a corresponding column name in b
opts: A dict of control options; see Options

Options

The *Join functions support the following options:

keep: String specifying how to handle conflicting values in non-joined duplicate columns in a and b:
- "a": Values from a overwrite values from b
- "b": Values from b overwrite values from a
- "both": Keeps values from both a and b, disambiguated by suffix
- "neither" or "drop": Drops keys with conflicting values (replaces with null)
- "na": Replaces conflicting values with haystack::NA
The default is "both". This option has no effect for semiJoin and antiJoin.
suffix: When keep == "both", specifies suffixes to use to disambiguate non-joined duplicate columns in a and b. May be:
- A list of length 2, containing strings corresponding to suffixes for columns in a and b
- A dict with key names a and b, corresponding to suffixes for columns in a and b
Suffixes are appended using an underscore as a separator. The default suffixes are "a" and "b". This option has no effect for semiJoin and antiJoin.

Notes

If by is not null, there is a potential for non-joined duplicate columns between grids a and b. In the joined grid, some rows may have conflicting values in these columns, that is, values for the same key (column) that differ between a and b. The keep option governs what happens when there is such a conflict. If instead either a or b is simply missing the duplicate key, then there is no conflict: the value from the other grid is included in the output.
Output column order is always all columns of a followed by any new columns from b. Output row order depends on the type of join:
- Left, inner, and full joins search and return rows of a first followed by rows of b
- Right joins search and return rows of b first followed by rows of a
Duplicated rows are discarded from the output.
Merging grid and column meta is not currently supported.

Examples

Consider two grids a and b:

a:

name age hometown state
---- --- -------- -----
Bob   63 Chicago  IL
Jane  41 New York NY
Nico  25 Portland ME

b:

hometown state population
-------- ----- ----------
New York NY     8,175,133
Denver   CO       600,158
Portland OR       583,776
Portland ME        66,093

For these example grids, natural joins will use "hometown" and "state" as the by columns.

Regular Joins

innerJoin(a, b):

name age hometown state population
---- --- -------- ----- ----------
Jane  41 New York NY     8,175,133
Nico  25 Portland ME        66,093

leftJoin(a, b):

name age hometown state population
---- --- -------- ----- ----------
Bob   63 Chicago  IL
Jane  41 New York NY     8,175,133
Nico  25 Portland ME        66,093

rightJoin(a, b):

name age hometown state population
---- --- -------- ----- ----------
Jane  41 New York NY     8,175,133
         Denver   CO       600,158
         Portland OR       583,776
Nico  25 Portland ME        66,093

fullJoin(a, b):

name age hometown state population
---- --- -------- ----- ----------
Bob   63 Chicago  IL
Jane  41 New York NY     8,175,133
Nico  25 Portland ME        66,093
         Denver   CO       600,158
         Portland OR       583,776

Partial Joins

semiJoin(a, b):

name age hometown state
---- --- -------- -----
Jane  41 New York NY   
Nico  25 Portland ME

antiJoin(a, b):

name age hometown state
---- --- -------- -----
Bob   63 Chicago  IL

Modifying Join Columns

Things get more interesting if we omit "state" from the join columns.

innerJoin(a, b, "hometown"):

name age hometown state_a state_b population
---- --- -------- ------- ------- ----------
Jane  41 New York NY      NY       8,175,133
Nico  25 Portland ME      OR         583,776
Nico  25 Portland ME      ME          66,093

By default, keep == "both"

innerJoin(a, b, "hometown", {suffix:["person","place"]}):

name age hometown state_person state_place population
---- --- -------- ------------ ----------- ----------
Jane  41 New York NY           NY           8,175,133
Nico  25 Portland ME           OR             583,776
Nico  25 Portland ME           ME              66,093

innerJoin(a, b, "hometown", {keep:"a"}):

name age hometown state population
---- --- -------- ----- ----------
Jane  41 New York NY     8,175,133
Nico  25 Portland ME       583,776
Nico  25 Portland ME        66,093

In practice, this particular case isn't very useful because the state-population match is now wrong.

Parsing Functions

Additional text parsing functions to complement the core parse* functions:

These now use regex and/or wrap ioReadZinc, so they should be safe to use on arbitrary input text.

Date Range Functions

Functions that return DateSpans corresponding to fiscal years:

Functions that return lists of standard time intervals (days, weeks, etc.) that cover a user-specified time span:

Function Import and Export

Functions to facilitate moving functions among projects:

importFunctions imports functions from a uri, record list, or text into the current project
exportFunctions exports functions from the current project to file

Options are available to filter, sort, and add or remove tags to imported and exported functions.

Other Useful Stuff

A brief sampling of other useful functions:

nmbe and cvrmse calculate the normalized mean bias error (NBME) and coefficient of variation of the root-mean-square error (CV[RMSE]) per ASHRAE Guideline 14
hasAny and hasAll are extensions of has to multiple keys
isNA complements isNull, isNaN, and the various other core is* funtions
removeNA, removeNaN, removeNull, and removeVal remove NAs, NaNs, nulls, or specific values from a list or dict
tomorrow complements today and yesterday

For a complete list, see the function listing.

Published by NREL

Packages by NREL

Free packages

v1.0.1

nrelCsvExt

Import and export functions for CSV data

by NREL

v0.3.0

nrelWattileExt

Interface extension for the Wattile Python package

by NREL

v1.4.3

psychrometricsExt

Psychrometric Axon functions

by NREL

Package details

Version	2.1.0
License	BSD-3-Clause-Clear
Build date	2 years ago on 14th Nov 2022
File name	`nrelUtilityExt.pod`
File size	`15.05 kB`
MD5	`8b40b466b302a6a99a8f81c4c89ad6c6`
SHA1	`367e0eac4c648eec9a8d44fa08190838edf041e0`
Published by NREL Download now Also available via SkyArc Install Manager