bfitRExt 1.5.3

BuildingFit R Extension

Overview

The bfitRExt is used to query the statistical computing environment R from within the Folio workspace using SkySpark data. It allows historical SkySpark data to be passed to R, and for generic R queries to be run against this data. The response from R is formatted into Axon data formats, in order to be easy to traverse. Since any R query can be passed, R packages can be loaded and external data sources queried, offering further extensibility.

Knowledge of the R programming language syntax is required. Returning graphics from R is not currently supported.

R is a widely-used, free software language and environment for statistical computing and graphics. Read more about it here.

Setup

R Setup

To use the R extension, you must be running a local R server alongside SkySpark.

You can install R by downloading it from one of the mirrors listed here. R is also available in the software repositories of many Linux distributions.

Once R is installed, you should install the "Rserve" package. More information on this package is available here. To install this package, you can run install.packages("Rserve") from the R command-line.

To start the R server, follow the instructions here. These are duplicated below.

  • Open the R command-line.
  • Add the Rserve dependency by executing library(Rserve)
  • Then start the server by running the command Rserve().
  • Text should be printed to the console indicating that Rserve is running.

If you find the Rserve window annoying or find that it is closed by others, Rserve may be run as a daemon. This means that it will run as a background process visible in the task manager without having to keep the R command window open. The easiest way to do this on Windows is running Rserve from the Rgui application rather than the R command-line. There are many other ways to do this on both Windows and Linux, including setting up a systemd service. Please contact BuildingFit if you'd like additional information.

SkySpark Setup

In order for SkySpark to communicate with R, you must add the Java dependencies to SkySpark. To do this, download the REngine.jar and RserveEngine.jar files, available here. These must be put into the /lib/java/ext directory of the SkySpark installation.

Additionally, enable this pod via the Settings app on the project of interest.

Test The Connection

If the steps listed above have been followed, the SkySpark should be able to connect to the R server. To recap, these are the requirements:

  • Rserve is running locally.
  • REngine.jar and RserveEngine.jar files are in the /lib/java/ext directory of the SkySpark installation.
  • bfitRExt is enabled.

To test the connection, run rConnectionTest() in Folio. If successful, it will return the version of R being used.

Usage

Limitations

To use this extension, you must be familiar with the R language. There are plenty of great resources online if you want to learn it.

The rQuery function will directly return the values of the R objects, not necessarily what would be printed in an R console. This makes data extraction much easier, as the data is stored in objects rather than requiring the user to parse a string, but in some cases may require some familiarity with the R objects being returned. If you would like to see that the R console would print for a given query, use rQueryOutput rather than rQuery.

Security

The Rserve process has some security considerations at the bottom of the Configuration section of their documentation here. This extension only supports running Rserve locally to the SkySpark installation, so the main point is DO NOT RUN Rserve AS ROOT!

Rserve gets the user permissions of whichever user started the process, and this level of permission is exposed to SkySpark through the Folio command-line. Since root users have full permission over the system, avoid allowing SkySpark and SkySpark users this level of control over the server. Furthermore, only allow trusted users access to the Folio App. Consider creating a separate user account on the server specifically for running Rserve or security software like SELinux to further reduce visibility.

Examples

  • Find the average of an outside air temperature point across last week:
    • read(point and outside and air and temp).hisRead(lastWeek()).rQuery("mean(data\$v0)")
    • Compare this to averaging the column within SkySpark; the result should be the same:
      • read(point and outside and air and temp).hisRead(lastWeek()).foldCol("v0",avg)
  • Find the coefficients of a regression between outside air temperature and mixed air temperature for last week:
    • [read(point and mixed and air and temp),read(point and outside and air and temp)].hisRead(lastWeek()).rQuery("lm(v0~v1, data=data)\$coefficients")
  • See the traditional model summary output of R in Str format:
    • [read(point and mixed and air and temp),read(point and outside and air and temp)].hisRead(lastWeek()).rQueryOutput("summary(lm(v0~v1, data=data))")

Licensing

This pod is licensed for purchase through StackHub. The purchased license is good for one project for one year. To run on multiple projects, multiple licenses must be purchased.

Related Products & Services

Related Packages