bfitRExt 1.6.3

BuildingFit R Extension

Overview

The bfitRExt is used to query the statistical computing environment R from within the Folio workspace using SkySpark data. It allows historical SkySpark data to be passed to R, and for generic R queries to be run against this data. The response from R is formatted into Axon data formats, in order to be easy to traverse. Since any R query can be passed, R packages can be loaded and external data sources queried, offering further extensibility.

Knowledge of the R programming language syntax is required. Returning graphics from R is not currently supported.

R is a widely-used, free software language and environment for statistical computing and graphics. Read more about it here.

Setup

R Setup

To use the R extension, you must be running a local R server alongside SkySpark.

You can install R by downloading it from one of the mirrors listed here. R is also available in the software repositories of many Linux distributions.

Once R is installed, you should install the "Rserve" package. More information on this package is available here. To install this package, you can run install.packages("Rserve") from the R command-line.

To start the R server, follow the instructions here. These are duplicated below.

  • Open the R command-line.
  • Add the Rserve dependency by executing library(Rserve)
  • Then start the server by running the command Rserve().
  • Text should be printed to the console indicating that Rserve is running.

If you find the Rserve window annoying or find that it is closed by others, Rserve may be run as a daemon. This means that it will run as a background process visible in the task manager without having to keep the R command window open. The easiest way to do this on Windows is running Rserve from the Rgui application rather than the R command-line. There are many other ways to do this on both Windows and Linux, including setting up a systemd service. Please contact BuildingFit if you'd like additional information.

SkySpark Setup

In order for SkySpark to communicate with R, you must add the Java dependencies to SkySpark. To do this, download the REngine.jar and RserveEngine.jar files, available here. These must be put into the /lib/java/ext directory of the SkySpark installation.

Additionally, enable this pod via the Settings app on the project of interest.

Test The Connection

If the steps listed above have been followed, the SkySpark should be able to connect to the R server. To recap, these are the requirements:

  • Rserve is running locally.
  • REngine.jar and RserveEngine.jar files are in the /lib/java/ext directory of the SkySpark installation.
  • bfitRExt is enabled.

To test the connection, run rConnectionTest() in Folio. If successful, it will return the version of R being used.

Considerations

Limitations

To use this extension, you must be familiar with the R language. There are plenty of great resources online if you want to learn it.

This extension cannot return visual graphs from R at this time.

By default, the rqExecute function will directly return the values of the R objects, not necessarily what would be printed in an R console. This makes data extraction much easier, as the data is stored in objects rather than requiring the user to parse a string, but in some cases may require some familiarity with the R objects being returned. If you would like to see what the R console would print for a given query, include true for the strOutput parameter in the rqExecute function.

Security

The Rserve process has some security considerations at the bottom of the "Configuration" section of their documentation here. This extension only supports running Rserve locally to the SkySpark installation, so the main point is DO NOT RUN Rserve AS ROOT!

Rserve gets the user permissions of whichever user started the process, and this level of permission is exposed to SkySpark through the Folio command-line. Since root users have full permission over the system, don't start the Rserve process when logged in as a root user. Consider the following solutions to further enhance security:

  • Make use of SkySpark's user app access permissions. Limit Folio App access to only those that need it.
  • Create a user on the server specifically for running Rserve. Limit this user's file permissions drastically or completely.
  • Investigate security software like SELinux to further limit what Rserve can see.

Examples

These are just a few simple examples to display the general usage and the overall capabilities of the extension. The code can be run in the Folio app.

Evaluate R commands

Perform 3+5 in R and return the result:

rq().rqEval("3 + 5").rqExecute()

Get the version of R being run:

rq().rqEval("R.version.string").rqExecute()

Pass SkySpark data

Pass a list of [1,2,3,4,5] into R and return it to folio:

rq().rqPassData([1,2,3,4,5].toGrid()).rqEval("data").rqExecute()

Note that the input data in the previous example is named data by default. Name it number.list instead:

rq().rqPassData([1,2,3,4,5].toGrid(), "number.list").rqEval("number.list").rqExecute()

Input two vectors and add them together in R, returning the result to folio:

rq().rqPassData([1,2,3,4,5].toGrid(),"d1").rqPassData([2,4,6,8,10].toGrid(), "d2").rqEval("d1+d2").rqExecute()

Pass a history grid into R and return it to folio:

rq().rqPassData(read(weatherPoint).hisRead(yesterday())).rqEval("data").rqExecute()

Pass a history grid into R and return the v0 column to folio:

rq().rqPassData(read(weatherPoint).hisRead(yesterday())).rqEval("data\$v0").rqExecute()

Simple average

Find the average of an outside air temperature point across last week (this should give the same result as doing foldCol("v0",avg) in Axon):

rq().rqPassData(read(point and outside and air and temp).hisRead(lastWeek())).rqEval("mean(data\$v0)").rqExecute()

Linear Regression

Find the coefficients of a regression between outside air temperature and mixed air temperature for last week:

rq().rqPassData([read(point and mixed and air and temp),read(point and outside and air and temp)].hisRead(lastWeek())).rqEval("lm(v0~v1, data=data)\$coefficients").rqExecute()

See the traditional model summary output of R in Str format:

rq().rqPassData([read(point and mixed and air and temp),read(point and outside and air and temp)].hisRead(lastWeek())).rqEval("summary(lm(v0~v1, data=data))").rqExecute(true)

Random Distribution Sampling

Get 5 random numbers from a gaussian distribution with a mean of 0 and a standard deviation of 1:

rq().rqEval("rnorm(5,0,1)").rqExecute()

Clustering

Perform 2-center K-means clustering on a weather history and return the result to folio:

rq().rqPassData(read(weatherPoint).hisRead(yesterday())).rqEval("kmeans(data\$v0, 2)").rqExecute()

Exponential Smoothing & Prediction

Given a month of weather data, predict the next two days using Holt-Winters filtering:

rq().rqPassData(read(weatherPoint).hisRead(lastMonth()).hisRollup(avg, 1hr)) .rqEval("hw <- HoltWinters(ts(data\$v0, start=c(1,1), freq=24), seasonal=\"mult\")") .rqEval("predict(hw, n.ahead=48)").rqExecute(false)

Licensing

This pod is licensed for purchase through StackHub. The purchased license is good for one project for one year. To run on multiple projects, multiple licenses must be purchased.

Buy bfitRExt for $575.00...

Related Products & Services

Related Packages