Fig 1 – interactive Leaflet choropleth of Census ACS household income $60k-$75k using Microsoft Open R
For ad hoc spatial work it was usually best to stick to a desk top application such as one of the big dollar Arc___ variations or better yet something open source like QGIS. In the early days these generally consisted of modular C/C++ functions threaded together with an all-purpose scripting language. If you wanted to get a little closer to the geo engine, knowledge of a scripting language (PHP, TCL, Python, or Ruby) helped to script modular toolkits like GDAL/OGR, OSSIM, GEOS, or GMT. This all works fine except for learning and relearning often arcane syntax, while repeatedly discovering and reading data documentation on various public resources from Census, USGS, NOAA, NASA, JPL … you get the idea.
R changes things in the geospatial world. The R project originated as a modular statistics and graphics toolkit. Unless you happen to be a true math prodigy, statistics are best visualized graphically. With powerful graphics libraries, R has evolved into a useful platform for ad hoc spatial analysis.
Coupled with an IDE such as RStudio, or the new Microsoft R Tools for Visual Studio, R wraps a large stable of component libraries into a script interpreter environment, ideal for “one off” analysis. Although learning arcane syntax is still a prerequisite, there is at least a universal environment with a really large contributor community. You can think of it as open source replacement for Tableau or Power BI but without proprietary limitations.
# only a few lines of script library(networkD3) data(MisLinks, MisNodes) forceNetwork(Links = MisLinks, Nodes = MisNodes, Source = "source", Target = "target", Value = "value", NodeID = "name", Group = "group", opacity = 0.4)
Community contributions are found in CRAN, Comprehensive R Archive Network for the R programming language. A search of CRAN or MRAN (Microsoft R Archive Network) for the term “spatial” yields a list of 145 R libraries.
Example: dygraph R library for creating interactive charts.
# only a few lines of script library(dygraphs) dygraph(nhtemp, main = "New Haven Temperatures") %>% dyRangeSelector(dateWindow = c("1920-01-01", "1960-01-01"))
Here are just a few samples of CRAN libraries useful for spatial analysis:
library(ggmap) # simple mapping and more
library(raster) # defining extents and raster processing
brick # raster cube objects useful for multispectral operations
stack # multilayer raster manipulation
library(sp) # working with spatial objects
library(leaflet) # interactive web mapping using Leaflet
library(rgeos) # R GEOS wrapper
library(tigris) # downloading geography spatial census tiger
library(FedData) # downloading federal data NED, NHD, SSURGO, GHCN
library(acs) # tabular census data (American Community Survey) ACS, SF1, SF3
library(UScensus2010) # spatial and demographic Census 2010 data county/tract/blkgrp/blk
library(RColorBrewer) # color palettes for thematic mapping
For example, tigris is a useful library for reading US Census TIGER files. With just a couple lines of R scripting you can zoom around a polygonal plot of US Census urban areas. Library(tigris) handles all the details of obtaining the TIGER polygons and loading into local memory. Library(leaflet) handles creating the polygons and displaying over a default Leaflet map as tiles.
ua <- urban_areas(cb = TRUE) ua %>% leaflet() %>% addTiles() %>% addPolygons(popup = ~NAME10)
• Microsoft R Open
• RTVS R Tools for Visual Studio
• Microsoft R Server
• SQL Server R Services
• MRAN Microsoft R Application Network
• Microsoft Azure R Server
• Microsoft R Server for Hadoop
Apparently Data Science is a growth industry and Microsoft has an interest in providing useful tools beyond Power BI.
Microsoft R Open Microsoft R Open
• intel enhanced math library
• multi core support
RTVS R Tools for Visual Studio RTVS R Tools for Visual Studio
Microsoft R Visual Studio IDE using the Data Science R settings. Users of Visual Studio will find all the familiar debug stepping, variable explorer, and intellisense editing they are using for other development languages.
Microsoft R Server Microsoft R Server
Licensed enterprise R Service that scales by avoiding in-memory data limitations using parallel chunked data streams.
SQL Server 2016 R Services SQL Server R Services
T-SQL R interface with Database next to R code on the same server.
sp_execute_external_script – R code embedding
receives inputs, passes to external R runtime, and returns R results.
invoke sp to run R code in T-SQL
MRAN Microsoft R Application Network MRAN Microsoft R Application Network
Example R Leaflet demographic script (ref Fig 1 above):
library(tigris) # TIGER data library(acs) # ACS data library(stringr) # to pad fips codes library(dplyr) # data manipulation library(leaflet) # interactive mapping #Colorado Front range counties counties <- c(1, 5, 13, 31, 35, 39, 41, 59) tracts <- tracts(state = 'CO', county = c(1, 5, 13, 31, 35, 39, 41, 59), cb = TRUE) api.key.install(key = "<insert your own Census.gov api key here>") geo <- geo.make(state = c("CO"), county = c(1, 5, 13, 31, 35, 39, 41, 59), tract = "*") income <- acs.fetch(endyear = 2012, span = 5, geography = geo, table.number = "B19001", col.names = "pretty") names(attributes(income)) attr(income, "acs.colnames") ##  "Household Income: Total:" ##  "Household Income: $60,000 to $74,999" income_df <- data.frame(paste0(str_pad(income@geography$state, 2, "left", pad = "0"), str_pad(income@geography$county, 3, "left", pad = "0"), str_pad(income@geography$tract, 6, "left", pad = "0")), income@estimate[, c("Household Income: Total:", "Household Income: $60,000 to $74,999")], stringsAsFactors = FALSE) income_df <- select(income_df, 1:3) rownames(income_df) <- 1:nrow(income_df) names(income_df) <- c("GEOID", "total", "income60kTo75k") income_df$percent <- 100 * (income_df$income60kTo75k / income_df$total) income_merged <- geo_join(tracts, income_df, "GEOID", "GEOID") popup <- paste0("GEOID: ", income_merged$GEOID, "<br>", "Percent of Households $60k-$75k: ", round(income_merged$percent, 2)) pal <- colorNumeric( palette = "YlGnBu", domain = income_merged$percent) incomemap <- leaflet() %>% addProviderTiles("CartoDB.Positron") %>% addPolygons(data = income_merged, fillColor = ~pal(percent), color = "#b2aeae", # you need to use hex colors fillOpacity = 0.7, weight = 1, smoothFactor = 0.2, popup = popup) %>% addLegend(pal = pal, values = income_merged$percent, position = "bottomright", title = "Percent of Households<br>$60k-$75k", labFormat = labelFormat(suffix = "%")) incomemap
Hillshade example using public SRTM 90 data:
library(raster) alt <- getData('alt', country = 'ITA') slope <- terrain(alt, opt = 'slope') aspect <- terrain(alt, opt = 'aspect') hill <- hillShade(slope, aspect, 40, 270) leaflet() %>% addProviderTiles("CartoDB.Positron") %>% addRasterImage(hill, colors = grey(0:100 / 100), opacity = 0.6)
Fig 7 – RVST R Visual Studio 2015 Tools Leaflet Hillshading image
R provides lots of interesting modules that help with spatial analytics. The script engine makes it easy to perform ad hoc visualization and publish the results online. However, there are limitations in performance and extents that make it more of a competitor to desktop GIS products or the newer commercial data visualizers like Tableau or PowerBI. For public facing web applications with generalized extents three tier performance using SQL + server code + web UI still makes the most sense.
The advent of Microsoft R Server and SQL Server R Services add scaling performance to make R solutions more competitive with the venerable three tier approach. It will be interesting to see how developers make use of SQL Server R Services. As a method of adding raster functionality to SQL Server, R sp_execute_external_script overlaps somewhat with PostGIS Raster. Exploring SQL Server 2016 R Services must await a future post.
Example: threejs R library with world flight data