IoT – Internet of Things
(das Ding an sich)

“Noumenon the thing-in-itself (das Ding an sich) as opposed to phenomenon—the thing as it appears to an observer.”

Fig 1 – Microsoft Azure IoT Suite offers complete working demonstration solutions

Heidegger Thinginess

Is Heidegger serious or just funn’n us, when his discursive rambling winds past the abolition of all distances, wanders around thinginess, and leads us to “some-thing” from “no-thing?”

“The failure of nearness to materialize in consequence of the ‘abolition of all distances has brought the distanceless to dominance. In the default of nearness the thing remains annihilated as / a thing in our sense. But when and in what way do things exist as things? This is the question we raise in the midst of the dominance of the distanceless.”

“The emptiness, the void, is what does the vessel’s holding. The empty space, this nothing of the jug, is what the jug is as the holding vessel.”

“The jug’s essential nature, its presencing, so experienced and thought of in these terms, is what we call thing.”

“The Thing” from Poetry, Language, Thought 1971
Heidegger Translated by Albert Hofstader

So class, we may conclude that our spatial attribute is not the essence of the thing. However, IoT does not concern itself with das Ding an sich, but with the mechanism of appearance, or how “noumenon” communicates “phenomenon” within the internet. Therefore, we must suppose IoT remains Kantian in spite of Heidegger’s prolix lecturing. And, spatial attributes do still exist.

No? … Really?
Phew I was worried about my job for a minute!
(Actually I always wanted to drag Heidegger into a post on maps.)

IoT Things
Of course, IoT just wants “things”, “stuff”, “devices” to have a part in the cloud just like the rest of us. Dualism, Monism who cares? It’s all about messages. Which is where Microsoft Azure IoT comes in.

Fig 2 – Azure and IoT Dominic Betts

For Microsoft, IoT is an opportunity to provide infrastructure at a couple of levels with the central piece the Azure IoT Hub:

Messsage Creation

Devices, sensors, are just small computers for which Microsoft introduced Windows IoT Core. This is a scaled down Windows OS for devices like Raspberry Pi, offered freely to feed the IoT Hub. The Maker community can now use Windows and Visual Studio Express to latch up Gpio and send telemetry messages via Bluetooth or WiFi. At $49, Microsoft’s Raspberry Pi 3 Starter Kit offers the latest single board computer with a MicroSD embedded Window IoT Core for experimenters. It should make hardware playtime easier for anyone in the Microsoft community.

Useful site: Connect your device to Azure IoT Hub

The ultimate device is still your smart phone. With the release of Xamarin in Visual Studio 2015 update2, native Mobile App development across android, iOS, Windows Phone is much easier.

Message Pipeline

Azure IoT Hub is the key piece of technology. IoT Hub is infrastructure for handling messages across a wide array of devices and software which scales to enterprise dimensions. Security, monitoring, and device management are built in. The value proposition is easy to see if you’ve ever dealt with fleet management or SCADA networks. Instead of writing services on multiple VMs to catch tcp packets and sort to various storage and events, it’s easy to sign up for an Azure IoT Hub and let Azure worry about reliability, scaling, and security.

Fig 3 – Azure IoT Hub with Stream Analytics - Getting Started with the Internet of Things (IoT)

Note that Machine Learning is part of the platform diagram. Satya Nadella’s Build 2016 keynote emphasized “the intelligent cloud” and of course R Project plays a role in predictive intelligence, so we can begin to see Microsoft marshalling services and tools for the next generation of cloud AI.

Thinking of ubiquitous sensors naturally (or unnaturally depending on your pre-disposition regarding the depravity of man and machine), brings to mind primitive organism possibilities as well as shades of Hal. Also noteworthy, “IoT Message Queues can be bidirectional,” so the order of Things and Humans can easily be reversed. Perhaps Microsoft’s embrace of artificial intelligence will cycle it back to the preeminent “seat of evil corporate empire” currently occupied by Google.

Azure IoT Hub deployment

Fig 4 – Azure Portal IoT Hub deployment

Once the Azure IoT Hub is deployed the next step is to add a Stream Analytics Job to the pipeline. These are jobs for processing telemetry streams into sinks such as SQL storage or visualizations. A Stream Analytic Job connects an input to an output with a processing query filter in between.

Fig 5 – Azure IoT Hub Stream Analytic Job = Input + Query + Output

Fig 6 – Azure IoT Hub Stream Analytic Job Input from the deployed message stream IoT Hub or Blob

Fig 7 – Azure IoT Hub Stream Analytic Job Output to several options including Azure SQL Server

Finally the query connecting Input to Output

Fig 8 – Stream Analytics Query – sample queries

Message store or visualization

As seen above the Azure IoT Hub offers several ways to store or visualize data streams.
This tutorial includes simple test and simulated device code:

Fig 9 – some test code for sending simulated messages to an Azure IoT Hub

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Azure.Devices.Client;
using Newtonsoft.Json;
using System.Threading;

namespace SimulatedDevice
    class Program
        static DeviceClient deviceClient;
        static string iotHubUri = "<name of your IoT Hub>";
        static string deviceKey = "<your device key>";

        static void Main(string[] args)
            Console.WriteLine("Simulated device\n");
            deviceClient = DeviceClient.Create(iotHubUri, new DeviceAuthenticationWithRegistrySymmetricKey("myFirstDevice", deviceKey));


        private static async void SendDeviceToCloudMessagesAsync()
            double avgWindSpeed = 10; // m/s
            Random rand = new Random();
            double latitude = 39.008208;
            double longitude = -104.797239;

            while (true)
                double currentWindSpeed = avgWindSpeed + rand.NextDouble() * 4 - 2;
                var telemetryDataPoint = new
                    deviceId = "myFirstDevice",
                    windSpeed = currentWindSpeed,
                    latitude = latitude,
                    longitude = longitude
                var messageString = JsonConvert.SerializeObject(telemetryDataPoint);
                var message = new Message(Encoding.ASCII.GetBytes(messageString));

                await deviceClient.SendEventAsync(message);
                Console.WriteLine("{0} > Sending message: {1}", DateTime.Now, messageString);


Fig 10 – Resulting simulated device test records inserted into the gpsTest table by Stream Analytics Job

A much more involved example of an IoT and Mobile App is furnished by Microsoft: My Driving
Microsoft’s complete solution is available on GitHub with details.


Microsoft is forging ahead with Azure, offering numerous infrastructure options that make IoT a real possibility for small and medium businesses. Collecting data from diverse devices is getting easier with the addition of Windows IoT Core, VS2015 Xamarin, and Azure IoT Hubs with Stream Analytic Jobs. Fleet management services will never be the same.

Spatial data still plays a big part in telemetry since every stationary sensor involves a location and every mobile device a gps stream. Ubiquitous sensor networks imply the need for spatial sorting and visualization, at least while humans are still in the loop. Remove the human and Heidegger’s “abolition of all distances” reappears, but then sadly you and I disappear.

The R Project for Maps

Fig 1 – interactive Leaflet choropleth of Census ACS household income $60k-$75k using Microsoft Open R

The main stay of web mapping applications for the last couple of decades has been three tier: Model – SQL, View – web UI, and Controller – server code. There are many variations on this theme: models residing in image tile pyramids, SQL Server, PostGIS, or Oracle; controller server code as Java, C#, or PHP. The visible action is on the viewer side. Html5 with ever expanding JavaScript libraries like jQuery, bootstrap, and angular.js make life interesting, while node.js is pushing JavaScript upstream to the controller.

For building end user applications it helps to know all three tiers and have at least one tool in each. With the right tools you can eventually accomplish just about anything spatially interesting. Emphasis is on the word “eventually.” SQL <=> C# <=> html5/JavaScript is very powerful, but extravagant for “one off” analytical work.

For ad hoc spatial work it was usually best to stick to a desk top application such as one of the big dollar Arc___ variations or better yet something open source like QGIS. In the early days these generally consisted of modular C/C++ functions threaded together with an all-purpose scripting language. If you wanted to get a little closer to the geo engine, knowledge of a scripting language (PHP, TCL, Python, or Ruby) helped to script modular toolkits like GDAL/OGR, OSSIM, GEOS, or GMT. This all works fine except for learning and relearning often arcane syntax, while repeatedly discovering and reading data documentation on various public resources from Census, USGS, NOAA, NASA, JPL … you get the idea.

R changes things in the geospatial world. The R project originated as a modular statistics and graphics toolkit. Unless you happen to be a true math prodigy, statistics are best visualized graphically. With powerful graphics libraries, R has evolved into a useful platform for ad hoc spatial analysis.

Coupled with an IDE such as RStudio, or the new Microsoft R Tools for Visual Studio, R wraps a large stable of component libraries into a script interpreter environment, ideal for “one off” analysis. Although learning arcane syntax is still a prerequisite, there is at least a universal environment with a really large contributor community. You can think of it as open source replacement for Tableau or Power BI but without proprietary limitations.

Example: networkD3 R library for creating D3 JavaScript network graphs.

# only a few lines of script
data(MisLinks, MisNodes)
forceNetwork(Links = MisLinks, Nodes = MisNodes, Source = "source",
             Target = "target", Value = "value", NodeID = "name",
             Group = "group", opacity = 0.4)

Community contributions are found in CRAN, Comprehensive R Archive Network for the R programming language. A search of CRAN or MRAN (Microsoft R Archive Network) for the term “spatial” yields a list of 145 R libraries.

Example: dygraph R library for creating interactive charts.

 # only a few lines of script
  dygraph(nhtemp, main = "New Haven Temperatures") %>%
   dyRangeSelector(dateWindow = c("1920-01-01", "1960-01-01"))

Here are just a few samples of CRAN libraries useful for spatial analysis:

library(rgdal)  # reading spatial files with gdal
library(ggmap)  # simple mapping and more
library(raster)  # defining extents and raster processing
      brick  # raster cube objects useful for multispectral operations
      stack  # multilayer raster manipulation
library(sp)  # working with spatial objects
library(leaflet)  # interactive web mapping using Leaflet
library(rgeos)  # R GEOS wrapper
library(tigris)  # downloading geography spatial census tiger
library(FedData)  # downloading federal data NED, NHD, SSURGO, GHCN
library(acs)  # tabular census data (American Community Survey) ACS, SF1, SF3
library(UScensus2010)  # spatial and demographic Census 2010 data county/tract/blkgrp/blk
library(RColorBrewer)  # color palettes for thematic mapping

For example, tigris is a useful library for reading US Census TIGER files. With just a couple lines of R scripting you can zoom around a polygonal plot of US Census urban areas. Library(tigris) handles all the details of obtaining the TIGER polygons and loading into local memory. Library(leaflet) handles creating the polygons and displaying over a default Leaflet map as tiles.

ua <- urban_areas(cb = TRUE)
ua %>% leaflet() %>% addTiles() %>% addPolygons(popup = ~NAME10)

Fig 2 – RStudio Interactive Leaflet plot of Census TIGER urban area extracted with tigris

Fig 3 – RStudio script of Census ACS tract household income percentage for $60k-$75K

These samples follow examples found in Zev Ross’s blog posts which contain a wealth of scripts using R for spatial analytics, including these posts on using FedData and rgdal.

Microsoft R

Microsoft recently entered the R world with several enhanced R tools, including:
Microsoft R Open
RTVS R Tools for Visual Studio
Microsoft R Server
SQL Server R Services
MRAN Microsoft R Application Network
Microsoft Azure R Server
Microsoft R Server for Hadoop

Apparently Data Science is a growth industry and Microsoft has an interest in providing useful tools beyond Power BI.

Microsoft R Open Microsoft R Open

Free Microsoft version of R script engine with a couple of enhancements:
• intel enhanced math library
• multi core support
• multithreading

Fig 4 – slide from Derek Norton webinar on R Server showing relative performance boost with enhanced Microsoft R Open

RTVS R Tools for Visual Studio RTVS R Tools for Visual Studio
Microsoft R Visual Studio IDE using the Data Science R settings. Users of Visual Studio will find all the familiar debug stepping, variable explorer, and intellisense editing they are using for other development languages.

Microsoft R Server Microsoft R Server
Licensed enterprise R Service that scales by avoiding in-memory data limitations using parallel chunked data streams.

Fig 5 – slide from Derek Norton webinar showing R Server scale enhancements

SQL Server 2016 R Services SQL Server R Services

SQL R Services – data ETL and visualization tool inside SQL.
T-SQL R interface with Database next to R code on the same server.

sp_execute_external_script – R code embedding
receives inputs, passes to external R runtime, and returns R results.
invoke sp to run R code in T-SQL

MRAN Microsoft R Application Network MRAN Microsoft R Application Network

CRAN fixed date snapshots allow shared R code pointing to compatible library versions
Checkpoint reproducibility

Fig 6 – RVST R Visual Studio 2015 Tools Leaflet demographic script

Example R Leaflet demographic script (ref Fig 1 above):

library(tigris)  # TIGER data
library(acs)     # ACS data
library(stringr) # to pad fips codes
library(dplyr)   # data manipulation
library(leaflet) # interactive mapping

#Colorado Front range counties
counties <- c(1, 5, 13, 31, 35, 39, 41, 59)
tracts <- tracts(state = 'CO', county = c(1, 5, 13, 31, 35, 39, 41, 59), cb = TRUE)

api.key.install(key = "<insert your own api key here>")
geo <- geo.make(state = c("CO"),
              county = c(1, 5, 13, 31, 35, 39, 41, 59), tract = "*")

income <- acs.fetch(endyear = 2012, span = 5, geography = geo,
                  table.number = "B19001", col.names = "pretty")
attr(income, "acs.colnames")
##  [1] "Household Income: Total:"
## [12] "Household Income: $60,000 to $74,999"  

income_df <- data.frame(paste0(str_pad(income@geography$state, 2, "left", pad = "0"),
                               str_pad(income@geography$county, 3, "left", pad = "0"),
                               str_pad(income@geography$tract, 6, "left", pad = "0")),
                        income@estimate[, c("Household Income: Total:",
                                           "Household Income: $60,000 to $74,999")],
                        stringsAsFactors = FALSE)

income_df <- select(income_df, 1:3)
rownames(income_df) <- 1:nrow(income_df)
names(income_df) <- c("GEOID", "total", "income60kTo75k")
income_df$percent <- 100 * (income_df$income60kTo75k / income_df$total)

income_merged <- geo_join(tracts, income_df, "GEOID", "GEOID")

popup <- paste0("GEOID: ", income_merged$GEOID, "<br>", "Percent of Households $60k-$75k: ", round(income_merged$percent, 2))
pal <- colorNumeric(
  palette = "YlGnBu",
  domain = income_merged$percent)

incomemap <- leaflet() %>%
  addProviderTiles("CartoDB.Positron") %>%
  addPolygons(data = income_merged,
              fillColor = ~pal(percent),
              color = "#b2aeae", # you need to use hex colors
fillOpacity = 0.7,
              weight = 1,
              smoothFactor = 0.2,
              popup = popup) %>%
  addLegend(pal = pal,
            values = income_merged$percent,
            position = "bottomright",
            title = "Percent of Households<br>$60k-$75k",
            labFormat = labelFormat(suffix = "%"))


Hillshade example using public SRTM 90 data:

    alt <- getData('alt', country = 'ITA')
    slope <- terrain(alt, opt = 'slope')
    aspect <- terrain(alt, opt = 'aspect')
    hill <- hillShade(slope, aspect, 40, 270)

    leaflet() %>% addProviderTiles("CartoDB.Positron") %>%
      addRasterImage(hill, colors = grey(0:100 / 100), opacity = 0.6)

Fig 7 – RVST R Visual Studio 2015 Tools Leaflet Hillshading image


R provides lots of interesting modules that help with spatial analytics. The script engine makes it easy to perform ad hoc visualization and publish the results online. However, there are limitations in performance and extents that make it more of a competitor to desktop GIS products or the newer commercial data visualizers like Tableau or PowerBI. For public facing web applications with generalized extents three tier performance using SQL + server code + web UI still makes the most sense.

The advent of Microsoft R Server and SQL Server R Services add scaling performance to make R solutions more competitive with the venerable three tier approach. It will be interesting to see how developers make use of SQL Server R Services. As a method of adding raster functionality to SQL Server, R sp_execute_external_script overlaps somewhat with PostGIS Raster. Exploring SQL Server 2016 R Services must await a future post.

Example: threejs R library with world flight data

.NET Rocks!
Road Trip Tracking

Fig 0 - Tracking a RoadTrip


Microsoft is on the move this fall. Win8 is the big news, but Visual Studio 2012, .Net 4.5, a revamped Azure, WP8, Office 2013, and even a first foray into consumer hardware, Surface Tablets (not tables), all see daylight this fall.

The Net Rocks duo, Carl Franklin and Richard Campbell, are also on the move. Carl and Richard head out this week for a whirl wind tour through 36 states in 72 days or roughly 1728 hours. The DNR Road Trip Tracking application,, keeps tabs on the .Net Rocks RV with Tweet encouragement for the road weary travelers. All are welcome to follow the DNR RV online and add Tweet comments at #dnrRoadTrip. The app gives event information and up to the minute updates of time to next show with tweets along the route. It even gives turn by turn directions for those inclined to catch the .Net Rocks RV and follow along in the real world – .NET Rocks stalking.

Technical Overview

Project Outline

Fig 1 – .Net Rocks Road Trip Tracking app project outline


SQL Server Azure is the key resource for the DNR tracking app. GPS feeds are pushed at 1 minute intervals from a commercial Airlink GPS unit to a Windows service listening for UDP packets. This Feed Service turns incoming UDP packets into Feed records stored in SQL Azure with a geography point location and datetime stamp.

On the same system, a Twitter Query service is checking for Tweets on a 30 second interval using the Twitter REST API. Tweets are also turned into Feed records in SQL Azure. However, the geography point locations for Tweets are pulled from the latest GPS record so they are associated in time with the location of the GPS unit in the DNR RV.


Fig2 – Windows 8 IE10 browser showing stop and routes

On the front end, an Azure WebRole provides the UI and WCF service for communicating with the SQL Azure Feed data. In order to handle the widest spectrum of devices, this UI leverages jQuery Mobile sitting on top of HTML5. jQuery Mobile Supported Platforms

Inside the app divs (jQuery Mobile <div data-role=”page”..> ) are maps leveraging Bing Maps Ajax v7 API. The UI client also accesses Bing Geocode and Bing Route services through a proxy.

Fig 3 – IE9 on a laptop with GPS dots and Tweet icons along the way

Some more details:


Since there are several thousand points for each ‘event to event’ route, these are stored in SQL Azure as geography LineStrings. Using SQL Server spatial functions, the routes can be simplified on query for improved performance in both network latency and map rendering. SQL Azure’s geography Reduce(factor) function is a thinning algorithm that reduces the number of vertices of geography features while maintaining shape integrity. In this app the reduce factor is tied to zoomlevels of the map, thinning the number of points returned to the UI.

The map viewport is used as a bounding box STIntersect so only the visible routes are returned. Obviously Carl and Richard may find reasons to take a different route so the GPS breadcrumbs may wander from the Bing generated routes.


Fig 4 – WebMatrix iPad simulator

The Twitter REST API queries are simple search by hashtag queries:

To avoid returning lots of duplicate Tweets the search is limited by the last since_id in the Feed table. There are some caveats to REST Twitter searches:
“Search is focused on relevance and not completeness. This means that some
Tweets and users may be missing from search results

Fig 5 - webmatrix WP7 emulator

Fig 6 - iPhone simulator

GPS points

GPS points are generated every 60 seconds while the RV GPS unit is powered on. When the vehicle is off, and power is scarce, the unit still sends a packet once every 4 hours. Carl and Richard will be driving a lot of hours and there will be lots of points generated over the next 72 days and roughly 1728 hours. Assuming a 25% driving time over the duration, there could be as many as 1728/4 *60 = 25,920 GPS locations. Even Bing Maps Ajax v7 will choke trying to render this many locations.

In order to keep things more reasonable, there is another thinning algorithm used in the GPS query service. This is again tied to zoomlevel . At lower zoom levels points are thinned using a type of decimation – every 20th, 10th, 5th point, etc is kept depending on the zoomlevel. In addition only points required by the viewport bounding box are returned. Once the map is zoomed to higher resolution (zoom > 11) all of the points will be returned.

GPS map locations include a rollover infobox with time and detected speed at the location. We can all check up on Carl’s driving (moving: 86.99mph) and keep track of coffee stops (moving: 0.0 mph).

Bing Routes

Routing services are provided for user position to the latest GPS location and Stop venues selected on the map or from the Stop list. In addition to the route map a turn by turn directions list is provided as a page option. The GeoLocation API is used for identifying a user’s location for these routing requests. Geolocation API is an opt in API, so users need to allow location sharing to have their location automatically available. If allowed, getCurrentPosition returns a latitude, longitude which is run through the Bing reverse geocoding service to get an address used as the ‘from’ field for routing requests.

Fig 7 - Stop Detail with Maps.Themes.BingTheme()

Fig 8 - Bing Route Denver to Seattle Stop

Fig 9 - Bing Turn by Turn directions

jQuery Mobile

jQuery Mobile is a javascript library for abstracting away some of the complexity of supporting a large number of devices. WP7, Win8 tablets, iPads, iPhones, and Android devices are multiplying while traditional laptop and desktop systems have a number of browser choices and versions. jQuery Mobile is not perfect but it is a great help in a project that had to be ready in about ten days from start to finish.

One interesting feature of jQuery Mobile is the page transition effect. These are based on CSS3 and are not supported by all browsers. It adds a little pizazz to see slide, flip, and pop effects for page transitions.

JQuery Mobile apps do not have access to device native sensors such as accelerometer, compass, gyrometer, etc , so jQuery Mobile webapps will not have the full range of capabilities found in custom apps for Win8, WP7, iOS, and Android. However, just one web UI for all is an enticing benefit, while deployment is ordinary webapp rather than a series of more complex app store submissions. This approach allows easy updates over the course of the tour.

Fig 10 – Microsoft Way on an Android AVD emulator

Collecting some locations always leads to the possibility of heatmaps. These are value gradients which are helpful for analyzing geospatial data.

Fig 11 – Tweet heatmap along tour route from Seattle to Wyoming

Maybe it’s pretty obvious where Tweets are concentrated, but how about locations of app users who share their location. Australia is hot, India is not. Guess who listens to .NetRocks! Or, at least who’s less cautious about sharing location with GeoLocation API. User heatmaps bring to mind some intriguing possibilities for monitoring use, markets, and promotion effectiveness.

Fig 12 - GeoLocation users world wide


On the Road Again

Silverlight Video Pyramids

VideoTile Fig 1
Fig 1 – Silverlight Video Tile Pyramid

Microsoft’s DeepZoom technology capitalizes on tile pyramids for MultiScaleImage elements. It is an impressive technology and is the foundation of Bing Maps Silverlight Control Navigation. I have wondered for some time why the DeepZoom researchers haven’t extended this concept a little. One possible extension that has intriguing possibilities is a MultiScaleVideo element.

The idea seems feasible, breaking each frame into a DeepZoom type pyramid and then refashioning as a pyramid of video codecs. Being impatient, I decided to take an afternoon and try out some proof of concept experiments. Rather than do a frame by frame tiling, I thought I’d see how a pyramid of WMV files could be synchronized as a Grid of MediaElements:

<font color="#0000FF" size=2><</font><font color="#A31515" size=2>Grid</font><font color="#FF0000" size=2> x</font><font color="#0000FF" size=2>:</font><font color="#FF0000" size=2>Name</font><font color="#0000FF" size=2>="VideoTiles"</font><font color="#FF0000" size=2> Background</font><font color="#0000FF" size=2>="{</font><font color="#A31515" size=2>StaticResource</font><font color="#FF0000" size=2> OnTerraBackgroundBrush</font><font color="#0000FF" size=2>}" </font>
<font color="#FF0000" size=2> Width</font><font color="#0000FF" size=2>="426"</font><font color="#FF0000" size=2> Height</font><font color="#0000FF" size=2>="240"&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>
<font color="#FF0000" size=2> HorizontalAlignment</font><font color="#0000FF" size=2>="Center"</font><font color="#FF0000" size=2> VerticalAlignment</font><font color="#0000FF" size=2>="Center">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>Grid.ColumnDefinitions</font><font color="#0000FF" size=2>>
<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>ColumnDefinition</font><font color="#0000FF" size=2>/>
<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>ColumnDefinition</font><font color="#0000FF" size=2>/>
<font color="#0000FF" size=2>&lt;/</font></font><font color="#A31515" size=2>Grid.ColumnDefinitions</font><font color="#0000FF" size=2>>
<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>Grid.RowDefinitions</font><font color="#0000FF" size=2>>
<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>RowDefinition</font><font color="#0000FF" size=2>/>
<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>RowDefinition</font><font color="#0000FF" size=2>/>
<font color="#0000FF" size=2>&lt;/</font></font><font color="#A31515" size=2>Grid.RowDefinitions</font><font color="#0000FF" size=2>>
&nbsp;&nbsp;<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>MediaElement</font><font color="#FF0000" size=2> x</font><font color="#0000FF" size=2>:</font><font color="#FF0000" size=2>Name</font><font color="#0000FF" size=2>="v00"</font><font color="#FF0000" size=2> Source</font><font color="#0000FF" size=2>=""</font><font color="#FF0000" size=2>
 Grid.Column</font><font color="#0000FF" size=2>="0"</font><font color="#FF0000" size=2> Grid.Row</font><font color="#0000FF" size=2>="0" <font color="#0000FF" size=2>/>
&nbsp;&nbsp;<font color="#0000FF" size=2><</font></font></font><font color="#A31515" size=2>MediaElement</font><font color="#FF0000" size=2> x</font><font color="#0000FF" size=2>:</font><font color="#FF0000" size=2>Name</font><font color="#0000FF" size=2>="v10"</font><font color="#FF0000" size=2> Source</font><font color="#0000FF" size=2>=""</font><font color="#FF0000" size=2>
 Grid.Column</font><font color="#0000FF" size=2>="1"</font><font color="#FF0000" size=2> Grid.Row</font><font color="#0000FF" size=2>="0" />
&nbsp;&nbsp;<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>MediaElement</font><font color="#FF0000" size=2> x</font><font color="#0000FF" size=2>:</font><font color="#FF0000" size=2>Name</font><font color="#0000FF" size=2>="v11"</font><font color="#FF0000" size=2> Source</font><font color="#0000FF" size=2>=""</font><font color="#FF0000" size=2>
 Grid.Column</font><font color="#0000FF" size=2>="1"</font><font color="#FF0000" size=2> Grid.Row</font><font color="#0000FF" size=2>="1" />
&nbsp;&nbsp;<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>MediaElement</font><font color="#FF0000" size=2> x</font><font color="#0000FF" size=2>:</font><font color="#FF0000" size=2>Name</font><font color="#0000FF" size=2>="v01"</font><font color="#FF0000" size=2> Source</font><font color="#0000FF" size=2>=""</font><font color="#FF0000" size=2>
 Grid.Column</font><font color="#0000FF" size=2>="0"</font><font color="#FF0000" size=2> Grid.Row</font><font color="#0000FF" size=2>="1" />
<font color="#0000FF" size=2>&lt;/</font></font><font color="#A31515" size=2>Grid</font><font color="#0000FF" size=2>></font>

Ideally to try out a video tile pyramid I would want something like 4096×4096 since it divides nicely into 256 like the Bing Maps pyramid. However, Codecs are all over the place, and tend to cluster on 4:3 or 16:9 aspect ratios. Red 4K at 4520×2540 is the highest resolution out there, but I didn’t see any way to work with that format in Silverlight. The best resolution sample clip I could find that would work in Silverlight was the WMV HD 1440×1080 Content Showcase. Since I like the Sting background music, I decided on “The Living Sea” IMAX sample.

Not enough resolution to get too far, but I am just looking at multi tile synching for now and two levels will do. I ended up using Expression Encoder 3 to take the original resolution and clip to smaller subsets.

Zoom Level 1:

00 10
01 11

Zoom Level 2:

0000 0010 1000 1010
0001 0011 1001 1011
0100 0110 1100 1110
0101 0111 1101 1111

I encoded ZoomLevel 1 as 4 tiles 640×480 and Zoom Level 2 as 16 tiles at 320×240. I then took all these tiles and dropped them into my Azure CDN video container. Again this is not a streaming server, but I hoped it would be adequate to at least try this in a limited time frame. Now that I have the video pyramid with two zoom levels I can start trying out some ideas.

VideoTile Fig 2
Fig 2 – Silverlight Video Tile Pyramid Zoom Level 1

VideoTile Fig 3
Fig 3 – Silverlight Video Tile Pyramid ZoomLevel 2

First, it is fairly difficult to keep the Grid from showing in the layout. Moving around with different sizes can change the border but there is generally a slight line visible, which can be seen in Fig 2. Even though you don’t see the lines in Fig 3, it also is made up of four tiles. This is setup just like a normal tile pyramid with four tiles under each upper tile in a kind of quad tree arrangement. In this case very simple with just the 2 levels.

I tied some events to the MediaElements. The main pyramid events are tied to MouseWheel events:

void Video_MouseWheel(object sender, MouseWheelEventArgs e)
    int delta = e.Delta;
    if (delta < 0)
      //zoom out
      if (e.OriginalSource.GetType() == typeof(MediaElement))
        VideoCnt = 0;
        MediaElement me = e.OriginalSource as MediaElement;
        currentPostion = me.Position;
        v00.Source = new Uri("");
        v10.Source = new Uri("");
        v11.Source = new Uri("");
        v01.Source = new Uri("");
    else if (delta > 0)
      //zoom in
      if (e.OriginalSource.GetType() == typeof(MediaElement))
        if (VideoZoomLevel <= maxVideoZoom)
            VideoCnt = 0;
            MediaElement me = e.OriginalSource as MediaElement;
            currentPostion = me.Position;
            string quad = me.Source.LocalPath.Substring(0, me.Source.LocalPath.IndexOf(".wmv"));

            v00.Source = new Uri("" + quad + "00.wmv");
            v10.Source = new Uri("" + quad + "10.wmv");
            v11.Source = new Uri("" + quad + "11.wmv");
            v01.Source = new Uri("" + quad + "01.wmv");
            VideoZoomLevel = maxVideoZoom;

I’m just checking a MouseWheel delta to determine whether to go in or out. Then looking at the original source I determine which quad the mouse is over and then create the new URIs for the new Zoom Level. This is not terribly sophisticated. Not surprisingly the buffering is what is the killer. There are some MediaOpen and Load events which I attempted to use, however, there were quite a few problems with synchronizing the four tiles.

If you can patiently wait for the buffering it does work after a fashion. Eventually the wmv are in local cache which helps. However, the whole affair is fragile and erratic.

I didn’t attempt to go any further with panning across the Zoom Level 2. I guess buffering was the biggest problem. I’m not sure how much further I could get trying to move to a Streaming Media server or monitoring BufferProgress with a timer thread.

The experiment may have been a failure, but the concept is none the less interesting. Perhaps some day a sophisticated codec will have such things built in.

The high altitude perspective

One aspect which makes MultiScaleVideo interesting is just its additional dimension of interactivity. As film moves inexorably to streaming internet, there is more opportunity for viewer participation. In a pyramid approach focus is in the viewer’s hand. The remote becomes a focus tool that moves in and out of magnification levels as well as panning across the video 2D surface.

In the business world this makes interfaces to continuous data collections even more useful. As in the video feed experiment of the previous post, interfaces can scan at low zoom levels and then zoom in for detailed examination of items of interest. Streetside photos in the example Streetside path navigation already hint at this, using the run navigation to animate a short photo stream while also providing zoom and pan DeepZoom capability.

One of the potential pluses for this, from a distributor point of view, is repeat viewer engagement. Since the viewer is in control, any viewing is potentially unique, which discourages the typical view and discard common with film videos today. This adds value to potential advertisement revenue.

The film producer also has some incentive with a whole new viewer axis to play with. Now focus and peripheral vision take on another dimension, and focal point clues can create more interaction or in some cases deceptive side trails in a plot. Easter eggs in films provide an avid fan base with even more reason to view a film repeatedly.

Finally, small form factor hand held viewers such as iPhone and Android enabled cell phones can benefit from some form of streaming that allows user focus navigation. The screen in these cases is small enough to warrant some navigation facility. Perhaps IMAX or even Red 4K on handhelds is unreasonable, but certainly allowing navigation makes even the more common HD codecs more useable. A video pyramid of streaming sources could make a compelling difference in the handheld video market.


MultiScaleVideo is a way to enhance user interaction in a 2D video. It doesn’t approach the game level interaction of true 3D scene graphs, but it does add another axis of interest. My primitive exercise was not successful. I am hoping that Microsoft Labs will someday make this feasible and add another type of Navigation to the arsenal. Of course, you can imagine the ensuing remote controller wars if DeepZoom Video ever becomes common place.

One more thing, check out the cool scaling animation attached to the expander button, courtesy of Silverlight Toolkit Nov drop.

Azure Video and the Silverlight Path

Fig 1 – Video Synched to Map Route

My last project experimented with synching Streetside with a Route Path. There are many other continuous asset collections that can benefit from this approach. Tom Churchill develops very sophisticated software for video camera augmentation, Churchill Navigation. He managed to take some time out of a busy schedule to do a simple drive video for me to experiment with.

In this scenario a mobile video camera is used along with a GPS to produce both a video stream and a simultaneous stream of GPS NMEA records. NMEA GPRMC records include a timestamp and latitude, longitude along with a lot of other information, which I simply discarded in this project.

First the GPS data file was converted into an xml file. I could then use some existing xml deserializer code to pull the positions into a LocationCollection. These were then used in Bing Maps Silverlight Control to produce a route path MapPolyline. In this case I didn’t get fancy and just put the xml in the project as an embedded resource. Certainly it would be easy enough to use a GPS track table from SQL Server, but I kept it simple.

NMEA GPRMC Record Detail:


1	050756,	time 05:07:56
2	A,		Active A | V
3	4000.8812,	Latitude
4	N,		North
5	10516.7323,	Longitude
6	W,		West
7	20.2,		ground speed in knots
8	344.8,	track angle degrees true
9	211109,	date 11/21/2009
10	10.2,		magnetic variation
11	E,		East
12	D*0E		Checksum

XML resulting from above NMEA record

&lt;?xml version="1.0" encoding="utf-16"?&gt;
&lt;ArrayOfLocationData xmlns:xsi=""
&nbsp;&nbsp;&nbsp;&nbsp;&lt;Description&gt;Boulder GPS&lt;/Description&gt;
&nbsp;&nbsp;&nbsp;&nbsp;&lt;Description&gt;Boulder GPS&lt;/Description&gt;

Once the route path MapPolyline is available I can add a vehicle icon similar to the last streetside project. The icon events are used in the same way to start an icon drag. Mouse moves are handled in the Map to calculate a nearest point on the path and move the icon constrained to the route. The Mouse Button Up event is handled to synch with the video stream. Basically the user drags a vehicle along the route and when the icon is dropped the video moves to that point in the video timeline.

Video is a major focus of Silverlight. Microsoft Expression Encoder 3 has a whole raft of codecs specific to Silverlight. It also includes a dozen or so templates for Silverlight players. These players are all ready to snap in to a project and include all the audio volume, video timeline, play-stop-pause, and other controls found in any media player. The styling however, is different with each template, which makes life endurable for the aesthetically minded. I am not, so the generic gray works fine for my purposes. When faced with fashion or style issues my motto has always been “Nobody will notice,” much to the chagrin of my kids.

Expression Encoder 3 Video Player Templates

  • Archetype
  • BlackGlass
  • Chrome
  • Clean
  • CorporateSilver
  • Expression
  • FrostedGallery
  • GoldenAudio
  • Graphing
  • Jukebox
  • Popup
  • QuikSilver
  • Reflection
  • SL3AudioOnly
  • SL3Gallery
  • SL3Standard

At the source end I needed a reliable video to plug into the player template. I had really wanted to try out the Silverlight Streaming Service, which was offered free for prototype testing. However, this service is being closed down and I unfortunately missed out on that chance.

Tim Heuer’s prolific blog has a nice introduction to an alternative.

As it turns out I was under the mistaken impression that “Silverlight Streaming” was “streaming” video. I guess there was an unfortunate naming choice which you can read about in Tim’s blog post.

As Tim explains, Azure is providing a new Content Delivery Network CTP. This is not streaming, but it is optimized for rapid delivery. CDN is akin to Amazon’s Cloud Front. Both are edge cache services that boast low latency and high data transfer speeds. Amazon’s Cloud Front is still Beta, and Microsoft Azure CDN is the equivalent in Microsoft terminology, CTP or Community Technical Preview. I would not be much surprised to see a streaming media service as part of Azure in the future.

Like Cloud Front, Azure CDN is a promoted service from existing Blob storage. This means that using an Azure storage account I can create a Blob storage container, then enable CDN, and upload data just like any other blob storage. Enabling CDN can take awhile. The notice indicated 60 minutes, which I assume is spent allocating resources and getting the edge caching scripts in place.

I now needed to add Tom’s video encoded as 640×480 WMV up to the blob storage account with CDN enabled. The last time I tried this there wasn’t a lot of Azure upload software available. However, now there are lots of options:

Cloud Berry Explorer and Cloud Storage Studio were my favorites but there are lots of codeplex open source projects as well.

Azure Storage Explorer

Factonomy Azure Utility(Azure Storage Utility)


Azure Blob Storage Client

I encountered one problem, however, in all of the above. Once my media file exceeded 64Mb, which is just about 5min of video for the encoding I chose, my file uploads consistently failed. It is unclear whether the problem is at my end or the upload software. I know there is a 64Mb limit for simple blob uploads but most uploads would use a block mode not simple mode. Block mode goes all the way up to 50GB in current CTP which is a very long video. (roughly 60 hours at this encoding)

When I get time I’ll return to the PowerShell approach and manually upload my 70Mb video sample as a block. In the meantime I used Expression Encoder to clip the video down a minute to a 4:30 min clip for testing purposes.

Here are the published Azure Storage limits once it is released:

200GB for block blobs (64KB min, 4MB max block size)
64MB is the limit for a single blob before you need to use blocks
1TB for page blobs

Fig 2 – Video Synched to Map Route

Now there’s a video in the player. The next task is to make a two way connect between the route path from the GPS records and the video timeline. I used a 1 sec timer tick to check for changes in the video timeline. This time is then used to loop through the route nodes defined by gps positions until reaching a time delta larger or equal to the current video time. At that point the car icon position is updated. This positions within a 1-3second radius of accuracy. It would be possible to refine this using a segment percentage approach and get down to the 1sec timer tick radius of accuracy, but I’m not sure increased accuracy is helpful here.

The reverse direction uses the current icon position to keep track of the current segment. Since the currentSegment also keeps track of endpoint delta times, it is used to set the video player position with the MouseUp event. Now route path connects to video position and as video is changed the icon location of the route path is also updated. We have a two way synch mode between the map route and the video timeline.

&nbsp;&nbsp;private void MainMap_MouseMove(object sender, MouseEventArgs e)
&nbsp;&nbsp;&nbsp;&nbsp;Point p = e.GetPosition(MainMap);
&nbsp;&nbsp;&nbsp;&nbsp;Location LL = MainMap.ViewportPointToLocation(p);
&nbsp;&nbsp;&nbsp;&nbsp;LLText.Text = String.Format("{0,10:0.000000},{1,11:0.000000}", LL.Latitude, LL.Longitude);
&nbsp;&nbsp;&nbsp;&nbsp;if (cardown)
&nbsp;&nbsp;&nbsp;&nbsp;currentSegment = FindNearestSeg(LL, gpsLocations);
&nbsp;&nbsp;&nbsp;&nbsp;MapLayer.SetPosition(car, currentSegment.nearestPt);

FindNearestSeg is the same as the previous blog post except I’ve added time properties at each endpoint. These can be used to calculate video time position when needed in the Mouse Up Event.

&nbsp;&nbsp;private void car_MouseLeftButtonUp(object sender, MouseButtonEventArgs e)
&nbsp;&nbsp;&nbsp;&nbsp;if (cardown)
&nbsp;&nbsp;&nbsp;&nbsp;cardown = false;
&nbsp;&nbsp;&nbsp;&nbsp;VideoBoulder1.Position = currentSegment.t1;

Silverlight UIs for continuous data

This is the second example of using Silverlight Control for route path synching to a continuous data collection stream. In the first case it was synched with Streetside and in this case to a gps video. This type of UI could be useful for a number of scenarios.

There is currently some interest in various combinations of mobile Video and LiDAR collections. Here are some examples

  • Obviously both Google and Microsoft are busy collecting streetside views for ever expanding coverage.
  • Utility corridors, transmission, fiber, and pipelines, are interested in mobile and flight collections for construction, as built management, impingement detection, as well as regulatory compliance.
    Here are a couple representative examples:
      Baker Mobile
      Mobile Asset Collection MAC Vehicle
  • Railroads have a similar interest
      Lynx Mobile
  • DOTs are investigating mobile video LiDAR collection as well
      Iowa DOT Research


Mobile asset collection is a growing industry. Traditional imagery, video, and now LiDAR components collected in stream mode are becoming more common. Silverlight’s dual media and mapping controls make UIs for managing and interfacing with these type of continuous assets not only possible in a web environment, but actually fun to create.

Hauling Out the Big RAM

Amazon released a handful of new stuff.

“Make that a Quadruple Extra Large with room for a Planet OSM”

Big Mmeory
Fig 1 – Big Foot Memory

1. New Price for EC2 instances

Linux Windows SQL Linux Windows SQL
m1.small $0.085 $0.12 $0.095 $0.13
m1.large $0.34 $0.48 $1.08 $0.38 $0.52 $1.12
m1.xlarge $0.68 $0.96 $1.56 $0.76 $1.04 $1.64
c1.medium $0.17 $0.29 $0.19 $0.31
c1.xlarge $0.68 $1.16 $2.36 $0.76 $1.24 $2.44

Notice the small instance, now $0.12/hr, matches Azure Pricing

Compute = $0.12 / hour

This is not really apples to apples since Amazon is a virtual instance, while Azure is per deployed application. A virtual instance can have multple service/web apps deployed.

2. Amazon announces a Relational Database Service RDS
Based on MySQL 5.1, this doesn’t appear to add a whole lot since you always could start an instance with any database you wanted. MySQL isn’t exactly known for geospatial even though it has some spatial capabilities. You can see a small comparison of PostGIS vs MySQL by Paul Ramsey. I don’t know if this comparison is still valid, but I haven’t seen much use of MySQL for spatial backends.

This is similar to Azure SQL Server which is also a convenience deployment that lets you run SQL Server as an Azure service, without all the headaches of administration and maintenance tasks. Neither of these options are cloud scaled, meaning that they are still single instance versions, not cross partition capable. SQL Azure Server CTP has an upper limit of 10Gb, as in hard drive not RAM.

3. Amazon adds New high memory instances

  • High-Memory Double Extra Large Instance 34.2 GB of memory, 13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units each), 850 GB of instance storage, 64-bit platform $1.20-$1.44/hr
  • High-Memory Quadruple Extra Large Instance 68.4 GB of memory, 26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each), 1690 GB of instance storage, 64-bit platform $2.40-$2.88/hr

These are new virtual instance AMIs that scale up as opposed to scale out. Scaled out options use clusters of instances in the Grid Computing/Hadoop type of architectures. There is nothing to prohibit using clusters of scaled up instances in a hybridized architecture, other than cost. However, the premise of Hadoop arrays is “divide and conquer,” so it makes less sense to have massive nodes in the array. Since scaling out involves moving the problem to a whole new parallel programming paradigm with all of its consequent complexity, it also means owning the code. In contrast scaling up is generally very simple. You don’t have to own the code or even recompile just install on more capable hardware.

Returning us back to the Amazon RDS, Amazon has presumably taken an optimized compiled route and offers prepackaged MySQL 5.1 instances ready to use:

  • db.m1.small (1.7 GB of RAM, $0.11 per hour).
  • db.m1.large (7.5 GB of RAM, $0.44 per hour)
  • db.m1.xlarge (15 GB of RAM, $0.88 per hour).
  • db.m2.2xlarge (34 GB of RAM, $1.55 per hour).
  • db.m2.4xlarge (68 GB of RAM, $3.10 per hour).

Of course the higher spatial functionality of PostgreSQL/PostGIS can be installed on any of these high memory instances as well. It is just not done by Amazon. The important thing to note is memory approaches 100Gb per instance! What does one do with all that memory?

Here is one use:

“Google query results are now served in under an astonishingly fast 200ms, down from 1000ms in the olden days. The vast majority of this great performance improvement is due to holding indexes completely in memory. Thousands of machines process each query in order to make search results appear nearly instantaneously.”
Google Fellow Jeff Dean keynote speech at WSDM 2009.

Having very large memory footprints makes sense for increasing performance on a DB application. Even fairly large data tables can reside entirely in memory for optimum performance. Whether a database makes use of the best optimized compiler for Amazon’s 64bit instances would need to be explored. Open source options like PostgreSQL/PostGIS would let you play with compiling in your choice of compilers, but perhaps not successfully.

Todd Hoff has some insightful analysis in his post, “Are Cloud-Based Memory Architectures the Next Big Thing?”

Here is Todd Hoff’s point about having your DB run inside of RAM – remember that 68Gb Quadruple Extra Large memory:

“Why are Memory Based Architectures so attractive? Compared to disk, RAM is a high bandwidth and low latency storage medium. Depending on who you ask the bandwidth of RAM is 5 GB/s. The bandwidth of disk is about 100 MB/s. RAM bandwidth is many hundreds of times faster. RAM wins. Modern hard drives have latencies under 13 milliseconds. When many applications are queued for disk reads latencies can easily be in the many second range. Memory latency is in the 5 nanosecond range. Memory latency is 2,000 times faster. RAM wins again.”

Wow! Can that be right? “Memory latency is 2,000 times faster .”

(Hmm… 13 milliseconds = 13,000,000 nanoseconds
so 13,000,000n/5n = 2,600,000x? And 5Gb/s / 100Mb/s = 50x? Am I doing the math right?)

The real question, of course, is what will actual benchmarks reveal? Presumably optimized memory caching narrows the gap between disk storage and RAM. Which brings up the problem of configuring a Database to use large RAM pools. PostgreSQL has a variety of configuration settings but to date RDBMS software doesn’t really have a configuration switch that simply caches the whole enchilada.

Here is some discussion of MySQL front-ending the database with In-Memory-Data-Grid (IMDG).

Here is an article on a PostgreSQL configuration to use a RAM disk.

Here is a walk through on configuring PostgreSQL caching and some PostgreSQL doc pages.

Tuning for large memory is not exactly straightforward. There is no “one size fits all.” You can quickly get into Managing Kernel Resources. The two most important parameters are:

  • shared_buffers
  • sort_mem
“As a start for tuning, use 25% of RAM for cache size, and 2-4% for sort size. Increase if no swapping, and decrease to prevent swapping. Of course, if the frequently accessed tables already fit in the cache, continuing to increase the cache size no longer dramatically improves performance.”

OK, given this rough guideline on a Quadruple Extra Large Instance 68Gb:

  • shared_buffers = 17Gb (25%)
  • sort_mem = 2.72Gb (4%)

This still leaves plenty of room, 48.28Gb, to avoid dreaded swap pagein by the OS. Let’s assume a more normal 8Gb memory for the OS. We still have 40Gb to play with. Looking at sort types in detail may make adding some more sort_mem helpful, maybe bump to 5Gb. Now there is still an additional 38Gb to drop into shared_buffers for a grand total of 55Gb. Of course you have to have a pretty hefty set of spatial tables to use up this kind of space.

Here is a list of PostgreSQL limitations. As you can see it is technically possible to run out of even 68Gb.


Maximum Database Size Unlimited
Maximum Table Size 32 TB
Maximum Row Size 1.6 TB
Maximum Field Size 1 GB
Maximum Rows per Table Unlimited
Maximum Columns per Table 250 – 1600 depending on column types
Maximum Indexes per Table Unlimited

Naturally the Obe duo has a useful posting on determining PostGIS sizes: Determining size of database, schema, tables, and geometry

To get some perspective on size an Open Street Map dump of the whole world fits into a 90Gb EBS Amazon Public Data Set configured for PostGIS with pg_createcluster. Looks like this just happened a couple weeks ago. Although 90Gb is just a little out of reach for a for even a Quadruple Extra Large, I gather the current size of planet osm is still in the 60Gb range and you might just fit it into 55Gb RAM. It would be a tad tight. Well maybe the Octuple Extra Large Instance 136Gb instance is not too far off. Of course who knows how big Planet OSM will ultimately end up being.

Another point to notice is the 8 virtual cores in a Quadruple Extra Large Instance. Unfortunately

“PostgreSQL uses a multi-process model, meaning each database connection has its own Unix process. Because of this, all multi-cpu operating systems can spread multiple database connections among the available CPUs. However, if only a single database connection is active, it can only use one CPU. PostgreSQL does not use multi-threading to allow a single process to use multiple CPUs.”

Running a single connection query apparently won’t benefit from a multi cpu virtual system, even though running multi threaded will definitely help with multiple connection pools.

I look forward to someone actually running benchmarks since that would be the genuine reality check.


Scaling up is the least complex way to boost performance on a lagging application. The Cloud offers lots of choices suitable to a range of budgets and problems. If you want to optimize personnel and adopt a decoupled SOA architecture, you’ll want to look at Azure + SQL Azure. If you want the adventure of large scale research problems, you’ll want to look at instance arrays and Hadoop clusters available in Amazon AWS.

However, if you just want a quick fix, maybe not 2000x but at least a some x, better take a look at Big RAM. If you do, please let us know the benchmarks!

Azure and GeoWebCache tile pyramids

Azure Blob storage tile pyramid
Fig 1 – Azure Blob Storage tile pyramid for citylimits

Azure Overview

Shared resources continue to grow as essential building blocks of modern life, key to connecting communities and businesses of all types and sizes. As a result a product like SharePoint is a very hot item in the enterprise world. You can possibly view Azure as a very big, very public, SharePoint platform that is still being constructed. Microsoft and 3rd party services will eventually populate the service bus of this Cloud version with lots and lots of service hooks. In the meantime, even early stage Azure with Web Hosting, Blob storage, and Azure SQL Server makes for some interesting experimental R&D.

Azure is similar to Amazon’s AWS cloud services, and Azure’s pricing follows Amazon’s lead with the familiar “pay as you go, buy what you use” model. Azure offers web services, storage, and queues, but instead of giving access to an actual virtual instance, Azure provides services maintained in the Microsoft Cloud infrastructure. Blob storage, Azure SQL Server, and IIS allow developers to host web applications and data in the Azure Cloud, but only with the provided services. The virtual machine is entirely hidden inside Microsoft’s Cloud.

The folks at Microsoft are probably well aware that most development scenarios have some basic web application and storage component, but don’t really need all the capabilities, and headaches, offered by controlling their own server. In return for giving up some freedom you get the security of automatic replication, scalability, and maintenance along with the API tools to connect into the services. In essence this is a Microsoft only Cloud since no other services can be installed. Unfortunately, as a GIS developer this makes Azure a bit less useful. After all, Microsoft doesn’t yet offer GIS APIs, OGC compliant service platforms, or translation tools. On the other hand, high availability with automatic replication and scalability for little effort are nice features for lots of GIS scenarios.

The current Azure CTP lets developers experiment for free with these minor restrictions:

  • Total compute usage: 2000 VM hours
  • Cloud storage capacity: 50GB
  • Total storage bandwidth: 20GB/day

To keep things simple, since this is my first introduction to Azure, I looked at just using Blob Storage to host a tile pyramid. The Silverlight MapControl CTP makes it very easy to add tile sources as layers so my project is simply to create a tile pyramid and store this in Azure Blob storage where I can access it from a Silverlight MapControl.

In order to create a tile pyramid, I also decided to dig into the GeoWebCache standalone beta 1.2. This is beta and offers some new undocumented features. It also is my first attempt at using geowebcache as standalone. Generally I just use the version conveniently built into Geoserver. However, since I was only building a tile pyramid rather than serving it, the standalone version made more sense. Geowebcache also provides caching for public WMS services. In cases where a useful WMS is available, but not very efficient, it would be nice to cache tiles for at least subsets useful to my applications.

Azure Blob Storage

Azure CTP has three main components:

  1. Windows Azure – includes the storage services for blobs, queues, and cloud tables as well as hosting web applications
  2. SQL Azure – SQL Server in the Cloud
  3. .NET Services – Service Bus, Access Control Service, Work Flow …

There are lots of walk throughs for getting started in Azure. It all boils down to getting the credentials to use the service.

Once a CTP project is available the next step is to create a “Storage Account” which will be used to store the tile pyramid directory. From your account page you can also create a “Hosted Service” within your Windows Azure project. This is where web applications are deployed. If you want to use “SQL Azure” you must request a second SQL Azure token and create a SQL Service. The .NET Service doesn’t require a token for a subscription as long as you have a Windows Live account.

After creating a Windows Azure storage account you will get three endpoints and a couple of keys.


Primary Access Key: ************************************
Secondary Access Key: *********************************

Now we can start using our brand new Azure storage account. But to make life much simpler first download the following:

Azure SDK includes some sample code . . . HelloWorld, HelloFabric, etc to get started using the Rest interface. I reviewed some of the samples and started down the path of creating the necessary Rest calls for recursively loading a tile pyramid from my local system into an Azure blob storage nomenclature. I was just getting started when I happened to take a look at the CloudDrive sample. This saved me a lot of time and trouble.

CloudDrive lets you treat the Azure service as a drive inside PowerShell. The venerable MSDOS cd, dir, mkdir, copy, del etc commands are all ready to go. Wince, I know, I know, MSDOS? I’m sure, if not now, then soon there will be dozens of tools to do the same thing with nice drag and drop UIs. But this works and I’m old enough to actually remember DOS commands.

First, using the elevated Windows Azure SDK command prompt you can compile and run the CloudDrive with a couple of commands:


Now open Windows PowerShell and execute the MounteDrive.ps1 script. This allows you to treat the local Azure service as a drive mount and start copying files into storage blobs.

Azure sample CloudDrive PowerShell
Fig 1 – Azure sample CloudDrive PowerShell

Creating a connection to the real production Azure service simply means making a copy of MountDrive.ps1 and changing credentials and endpoint to the ones obtained previously.

function MountDrive {
Param (
 $Account = "sampleaccount",
 $Key = "***************************************",

# Power Shell Snapin setup
 add-pssnapin CloudDriveSnapin -ErrorAction SilentlyContinue

# Create the credentials
 $password = ConvertTo-SecureString -AsPlainText -Force $Key
 $cred = New-Object -TypeName Management.Automation.PSCredential -ArgumentList $Account, $password

# Mount storage service as a drive
 new-psdrive -psprovider $ProviderName -root $ServiceUrl -name $DriveName -cred $cred -scope global

MountDrive -ServiceUrl "" -DriveName "Blob" -ProviderName "BlobDrive"

The new-item command lets you create a new container with -Public flag ensuring that files will be accessible publicly. Then the Blog: drive copy-cd command will copy files and subdirectories from the local file system to the Azure Blob storage. For example:

PS Blob:\> new-item imagecontainer -Public
Parent: CloudDriveSnapin\BlobDrive::http:\\\devstoreaccount1

Type Size LastWriteTimeUtc Name
---- ---- ---------------- ----
Container 10/16/2009 9:02:22 PM imagecontainer

PS Blob:\> dir

Parent: CloudDriveSnapin\BlobDrive::http:\\\

Type Size LastWriteTimeUtc Name
---- ---- ---------------- ----
Container 10/16/2009 9:02:22 PM imagecontainer
Container 10/8/2009 9:22:22 PM northmetro
Container 10/8/2009 5:54:16 PM storagesamplecontainer
Container 10/8/2009 7:32:16 PM testcontainer

PS Blob:\> copy-cd c:\temp\image001.png imagecontainer\test.png
PS Blob:\> dir imagecontainer

Parent: CloudDriveSnapin\BlobDrive::http:\\\imagecontainer

Type Size LastWriteTimeUtc Name
---- ---- ---------------- ----
Blob 1674374 10/16/2009 9:02:57 PM test.png

Because imagecontainer is public the test.png image can be accessed in the browser from the local development storage with:
or if the image was similarly loaded in a production Azure storage account:

It is worth noting that Azure storage consists of endpoints, containers, and blobs. There are some further subtleties for large blobs such as blocks and blocklists as well as metadata, but there is not really anything like a subdirectory. Subdirectories are emulated using slashes in the blob name.
i.e. northmetro/citylimits/BingMercator_12/006_019/000851_002543.png is a container, “northmetro“, followed by a blob name,

The browser can show this image using the local development storage:

Changing to producton Azure means substituting a valid endpoint for “″ like this:

With CloudDrive getting my tile pyramid into the cloud is straightforward and it saved writing custom code.

The tile pyramid – Geowebcache 1.2 beta

Geowebcache is written in Java and synchronizes very well with the GeoServer OGC service engine. The new 1.2 beta version is available as a .war that is loaded into the webapp directory of Tomcat. It is a fairly simple matter to configure geowebcache to create a tile pyramid of a particular Geoserver WMS layer. (Unfortunately it took me almost 2 days to work out a conflict with an existing Geoserver gwc) The two main files for configuration are:

C:\Program Files\Apache Software Foundation\Tomcat 6.0\webapps\
C:\Program Files\Apache Software Foundation\Tomcat 6.0\webapps\

geowebcache-servlet.xml customizes the service bean parameters and geowebcache.xml provides setup parameters for tile pyramids of layers. Leaving the geowebcache-servlet.xml at default will work fine when no other Geoserver or geowebcache is around. It can get more complicated if you have several that need to be kept separate. More configuration info.

Here is an example geowebcache.xml that uses some of the newer gridSet definition capabilities. It took me a long while to find the schema for geowebcache.xml:
The documentation is still thin for this beta release project.

<?xml version="1.0" encoding="utf-8"?>
<gwcConfiguration xmlns:xsi=""

After editing the configuration files, building the pyramid is a matter of pointing your browser at the local webapp and seeding the tiles down to the level you choose with the gridSet you want. The GoogleMapsCompatible gridSet is built into geowebcache and the BingMercator is a custom gridSet that I’ve added with extent limits defined.

This can take a few hours/days depending on the extent and zoom level you need. Once completed I use the CloudDrive PowerShell to copy all of the tiles into Azure blob storage:

PS Blob:\> copy-cd C:\Program Files\Apache Software Foundation\Tomcat 6.0\temp\geowebcache\citylimits

This also takes some time for the resulting 243,648 files of about 1Gb.

Silverlight MapControl

The final piece in the project is adding the MapControl viewer layer. First I add a new tile source layer in the Map Control of the MainPage.xaml

      Grid.Column="0" Grid.Row="1" Grid.RowSpan="1" Padding="5"
       <!-- Azure tile source -->
       <m:MapTileLayer x:Name="citylimitsAzureLayer" Opacity="0.5" Visibility="Collapsed">

The tile naming scheme is described here:
The important point is:

“Most filesystems use btree’s to store the files in directories, so layername/projection_z/[x/(2(z/2))]_[y/(2(z/2))]/x_y.extension seems reasonable, since it works sort of like a quadtree. The idea is that half the precision is in the directory name, the full precision in the filename to make it easy to locate problematic tiles. This will also make cache purges a lot faster for specific regions, since fewer directories have to be traversed and unlinked. “

An ordinary tile source class looks just like this:

  public class CityLimitsTileSource : Microsoft.VirtualEarth.MapControl.TileSource
        public CityLimitsTileSource() : base(App.Current.Host.InitParams["src"] +

        public override Uri GetUri(int x, int y, int zoomLevel)
           return new Uri(String.Format(this.UriFormat, x, y, zoomLevel));

However, now I need to reproduce the tile name as it is in the Azure storage container rather than letting gwc/service/gmaps mediate the nomenclature for me. This took a little digging. The two files I needed to look at turned out to be:

GMapsConverter works because Bing Maps follows the same upper left origin convention and spherical mercator projection as Google Maps. Here is the final approach using the naming system in Geowebcache1.2.

public class CityLimitsAzureTileSource : Microsoft.VirtualEarth.MapControl.TileSource
  public CityLimitsAzureTileSource()
  : base(App.Current.Host.InitParams["azure"] + "citylimits/GoogleMapsCompatible_{0}/{1}/{2}.png")

  public override Uri GetUri(int x, int y, int zoomLevel)
   * From geowebcache
   * must convert zoom, x, y, and z into tilepyramid subdirectory structure used by geowebcache
  int extent = (int)Math.Pow(2, zoomLevel);
  if (x < 0 || x > extent - 1)
     MessageBox.Show("The X coordinate is not sane: " + x);

  if (y < 0 || y > extent - 1)
     MessageBox.Show("The Y coordinate is not sane: " + y);
  // xPos and yPos correspond to the top left hand corner
  y = extent - y - 1;
  long shift = zoomLevel / 2;
  long half = 2 << (int)shift;
  int digits = 1;
  if (half > 10)
     digits = (int)(Math.Log10(half)) + 1;
  long halfx = x / half;
  long halfy = y / half;
  string halfsubdir = zeroPadder(halfx, digits) + "_" + zeroPadder(halfy, digits);
  string img = zeroPadder(x, 2 * digits) + "_" + zeroPadder(y, 2 * digits);
  string zoom = zeroPadder(zoomLevel, 2);
  string test = String.Format(this.UriFormat, zoom, halfsubdir, img );

  return new Uri(String.Format(this.UriFormat, zoom, halfsubdir, img));

  * From geowebcache
  * a way to pad numbers with leading zeros, since I don't know a fast
  * way of doing this in Java.
  * @param number
  * @param order
  * @return
  public static String zeroPadder(long number, int order) {
  int numberOrder = 1;

  if (number > 9) {
    if(number > 11) {
      numberOrder = (int) Math.Ceiling(Math.Log10(number) - 0.001);
    } else {
      numberOrder = 2;

  int diffOrder = order - numberOrder;

    if(diffOrder > 0) {
      //System.out.println("number: " + number + " order: " + order + " diff: " + diffOrder);
      StringBuilder padding = new StringBuilder(diffOrder);

      while (diffOrder > 0) {
       return padding.ToString() + string.Format("{0}", number);
    } else {
      return string.Format("{0}", number);

I didn’t attempt to change the zeroPadder. Doubtless there is a simple C# String.Format that would replace the zeroPadder from Geowebcache.

This works and provides access to tile png images stored in Azure blob storage, as you can see from the sample demo.


Tile pyramids enhance user experience, matching the performance users have come to expect in Bing, Google, Yahoo, and OSM. It is resource intensive to make tile pyramids of large world wide extent and deep zoom levels. In fact it is not something most services can or need provide except for limited areas. Tile pyramids in the Cloud require relatively static layers with infrequent updates.

Although using Azure this way is possible and provides performance, scalability, and reliability, I’m not sure it always makes sense. The costs are difficult to predict for a high volume site as they are based on bandwidth usage as well as storage. Also you may be paying storage fees for many tiles seldom or never needed. Tile pyramid performance is a wonderful thing, but it chews up a ton of storage, much of which is seldom if ever used.

For a stable low to medium volume application it makes more sense to host a tile pyramid on your own server. Possibly with high volume sites where reliability is the deciding factor moving to Cloud storage services is the right thing. This is especially true where traffic patterns swing wildly or grow rapidly and robust scaling is an ongoing battle.

Azure CTP is of course not as mature as AWS, but obviously it has the edge in the developer community and like many Microsoft technologies it has staying power to spare. Leveraging its developer community makes sense for Microsoft and with easy to use tools built into Visual Studio I can see Azure growing quickly. In time it will just be part of the development fabric with most Visual Studio deployment choices seamlessly migrating out to the Azure Cloud.

Azure release is slated for Nov 2009.