Connecting the Data Dots – Hybrid Architecture

Web mapping has generally been a 3 tier proposition. In a typical small to medium scale scenario there will almost always be a data source residing in a spatial database, with some type of service in the middle relaying queries from a browser client UI to the database and back.

I’ve worked with all kinds of permutations of 3 tier web mapping, some easier to use than others. However, a few months ago I sat down with all the Microsoft offerings and worked out an example using SQL Server + WCF + Bing Maps Silverlight Control. Notice tools for all three tiers are available from the same vendor i.e. Microsoft Visual Studio. I have to admit that it is really nice to have integrated tools across the whole span. The Microsoft option has only been possible in the last year or so with the introduction of SQL Server 2008 and Bing Maps Silverlight Map control.

The resulting project is available on codeplex: dataconnector
and you can play with a version online here: DataConnectorUI

When working on the project, I was starting out in WCF with some reservations. SQL Spatial was similar to much of my earlier work with PostGIS. While Bing Maps Silverlight Control and XAML echoed work I’d done a decade back with SVG, just more so. However, for the middle tier I had generally used something from the OGC world such as GeoServer. Putting together my own middle tier service using WCF was largely experimental. WCF turned out to be less daunting than I had anticipated. In addition, it also afforded opportunity to try out a few different approaches for transferring spatial query results to the UI client.

There is more complete information on the project here:
http://dataconnector.codeplex.com/documentation

After all the experimental approaches my conclusion is that even with the powerful CLR performance of Silverlight Bing Maps Control, most scenarios still call for a hybrid approach: raster performance tile pyramids for large extent or dense data resources, and vectors for better user interaction at lower levels in the pyramid.

Tile Pyramids don’t have to be static and DataConnector has examples of both static and dynamic tile sources. The static example is a little different from other approaches I’ve used, such as GeoWebCache, since it drops tiles into SQL Server as they are created rather than using a file system pyramid. I imagine that a straight static file system source could be a bit faster, but it is nice to have indexed quadkey access and all data residing in a single repository. This was actually a better choice when deploying to Azure, since I didn’t have to work out a blob storage option for the tiles in addition to the already available SQL Azure.

Hybrid web mapping:

Here are some of the tradeoffs involved between vector and raster tiles.

Hybrid architectures switch between the two depending on the client’s position in the resource pyramid. Here is my analysis of feature count ranges and optimal architecture:

1. Low – For vector poly feature counts < 300 features per viewport, I’d use vector queries from DB. Taking advantage of the SQL Reduce function makes it possible to drop node counts for polygons and polylines for lower zoom levels. Points are more efficient and up to 3000-5000 points per viewport are still possible.

2. Medium – For zoomlevels with counts > 300 per viewport, I’d use a dynamic tile builder at the top of the pyramid. I’m not sure what the upper limit is on performance here. I’ve only run it on fairly small tables 2000 records. Eventually dynamic tile building on the server effects performance (at the server not the client).

3. High – For zoomlevels with high feature counts and large poly node counts, I’d start with pre-seeded static tile at low zoomlevels at the top of the pyramid, perhaps dynamic tiles in the middle pyramid, and vectors at the bottom.

4. Very High – For very high feature counts at low zoomlevels near the top of the pyramid I’d just turn off the layer. There probably isn’t much reason to show very dense resources until the user moves in to an area of interest. For dense point sources a heat map raster overview would be best at the top of the pyramid. At middle levels I’d use a caching tile builder with vectors again at higher zoom levels at the bottom of the pyramid.

Here is a graphic view of some hybrid architectural options:
DataConnector screen shotDataConnector screen shot

DataConnector screen shotDataConnector screen shot

Data Distribution Geographically:
Another consideration is the distribution of data in the resource. Homogenous geographic data density works best with hard zoom level switches. In other words, the switch from vector to raster tile can be coded to zoomlevel regardless of where the client has panned in the extent of the data. This is simple to implement.

However where data is relatively heterogeneous geographically it might be nice to arrange switching according to density. An example might be parcel data densities that vary across urban and rural areas. Instead of simple zoom levels, the switch between tile and vector is based on density calculations. Having available a heat map overview, for example, could provide a quick viewport density calculation based on a pixel sum of the heat map intersecting with the user’s viewport. This density calculation would be used for the switch rather than a simpler zoom level switch. This way rural area of interest can gain the benefit of vectors higher in the pyramid than would be useful in urban areas.

DataConnector screen shotDataConnector screen shot

Point Layers:
Points have a slightly different twist. For one thing too many points clutter a map, while their simplicity means that more point vectors can be rendered before affecting UI performance. Heat Maps are a great way to show density at higher levels in the pyramid. Heat Maps can be dynamic tile source or a more generalized caching tile pyramid. In a point layer scenario at some level there is a switch from Heat Map to Cluster Icons, and then to individual Pushpins. Using power scaling at pushpin levels allows higher density pins to show higher in the pyramid without as much clutter. Power scaling hooks the icon size for the pin to zoomlevel. Experiments showed icon max limit for Bing Silverlight Map Control at 3000-5000 per viewport.

DataConnector screen shot

Some Caveats:
Tile pyramids are of course most efficient when the data is relatively static. With highly dynamic data, tiles can be built on the fly but with consequent loss of performance as well as loading on the server that affects scaling. In an intermediate situation with data that changes slowly, static tiles are still an option using a pre-seeding batch process run at some scheduled interval.

Batch tile loading also has limitations for very dense resources that require tiling down deep in a pyramid where the number of tiles grows very large. Seeding all levels of a deep pyramid requires some time, perhaps too much time. However, in a hybrid case this should rarely happen since the bottom levels of the pyramid are handled dynamically as vectors.

It is also worth noting that with Silverlight the client hardware affects performance. Silverlight has an advantage for web distribution. It distributes the cpu load out to the clients harnessing all their resources. However, an ancient client with old hardware performance will not match performance of newer machines.

Conclusion:

A Hybrid zoom level based approach is the best general architecture for Silverlight web maps where large data sets are involved. Using your own service provides more options in architecting a web map solution.

Microsoft’s WCF is a nice framework especially since it leverages the same C# language, IDE, and debugging capabilities across all tiers of a solution. Microsoft is the only vendor out there with the breadth to give developers an integrated solution across the whole spectrum from UI graphics, to customizable services, to SQL spatial tables.

Then throw in global map and imagery resources from Bing, with Geocode, Routing, and Search services, along with the Azure cloud platform for scalability and it’s no wonder Microsoft is a hit with the enterprise. Microsoft’s full range single vendor package is obviously a powerful incentive in the enterprise world. Although relatively new to the game, Microsoft’s full court offense puts some pressure on traditional GIS vendors, especially in the web distribution side of the equation.