Media Playing down the LiDAR path

Fig 1 – Showing a sequence of LiDAR Profiles as a video clip

Video codecs harnessed to Silverlight MediaPlayer make a useful compression technique for large image sets. The video, in essence, is a pointer into a large library of image frames. Data collected in a framewise spatial sense can leverage this technique. One example where this can be useful is in Mobile Asset Collection or MAC UIs. MAC corridors can create large collections of image assets in short order.

Another potential for this approach is to make use of conventional LiDAR point cloud profiles. In this project I wanted to step through a route, collecting profiles for later viewing. The video approach seemed feasible after watching cached profiles spin through a view panel connected to a MouseWheel event in another project. With a little bit of effort I was able to take a set of stepwise profiles, turn them into a video, and then connect the resulting video to a Silverlight Map Control client hooked up to a Silverlight MediaPlayer. This involved three steps:

1. Create a set of frames

Here is a code snippet used to follow a simple track down Colfax Ave in Denver, west to east. I used this to repeatedly grab LidarServer WMS GetProfileData requests, and save the resulting .png images to a subdirectory. The parameters were set to sweep at a 250ft offset either side of the track with a10ft depth and a 1 ft step interval. The result after a couple of hours was 19164 .png profiles at 300px X 500px.

The code basically starts down the supplied path and calculates the sweep profile endpoints at each step using the Perpendicular function. This sweep line supplies the extent parameters for a WMS GetProfileData request.

private void CreateImages(object sender, RoutedEventArgs e)
    string colorization = "Elevation";
    string filter = "All";
    string graticule = "True";
    string drapeline = "False";
    double profileWidth = 300;
    double profileHeight = 500;

    double dx = 0;
    double dy = 0;
    double len = 0;
    double t = 0;
    string profileUrl = null;
    WebClient client = new WebClient();

    step = 57.2957795 * (step * 0.3048 / 6378137.0); //approx dec degree
    Point startpt = new Point();
    Point endpt = new Point();
    string[] lines = points.Text.Split('\r');
    startpt.X = double.Parse(lines[0].Split(',')[0]);
    startpt.Y = double.Parse(lines[0].Split(',')[1]);

    endpt.X = double.Parse(lines[1].Split(',')[0]);
    endpt.Y = double.Parse(lines[1].Split(',')[1]);

    dx = endpt.X - startpt.X;
    dy = endpt.Y - startpt.Y;
    len = Math.Sqrt(dx * dx + dy * dy);

    Line direction = new Line();
    direction.X1 = startpt.X;
    direction.Y1 = startpt.Y;
    width *= 0.3048;
    int cnt = 0;
    t = step / len;

    while (t <= 1)
        direction.X2 = startpt.X + dx * t;
        direction.Y2 = startpt.Y + dy * t;

        Point p0 = Perpendicular(direction, width / 2);
        Point p1 = Perpendicular(direction, -width / 2);

        p0 = Mercator(p0.X, p0.Y);
        p1 = Mercator(p1.X, p1.Y);

        profileUrl = ""+
                         "&LEFT_XY=" + p0.X + "%2C" + p0.Y + "&RIGHT_XY=" + p1.X + "%2C" + p1.Y +
                         "&DEPTH=" + depth + "&SHOW_DRAPELINE=" + drapeline + "&SHOW_GRATICULE=" + graticule +
                         "&COLORIZATION=" + colorization + "&FILTER=" + filter +
                         "&WIDTH=" + profileWidth + "&HEIGHT=" + profileHeight;

        byte[] bytes = client.DownloadData(new Uri(profileUrl));
        FileStream fs = File.Create(String.Format(workingDir+"img{0:00000}.png", cnt++));
        BinaryWriter bw = new BinaryWriter(fs);

        direction.X1 = direction.X2;
        direction.Y1 = direction.Y2;
        t += step / len;

private Point Perpendicular(Line ctrline, double dist)
    Point pt = new Point();
    Point p1 = Mercator(ctrline.X1, ctrline.Y1);
    Point p2 = Mercator(ctrline.X2, ctrline.Y2);

    double dx = p2.X - p1.X;
    double dy = p2.Y - p1.Y;
    double len = Math.Sqrt(dx * dx + dy * dy);
    double e = dist * (dx / len);
    double f = dist * (dy / len);

    pt.X = p1.X - f;
    pt.Y = p1.Y + e;
    pt = InverseMercator(pt.X, pt.Y);
    return pt;

2. Merge the .png frames into an AVI

Here is a helpful C# AviFile library wrapper. Even though it is a little old, the functions I wanted in this wrapper worked just fine. The following WPF project simply takes a set of png files and adds them one at a time to an avi clip. Since I chose the (Full)uncompressed option, I had to break my files into smaller sets to keep from running into the 4Gb limit on my 32bit system. In the end I had 7 avi clips to cover the 19,164 png frames.

Fig 2 – Create an AVI clip from png frames

using System.Drawing;
using System.Windows;
using AviFile;

namespace CreateAVI

    public partial class Window1 : Window
        public Window1()

        private void btnWrite_Click(object sender, RoutedEventArgs e)
            int startframe = int.Parse(start.Text);
            int frameInterval = int.Parse(interval.Text);
            double frameRate = double.Parse(fps.Text);
            int endframe = 0;

            string currentDirName = inputDir.Text;
            string[] files = System.IO.Directory.GetFiles(currentDirName, "*.png");
            if (files.Length > (startframe + frameInterval)) endframe = startframe + frameInterval;
            else endframe = files.Length;
            Bitmap bmp = (Bitmap)System.Drawing.Image.FromFile(files[startframe]);
            AviManager aviManager = new AviManager(@currentDirName + outputFile.Text, false);
            VideoStream aviStream = aviManager.AddVideoStream(true, frameRate, bmp);

            Bitmap bitmap;
            int count = 0;
            for (int n = startframe+1; n < endframe; n++)
                if (files[n].Trim().Length > 0)
                    bitmap = (Bitmap)Bitmap.FromFile(files[n]);

Next I used Microsoft Expression Encoder 3 to encode the set of avi files into a Silverlight optimized VC-1 Broadband variable bitrate wmv output, which expects a broadband connection for an average 1632 Kbps download. The whole path sweep takes about 12.5 minutes to view and 53.5Mb of disk space. I used a 25fps frame rate when building the avi files. Since the sweep step is 1ft this works out to about a 17mph speed down my route.

3. Add the wmv in a MediaPlayer and connect to Silverlight Map Control.


I used a similar approach for connecting a route path to a video described in “Azure Video and the Silverlight Path”. Expression Encoder 3 comes with a set of Silverlight MediaPlayers templates. I used the simple “SL3Standard” template in this case, but you can get fancier if you want.

Looking in the Expression Templates subdirectory “C:\Program Files\Microsoft Expression\Encoder 3\Templates\en”, select the ExpressionMediaPlayer.MediaPlayer template you would like to use. All of the templates start with a generic.xaml MediaPlayer template. Add the .\Source\MediaPlayer\Themes\generic.xaml to your project. Then look through this xaml for <Style TargetType=”local:MediaPlayer”>. Once a key name is added, this plain generic style can be referenced by MediaPlayer in the MainPage.xaml
<Style x:Key=”MediaPlayerStyle” TargetType=”local:MediaPlayer”>

  Width=”432″ Height=”720″
  Style=”{StaticResource MediaPlayerStyle}”

It is a bit more involved to add one of the more fancy templates. It requires creating another ResourceDictionary xaml file, adding the styling from the template Page.xaml and then adding both the generic and the new template as merged dictionaries:

  <ResourceDictionary Source=”generic.xaml”/>
  <ResourceDictionary Source=”BlackGlass.xaml”/>

Removing unnecessary controls, like volume controls, mute button, and misc controls, involves finding the control in the ResourceDictionary xaml and changing Visibility to Collapsed.

Loading Route to Map

The list of node points at each GetProfileData frame was captured in a text file in the first step. This file is added as an embedded resource that can be loaded at initialization. Since there are 19164 nodes, the MapPolyline is reduced using only modulus 25 nodes resulting in a more manageable 766 node MapPolyline. The full node list is still kept in a routeLocations Collection. Having the full node list available helps to synch with the video timer. This video is encoded at 25fps so I can relate any video time to a node index.

private List<location> routeLocations = new List<location>();

 private void LoadRoute()
     MapPolyline rte = new MapPolyline();
     rte.Name = "route";
     rte.Stroke = new SolidColorBrush(Colors.Blue);
     rte.StrokeThickness = 10;
     rte.Opacity = 0.5;
     rte.Locations = new LocationCollection();

     Stream strm = Assembly.GetExecutingAssembly().GetManifestResourceStream("OnTerra_MACCorridor.corridor.direction.txt");

     string line;
     int cnt = 0;
     using (StreamReader reader = new StreamReader(strm))
         while ((line = reader.ReadLine()) != null)
             string[] values = line.Split(',');
             Location loc = new Location(double.Parse(values[0]), double.Parse(values[1]));
             if ((cnt++) % 25 == 0) rte.Locations.Add(loc);// add node every second

A Sweep MapPolyline is also added to the map with an event handler for MouseLeftButtonDown. The corresponding MouseMove and MouseLeftButtonUp events are added to the Map Control, which sets up a user drag capability. Every MouseMove event calls a FindNearestPoint(LL, routeLocations) function which returns a Location and updates the current routeIndex. This way the user sweep drag movements are locked to the route and the index is available to get the node point at the closest frame. This routeIndex is used to update the sweep profile end points to the new locations.

Synchronzing MediaPlayer and Sweep location

From the video perspective a DispatcherTimer polls the video MediaPlayer time position every 500ms. The MediaPlayer time position returned in seconds is multiplied by the frame rate of 25fps giving the routeLocations node index, which is used to update the sweep MapPolyline locations.

In reverse, a user drags and drops the sweep line at some point along the route. The MouseMove keeps the current routeIndex updated so that the mouse up event can change the sweep locations to its new location on the route. Also in this MouseLeftButtonUp event handler the video position is updated dividing the routeIndex by the frame rate.
VideoFile.Position = routeIndex/frameRate;


Since Silverlight handles media as well as maps it’s possible to take advantage of video codecs as a sort of compression technique. In this example, all of the large number of frame images collected from a LiDAR point cloud are readily available in a web mapping interface. Connecting the video timer with a node collection makes it relatively easy to keep map position and video synchronized. The video is a pointer into the large library of LiDAR profile image frames.

From a mapping perspective, this can be thought of as a raster organizational pattern, similar in a sense to tile pyramids. In this case, however, a single time axis is the pointer into a large image set, while with tile pyramids 3 axis pointers access the image library with zoom, x, and y. In either case visually interacting with a large image library enhances the human interface. My previous unsuccessful experiment with video pyramids attempted to combine a serial time axis with the three tile pyramid axis.I still believe this will be a reality sometime.

Of course there are other universes than our earth’s surface. It seems reasonable to think dynamic visualizer approaches could be extended to other large domains. Perhaps Bioinformatics could make use of tile pyramids and video codecs to explore Genomic or Proteomic topologies. It would be an interesting investigation.

Bioinformatics is a whole other world. While we are playing around in “mirror land” these guys are doing the “mirror us.”

Fig 1 – MediaPlayer using BlackGlass template

Codename “Dallas” Data Subscription

Fig 1 – Data.Gov subscription source from Microsoft Dallas

What is Dallas? More info here Dallas Blog

“Dallas is Microsoft’s Information Service, built on and part of the Windows Azure platform to provide developers and information workers with friction-free access to premium content through clean, consistent APIs as well as single-click BI/Reporting capabilities; an information marketplace allowing content providers to reach developers of ALL sizes (and developers of all sizes to gain access to content previously out of reach due to pricing, licensing, format, etc.)”

I guess I fall into the information worker category and although “friction-free” may not be quite the same as FOSS maybe it’s near enough to do some experiments. In order to make use of this data service you need to have a Windows Live ID with Microsoft. The signup page also asks for an invitation code which you can obtain via email. Once the sign-in completes you will be presented with a Key which is used for access to any of the data subscription services at this secure endpoint:

Here is a screen shot showing some of the free trial subscriptions that are part of my subscription catalog. This is all pretty new and most of the data sets listed in the catalog still indicate “Coming Soon.” The subscriptions interesting to me are the ones with a geographic component. There are none yet with actual latitude,longitude, but in the case of’s crime data there is at least a city and state attribution.

Fig 2 – Dallas subscriptions

Here is the preview page showing the default table view. You select the desired filter attributes and then click preview to show a table based view. Also there is a copy of the url used to access the data on the left. Other view options include “atom 1.0″, “raw”, and direct import to Excel Pivot.

Fig 3 – Dallas DATA.Gov subscription preview – Crime 2006,2007 USA

There are two approaches for consuming data.

1. The easiest is the url parameter service approach.$format=atom10

This isn’t the full picture because you also need to include your account key and a unique user ID in the http header. These are not sent in the url but in the header, which means using a specialized tool or coding an Http request.

	WebRequest request = WebRequest.Create(url);
	request.Headers.Add("$accountKey", accountKey);
	request.Headers.Add("$uniqueUserID", uniqueUserId);

	// Get the response
	HttpWebResponse response = (HttpWebResponse)request.GetResponse();

The response in this case is in Atom 1.0 format as indicated in the format request parameter of the url.

<feed xmlns=""
  <title type="text">Data.Gov - U.S. Offenses Known to Law Enforcement</title>
  <rights type="text">2009 U.S. Government</rights>
  <link rel="self" title="Data.Gov - U.S. Offenses Known to Law Enforcement"
href="$format=atom10" />
    <title type="text">Colorado / Alamosa in 2007</title>
    <link rel="self" href="$format=atom10
&$page=1&$itemsperpage=1" />
    <content type="application/xml">
        <d:State m:type="Edm.String">Colorado</d:State>
        <d:City m:type="Edm.String">Alamosa</d:City>
        <d:Year m:type="Edm.Int32">2007</d:Year>
        <d:Population m:type="Edm.Int32">8714</d:Population>
        <d:Violentcrime m:type="Edm.Int32">57</d:Violentcrime>
        <d:MurderAndNonEgligentManslaughter m:type="Edm.Int32">1</d:MurderAndNonEgligentManslaughter>
        <d:ForcibleRape m:type="Edm.Int32">11</d:ForcibleRape>
        <d:Robbery m:type="Edm.Int32">16</d:Robbery>
        <d:AggravatedAssault m:type="Edm.Int32">29</d:AggravatedAssault>
        <d:PropertyCrime m:type="Edm.Int32">565</d:PropertyCrime>
        <d:Burglary m:type="Edm.Int32">79</d:Burglary>
        <d:LarcenyTheft m:type="Edm.Int32">475</d:LarcenyTheft>
        <d:MotorVehicleTheft m:type="Edm.Int32">11</d:MotorVehicleTheft>
        <d:Arson m:type="Edm.Int32">3</d:Arson>

If you’re curious about MurderAndNonEgligentManslaughter, I assume it is meant to be: “Murder And Non Negligent Manslaughter”. There are some other anomalies I happened across such as very few violent crimes in Illinois. Perhaps Chicago politicians are better at keeping the slate clean.

2. The second approach using a generated proxy service is more powerful.

On the left corner of the preview page there is a Download C# service class link. This is a generated convenience class that lets you invoke the service with your account, ID, and url, but handles the XML Linq transfer of the atom response into a nice class with properties. There is an Invoke method that does all the work of getting a collection of items generated from the atom entry records:

    public partial class DataGovCrimeByCitiesItem
        public System.String State { get; set; }
        public System.String City { get; set; }
        public System.Int32 Year { get; set; }
        public System.Int32 Population { get; set; }
        public System.Int32 Violentcrime { get; set; }
        public System.Int32 MurderAndNonEgligentManslaughter { get; set; }
        public System.Int32 ForcibleRape { get; set; }
        public System.Int32 Robbery { get; set; }
        public System.Int32 AggravatedAssault { get; set; }
        public System.Int32 PropertyCrime { get; set; }
        public System.Int32 Burglary { get; set; }
        public System.Int32 LarcenyTheft { get; set; }
        public System.Int32 MotorVehicleTheft { get; set; }
        public System.Int32 Arson { get; set; }

public List Invoke(System.String state,
            System.String city,
            System.String year,
            int page)

Interestingly, you can’t just drop this proxy service code into the Silverlight side of a project. It has to be on the Web side. In order to be useful for a Bing Maps Silverlight Control application you still need to add a Silverlight WCF service to reference on the Silverlight side. This service simply calls the nicely generated Dallas proxy service which then shows up in the Asynch completed call back.

private void GetCrimeData(string state, string city, string year, int page,string crime )
  DallasServiceClient dallasclient = GetServiceClient();
  dallasclient.GetItemsCompleted += svc_DallasGetItemsCompleted;
  dallasclient.GetItemsAsync(state, city, year, page, crime);

private void svc_DallasGetItemsCompleted(object sender, GetItemsCompletedEventArgs e)
  if (e.Error == null)
      ObservableCollection<DataGovCrimeByCitiesItem> results = e.Result as

This is all very nice, but I really want to use it with a map. Getting the Dallas data is only part of the problem. I still need to turn the City, State locations into latitude, longitude locations. This can easily be done by adding a reference to the Bing Maps Web Services Geocode service. With the geocode service I can loop through the returned items collection and send each off to the geocode service getting back a useable LL Location.

foreach(DataGovCrimeByCitiesItem item in results){
   GetGeocodeLocation(item.City + "," + item.State, item);

Since all of these geocode requests are also Asynch call backs, I need to pass my DataGovCrimeByCitiesItem object along as a GeocodeCompletedEventArgs e.UserState. It is also a bit tricky determining exactly when all the geocode requests have been completed. I use a count down to check for a finish.

With a latitude, longitude in hand for each of the returned DataGovCrimeByCitiesItem objects I can start populating the map. I chose to use the Bubble graph approach with the crime statistic turned into a diameter. This required normalizing by the maximum value. It looks nice, although I’m not too sure how valuable such a graph actually is. Unfortunately this CTP version of Dallas data service has an items per page limit of 100. I can see why this is done to prevent massive data queries, but it complicates normalization since I don’t have all the pages available at one time to calculate a maximum. I could work out a way to call several pages, but there is a problem with an odd behaviour which seems to get results looped back on the beginning to finish the default 100 count on pages greater than 1. There ought to be some kind of additional query for count, max, and min of result sets. I didn’t see this in my experiments.

One drawback to my approach is the number of geocode requests that are accumulated. I should really get my request list only once per state and save locally. All the bubble crime calculations could then be done on a local set in memory cache. There wouldn’t be a need then for return calls and geocode loop with each change in type of crime. However, this version is a proof of concept and lets me see some of the usefulness of these types of data services as well as a few drawbacks of my initial approach.

Here is a view of the Access Report for my experiment. If you play with the demo you will be adding to the access tallies. Since this is CTP I don’t get charged, but it is interesting to see how a dev pay program might utilize this report page. Unfortunately, User ID is currently not part of the Access Report. If this Access Report would also sort by the User ID you could simply identify each user with their own unique ID and their share of the burden could be tracked.

Fig 4 – Dallas Data.Gov subscription Access Report


The interesting part of this exercise is seeing how the Bing Maps Silverlight Control can be the nexus of a variety of data service sources. In this simple demo I’m using the Bing Maps service, The Bing Maps Web Geocode Service, and the Dallas data service. I could just as easily add other sources from traditional WMS, WFS sources, or local tile pyramids and spatial data tables. The data sources can be in essence out sourced to some other service. All the computation happens in the client and a vastly more efficient distributed web app is the result. My server isn’t loaded with all kinds of data management issues or even all that many http hits.

Fig 5 – Distributed Data Sources – SOA

Hauling Out the Big RAM

Amazon released a handful of new stuff.

“Make that a Quadruple Extra Large with room for a Planet OSM”

Big Mmeory
Fig 1 – Big Foot Memory

1. New Price for EC2 instances

Linux Windows SQL Linux Windows SQL
m1.small $0.085 $0.12 $0.095 $0.13
m1.large $0.34 $0.48 $1.08 $0.38 $0.52 $1.12
m1.xlarge $0.68 $0.96 $1.56 $0.76 $1.04 $1.64
c1.medium $0.17 $0.29 $0.19 $0.31
c1.xlarge $0.68 $1.16 $2.36 $0.76 $1.24 $2.44

Notice the small instance, now $0.12/hr, matches Azure Pricing

Compute = $0.12 / hour

This is not really apples to apples since Amazon is a virtual instance, while Azure is per deployed application. A virtual instance can have multple service/web apps deployed.

2. Amazon announces a Relational Database Service RDS
Based on MySQL 5.1, this doesn’t appear to add a whole lot since you always could start an instance with any database you wanted. MySQL isn’t exactly known for geospatial even though it has some spatial capabilities. You can see a small comparison of PostGIS vs MySQL by Paul Ramsey. I don’t know if this comparison is still valid, but I haven’t seen much use of MySQL for spatial backends.

This is similar to Azure SQL Server which is also a convenience deployment that lets you run SQL Server as an Azure service, without all the headaches of administration and maintenance tasks. Neither of these options are cloud scaled, meaning that they are still single instance versions, not cross partition capable. SQL Azure Server CTP has an upper limit of 10Gb, as in hard drive not RAM.

3. Amazon adds New high memory instances

  • High-Memory Double Extra Large Instance 34.2 GB of memory, 13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units each), 850 GB of instance storage, 64-bit platform $1.20-$1.44/hr
  • High-Memory Quadruple Extra Large Instance 68.4 GB of memory, 26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each), 1690 GB of instance storage, 64-bit platform $2.40-$2.88/hr

These are new virtual instance AMIs that scale up as opposed to scale out. Scaled out options use clusters of instances in the Grid Computing/Hadoop type of architectures. There is nothing to prohibit using clusters of scaled up instances in a hybridized architecture, other than cost. However, the premise of Hadoop arrays is “divide and conquer,” so it makes less sense to have massive nodes in the array. Since scaling out involves moving the problem to a whole new parallel programming paradigm with all of its consequent complexity, it also means owning the code. In contrast scaling up is generally very simple. You don’t have to own the code or even recompile just install on more capable hardware.

Returning us back to the Amazon RDS, Amazon has presumably taken an optimized compiled route and offers prepackaged MySQL 5.1 instances ready to use:

  • db.m1.small (1.7 GB of RAM, $0.11 per hour).
  • db.m1.large (7.5 GB of RAM, $0.44 per hour)
  • db.m1.xlarge (15 GB of RAM, $0.88 per hour).
  • db.m2.2xlarge (34 GB of RAM, $1.55 per hour).
  • db.m2.4xlarge (68 GB of RAM, $3.10 per hour).

Of course the higher spatial functionality of PostgreSQL/PostGIS can be installed on any of these high memory instances as well. It is just not done by Amazon. The important thing to note is memory approaches 100Gb per instance! What does one do with all that memory?

Here is one use:

“Google query results are now served in under an astonishingly fast 200ms, down from 1000ms in the olden days. The vast majority of this great performance improvement is due to holding indexes completely in memory. Thousands of machines process each query in order to make search results appear nearly instantaneously.”
Google Fellow Jeff Dean keynote speech at WSDM 2009.

Having very large memory footprints makes sense for increasing performance on a DB application. Even fairly large data tables can reside entirely in memory for optimum performance. Whether a database makes use of the best optimized compiler for Amazon’s 64bit instances would need to be explored. Open source options like PostgreSQL/PostGIS would let you play with compiling in your choice of compilers, but perhaps not successfully.

Todd Hoff has some insightful analysis in his post, “Are Cloud-Based Memory Architectures the Next Big Thing?”

Here is Todd Hoff’s point about having your DB run inside of RAM – remember that 68Gb Quadruple Extra Large memory:

“Why are Memory Based Architectures so attractive? Compared to disk, RAM is a high bandwidth and low latency storage medium. Depending on who you ask the bandwidth of RAM is 5 GB/s. The bandwidth of disk is about 100 MB/s. RAM bandwidth is many hundreds of times faster. RAM wins. Modern hard drives have latencies under 13 milliseconds. When many applications are queued for disk reads latencies can easily be in the many second range. Memory latency is in the 5 nanosecond range. Memory latency is 2,000 times faster. RAM wins again.”

Wow! Can that be right? “Memory latency is 2,000 times faster .”

(Hmm… 13 milliseconds = 13,000,000 nanoseconds
so 13,000,000n/5n = 2,600,000x? And 5Gb/s / 100Mb/s = 50x? Am I doing the math right?)

The real question, of course, is what will actual benchmarks reveal? Presumably optimized memory caching narrows the gap between disk storage and RAM. Which brings up the problem of configuring a Database to use large RAM pools. PostgreSQL has a variety of configuration settings but to date RDBMS software doesn’t really have a configuration switch that simply caches the whole enchilada.

Here is some discussion of MySQL front-ending the database with In-Memory-Data-Grid (IMDG).

Here is an article on a PostgreSQL configuration to use a RAM disk.

Here is a walk through on configuring PostgreSQL caching and some PostgreSQL doc pages.

Tuning for large memory is not exactly straightforward. There is no “one size fits all.” You can quickly get into Managing Kernel Resources. The two most important parameters are:

  • shared_buffers
  • sort_mem
“As a start for tuning, use 25% of RAM for cache size, and 2-4% for sort size. Increase if no swapping, and decrease to prevent swapping. Of course, if the frequently accessed tables already fit in the cache, continuing to increase the cache size no longer dramatically improves performance.”

OK, given this rough guideline on a Quadruple Extra Large Instance 68Gb:

  • shared_buffers = 17Gb (25%)
  • sort_mem = 2.72Gb (4%)

This still leaves plenty of room, 48.28Gb, to avoid dreaded swap pagein by the OS. Let’s assume a more normal 8Gb memory for the OS. We still have 40Gb to play with. Looking at sort types in detail may make adding some more sort_mem helpful, maybe bump to 5Gb. Now there is still an additional 38Gb to drop into shared_buffers for a grand total of 55Gb. Of course you have to have a pretty hefty set of spatial tables to use up this kind of space.

Here is a list of PostgreSQL limitations. As you can see it is technically possible to run out of even 68Gb.


Maximum Database Size Unlimited
Maximum Table Size 32 TB
Maximum Row Size 1.6 TB
Maximum Field Size 1 GB
Maximum Rows per Table Unlimited
Maximum Columns per Table 250 – 1600 depending on column types
Maximum Indexes per Table Unlimited

Naturally the Obe duo has a useful posting on determining PostGIS sizes: Determining size of database, schema, tables, and geometry

To get some perspective on size an Open Street Map dump of the whole world fits into a 90Gb EBS Amazon Public Data Set configured for PostGIS with pg_createcluster. Looks like this just happened a couple weeks ago. Although 90Gb is just a little out of reach for a for even a Quadruple Extra Large, I gather the current size of planet osm is still in the 60Gb range and you might just fit it into 55Gb RAM. It would be a tad tight. Well maybe the Octuple Extra Large Instance 136Gb instance is not too far off. Of course who knows how big Planet OSM will ultimately end up being.

Another point to notice is the 8 virtual cores in a Quadruple Extra Large Instance. Unfortunately

“PostgreSQL uses a multi-process model, meaning each database connection has its own Unix process. Because of this, all multi-cpu operating systems can spread multiple database connections among the available CPUs. However, if only a single database connection is active, it can only use one CPU. PostgreSQL does not use multi-threading to allow a single process to use multiple CPUs.”

Running a single connection query apparently won’t benefit from a multi cpu virtual system, even though running multi threaded will definitely help with multiple connection pools.

I look forward to someone actually running benchmarks since that would be the genuine reality check.


Scaling up is the least complex way to boost performance on a lagging application. The Cloud offers lots of choices suitable to a range of budgets and problems. If you want to optimize personnel and adopt a decoupled SOA architecture, you’ll want to look at Azure + SQL Azure. If you want the adventure of large scale research problems, you’ll want to look at instance arrays and Hadoop clusters available in Amazon AWS.

However, if you just want a quick fix, maybe not 2000x but at least a some x, better take a look at Big RAM. If you do, please let us know the benchmarks!

My EPSG:54004 mystery solved!

EPSG:54004 problem fig 1
Fig 1 – DIA Looks a Lot Better!

With a helpful comment from SharpGIS I was able to finally pin down my baffling problem with EPSG:54004.

The problem is in the datums.



As Morten pointed out the 54004 datum includes a flattening, 298.257223563 :

So 54004 should be treated as an Ellipsoid rather than a Sphere.

There is a subtle difference in 900913. If you notice 900913 also includes a flattening:


PROJCS["Google Mercator",
    GEOGCS["WGS 84",
        DATUM["World Geodetic System 1984",
            SPHEROID["WGS 84",6378137.0,298.257223563,
        AXIS["Geodetic latitude",NORTH],
        AXIS["Geodetic longitude",EAST],

However, you might not notice in addition it includes an explicit minor axis parameter.
And this minor axis is identical to the major axis. The axis definition overrides the flattening in the Datum and is probably technically incorrect, but the idea was just to get a spherical mercator into a definition that people could use to match Google Maps. I’ve seen this definition in PostGIS, GeoServer, and OpenLayers.

I had already noticed this and played with a MercatorEllipsoid function to see if that would fix my problem. However, sadly, I made an error and miscalculated eccentricity. The real equation for e goes something like this:
double f = 1 / 298.257223563;
e = Math.Sqrt(2*f – Math.Pow(f,2));

resulting in e = 0.081819190842621486;

Once I made the correction for the proper eccentricity in MercatorEllipsoid, ESRI:54004 lines up with the EPSG:3857. DIA is back in its rightful place.

My MercatorEllipsoid function now calculates correct BBOX parameters for GetMap requests, but only for com.esri.wms.Esrimap services. Looks like ESRI is the expert and correctly produces ESRI:54004 with Datum as defined. However, not so with GeoServer.

GeoServer seems to ignore the flattening 298.257223563 or else assume it is like the 900913 where flattening is overridden by a minor axis parameter:
semi_minor axis = major.PARAMETER[semi_minor,6378137.0]

This leads to some problems. My WMS client now has to decide which service correctly interprets the DATUM on 54004. For now I just check for “com.esri.wms.Esrimap” in the WMS url and change datums accordingly. This will undoubtedly lead to problems with other services since I don’t yet know how MapServer or others treat 54004.


  1. ESRI is right again!
  2. Always check your math one more time
  3. Community responses produce answers

Thanks everyone!

Follow up on BLM LSIS Transparency

Silverlight MapControl OWS Viewer
Fig 1 – BLM LSIS does offer transparency

Ah looks like I ran into a different problem in my exploration of BLM LSIS. It is not a problem with Transparent=True but a problem with Silverlight’s support of 24 bit png.

Here is some discussion: and

Vish’s Rambling adds some discussion as well.

It is interesting to note how reluctant Microsoft is to admit to a bug of any kind. I guess it is a big target for the lawyers, but I’ve never seen that kind of reluctance in the Open Source world.

Hopefully Silverlight support for png transparency will be cleared up in a future release.

Chrome Problems

Fig 1 IE with Silverlight ve:Map component

With the introduction of Chrome, Google has thrown down the gauntlet to challenge IE and Firefox. Out of curiosity I thought it would be interesting to download the current Chrome Beta and see what it could do with some of the interfaces I’ve worked on. Someone had recently quipped, “isn’t all of Google Beta?” I guess the same could be said of Amazon AWS, but then again in the “apples to apples” vein, I decided to compare IE8 Beta and Chrome Beta. The above screen shot shows an example of the new Silverlight ve:Map component in an ASP Ajax running on II6. The browser is IE8 beta in Vista, and surprise, not, it all works as expected.

Fig 2 Chrome with Silverlight ve:Map component

Also not surprisingly, the same Silverlight ve:Map component in an ASP Ajax site fares poorly in Chrome. In fact the component appears not at all, while curiously the menu asp:MenuItems act oddly. Instead of the expected drop down I get a refresh to a new horizontal row?

Fig 3 IE with Silverlight ve:Map component

Moving on to a Google Map Component embedded in the same ASP page, IE8 beta displays the map component including the newer G_SATELLITE_3D_MAP map type. ASP drop down menu and tooltips all work.

Fig 4 Chrome with Silverlight ve:Map component

Since this is a Google Map Component I would be disappointed if it did not work in Chrome, and it does. Except, I noticed the G_SATELLITE_3D_MAP control type is missing? I guess Chrome Beta has not caught up with Google Map Beta. Again the ASP Menu is not functional.

Fig 5 IE Google Map Control with Earth Mode – G_SATELLITE_3D_MAP

Back to IE to test the 3D Earth mode of my Google Map Component.As seen above it all works fine.

Fig 6 IE Silverlight Deep Earth

Now to check the new Silverlight DeepEarth component in IE. DeepEarth is a nice little MultiScaleTile source library for smoothly spinning around the VE tile engines. It works as amazingly smooth as ever.

Fig 7 Google Chrome Deep Earth

However, in Chrome, no luck, just a big white area. I suppose that Silverlight was not a high priority with Chrome.

Fig 8 IE SVG hurricane West Atlantic weather clip

Switching to some older SVG interfaces, I took a look at the Hurricane clips in the West Atlantic. It looks pretty good, Hanna is deteriorating to a storm and Ike is still out east of the Bahamas.

Fig 9 Chrome SVG hurricane West Atlantic weather clip

On Chrome it is not so nice. The static menu side of the svg frames shows up but the image and animation stack is just gray. Clicking on menu items verifies that events are not working. Of course this SVG is functional only in the Adobe SVG viewer, but evidently Chrome has some svg problems.

Fig 10 IE ASP .NET 3.5

Moving back to IE8, I browsed through a recent ASP .NET 3.5 site I built for an Energy monitoring service. This is a fairly complete demonstration of ListView and Linq SQL and it of course works in IE8 beta.

Fig 11 Chrome ASP .NET 3.5

Surprisingly, Chrome does a great job on the ASP .NET 3.5. Almost all the features work as expected with the exception of the same old Menu problems.

Fig 12 IE SVG OWS interface

Finally I went back down memory lane for an older OWS interface built with the SVG, using the Adobe Viewer variety. There are some glitches in IE8 beta. Although I can still see WMS and WFS layers and zoom around a bit , some annoying errors do pop up here and there. Adobe SVG viewer is actually orphaned, ever since Adobe picked up Macromedia and Flash, so it will doubtless receed into the distant past as the new browser generations arrives. Unfortunately, there is little Microsoft activity in SVG, in spite of competition from the other browsers, Safari, Firefox, and Opera. It will likely remain a 2nd class citizen in IE terms as SIlverlight’s intent is to replace Flash, which itself is a proprietary competitor to SVG.

Fig 13 Chrome SVG OWS interface

Chrome and Adobe SVG are not great friends. Rumor has it that Chrome intends to fully support SVG, so if I ever get around to it, I could rewrite these interfaces for Firefox, Opera, Chrome 2.0.

Chrome is beta and brand new. Although it has a lot of nice features and a quick clean tabbed interface, I don’t see anything but problems for map interfaces. Hopefully the Google Map problems will be ironed out shortly. There is even hope for SVG at some later date. I imagine even Silverlight will be supported grudgingly since I doubt that Google has the clout to dictate useage on the internet.

Deep Zoom a TerraServer UrbanArea on EC2

Fig 1 – Silverlight MultiScaleImage of a high resolution Denver image – 200.6Mb .png

Just to show that I can serve a compiled Deep Zoom Silverlight app from various Apache servers I loaded this Denver example on a Windows 2003 Apache Tomcat here:, and then a duplicate on a Linux Ubuntu7.10 running as an instance in the Amazon EC2, this time using Apache httpd not Tomcat: Remember these are using beta technology and will requires updating to Silverlight 2.0. The Silverlight install is only about 4.5Mb so the install is relatively painless on a normal bandwidth connection.

Continuing the exploration of Deep Zoom, I’ve had a crash course in Silverlight. Silverlight is theoretically cross browser compatible (at least for IE, Safari, and FireFox), and it’s also cross server. The trick for compiled Silverlight is to use Visual Studio 2008 with .NET 3.5 updates. Under the list of new project templates is a template called ‘Silverlight application’. Using this template sets up a project that can be published directly to the webapp folder of my Apache Server. I have not tried a DeepZoom MultiScaleImage on Linux FireFox or Mac Safari clients. However, I can view this on a Windows XP FireFox updated to Silverlight 2.0Beta as well as Silverlight updated IE7 and IE8 beta.

Creating a project called Denver and borrowing liberally from a few published examples, I was able to add a ClientBin folder under my Denver_Web project folder. Into this folder goes the pyramid I generate using Deep Zoom Composer. Once the pyramid is copied into place I can reference this source from my MultiScaleImage element source. Now the pyramid is viewable.

To make the MultiScaleImage element useful, I added a couple of additional .cs touches for mousewheel and drag events. Thanks to the published work of Lutz Gerhard, Peter Blois, and Scott Hanselman this was just a matter of including a MouseWheelHelper.cs in the project namespace and adding a few delegate functions to the main Page initialization code behind file. Pan and Zoom .cs

Now I need to backtrack a bit. How do I get some reasonable Denver imagery for testing this Deep Zoom technology? Well I don’t belong to DRCOG which I understand is planning on collecting 6″ aerials. There are other imagery sets floating around Denver, as well, I believe down to 3″ pixel resolution. However, the cost of aerial capture precludes any free and open source type of use. However, there is some nice aerial data available from the USGS. The USGS Urban Area imagery is available for a number of metropolitan areas, including Denver.

Fig 2 – Same high resolution Denver image zoomed in to show detail

USGS Urban Area imagery is a color orthorectified image set captured at approximately 1ft pixel resolution. The data is made available to the public through the TerraServer WMS. Looking over the TerraServer UrbanArea GetCapabilities layer I see that I can ‘GetMap’ this layer in EPSG:26913 (UTM83-13m). The best possible pixel resolution through the TerraServer WMS is 0.25m per pixel. To achieve this level of resolution I can use the max pixel Height and Width of 2000 over a metric bounding box of 500m x 500m.

For example:,4399768,511672,4400268&WIDTH=2000&HEIGHT=2000

This is nice data but I want to get the max resolution for a larger area and mosaic the imagery into a single large image that I will then feed into the Deep Zoom Composer tool for building the MultiScaleImage pyramid. Java is the best tool I have to make a simple program to connect to the WMS and pull down my images one at a time into the tiff format.
try {
File OutFile = new File(dir+imageFileName);
URL u = new URL(url);
HttpURLConnection geocon = (HttpURLConnection)u.openConnection();
BufferedImage image =;
System.out.println(“download completed to “+dir+imageFileName+” “+bbox);

Looping this over my desired area creates a directory of 11.7Mb tif images. In my present experiment I grabbed a set of 6×6 tiles, or 36 tiff files at a total of 412Mb. The next step is to collect all of these tif tiles into a single mosaic. The Java JAI package contains a nice tool for this called mosaic:
mosaic = JAI.create(“mosaic”, pbMosaic, new RenderingHints(JAI.KEY_IMAGE_LAYOUT, imageLayout));

Iterating pbMosaic.addSource(translated); over my set of TerraServer tif files and then using PNGImageEncoder, I am able to create a single png file of about 200Mb. Now I have a sufficiently large image to drop into the Deep Zoom Composer for testing. The resulting pyramid of jpg files is then copied into my ClientBin subdirectory of the Denver VS2008 project. From there it is published to the Apache webapp. Now I can open my Denver webapp for viewing the image pyramid. On this client system with a good GPU and dual core cpu the image zoom and pan is quite smooth and replicates a nice local application viewing program with smooth transitions around real time zoom pan space. On an older Windows XP running FireFox the pan and zoom is very similar. This is on a system with no GPU so I am impressed.

Peeking into the pyramid I see that the bottom level 14 contains 2304 images for a 200Mb png pyramid. Each image stays at 256×256 and the compression ranges from 10kb to 20kb per tile. Processing into the jpg pyramid compresses from the original 412Mb tif set => 200.5Mb png mosaic => 45.7Mb 3084 file jpg pyramid. Evidently there is a bit of lossy compression, but the end effect is that the individual tiles are small enough to stream into the browser at a decent speed. Connected with high bandwidth the result is very smooth pan and zoom. This is basically a Google Earth or Virtual Earth user experience all under my control!

Now that I have a workflow and a set of tools, I wanted to see what limits I ran into. The next step was to increment my tile set to an 8×8 for 64 tifs to see if my mosaic tool would endure the larger size as well as the DeepZoom Composer. My JAI mosaic will be the sticking point on a maximum image size since the source images are built in memory which on this machine is 3Gb. Taking into account Vista’s footprint I can actually only get about 1.5Gb. One possible workaround to that bottleneck is to create several mosaics and then attempt to splice them in the Deep Zoom Composer by manually positioning them before exporting to a pyramid.

First I modified my mosaic program to write a Jpeg output with jpgParams.setQuality(1.0f); This results in a faster mosaic and a smaller export. The JAI PNG encoder is much slower than JPEG. With this modification I was able to export a couple of 3000m x 3000m mosaics as jpg files. I then used Deep Zoom Composer to position the two images horizontally and exported as a single collection. In the end the image pyramid is 6000m x 3000m and 152Mb of jpg tiles. It looks like I might be able to scale this up to cover a large part of the Denver metro UrbanArea imagery.

The largest mosaic I was able to get Deep Zoom Composer to accept was 8×8 or 16000px x 16000px which is just 4000m x 4000m on the ground. Feeding this 143Mb mosaic through Composer resulted in a pyramid consists of 5344 jpg files at 82.3Mb. However, scaling to a 5000m x 5000m set of 100 tif, the 221Mb mosaic, failed on import to Deep Zoom Composer. I say failed, but in this prerelease version the import finishes with a blank image shown on the right. Export works in the usual quirky fashion in that the export progress bar generally never stops, but in this case the pyramid also remains empty. Another quirky item to note is that each use of Deep Zoom Composer starts a SparseImageTool.exe process which continues consuming about 25% of cpu even after the Deep Zoom Composer is closed. After working awhile you will need to go into task manager and close down these processes manually. Apparently this is “pre-release.”

Fig 3 – Same high resolution Denver image zoomed in to show detail of Coors Field players are visible

Deep Zoom is an exciting technology. It allows map hackers access to real time zoom and pan of large images. In spite of some current size limitations on the Composer tool the actual pyramid serving appears to have no real limit. I verified on a few clients and was impressed that this magic works in IE and FireFox although I don’t have a Linux or Mac client to test. The compiled code serves easily from Apache and Tomcat with no additional tweaking required. My next project will be adapting these Deep Zoom pyramids into a tile system. I plan to use either an OWS front end or a Live Maps with a grid overlay. The deep zoom tiles can then be accessed by clicking on a tile to open a Silverlight MultiScaleImage. This approach seems like a simple method for expanding coverage over a larger metropolitan area while still using the somewhat limiting Deep Zoom Composer pre release.

Wide Area HVAC controller using WPF and ZigBee Sensor grid

One project I’ve been working on recently revolves around an online controller for a wide area HVAC system. HVAC systems can sometimes be optimized for higher efficiency by monitoring performance in conjunction with environment parameters. Local rules can be established for individual systems based on various temperatures, humidity, and duct configurations. Briefly, a set of HVAC functions consisting of on/off relay switches and thermistors, can be observed from an online monitoring interface. Conversely state changes can be initiated online by issuing a command to a queue. These sensors and relays might be scattered over a relatively large geographic area and in multiple locations inside a commercial building.

It is interesting to connect a macro geospatial world with a micro world, drilling down through a local facility to a single thermistor chip. In the end its all spatial.

Using a simple map view allows drill down from a wide area to a building, a device inside a building, a switch bank, and individual relay or analog channel for monitoring or controlling. The geospatial aspect of this project is somewhat limited, however, the zoom and pan tools used in the map location also happen to work well in the facilities and graphing views.

The interface can be divided into three parts:
1) The onsite system – local base system and Zigbee devices
2) The online server system – standard Apache Tomcat
3) The online client interface – WPF xbap, although svg would also work with a bit more work

Onsite System

The electronically impaired, like myself, may find the details of controller PIC chip sets, relays, and thermistor spec sheets baffling, but really they look more hacky than they are:

Fig 0 – Left: Zigbee usb antenna ; Center: thermistor chip MCP9701A; Right: ProXR Zigbee relay controller

The onsite system is made up of sensors and controller boards. The controller boards include a Zigbee antenna along with a single bank of 8 relays and an addition set of 8 analog inputs. The sensors are wired to the controller board in this development mode. However, Zigbee enabled temperature sensors are also a possibility, just more expensive. See SunSpot for example: (Open Source hardware? )

ZigBee is a wifi type communications protocol based on IEEE 802.15.4. It allows meshes of devices to talk to each other via RF as long as they are within about 100-300 ft of another node on the mesh. Extender repeaters are also available. ZigBee enabled devices can be scattered around a facility and communicate back to a base system by relaying messages node to node through an ad hoc mesh network.

The onsite system has a local pc acting as the base server. The local onsite server communicates with an external connection via a internet router and monitors the Zigbee network. ASUS EeePCs look like a good candidate for this type of application. Messages originating from outside are communicated down to the individual relay through the ZigBee mesh, while state changes and analog readings originating from a controller relay or sensor channel are communicated up the ZigBee network and then passed to the outside from the local onsite server.

The local server must have a small program polling the ZigBee devices and handling communications to the outside world via an internet connection. The PC is equipped with a usb ZigBee antenna to communicate with the other ZigBee devices in the network. This polling software was written in Java even though that may not the best language for serial USB com control in Windows. The target system will be Linux based. The ZigBee devices we selected came with the USB driver that treats a USB port like a simple COM port.

Since this was a Java project the next step was finding a comm api. The sun JavaComm has discontinued support of Windows, although it is available for Linux. Our final onsite system will likely be Linux for cost reasons, so this is only a problem with the R&D system which is Windows based. I ended using a RXTX library, RXTXcomm.jar, at

Commands for our ProXR controller device are a series of numeric codes, for example<254;140;3;1>This series of commands puts the controller in command mode 254, a set bank status command 140, a byte indicating relays 0 and 1 on 3, and bank address 1. The result is relays 0 and 1 are switched to the on position. The commands are issued similarly for reading relay state and analog channels. <254;166;1> for example reads all 8 analog I/O channels as a set of 8 bytes.

Going in prototype mode we picked up a batch of three wire MCP9701A thermistor chips for a few dollars. The trick is to pick the right resistance to get voltage readings in to the mid range of the 8bit or 10bit analog channel read. Using 8 bit output lets us poll for temperature with around .5 degree F resolution.

The polling program issues commands and reads results on separate threads. If state is changed locally it is communicated back to the online server on the next polling message, while commands from the online command queue are written to the local controller boards with the return. In the meantime every polling interval sends an analog channel record back to the server.

Online Server

The online server is an Apache Tomcat service with a set of servlets to process communications from the onsite servers. Polled analog readings are stored in a PostgreSQL database with building:device: bank:channel addresses as well as a timestamp. The command queue is another PostgreSQL table which is checked at each poll interval for commands addressed to the building address which initiated the poll. Any pending commands are returned to the polling onsite server where they will be sent out to the proper device:bank:relay over the ZigBee network.

Two other tables simply provide locations of buildings as longitude, latitude in the wide area HVAC control system. Locations of devices insidebuildings are stored in a building table as floor and x,y coordinates. These are available for the client interface.

Client Interface

The client interface was developed using WPF xbap to take advantage of xaml controls and a WMS mapping interface. Initially the client presents a tabbed menu with a map view. The map view indicates the wide area HVAC extents with a background WMS image for reference. Zooming in to the building location of interest allows the user to select a building to show a floor plan with device locations indicated.

Fig 1 HVAC wide area map view

Once a building is selected the building floor plans are displayed. Selecting an individual device determines the building:device address.

Fig 2 Building:device address selection from facilities floor plan map

Finally individual relays can be selected from the device bank by pushing on/off buttons. Once the desired switch configuration is set in the panel, it can be sent to the command queue as a building:device:bank address command. Current onsite state is also double checked by the next polling return from the onsite server.

Fig 3 Relay switch panel for selected building:device:bank address.

The analog IO channels are updated to the server table at the set polling interval. A selection of the analog tab displays a set of graph areas for each of the 8 channels. The on/off panel initiates a server request for the latest set of 60 polled readings which are displayed graphically. It won’t be much effort to extend this analog graph to a bidirectional interface with user selectable ranges set by dragging floor and ceiling lines that trigger messages or events when a line is crossed.

Fig 4 Analog IO channel graphs

This prototype incorporates several technologies using a Java based Tomcat service online and a Java RXTXcomm Api for the local Zigbee polling. The client interface is also served out of Apache Tomcat as WPF xaml to take advantage of easier gui control building. In addition OGC WMS is used for the map views. The facilities plan views will be DXF translations to WPF xaml. Simple graphic selection events are used to build addresses to individual relays and channels. The server provides historical command queues and channel readings by storing time stamped records. PostgreSQL also has the advantage of handling record locking on the command queue when multiple clients are accessing the system.

This system is in the prototype stage but illustrates the direction of control systems. A single operator can maintain and monitor systems from any locations accessible to the internet, which is nearly anywhere these days. XML rendering graphics grammars for browsers like svg and xaml enable sophisticated interfacesthat are relatively simple to build.

There are several OGC specifications oriented toward sensor grids, The state of art is still in flux but by virtue of the need for spatial management of sensor grids, there will be a geospatial component in an “ubiquitous sensor” world.

Xaml on Amazon EC2 S3

Time to experiment with Amazon EC2 and S3. This site is using an Amazon EC2 instance with a complete open source GIS stack running on Ubuntu Gutsy.

  • Ubuntu Gutsy
  • Java 1.6.0
  • Apache2
  • Tomcat 6.016
  • PHP5
  • MySQL 5.0.45
  • PostgreSQL 8.2.6
  • PostGIS 1.2.1
  • GEOS 2.2.3-CAPI-1.1.1
  • Proj 4.5.0
  • GeoServer 1.6

Running an Apache2 service with a jk_mod connector to tomcat lets me run the examples of xaml xbap files with their associated java servlet utilities for pulling up GetCapabilities trees on various OWS services. This is an interesting example of combining open source and WPF. In the NasaNeo example Java is used to create the 3D terrain models from JPL srtm (Ctrl+click) and drape with BMNG all served as WPF xaml to take advantage of native client bindings. NasaNeo example

I originally attempted to start with a public ami based on fedora core 6. I found loading the stack difficult with hard to find RPMs and difficult installation issues. I finally ran into a wall with the PostgreSQL/PostGIS install. In order to load I needed a complete gcc make package to compile from sources. It did not seem worth the trouble. At that point I switched to an Ubuntu 7.10 Gutsy ami.

Ubuntu based on debian is somewhat different in its directory layout from the fedora base. However, Ubuntu apt-get was much better maintained than the fedora core yum installs. This may be due to using the older fedora 6 rather than a fedora 8 or 9, but there did not appear to be any useable public ami images available on the AWS EC2 for the newer fedoras. In contrast to fedora on Ubuntu installing a recent version of PostgreSQL/PostGIS was a simple matter:
apt-get install postgresql-8.2-postgis postgis

In this case I was using the basic small 32 bit instance ami with 1.7Gb memory and 160Gb storage at $0.10/hour. The performance was very comparable to some dedicated servers we are running, perhaps even a bit better since the Ubuntu service is setup using an Apache2 jk_mod to tomcat while the dedicated servers simply use tomcat.

There are some issues to watch for on the small ami instances. The storage is 160Gb but the partition allots just 10Gb to root and the balance to a /mnt point. This means the default installations of mysql and postgresql will have data directories on the smaller 10Gb partition. Amazon has done this to limit ec2-bundle-vol to a 10GB max. ec2-bundle-volume is used to store an image to S3 which is where the whole utility computing gets interesting.

Once an ami stack has been installed it is bundled and stored on S3, that ami is then registered with AWS. Now you have the ability to replicate the image on as many instances as desired. This allows very fast scaling or failover with minimal effort. The only caveat of course is in dynamic data. Unless provision is made to replicate mysql and postgresql data to multiple instances or S3, any changes can be lost with the loss of an instance. This does not appear to occur terribly often but then again the AWS is still Beta. Also important to note, the DNS domain pointed to an existing instance will also be lost with the loss of your instance. Bringing up a new instance requires a change to the DNS entry as well (several hours), since each instance creates its own unique amazon domain name. There appear to be some work arounds for this requiring more extensive knowledge of DNS servers.

In my case the data sources are fairly static. I ended up changing the datadir pointers to /mnt locations. Since these are not bundled in the volume creation, I handled them separately. Once the data required was loaded I ran a tar on the /mnt/directory and copied the .tar files each to its own S3 bucket. The files are quite large so this is not a nice way to treat backups of dynamic data resources.

Next week I have a chance to experiment with a more comprehensive solution from Elastra. Their beta version promises to solve these issues by wrapping Postgresql/postgis on the ec2 instance with a layer that uses S3 as the actual datadir. I am curious how this is done but assume for performance the indices remain local to an instance while the data resides on S3. I will be interested to see what performance is possible with this product.

Another interesting area to explore is Amazon’s recently introduced SimpleDB. This is not a standard sql database but a type of hierarchical object stack over on S3 that can be queried from EC2 instances. This is geared toward non typed text storage which is fairly common in website building. It will be interesting to adapt this to geospatial data to see what can be done. One idea is to store bounding box attributes in the SimpleDB and create some type of JTS tool for indexing on the ec2 instance. The local spatial index would handle the lookup which is then fed to the SimpleDB query tools for retrieving data. I imagine the biggest bottleneck in this scenario would be the cost of text conversion to double and its inverse.

Utility computing has an exciting future in the geospatial realm – thank you Amazon and Zen.

This is a test of WordPress 2.2.1

I’ve manually run a blog for the past couple of years, but decided it was about time to setup a more automated approach. WordPress,  is a popular webblog tool using PHP. It was very straightforward to install WordPress 2.2.1 on a server with a MySQL DB. After playing around with a few options it was not difficult to start importing items from my existing site:

I managed to edit the html from the old site and even found a work around for embedded YouTube videos without too much effort. So far I’ve found WordPress easy to use, but I have yet to work on customizing the styling.

The recommended w.bloggar tool is actually a better interface for editing and adding content. For example, I found that modifying the YouTube embed code to use <div> in place of <p> did not work in the WordPress editor, but worked fine from a w.bloggar publish.

 Overall the WordPress experience has been pleasant. In the past I’d attempted to use a Java weblog tool , Roller Weblog , without much success. I now see that Roller has been released from Apache incubation and I will need to check out the 3.1 version. There doesn’t seem to be much advantage to java over PHP in my current configuration. WordPress is a tool to get the job done. I have not yet thought of any real use for customized map interfaces in a weblog that can’t be accomplished with a straightforward embed.  If I want to work with a 3D connected graph sometime using WPF to show article relationships in a 3 axis spatial model, I might need to work with a weblog tool that is .net 3.0 friendly.

Now I just need to readup on WordPress and see about some customization of styling.