Demographic Terrains


Fig 1 – Population skyline of New York - Census SF1 P0010001 and Bing Road Map

Demographic Terrain

My last blog post, Demographic Landscapes, described leveraging new SFCGAL PostGIS 3D functions to display 3D extruded polygons with X3Dom. However, there are other ways to display census demographics with WebGL. WebGL meshes are a good way to handle terrain DEM. In essence the US Census is providing demographic value terrains, so 3D terrain meshes are an intriguing approach to visualization.

Babylon.js is a powerful 3D engine written in javascript for rendering WebGL scenes. In order to use Babylon.js, US Census SF1 data needs to be added as WebGL 3D mesh objects. Babylon.js offers a low effort tool for generating these meshes from grayscale height map images: BABYLON.Mesh.CreateGroundFromHeightMap.

Modifying the Census WMS service to produce grayscale images at the highest SF1 polygon resolution i.e. tabblock, is the easiest approach to generating these meshes. I added an additional custom request to my WMS service, “request=GetHeightMap,” which returns a PixelFormat.Format32bppPArgb bitmap with demographic values coded as grayscale. This is equivalent to a 255 range classifier for population values.

Example GetHeightMap request:
http://<server>/CensusSF1/WMSService.svc/<token>/NAD83/USA/SF1QP/WMSLatLon?REQUEST=GetHeightMap&SERVICE=WMS&VERSION=1.3.0&LAYERS=TabBlock&STYLES=P0010001 &FORMAT=image/png&BGCOLOR=0xFFFFFF&TRANSPARENT=TRUE&CRS=EPSG:4326 &BBOX=39.25212350301827,-105.70779069335937,40.22049698135099,-104.23836930664062
&WIDTH=1024&HEIGHT=1024

Fig 2 - grayscale heightmap from Census SF1 P001001 at tabblock polygon level for Denver metro area

Once a grayscale image is available it can be added to the Babylon scene, as a heightmap.

/* Name
  * Height map image url
  * mesh Width
  * mesh Height
  * Number of subdivisions (increase the complexity of
this mesh in order to improve the visual quality of it)
  * Minimum height : The lowest level of the mesh
  * Maximum height : the highest level of the mesh
  * scene
  * Updatable: if this mesh can be updated dynamically in the future (Boolean)
  **/
var ground = BABYLON.Mesh.CreateGroundFromHeightMap("ground", heightmap,
256, 256, 500, 0, 10, scene, false);

WMS GetMap requests are also useful for adding a variety of textures to the generated demographic terrain.

var groundMaterial = new BABYLON.StandardMaterial("ground", scene);
groundMaterial.diffuseTexture = new BABYLON.Texture( image, scene);

Fig 3 – texture from Bing Maps AerialWithLabels over population terrain


Sample Babylon.js Demographic Terrain Scene using ArcRotateCamera
<!DOCTYPE html>
<html>
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Babylon HeightMap</title>
    <script type="text/javascript" src="//code.jquery.com/jquery-1.11.0.min.js"></script>
    <!-- Babylon.js -->
    <script src="http://www.babylonjs.com/hand.minified-1.2.js"></script>
    <script src="http://cdn.babylonjs.com/2-1/babylon.js"></script>
    <style>
        html, body {
            overflow: hidden;
            width: 100%;
            height: 100%;
            margin: 0;
            padding: 0;
        }

        #renderCanvas {
            width: 100%;
            height: 100%;
            touch-action: none;
        }
    </style>
</head>
<body>
    <canvas id="renderCanvas"></canvas>
    <script>
        if (!BABYLON.Engine.isSupported()) {
            $("body").html('<div id="webglmessage"><h1>Sorry, your Browser does no implement WebGL</h1></div>');
        }
        else {
            var canvas = document.getElementById("renderCanvas");
            var engine = new BABYLON.Engine(canvas, true);

            var createScene = function () {
                var scene = new BABYLON.Scene(engine);

                // Light
                var spot = new BABYLON.SpotLight("spot", new BABYLON.Vector3(0, 30, 10), new BABYLON.Vector3(0, -1, 0), 17, 1, scene);
                spot.diffuse = new BABYLON.Color3(1, 1, 1);
                spot.specular = new BABYLON.Color3(0, 0, 0);
                spot.intensity = 0.5;

                //ArcRotateCamera Camera
                var camera = new BABYLON.ArcRotateCamera("Camera", -1.57079633, 1.0, 256, new BABYLON.Vector3.Zero(), scene);
                camera.lowerBetaLimit = 0.1;
                camera.upperBetaLimit = (Math.PI / 2) * 0.9;
                camera.lowerRadiusLimit = 30;
                camera.upperRadiusLimit = 256;
                scene.activeCamera = camera;
                scene.activeCamera.attachControl(canvas);

                var image = 'textures/NY_Map.jpg';
                var heightmap = 'textures/NY_HeightMap.jpg';

                // Ground
                var groundMaterial = new BABYLON.StandardMaterial("ground", scene);
                groundMaterial.diffuseTexture = new BABYLON.Texture(image, scene);

                var ground = BABYLON.Mesh.CreateGroundFromHeightMap("ground", heightmap, 256, 141, 750, 0, 2.5, scene, false);
                ground.material = groundMaterial;

                // Skybox
                var skybox = BABYLON.Mesh.CreateBox("skyBox", 800.0, scene);
                var skyboxMaterial = new BABYLON.StandardMaterial("skyBox", scene);
                skyboxMaterial.backFaceCulling = false;
                skyboxMaterial.reflectionTexture = new BABYLON.CubeTexture("textures/skybox", scene);
                skyboxMaterial.reflectionTexture.coordinatesMode = BABYLON.Texture.SKYBOX_MODE;
                skyboxMaterial.diffuseColor = new BABYLON.Color3(0, 0, 0);
                skyboxMaterial.specularColor = new BABYLON.Color3(0, 0, 0);
                skybox.material = skyboxMaterial;

                return scene;
            }

            var scene = createScene();

            engine.runRenderLoop(function () {
                scene.render();
            });

            // Resize
            window.addEventListener("resize", function () {
                engine.resize();
            });
        }
    </script>
</body>
</html>

Modifying the experimental Census UI involves adding a link button, scale dropdown, and a camera selector to the Demographics tab.

Fig 4 – “View HeightMap” button with Camera selector added to experimental Census UI

Babylon.js offers a range of cameras, including cameras for use with VR headsets.

  • ArcRotate – rotates a target pivot using mouse and cursor keys
  • Free – first person shooter camera
  • Touch – includes touch gesture events for camera control
  • WebVRFree – dual eye images for VR head set devices

WebVRFreeCamera is suited for use in Google Cardboard headsets.


Fig 5 – Babylon.js WebVRFreeCamera Census P001001 terrain


I have a $4 Google Cardboard headset on order. Soon I can try it using a Babylon.js WebVRFreeCamera to navigate population terrains.


Fig 6 - $4.00 Google cardboard kit

Microsoft HoloLens will up the coolness but not the hipster factor while improving usability immensely ( when released ). I’m inclined to the minimalist movement myself, but I’d be willing to write a Windows 10 app with the HoloLens SDK to see how well it performs.

Fig 7 - Microsoft HoloLens is a future possibility for viewing Census Terrains

Performance

Viewing performance is fine once loaded as WebGL, but producing the height map involves:

  1. (Server) PostGIS query join to return intersected tabblocks of selected demographic
  2. (Server) Polygon draw to grayscale image with population value normalized by US maximum
  3. (Client) Babylon translates grayscale to webgl mesh

This can require some patience looking at the heavenly but empty skybox for a few seconds, 5-10s on my laptop.

Future directions for inquiry

  • Compare performance with batch processed SF1 tiles. Tiles could combine a 3D vector mesh with a 2D value array to reduce size of multiple demographic tile pyramids.
  • Explore Babylon.js LOD mesh simplification.
  • Explore Babylon.js octree submeshes with multiple mesh tiles.
  • Use PostGIS MapAlgebra on multi-variate value arrays.

Fig 8 – Population view of Denver - Census SF1 P0010001 scale 3

Increasing the scale exaggerates relative population. My how Denver has grown!


Fig 9 – Population view of Denver - Census SF1 P0010001 scale 20

Territorial Turf

Fig 1 – Territories from Census Zip Code Tabulation Areas (ZCTAs) on Bing Maps

An interesting GIS development over the years has been the evolution from monolithic applications to multiple open source plug and play tools. Considering GIS as a three tier system with back end storage, a controller in the middle, and a UI display out front, more and more of the middle tier is migrating to either end.

SQL DBs, such as SQL Server, Oracle Spatial, and especially PostGIS, now implement a multitude of GIS functions originally considered middle tier domain. On the other end, the good folks at turf.js and proj4js continue to push atomic GIS functions out to javascript, where they can fit into the client side UI. The middle tier is getting thinner and thinner as time goes on. Generally the middle is what costs money, so rolling your own middle with less work is a good thing. As a desirable side effect, instead of kitchen sink GIS, very specific user tools can be cobbled together as needed.

Looking for an excuse to experiment with turf.js, I decided to create a Territory Builder utilizing turf.js on the client and some of the US Census Bureau Services on the backend.

US Census Bureau does expose some “useful” services at TigerWeb. I tend to agree with Brian Timoney that .gov should stick to generating data exposed in useful ways. Apps and presentation are fast evolving targets and historically .gov can’t really hope to keep up. Although you can use the census custom TigerWeb applications for some needs, there are many other occasions when you would like to build something less generic. For example a Territory builder over Bing Maps.

Here is the simplified POC work flow:

1. Use Census WMS GetMap to show polygons on a canvas over Bing Maps.
2. Use Census WMS GetFeatureInfo to return a GeoID of selected polygon(s).
3. Use Census REST service and GeoID to return selected polygon vertices.
4. Use Turf.js merge function to merge selected polygons into a single Territory polygon
5. Display territory polygon using Bing Maps Ajax v7 Microsoft.Maps.AdvancedShapes polygon

Fig 2 – Territory polygon with a demographic overlay on top of Bing Maps

Note: Because there doesn’t appear to be an efficient way to join demographic data to geography with current TigerWeb service APIs, the demographic tab of this app uses a custom WMS PostGIS backend, hooking SF1 to TIGER polygons.

1. Census WMS GetMap over Bing Maps

WMS GetMap requests simply return an image. In order to overlay the image on a Bing Map this Territory app uses an html5 canvas and context app.context.drawImage(imageObj, 0, 0); The trick is to stack the canvas in between the Bing Map and the Bing Map navigation, scale, and selector tools. TigerWeb WMS conveniently exposes epsg:3857 which correctly aligns with Bing Map tiles.

    function showCensus() {
        if (app.canvas == null) {
            app.canvas = document.createElement('canvas');
            app.canvas.id = 'censuscanvas';
            app.canvas.style.position = 'relative';
            var mapDiv = app.map.getRootElement();
            //position this canvas under Bing navigation and over map tiles
            mapDiv.insertBefore(app.canvas, mapDiv.childNodes[3]);
            app.context = app.canvas.getContext("2d");
        }
        //match canvas size to map size
        app.canvas.height = app.map.getHeight();
        app.canvas.width = app.map.getWidth();

        var imageObj = new Image();
        //image onload function draws image on context
        imageObj.onload = function () {
            app.canvas.style.opacity = app.alpha / 100;
            app.context.drawImage(imageObj, 0, 0);
        };
        //prepare a TigerWeb WMS GetMap request
        //use proj4.js to transform ll to mercator bbox
        var b = app.map.getBounds();
        var epsg3857 = "+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +wktext  +no_defs";
        var sw = proj4(epsg3857).forward([b.getWest(), b.getSouth()]);
        var ne = proj4(epsg3857).forward([b.getEast(), b.getNorth()]);
        bbox = sw[0] + "," + sw[1] + "," + ne[0] + "," + ne[1];

        var url = "";
        if (tab == "Territory") {
            app.alpha = 100;
            // GetMap request
            url = "http://tigerweb.geo.census.gov/arcgis/services/TIGERweb/tigerWMS_Current/MapServer/WmsServer?REQUEST=GetMap&SERVICE=WMS&VERSION=1.3.0&LAYERS=" + app.censusPolygonType + "&STYLES=&FORMAT=image/png&BGCOLOR=0xFFFFFF&TRANSPARENT=TRUE&CRS=EPSG:3857&BBOX=" + bbox + "&WIDTH=" + app.canvas.width + "&HEIGHT=" + app.canvas.height;
        }
        imageObj.src = url;
    }

2. Census WMS GetFeatureInfo for obtaining GeoID

Unfortunately the WMS GetMap request has no IDs available for polygons, even if requested format is image/svg+xml. SVG is an xml format and could easily contain associated GeoID values for joining with other data resources, but this is contrary to the spirit of OGC WMS service specs. Naturally we must obey OGC Law, which is too bad. Adding a GeoID attribute would allow options such as choropleth fills directly on existing svg paths. For example adding id = “80132″ would allow fill colors by zipcode with a bit of javascript.

http://tigerweb.geo.census.gov/arcgis/services/TIGERweb/tigerWMS_Current/MapServer/WmsServer?REQUEST=GetMap&SERVICE=WMS&VERSION=1.3.0 &LAYERS=2010 Census ZIP Code Tabulation Areas&STYLES=&FORMAT=image/svg+xml&BGCOLOR=0xFFFFFF&TRANSPARENT=TRUE&CRS=EPSG:3857 &BBOX=-11679625.942909468,4709198.547476525,-11645573.246808422,4737900.651597611&WIDTH=891&HEIGHT=751

<g transform="matrix(1.3333333 0 0 -1.3333333 445.5 375.5)">
<path id="80132" d="M300.50809 -258.78592 L302.54407 -257.91829 L303.17977 -257.55608 L304.45554 -256.98889
                                .
				.
L280.30116 -268.55133 L280.7184 -268.34637 L298.33448 -259.95117 L300.50809 -258.78592 "
  stroke="#99454A" fill="none" stroke-opacity="1" stroke-width="1.5"
 stroke-linecap="round" stroke-linejoin="round" stroke-miterlimit="10" />

Instead TigerWeb WMS exposes GetFeatureInfo requests. Using a click event we calculate a lat,lon location and send a GetFeatureInfo request to find a GeoID for the enclosing polygon.

Request:
http://tigerweb.geo.census.gov/arcgis/services/TIGERweb/tigerWMS_Current/MapServer/WmsServer?REQUEST=GetFeatureInfo&SERVICE=WMS&VERSION=1.3.0&CRS=EPSG:4326 &BBOX=13.6972399985862,-128.010213138288,52.2800368298775,-53.444917258507&WIDTH=1061&HEIGHT=549 &INFO_FORMAT=text/xml&QUERY_LAYERS=2010 Census ZIP Code Tabulation Areas&X=388&Y=190

Response:
<?xml version=”1.0″ encoding=”UTF-8″?>
<FeatureInfoResponse xmlns=”http://www.esri.com/wms” xmlns:esri_wms=”http://www.esri.com/wms”>
<FIELDS INTPTLON=”-101.2099848″ INTPTLAT=”+38.9263824″ CENTLON=”-101.2064010″ CENTLAT=”+38.9226272″ STGEOMETRY=”Polygon” OBJECTID=”16553″ AREAWATER=”163771″ AREALAND=”1327709219″ FUNCSTAT=”S” ZCTA5CC=”B5″ MTFCC=”G6350″ NAME=”ZCTA5 67764″ LSADC=”Z5″ BASENAME=”67764″ GEOID=”67764″ ZCTA5=”67764″ OID=”221404258331221″/>
</FeatureInfoResponse>

Since cross browser restrictions come into play it’s necessary to add a bit of server side proxy code to actually return the xml.

Some MVC controller magic:

 function GetCensusFeature(e) {
    var x = e.pageX - $("#map").offset().left;
    var y = e.pageY - $("#map").offset().top;
    var FeatureUrl = {
        url: "http://tigerweb.geo.census.gov/arcgis/services/TIGERweb/tigerWMS_Current/MapServer/WmsServer?REQUEST=GetFeatureInfo&SERVICE=WMS&VERSION=1.3.0&LAYERS=" + app.censusPolygonType + "&STYLES=&FORMAT=image/png&BGCOLOR=0xFFFFFF&TRANSPARENT=TRUE&CRS=EPSG:3857&BBOX=" + bbox + "&WIDTH=" + app.canvas.width + "&HEIGHT=" + app.canvas.height + "&INFO_FORMAT=text/xml&QUERY_LAYERS=" + app.censusPolygonType + "&X=" + x + "&Y=" + y
    }
        app.postit('/api/Operation/GetFeatureInfo', {
            data: JSON.stringify(FeatureUrl),
            success: function (response) {
                if (response.length > 0) {
                    var parser = new DOMParser();
                    var xmlDoc = parser.parseFromString(response, "text/xml");
				          .
				          .

Eventally gets here to a C# Controller which can proxy the GetFeatureInfo request:

        /// <summary>
        /// GetFeatureInfo
        /// </summary>
        /// <param name="FeatureUrl"></param>
        /// <returns>xml doc</returns>
        [HttpPost]
        public async Task<IHttpActionResult> GetFeatureInfo(Models.FeatureUrl feature)
        {
            var wc = new HttpClient();
            var getFeatureInfoRequest = new Uri(feature.url);
            var response = await wc.GetAsync(getFeatureInfoRequest);
            response.EnsureSuccessStatusCode();
            var content = await response.Content.ReadAsStreamAsync();
            var result = "";
            using (var reader = new StreamReader(content))
            {
                while (!reader.EndOfStream)
                {
                    result += reader.ReadLine();
                }
            }
            return Ok(result);
        }

Finally, on success, javascript can parse the xml for GeoID and add a canvas context box with some text information:

     success: function (response) {
              if (response.length > 0) {
                    var parser = new DOMParser();
                    var xmlDoc = parser.parseFromString(response, "text/xml");
                    if (tab == "Territory" && xmlDoc.getElementsByTagName("FeatureInfoResponse")[0].hasChildNodes()) {
                        app.context.beginPath();
                        app.context.rect(x, y, 125, 25);
                        app.context.fillStyle = 'white';
                        app.context.fill();
                        app.context.lineWidth = 1;
                        app.context.strokeStyle = 'black';
                        app.context.stroke();
                        app.context.fillStyle = 'black';
                        app.context.font = '11pt Calibri';
                        var fields = xmlDoc.getElementsByTagName("FIELDS")[0];
                        app.context.fillText(fields.getAttribute('NAME'), x + 5, y + 15);
                        var geoid= fields.getAttribute('GEOID')
					            .
					            .

Fig 3 – building territories from Census Block Groups on Bing Maps

3. Census REST to obtain vertices

The GEOID retrieved from our proxied GetFeatureInfo request allows us to grab vertices with another TigerWeb service, Census REST.

This spec is a little more proprietary and requires some detective work to unravel.
FeatureUrl.url = “http://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb/PUMA_TAD_TAZ_UGA_ZCTA/MapServer/1/query?where=GEOID%3D” + geoid + “&geometryPrecision=6&outSR=4326&f=pjson”;

There are different endpoint urls for the various polygon types. In this case Zip Codes are found in PUMA_TAD_TAZ_UGA_ZCTA. We don’t need more than 1 meter resolution so precision 6 is good enough and we would like the results in epsg:4326 to avoid a proj4 transform on the client.

This follows the same process as previously sending the REST request through a proxy controller. You can see the geojson geometry result with this sample request:
http://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb/PUMA_TAD_TAZ_UGA_ZCTA/MapServer/1/query?where=GEOID%3D80004&geometryPrecision=6&outSR=4326&f=pjson

Census REST doesn’t appear to offer a simplify parameter so the coordinates returned are at the highest resolution. Highly detailed polygons can easily return several thousand vertices, which is a problem for performance, but the trade-off is eliminating hosting data ourselves.

4. turf.js merge function to Build Territory

Finally the interesting part, where we get to use turf.js to handle merging multiple polygons into a single territory polygon.

var FeatureUrl = {
url: "http://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb/PUMA_TAD_TAZ_UGA_ZCTA/MapServer/1/query?where=GEOID%3D" + geoid + "&geometryPrecision=6&outSR=4326&f=pjson";
}
app.postit('/api/Operation/GetFeatureGeo', {
    data: JSON.stringify(FeatureUrl),
    success: function (response) {
        var polygon = JSON.parse(response);
        if (polygon.features.length == 0 || polygon.features[0].geometry.rings.length == 0) {
            $('#loading').addClass('hide');
            alert("No Geo Features returned.")
            return;
        }
        if (app.territoryFeatures == null) {
            app.territoryFeatures = new Array();
            app.territory = turf.polygon(polygon.features[0].geometry.rings);
            app.territoryFeatures.push(app.territory);
        }
        else {
            var tpolygon = turf.polygon(polygon.features[0].geometry.rings);
            app.territoryFeatures.push(tpolygon);
            var fc = turf.featurecollection(app.territoryFeatures);
            try {
                app.territory = turf.merge(fc);
            }
            catch (err) {
                $('#loading').addClass('hide');
                alert("turf.js error: " + err.message);
                return;
            }
        }
        if (app.territory.geometry.coordinates.length > 0) {
            displayTerritory();
            $('#totalDivID').removeClass('hide');
        }

        $('#loading').addClass('hide');
    },
    error: function (response) {
        alert("TigerWeb error:" + response);
    }
});

5. Display Territory Bing Maps Ajax polygon

The final rendering just uses Bing Maps Ajax v7 Microsoft.Maps.AdvancedShapes to add the new territory to the map.

function displayTerritory() {
    app.territoryLayer.clear();
    var coordLen = app.territory.geometry.coordinates.length;
    if (app.territory.geometry.type == 'Polygon') {
        var rings = new Array();
        for (var i = 0; i < coordLen; i++) {
            var vertices = new Array();
            for (var j = 0; j < app.territory.geometry.coordinates[i].length; j++) {
                vertices.push(new Microsoft.Maps.Location(app.territory.geometry.coordinates[i][j][1], app.territory.geometry.coordinates[i][j][0]));
            }
            rings.push(vertices);
        }
        var polygon = new Microsoft.Maps.Polygon(rings, { fillColor: new Microsoft.Maps.Color(100, 100, 0, 100) });
        app.territoryLayer.push(polygon);
    }
    else if (app.territory.geometry.type == 'MultiPolygon') {
        var multi = new Array();
        for (var i = 0; i < coordLen; i++) {
            var ringslen = app.territory.geometry.coordinates[i].length;
            for (var j = 0; j < ringslen; j++) {
                var vertices = new Array();
                for (var k = 0; k < app.territory.geometry.coordinates[i][j].length; k++) {
                    vertices.push(new Microsoft.Maps.Location(app.territory.geometry.coordinates[i][j][k][1], app.territory.geometry.coordinates[i][j][k][0]));
                }
                multi.push(vertices)
            }
        }
        var polygon = new Microsoft.Maps.Polygon(multi, { fillColor: new Microsoft.Maps.Color(100, 100, 0, 100) });
        app.territoryLayer.push(polygon);
    }
    else {
        $('#loading').addClass('hide');
        alert("geometry type is " + app.territory.geometry.type + ". Territory requires a Polygon or MultiPolygon.")
        return;
    }
}

Fig 4 – Territory Total for P0040003 Latino Origin – Map: Quantile population classifier Census Block Groups on Bing Map

Summary

TigerWeb offers some useful data access. With TigerWeb WMS and REST api, developers can customize apps without hosting and maintaining a backend SQL store. However, there are some drawbacks.

Some potential improvements:
1. Adding an svg id=GeoID would really improve the usefulness of TigerWeb WMS image/svg+xml, possibly eliminating steps 2 and 3 of the workflow.

2. Technically it’s possible to use the TigerWeb REST api to query geojson by area, but practically speaking the results are too detailed for useful performance. A helpful option for TigerWeb REST would be a parameter to request simplified polygons and avoid lengthy vertice transfers.

turf.js is a great tool box, however, occasionally the merge function had trouble with complex polygons from TigerWeb.

Fig 6 - turf merge had some rare difficulties with complex polygons from TigerWeb

In the end, territories are useful for query aggregates of SF1 demographic data.

Fig 5 – Territory Total for P0120025 Male 85 Years and older – Map: Jenks population classifier Census Block Group on Bing Map

2020 The Last Census?

Fig 1 - SF1QP Quantile Population County P0010001 P1.TOTAL POPULATION Universe: Total population

Preparation for US 2020 Census is underway at this mid-decennial point and we’ll see activity ramping up over the next few years. Will 2020 be the last meaningful decennial demographic data dump? US Census has been a data resource since 1790. It took a couple centuries for Census data to migrate into the digital age, but by Census 2000, data started trickling into the internet community. At first this was simply a primitive ftp data dump, ftp2.census.gov, still very useful for developers, and finally after 2011 exposed as OGC WMS, TigerWeb UI, and ESRI REST.

However, static data in general, and decennial static data in particular, is fast becoming anachronistic in the modern era. Surely the NSA data tree looks something like phone number JOIN Facebook account JOIN Twitter account JOIN social security id JOIN bank records JOIN IRS records JOIN medical records JOIN DNA sequence….. Why should this data access be limited to a few black budget bureaus? Once the data tree is altered a bit to include mobile devices, static demographics are a thing of the past. Queries in 2030 may well ask “how many 34 year old male Hispanic heads of households with greater than 3 dependents with a genetic predisposition to diabetes are in downtown Denver Wed at 10:38AM, at 10:00PM?” For that matter let’s run the location animation at 10 minute intervals for Tuesday and then compare with Sat.

“Privacy? We don’t need no stinking privacy!”

I suppose Men and Black may find location aware DNA queries useful for weeding out hostile alien grays, but shouldn’t local cancer support groups also be able to ping potential members as they wander by Star Bucks? Why not allow soda vending machines to check for your diabetic potential and credit before offering appropriate selections? BTW How’s that veggie smoothie?

Back to Old School

Fig 2 - SF1QD Quantile Density Census Block Group P0050008 P5.HISPANIC OR LATINO ORIGIN BY RACE Universe: Total population

By late 2011 census OGC services began to appear along with some front end data web UIs, and ESRI REST interfaces. [The ESRI connection is a tightly coupled symbiotic relationship as the Census Bureau, like many government bureaucracies, relies on ESRI products for both publishing and consuming data. From the outside ESRI could pass as an agency of the federal government. For better or worse “Arc this and that” are deeply rooted in the .gov GIS community.]

For mapping purposes there are two pillars of Census data, spatial and demographic. The spatial data largely resides as TIGER data while the demographic data is scattered across a large range of products and data formats. In the basement, a primary demographic resource is the SF1, Summary File 1, population data.

“Summary File 1 (SF 1) contains the data compiled from the questions asked of all people and about every housing unit. Population items include sex, age, race, Hispanic or Latino origin, household relationship, household type, household size, family type, family size, and group quarters. Housing items include occupancy status, vacancy status, and tenure (whether a housing unit is owner-occupied or renter-occupied).”

The intersection of SF1 and TIGER is the base level concern of census demographic mapping. There are a variety of rendering options, but the venerable color themed choropleth map is still the most widely recognized. This consists of assigning a value class to a color range and rendering polygons with their associated value color. This then is the root visualization of Census demographics, TIGER polygons colored by SF1 population classification ranges.

Unfortunately, access to this basic visualization is not part of the 2010 TigerWeb UI.

There are likely a few reasons for this, even aside from the glacially slow adoption of technology at the Bureau of the Census. A couple of obvious reasons are the sheer size of this data resource and the range of the statistics gathered. A PostGIS database with 5 level primary spatial hierarchy, all 48 SF1 population value files, appropriate indices, plus a few helpful functions consumes a reasonable 302.445 GB of a generic Amazon EC2 SSD elastic block storage. But, contained in those 48 SF1 tables are 8912 demographic values which you are welcome to peruse here. A problem for any UI is how to make 8912 plus 5 spatial levels usable.

Fig 3 – 47 SF1 tables plus sf1geo geography join file

Filling a gap

Since the Census Bureau budget did not include public visualization of TIGER/Demographics what does it take to fill in the gap? Census 2010 contains a large number of geographic polygons. The core hierarchy for useful demographic visualization is state, county, tract, block group, and block.

Fig 4 – Census polygon hierarchy

Loading the data into PostGIS affords low cost access to data for SF1 Polygon value queries such as this:

-- block tabblock
SELECT poly.GEOM, geo.stusab, geo.sumlev, geo.geocomp, geo.state, geo.county, geo.tract, geo.blkgrp, geo.block, poly.geoid10, sf1.P0010001, geo.stlogrecno
FROM tabblock poly
JOIN sf1geo geo ON geo.geoid10 = poly.geoid10
JOIN sf1_00001 sf1 ON geo.stlogrecno = sf1.stlogrecno
WHERE geo.geocomp='00' AND geo.sumlev = '101' AND ST_Intersects(poly.GEOM, ST_GeometryFromText('POLYGON ((-104.878035974004 38.9515291859429,-104.721023973742 38.9515291859429,-104.721023973742 39.063158980149,-104.878035974004 39.063158980149,-104.878035974004 38.9515291859429))', 4269))
ORDER BY geo.state, geo.county, geo.tract, geo.blkgrp, geo.block

Returning 1571 polygons in 1466 ms. Not too bad, but surely there’s room for improvement. Where is Paul Ramsey when you need him?

Fig 5 - PostgreSQL PostGIS Explain Query

Really Old School – WMS

Some considerations on the data:

A. Queries become unwieldy for larger extents with large numbers of polygons

Polygon Counts
county 3,233
tract 74,133
blockgroup 220,740
tabblock 11,166,336

These polygon counts rule out visualizations of the entire USA, or even moderate regions, at tract+ levels of the hierarchy. Vector mapping is not optimal here.

B. The number of possible image tile pyramids for 8912 values over 5 polygon levels is 5 * 8192 = 44,560. This rules out tile pyramids of any substantial depth without some deep Google like pockets for storage. Tile pyramids are not optimal either.

C. Even though vector grid pyramids would help with these 44,560 demographic variations, they suffer from the same restrictions as A. above.

One possible compromise of performance/visualization is to use an old fashioned OGC WMS GetMap request scheme that treats polygon types as layer parameters and demographic types as style parameters. With appropriate use of WMS <MinScaleDenominator> <MaxScaleDenominator> the layers are only rendered at sufficient zoom to reasonably limit the number of polygons. Using this scheme puts rendering computation right next to the DB on the same EC2 instance, while network latency is reduced to simple jpeg/png image download. Scaling access to public consumption is still problematic, but for in-house it does work.

Fig 6 – Scale dependent layer rendering for SF1JP - SF1 Jenks P0010001 (not density)

Fig 7 - a few of 8912 demographic style choices

There are still issues with a scale rendering approach. Since population is not very homogenous over US coverage extent, scale dependent rendering asks to be variable as well. This is easily visible over population centers. Without some type of pre-calculated density grid, the query is already completed prior to knowledge of the ideal scale dependency. Consequently, static rendering scales have to be tuned to high population urban regions. Since “fly over” US is generally less interesting to analysts, we can likely live with this compromise.

Fig 8 - SF1QD SF1 Quantile Density Census Tract P0010001/geographic Area



Classification schemes

Dividing a value curve to display adequately over a viewport range can be accomplished in a few different ways: equal intervals, equal quantile, jenks natural break optimization, K-means clustering, or “other.” Leaning toward the simpler, I chose a default quantile (guarantees some color) with a ten class single hue progression which of course is not recommended by color brewer. However 10 seems an appropriate number for decennial data. I also included a jenks classifier option which is considered a better representation. The classifier is based only on visible polygons rather than the entire polygon population. This means comparisons region to region are deceptive, but after all this is visualization of statistics.

“There are three kinds of lies: lies, damned lies, and statistics.” Mark Twain

Fig 9 – SF1JP SF1 Jenks Census Tract P0010001 (not density)

In order to manage Census data on a personal budget these compromises are involved:

1. Only expose SF1 demographic data for 2010 i.e. 8912 population value types
2. Only provide primary level polygon hierarchy – state, county, tract, blockgroup, block
3. Code a custom OGC WMS service – rendering GetMap image on the server
4. Resolution scale rendering to limit polygon counts down the polygon hierarchy
5. Provide only quantile and Jenks classifier options
6. Classifier applied only to viewport polygon selection

This is a workable map service for a small number of users. Exposing as an OGC WMS service offers some advantages. First there are already a ton of WMS clients available to actually see the results. Second, the Query, geometry parsing, and image computation (including any required re-projection) are all server side on the same instance reducing network traffic. Unfortunately the downside is that the computation cost is significant and discouraging for a public facing service.

Scaling could be accomplished in a few ways:

1. Vertical scaling to a high memory EC2 R3 instance(s) and a memory tuned PostGIS
2. Horizontal auto scaling to multiple instances with a load balancer
3. Storage scaling with pre-populated S3 tile pyramids for upper extents

Because this data happens to be read only for ten years, scaling is not too hard, as long as there is a budget. It would also be interesting to try some reconfiguration of data into NoSQL type key/value documents with perhaps each polygon document containing the 8912 values embedded along with the geometry. This would cost a bit in storage size but could decrease query times. NoSQL also offers some advantages for horizontal scaling.

Summary

The Census Bureau and its census are obviously not going away. The census is a bureaucracy with a curious inertial life stretching back to the founding of our country (United States Constitution Article 1, section 2). Although static aggregate data is not going to disappear, dynamic real time data has already arrived on stage in numerous and sundry ways from big data portals like Google, to marketing juggernauts like Coca Cola and the Democratic Party, to even more sinister black budget control regimes like the NSA.

Census data won’t disappear. It will simply be superseded.

The real issue for 2020 and beyond is, how to actually intelligently use the data. Already data overwhelms analytic capabilities. By 2030, will emerging AI manage floods of real time data replacing human analysts? If Wall Street still exists, will HFT algos lock in dynamic data pipelines at unheard of scale with no human intervention? Even with the help of tools like R Project perhaps the human end of data analysis will pass into anachronism along with the decennial Census.

Fig 10 - SF1JP SF1 Jenks Census Blocks P0010001