Developing nations – the network infrastructure challenge

The world is becoming more urban – so how do we ensure our infrastructure meets the social, economic and climatic challenges of the 21st century? This was the theme of the recent International Symposium for Next Generation Infrastructure (ISNGI) hosted by the SMART Infrastructure Facility, at the University of Wollongong. The symposium highlighted some of the great research going on in Australia and around the world to understand how we can make our cities and their infrastructure sustainable for future generations.

IWA Poster

Sanitation Network Modelling Poster

But what about developing nations? How do we model infrastructure when there’s no data? Or when the system is changing so fast that traditional data collection techniques become redundant? How do we quantify the infrastructure requirements of a slum when it’s  population fluctuates by 800,000 people annually? More importantly, how do you engage with that community to understand their needs?

One solution is to use open data and open tools. The world is becoming more connected, and crowd-sourced data offer, for the first time, an insight into infrastructure in some of the world’s poorest cities and informal settlements which have never before been mapped. The Map Kibera project is a really great example of this. In collaboration with colleagues from the UK, we built a prototype model to demonstrate the utility of data from Map Kibera and Open Street Map for spatio-topological network modelling, to optimise road-based sanitation for Kibera. I presented this work at ISNGI, and Ruth recently presented a poster of this work at the International Water Association Congress and Exhibition in Nairobi. We’ve demonstrated it’s possible – the challenge now is to make it work in the real world.

Creating spatial magic with GeoAlchemy2 and PostGIS

I recently discovered the GeoAlchemy2 project – a replacement for the original GeoAlchemy package, focused on providing PostGIS support for SQLAlchemy. The SQLAlchemy package is a “Python SQL Toolkit and Object Relational Mapper”. In a nutshell this means you can write Python classes and map them to PostgreSQL tables without the need to write SQL statements – pretty cool!

PostGIS is great for doing spatial stuff, but if you’re using it as back-end for a Python app then you can spend a lot of time writing Python wrappers around SQL statements, and even with the excellent Psycopg2 package this can be tricky. This is especially true if you’re using the OGR Python bindings to handle PostGIS read/writes.

Enter GeoAlchmey2. I’ve been experimenting with it for a week, in that time I’ve learnt this:

For developing geospatial Python apps with PostGIS, GeoAlchemy2 is nothing short of revolutionary.

You can call PostGIS functions in Python, which means you can use them (and the data) directly within your Python application logic. Here’s an example. The SQL statement below uses PostGIS to create a new line geometry between a point, and the closest point on the nearest line.

Now here’s a snippet from a Python script, performing the same process using GeoAlchemy2.

You’ll notice here that we’re actually calling our own Python function “make_link_line” during the query to create the new geometry. This exemplifies how we can move PostGIS objects around inside the script. Once the query runs we can access the returned data in our application from the row variable. Below is the complete script.

Nearest neighbour PostGIS and GeoAlchemy2 script:

The script above is just a simple example, but it shows how powerful GeoAlchemy2 is for embedding PostGIS objects and methods inside Python. I’m really looking forward to digging deeper into the functionality of GeoAlchemy2 and SQLAlchemy to integrate them within my own projects. Check out the official tutorials for more examples: https://geoalchemy-2.readthedocs.org/en/latest/#tutorials

Confidence intervals of simple linear regression

Plotting confidence intervals of linear regression in Python

After a friendly tweet from @tomstafford who mentioned that this script was useful I’ve re-posted it here in preparation for the removal of my Newcastle University pages.

This script calculates and plots confidence intervals around a linear regression based on new observations. After I couldn’t find anything similar on the internet I developed my own implementation based on Statistics in Geography by David Ebdon (ISBN: 978-0631136880).

Linear regression plot

Plot of linear regression with confidence intervals

# linfit.py - example of confidence limit calculation for linear regression fitting.

# References:
# - Statistics in Geography by David Ebdon (ISBN: 978-0631136880)
# - Reliability Engineering Resource Website:
# - http://www.weibull.com/DOEWeb/confidence_intervals_in_simple_linear_regression.htm
# - University of Glascow, Department of Statistics:
# - http://www.stats.gla.ac.uk/steps/glossary/confidence_intervals.html#conflim

import numpy as np
import matplotlib.pyplot as plt

# example data
x = np.array([4.0,2.5,3.2,5.8,7.4,4.4,8.3,8.5])
y = np.array([2.1,4.0,1.5,6.3,5.0,5.8,8.1,7.1])

# fit a curve to the data using a least squares 1st order polynomial fit
z = np.polyfit(x,y,1)
p = np.poly1d(z)
fit = p(x)

# get the coordinates for the fit curve
c_y = [np.min(fit),np.max(fit)]
c_x = [np.min(x),np.max(x)]

# predict y values of origional data using the fit
p_y = z[0] * x + z[1]

# calculate the y-error (residuals)
y_err = y -p_y

# create series of new test x-values to predict for
p_x = np.arange(np.min(x),np.max(x)+1,1)

# now calculate confidence intervals for new test x-series
mean_x = np.mean(x)			# mean of x
n = len(x)				# number of samples in origional fit
t = 2.31				# appropriate t value (where n=9, two tailed 95%)
s_err = np.sum(np.power(y_err,2))	# sum of the squares of the residuals

confs = t * np.sqrt((s_err/(n-2))*(1.0/n + (np.power((p_x-mean_x),2)/
			((np.sum(np.power(x,2)))-n*(np.power(mean_x,2))))))

# now predict y based on test x-values
p_y = z[0]*p_x+z[0]

# get lower and upper confidence limits based on predicted y and confidence intervals
lower = p_y - abs(confs)
upper = p_y + abs(confs)

# set-up the plot
plt.axes().set_aspect('equal')
plt.xlabel('X values')
plt.ylabel('Y values')
plt.title('Linear regression and confidence limits')

# plot sample data
plt.plot(x,y,'bo',label='Sample observations')

# plot line of best fit
plt.plot(c_x,c_y,'r-',label='Regression line')

# plot confidence limits
plt.plot(p_x,lower,'b--',label='Lower confidence limit (95%)')
plt.plot(p_x,upper,'b--',label='Upper confidence limit (95%)')

# set coordinate limits
plt.xlim(0,11)
plt.ylim(0,11)

# configure legend
plt.legend(loc=0)
leg = plt.gca().get_legend()
ltext = leg.get_texts()
plt.setp(ltext, fontsize=10)

# show the plot
plt.show()

Raspberry Pi Temperature Server

Using NodeJS to develop a temperature server with the Raspberry Pi

The Raspbian Linux distribution for the Raspberry Pi includes some useful kernel drivers for accessing devices connected to the Pi’s GPIO pins. Based on the tutorial from the University of Cambridge Computer Laboratory I’ve been playing around with the DS18B20 digital thermometer on the Pi. I’ve connected the sensor to the Pi using a standard electronics breadboard and the excellent Adafruit Pi Cobbler breakout connector kit (kudos to Mills for her excellent soldering skills). When the required GPIO kernel modules are loaded a file containing sensor output is written to the /sys/bus directory (see tutorial link above for more) which contains the current thermometer reading.

Raspberry Pi & DS18B20

Raspberry Pi & DS18B20 digital thermometer

Developing with NodeJS

I originally wrote a Python CGI script as part of the CPC Pi Hack event to parse the sensor file and display the temperature on a web page (although we didn’t submit our hack in the end), but I’ve also been looking for a project for a while to try out NodeJS.The result is a prototype JavaScript server/client app to serve temperature from the Pi as a JSON string and display a graph of current temperature on the client.

This was my first NodeJS app and I was impressed with the speed of development and the readability of documentation/examples – so much so that I managed to write the bulk of the server on my netbook during a flight from Leeds to London! The biggest disadvantage of using NodeJS on the Pi is the time it takes to compile (1hr 58 minutes, Pi CPU clocked at 950MHz). While this may put some developers off, the build process was painless and is more than made up for by the efficient asynchronous nature of NodeJS once you get it running (first-pass testing shows no noticeable increase in memory/CPU when the server is under load although I’ve not investigated this thoroughly).

The server code is divided into two parts: a dynamic server response which is called when a request for the “temperature.json” URL is received. The server reads the sensor file, parses the data and returns the temperature with a Unix time-stamp in JSON notation. Here’s a snippet from the server code parsing and returning the temperature data:

// Read data from file (using fast node ASCII encoding).
var data = buffer.toString('ascii').split(" "); // Split by space

// Extract temperature from string and divide by 1000 to give celsius
var temp = parseFloat(data[data.length-1].split("=")[1])/1000.0;

// Round to one decimal place
temp = Math.round(temp * 10) / 10

// Add date/time to temperature
var jsonData = [Date.now(), temp];

// Return JSON data
response.writeHead(200, { "Content-type": "application/json" });
response.end(JSON.stringify(jsonData), "ascii");

The second section of the server uses the node-static module to serve a client-side page which performs an AJAX call for “temperature.json” and plots current temperature. The plot is created using the highcharts JavaScript package to create a dynamic graph which moves along the x-axis over time (check out the highcharts demo page to get a better idea of dynamic charts). One thing to note is that the sensor precision is ±0.5 °C, and while the temperature data is rounded to one decimal place, the default highcharts y-axis precision may be a bit misleading. Overall though the highcharts package is pretty slick, and the plot looks great on my new Nexus 7!

Temperature Plot

Raspberry Pi Temperature Sensor Plot

I’ve pushed the code to GitHub as it may be useful to others who are also new to the Pi and NodeJS. One of the features of NodeJS I like the most is the ease of testing and deployment – you can run the server right in the terminal window without super-user permissions and get debugging info straight away. Furthermore, given the ease of creating web-apps and the small resource footprint (obviously depending on what you’re doing) I’ll definitely be looking to use NodeJS/JavaScript as development platform for the Pi in the future.

3DLondon

Visualising London’s buildings with 3D mapping

I recently discovered a JavaScript library called OSM Buildings which provides a 3D representation of buildings from Open Street Map (OSM) data on a 2D map. OSM Buildings doesn’t generate a complete 3D model but uses a clever shading technique to make buildings appear ’3D’. OSM Buildings and can be used with either OpenLayers or the Leaflet JavaScript map API’s.

3DLondon map screenshot

I was interested in whether OSM Buildings could be used to visualise non-OSM data, in particular the Geoinformation group’s UKMap product, to create a complete ’3D’ city map. The UK map database is a commercial, attribute rich, map database designed to compete with the UK’s Ordnance Survey products, and contains a LiDAR derived height for most structures in urban areas. Furthermore, UK academics can access the UKMap data via the MIMAS Landmap service. As part of my PhD research I obtained a copy of the UKMap basemap and building heights for London from the Landmap team.

The OSM Buildings API provides a method to add a buildings layer from a GeoJSON object, so I wrote a simple Python CGI script to grab building features from the UKMap PostGIS database and return it as GeoJSON. The embedded SQL query (via the excellent Psycopg module) uses the new PostGIS 2.0 function ST_AsGeoJSON to return a JSON representation of the geometry. Unfortunately this isn’t as straight forward as it seems as PostGIS doesn’t return any feature attributes. However after reading this blog at the Postgres online journal and installing Andrew Dunstan’s backport of JSON support for PostgreSQL 9.1, I managed to get the query to return proper JSON representation of the attributes and geometry.

To reduce data volume (there are ~5 million features in the database) the CGI script accepts a bounding box parameter in well known text form, and only features within the bounding box are returned. Here’s a snippet showing the SQL query:

SELECT row_to_json(fc)
FROM (SELECT 'FeatureCollection' As type, array_to_json(array_agg(f)) As features
      FROM (SELECT 'Feature' As type,
            ST_AsGeoJSON(ST_Transform(lg.the_geom,4326))::json As geometry,
            row_to_json((SELECT l FROM (SELECT height) As l)) As properties
            FROM "3DLondon" As lg
            WHERE ST_Within(lg.the_geom,
            ST_Transform(ST_GeomFromText(%s,4326),27700))
            LIMIT 100000)
      As f )
As fc;',[bbox_wkt])

A sample of the features and their geometry returned as GeoJSON:

{"type":"FeatureCollection","features":[{"type":"Feature","geometry":{"type":"MultiPolygon","coordinates":[[[[-0.099928777847266,51.514483900172905],[-0.100343210931384,51.514457880418973],
//...

Thanks to some good documentation at OSM Buildings the front end is a simple JavaScript function to add the buildings to the map via a jQuery post AJAX call.

//Request data from GeoJson server and add to map
function getBuildings(bounds_wkt){
	jQuery.postJSON('/3DLondon/get_geojson.py?bbox_wkt='+bounds_wkt,
   function(data){
		build
     		.addTo(map)
	        .setStyle({ wallColor: 'rgb(200,190,180)', roofColor: 'null',
                        strokeRoofs: true })
     		.geoJSON(data)
		;
		});
};

Using this function it was fairly straightforward to add some functions to listen for Leaflet map events to refresh the building data when the map was moved or zoomed.

//Listen for map pan events and get new buildings data
map.on('moveend', function(e) {
   getBuildings(getBBoxWKT());
});

From the screenshot above you can see that OSM Buildings is doing a great job of rendering stuff in ’3D’. The basemap is Open Street Map data, styled and served by CloudMade with OSM building footprints removed for clarity. In a nutshell I was pretty impressed with both the OSM Buildings and the Leaflet libraries. You can check out the complete source code at https://github.com/talltom/3DLondon, and here’s another screenshot over a larger area with the buildings rendered in blue.

3DLondon map screenshot

3DLondon map of St Paul’s/The City and surrounds

PostGISDroid GitHub pages

Following in the footsteps of the Raster Processing Suite, I’ve added GitHub pages for the PostGISDroid script which I wrote last year. This was a prototype Python script, built on the Android scripting layer, to log the location of an Android device to a remote PostGIS server. Check it out here: http://talltom.github.com/PostGISDroid.

PostGISDroid track in QGIS
A track from PostGISDroid in Quantum GIS.

If you’re looking for a full-blown Android client for mobile data acquisition check out the Sense Cloud Framework: http://ceg-sense.ncl.ac.uk/geoanorak/nclsensecloud.html.

Raster Processing Suite at GitHub

As mentioned previously I’m in the process of migrating content from my Newcastle pages to this site. Given that I’m already hosting most of my software projects on GitHub, I’ve taken the leap and created GitHub pages for the Raster Processing Suite QGIS plug-in. The Raster Processing Suite is a nifty bit of software I developed as part of my PhD thesis, to help with exploratory numerical analysis of time series AVHRR satellite imagery (see Holderness et al., 2013). Part of the PyRaster tools (more on those soon) created during my PhD, the Raster Processing Suite is a raster calculator with a Python scripting interface (including automatic script generation) for raster image processing in Quantum GIS (see screen-shot below).

GitHub pages provide free web hosting for content related to software projects. You can check out the Raster Processing Suite at https://github.com/talltom/PyRaster. Furthermore, with a bit of tweaking I’ve also managed to get the pages to host the QGIS XML repository for direct download into QGIS: http://talltom.github.com/Raster-Processing-Suite/qgis_repo.xml.

I’ll post links to GitHub pages for other projects in the coming weeks.

Screenshot of the Raster Processing Suite QGIS plug-in

Screen-shot of the Raster Processing Suite QGIS plug-in