Home

This post originally appeared on the author’s blog.

I’ve recently been getting involved in the Open Data movement through association with ODI Leeds, where I spend some of my working week as a co-worker. I’m the sort of person who learns by doing, so I’ve been on the look out for a project which would give me an opportunity to research and use Open Data. I came across the OpenCorporates datasets via the ODI Awards: given my involvement in the Hebden Bridge Business Forum, a plan began to form…

The Plan

The OpenCorporates data covers businesses, past and current, across range of geographic areas. I was interested in data about businesses in the HX7 postcode district, specifically the distribution and density of these businesses across the region. This suggested creating a map, but I wasn’t sure which would be the clearest way of visualising the data. I decided to trial two approaches

  1. Count of businesses registered in a postcode
  2. Individual businesses entities located by postcode

For good measure, I decided to add Postcode district boundaries for the HX postcodes as a reference layer.

Preparing the Data

I obtained the data from a range of sources

Presenting the data in a GeoJSON format seemed sensible, given I would be using Cloudmade Leaflet to display the map. I wrote a Javascript preparation script which created two GeoJSON files: the first with geocoded business data and the second with postcodes augmented with business counts. There were a few issues to resolve in this process

  • OS Code-Point data was coded to the OS National Grid, which is not natively supported by Leaflet. While I could have used a plugin such as Proj4Leaflet, I chose to pre-process the data with the gbify-geojson library.
  • Geocoded businesses would interfere with others located at the same postcode. To avoid this, I added a little bit of jitter to the coordinates, positioning them randomly in a roughly circular area around the postcode centre.
  • The KML file also needed converting to a GeoJSON form (although, again, plugins are available). I used the online toGeoJson tool to do this, given it was a one-off requirement.

Building the Site

As previously mentioned, I selected used the Cloudmade Leaflet mapping API, and the astoundingly beautiful Stamen Watercolor map tiles. My code is available on GitHub, and the site is also hosted there. The finished rendering looks like this.

A map showing the distribution of businesses in Hebden Bridge

Visualisation of Hebden Bridge businesses, based on OpenCorporates data

Green circles show the number of businesses registered at a given postcode, and blue circles are the individual businesses. I prefer the individual business visualisation, as it gives a good sense of density (especially given the effect of overlaid semi-transparent circles), but reveals details as you zoom in.

A word on Registered Addresses

There are a couple of facts that I’d not really considered when starting the project. These concern the registered address details, and the fact that it is common practice (though not universal) to enter the registered business as the company accountants address. This has three implications

  1. There may be businesses operating in the region with a registered address outside the area
  2. There may be businesses who don’t operate in the area but who have a registered address here
  3. What I’m really visualising in a good number of cases is the most popular accountants in the area!

Next Steps

I’d like to continue developing this into a toolset for the local business community. To be truly useful, this would ideally include the concept of a business address (e.g. retail premises, office). This would help with the regional anomalies mentioned above. Some of this is available in OpenCorporates, but needs to be manually added. I also need to consider unregistered businesses such as sole traders, as these form a significant part of the business life of the town. Finally, it’d be great to tag this by business sector. Again, this is available for some businesses on OpenCorporates (in the Industry Codes field), but not all. Some approaches I’ve considered to include

  • Providing a business registration form to capture extra information about local businesses
  • Augmenting the data with further Open Data published by Calderdale Council
  • Researching other datasets and augmenting manually (e.g. yell.com)
  • Local research (e.g. walking the area) and recording business details. I took this approach previously with the Hebden Rising site I created in the aftermath of a serious flooding incident in the area.

Some of these are clearly more manually intensive and more importantly may limit my ability to share the data. Ultimately, this could be not only a great tool for local businesses, but a real boon to visitors to the region. I see great potential in this dataset and look forward to developing the applications. Watch this space… Giles Dring, Freelance IT Consultant

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s