This post originally appeared on the author’s blog.
I’ve recently been getting involved in the Open Data movement through association with ODI Leeds, where I spend some of my working week as a co-worker. I’m the sort of person who learns by doing, so I’ve been on the look out for a project which would give me an opportunity to research and use Open Data. I came across the OpenCorporates datasets via the ODI Awards: given my involvement in the Hebden Bridge Business Forum, a plan began to form…
The OpenCorporates data covers businesses, past and current, across range of geographic areas. I was interested in data about businesses in the HX7 postcode district, specifically the distribution and density of these businesses across the region. This suggested creating a map, but I wasn’t sure which would be the clearest way of visualising the data. I decided to trial two approaches
- Count of businesses registered in a postcode
- Individual businesses entities located by postcode
For good measure, I decided to add Postcode district boundaries for the HX postcodes as a reference layer.
Preparing the Data
I obtained the data from a range of sources
- Business data from the OpenCorporates api, searching on registered addresses containing HX7
- Postcode geocoding from the Ordnance Survey Code-Point Open data, used to geocode the business data
- Postcode district boundaries from KML file on the HX Postcode area Wikipedia page!
- OS Code-Point data was coded to the OS National Grid, which is not natively supported by Leaflet. While I could have used a plugin such as Proj4Leaflet, I chose to pre-process the data with the gbify-geojson library.
- Geocoded businesses would interfere with others located at the same postcode. To avoid this, I added a little bit of jitter to the coordinates, positioning them randomly in a roughly circular area around the postcode centre.
- The KML file also needed converting to a GeoJSON form (although, again, plugins are available). I used the online toGeoJson tool to do this, given it was a one-off requirement.
Building the Site
As previously mentioned, I selected used the Cloudmade Leaflet mapping API, and the astoundingly beautiful Stamen Watercolor map tiles. My code is available on GitHub, and the site is also hosted there. The finished rendering looks like this.
Green circles show the number of businesses registered at a given postcode, and blue circles are the individual businesses. I prefer the individual business visualisation, as it gives a good sense of density (especially given the effect of overlaid semi-transparent circles), but reveals details as you zoom in.
A word on Registered Addresses
There are a couple of facts that I’d not really considered when starting the project. These concern the registered address details, and the fact that it is common practice (though not universal) to enter the registered business as the company accountants address. This has three implications
- There may be businesses operating in the region with a registered address outside the area
- There may be businesses who don’t operate in the area but who have a registered address here
- What I’m really visualising in a good number of cases is the most popular accountants in the area!
I’d like to continue developing this into a toolset for the local business community. To be truly useful, this would ideally include the concept of a business address (e.g. retail premises, office). This would help with the regional anomalies mentioned above. Some of this is available in OpenCorporates, but needs to be manually added. I also need to consider unregistered businesses such as sole traders, as these form a significant part of the business life of the town. Finally, it’d be great to tag this by business sector. Again, this is available for some businesses on OpenCorporates (in the Industry Codes field), but not all. Some approaches I’ve considered to include
- Providing a business registration form to capture extra information about local businesses
- Augmenting the data with further Open Data published by Calderdale Council
- Researching other datasets and augmenting manually (e.g. yell.com)
- Local research (e.g. walking the area) and recording business details. I took this approach previously with the Hebden Rising site I created in the aftermath of a serious flooding incident in the area.
Some of these are clearly more manually intensive and more importantly may limit my ability to share the data. Ultimately, this could be not only a great tool for local businesses, but a real boon to visitors to the region. I see great potential in this dataset and look forward to developing the applications. Watch this space… Giles Dring, Freelance IT Consultant