This is a guest blog by James Moulding. If you would like to write for us, get in touch at email@example.com
Yesterday, I attended a #FlashHacks event by OpenCorporates at Hub Kings Cross. #FLASHHACKS are mini-hacks organised by OpenCorporates for coders and non-coders who want to liberate corporate data for a better world.
At #FLASHHACKS, Taras Fedirko, an anthropologist from Durham University researching campaigns involved in greater corporate transparency, and myself, an organiser for Cybersalon, a think-tank on issues in network cultures, came to learn a bit more about working with and visualising data.
Starting off, we were introduced to Kibana, a data visualisation software and worked off of various saved queries. Working from OpenCorporates expansive datasets of corporate data, some 85 million companies, we explored data from companies set up between 1900 and 2014. We began by differentiating the results by types of economic activity (SIC codes – using the ‘sic_division_titles’ query) and discriminated the data further by companies that had dissolved (utilising ‘current_status’ query) in this period.
We played with a range of parameters in an attempt to make the data meaningful with increasingly precise categorisation. Attempting to display more recent data, particularly around the recent economic recession, we tried to display the top 15 postcodes by the frequency of businesses dissolved. However, in some cases the database could not discriminate between the first and second parts of a postcode. Despite this, it was interesting to see Greater London postcodes figured quite prominently in this data.
Continuing to mess around with the data, we set out to compare the aggregate number of dissolutions in the IT sector between both Tony Blair and David Cameron’s periods of office. We abandoned this as the IT sector is distributed across several SIC codes and Kibana was not able to run the query.
Finally, we decided to compare the frequency of dissolutions across the top 10 economic sectors which had experienced the most business closures – between Tony Blair’s and David Cameron’s periods of office. We attempted to export the .CSV files to then visualise the data more constructively on an external spreadsheet, because Kibana cannot show percentages on the charts. We noticed the export button on Kibana didn’t work and discovered it didn’t have a link attached to it. It might have been a bug in the Kibana software so we have reported it.
Instead, we copied the figures ourselves to a Google Spreadsheet and visualised it.
There are problems associated with the ways in which we relied upon SIC codes to discriminate the data, in that many businesses do not accurately list their organisations SIC code – lots of new businesses select ‘Other Businesses Activities’ leading to this bloated representation.
Lots of fun and we’ll definitely back for another #Flashhacks event!
James is a politics graduate of the University of Westminster. He is developing an educational boardgame and app, Imperialism in Space, and is working on building a startup, OpenBar. James is currently an Event Coordinator for Cybersalon, and also spends his time working on supporting the Institute of Applied Social Innovation. He has helped develop e-learning research at the University of Westminster and was Assistant Conference Director of Wikimania 2014. James is also a proud player of Class Wargames.