Last week, the UK Companies House took the wraps off its URIs for companies. Now this is actually rather more significant than you might think (though frankly we are going to struggle to make it sexy).
First, it’s significant from the UK open data perspective, as in part it was the lack of a persistent URL which led to Companies Open House, an early precursor to OpenCorporates. Like quite a few company registers around the world, Companies House has URLs based on user sessions, which means that they change each time you visit them. The lack of permanent URLs for companies is a real problem when trying to reference companies, as anyone will know who’s ever been sent an email containing a Companies House URL only to find it takes you to the search page, not the page for the company concerned.
Second, it is significant because – with a little bit of arm-twisting by the Cabinet Office – Companies House has created these with the input of the open data community in the shape of OpenCorporates. Now URIs for companies is something we’ve got a bit of an opinion on, publishing over 27 million of them in over 30 jurisdictions, including 7 million for UK companies, and while much about the process was cumbersome and frustrating, what was gratifying was that Companies House listened to our input and changed their URI structure (for help with understanding the difference between URL and URIs see this Wikipedia Article and this StackOverflow answer).
Now the information you can get from those URIs is pretty bare to be honest, but that’s not really the point here. The point is that the government has now produced URIs which not only permanently identify the companies, but follow the same principles to those used by the Open Data community, being based only on the company number.
So, the Companies House URI for Vodafone PLC is http://business.data.gov.uk/id/company/01833679 and the OpenCorporates one is http://opencorporates.com/id/companies/gb/01833679. Both of these redirect to a HTML web page with a similar URL by default, or to pages that return JSON, XML or RDF data if that is requested. You can see how easy it is to switch between the two, and in fact every UK company page on OpenCorporates now has a link to the Companies House page.
Perhaps some of the more astute business people out there are surely asking, “Why would they do that? Why would OpenCorporates help Companies House come up with a URI system that mirrors their own and so, arguably, helps make it less essential, and with it reduce a chance of being the monopoly supplier.”
Well, it’s not naïvity; nor is it just a determination to do the Right Thing, although that’s very important to us, and in fact the choice of URI structure in OpenCorporates is designed to prevent anyone, including OpenCorporates, being able to have a monopoly over company identifiers. It’s also a realisation that unless we pull governments kicking and screaming into the open data world, we’ll end up with a world with little power for citizens and little opportunity for those who want to build a business based on open data and doing the right thing have a chance.
This is especially true for corporate data, and in a world where company registers publishing all their data in a free and open way is the exception rather than the rule (shamefully, none in the EU do so), this is an important step, and one we were happy to support.
Now the question is: will this just be another isolated island of URIs, or will the government actually become part of the open data community, by linking out to OpenCorporates and other open data projects? And will they be using OpenCorporates reconciliation service to match their own data, suppliers, etc to real-world legal entity or continue with using proprietary and closed identifiers such as Dun & Bradstreet’s poisonous DUNS numbering system? Answers in comments