OpenCorporates is proud to announce that it has today added German company data, making this the 130th jurisdiction and one of the largest single jurisdictions. The German data adds over 5 million companies to the OpenCorporates database.
This was a huge project for us, for a number of reasons:
- There are significant problems with the underlying data quality, particularly the identifiers. These required months of detailed analysis to understand, and come up with the right, sustainable solutions.
- The data isn’t in a single source, or easily parseable as structured data, but mostly as a series of unstructured gazette notices in the Handelsregisterbekanntmachungen. We’ve spent months writing and testing code to extract embedded information from these – addresses, officers, subsequent and previous registrations (more on that coming soon) – and now at last it meets OpenCorporates’ high standards.
- This isn’t just about OpenCorporates having the data. Germany is an important jurisdiction and this is a crucial dataset that must be open for everyone. We’ve been working with Transparency International Deutschland, and Open Knowledge Germany, among others, to use this to show the German government what good looks like, so it will make the information available as open data. While we’re proud of the work we’ve done, we don’t think others should have to go through this pain to get German company information as high-quality structured open data.
We’ve also been working with some of the best journalists in Germany, to allow them early access to the data to see if they could find anything interesting. Today they delivered on this – in a major way – with stories from Süddeutsche Zeitung, NDR and CORRECTIV. Here’s what they’ve found so far:
- Süddeutsche Zeitung – The Owner Remains Secret
- NDR – Who is behind which company?
- CORRECTIV – WHO OWNS HAMBURG? Rent under palm trees
Over the next few days we’ll be reporting more on this landmark dataset, including our analysis of the Germany company data, and why it’s so important for it to be open data.
We’re also donating the dataset to Open Knowledge Germany for them to make available to all under an open licence (more on this tomorrow). Watch this space!