This is the latest in a series of blog posts relating to the opening up and launching of Germany company information as open data.
In Germany, the incorporation of companies (and subsequent changes) is handled not by a central register, but by district courts (‘Amtsgerichts’) – in fact by around 150 such courts (out of a total 600). Unfortunately the identifiers each court issues are neither unique to that court, nor to Germany.
For example, there may be several courts that issue the identifier HRB 1234 – the identifier becomes unique only when paired with the court that issued it. Unfortunately there is no universally accepted consistent way of representing the court, meaning that national company identifiers in Germany are ill-defined and non-unique.
Even more problematic, when a company moves its headquarters from one court area to another, it gets issued a new identifier by the new court (which again is unique only to that court). Thus there are three problems with identifiers in Germany:
- Lack of uniqueness
- No official or consistent way of representing the identifiers
- Multiple registrations (and thus identifiers) per legal entity
This is a recipe for low data quality, and also makes combining essential datasets together (such as procurement data and lobbying data) difficult to say the least, creating an opaque corporate environment that only benefits criminal and other anti-social activities.
This post will explain how OpenCorporates tackled the first two problems; the third problem – of multiple related registrations – will be addressed in a post in the next week or so.
Representing identifiers
Ignoring for a moment the problem of identifying the courts, the base identifier is actually not just a number, but a combination of the number and the register in which it is entered. Each court maintains a number of registers for different groups of company types. For example, Handelsregister Abteilung B – abbreviated to “HRB” – covers the following company types: public limited company (AG), association limited by shares (KGaA), limited liability companies (GmbH), and others). Handelsregister Abteilung A, on the other hand, covers: individual enterprises (e.K.), commercial (general and limited) partnerships (OHG and KG), corporations of public law running a business and EEIGs.
There are also registers for co-ops (GnR – Genossenschaftsregister), partnerships (PR – Partnerschaftsregister) and associations (VR – Vereinsregister). Because the same number can appear on different registers, you need to pair the number with the register, e.g. HRB123, even to identify the registration within the same court.
Lack of uniqueness
The problem of representing the courts is a little more tricky, and we’ve seen a number of different solutions used. In our case, we could refer to our (public) policy document for handling company number problems. This states that we should make them unique within the jurisdiction, by adding a prefix for the area or register (aka namespacing the identifier). We looked at several options for how to do this.
In the end, we used the so-called “XJustiz-ID” identifiers for the courts (examples can be found here), which are managed by the German Ministry of Justice to support data transmission between official agencies, and also used by the EU in the so-called EUID (which is used in a DG-Justice portal for company information and is somewhat poorly documented). In other jurisdictions with multiple registers, but where there is no local official identifier, we plan to use the Registration Authorities list managed by the Global LEI Foundation.
This wouldn’t be too bad, if that’s all there was. However, it’s complicated considerably by the consolidation of courts over time, for example where the authority to register companies is transferred from several smaller courts to a single one. There are two ways this is represented in documents referring to the company (e.g. gazette notices). The first is variations of court name. For example: “Amtsgericht Köln früher [formerly] Amtsgericht Eschweiler”. The EUID represents this as a paring of XJustiz-IDs, e.g. “R3101_R3104” in the above case.
However, in other cases, suffixes are appended by the register to the number (eg “HRB 123 PI”) – these numbers represent companies that had previously been registered in an adjoining district court which now no longer handles company registration. As an example, these are the suffixes that can be found in the Amtsgericht Pinneberg register – EL, IZ, ME, PI. They represent companies that were formerly registered with Ellerbek, Itzerhoe, Meldorf & Pinneberg District Courts respectively. (This approach is consistent with the way that the EUID is constructed, which retains the suffix.)
What this all looks like
When you put this all together, you get an identifier which looks like:
[XJustiz-ID for District Court][Optional: XJustiz-ID for former District Court][underscore for readability][Register type][Company Number][Optional: suffix for former District Court]. Here are some real-life examples:
- G1309_HRB1234. Company registered at the Neuruppin court, in the HRB register with identifier 1234
- R2201_R2205_HRB1216. Company registered at the Bochum court, formerly the Herne-Wanne (Herne) court in the HRB register with identifier 1216
- X1721R_HRB918SB. Company registered at the Lübeck court, formerly the Schwarzenbek court in the HRB register with identifier 918.
The original numbers
Finally we have also stored the court name and identifiers in as close to their original form as possible in the “Native company number” field. This provides additional terms that our users can search with to match German companies without needing to know the court identifier.
Examples:
Image: Adreßbuch sämmtlicher Bewohner der Stadt Heidelberg für 1846, Universitätsbibliothek Heidelberg (CC-BY-SA)
EU Horizon 2020 
The collection of the German company data has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780247, TheyBuyForYou