If we had to describe 2015 in one line, it would be: The Year We Grew Up. Specifically, it was the year we went from being a feisty young start-up to being a critical part of the anti-corruption and business information infrastructure. Again and again, throughout the year, we had journalists, NGOs, lawyers, anti-corruption officials, government tax officers and academics telling us how critical OpenCorporates was. To strengthen this, we spent a lot of time working on back-end processes to improve the quality, freshness and scope of our data – not sexy stuff, but critical if people are going to depend on you.
And how can we not mention how proud and humbled we were to receive the Best Open Data Business Award, handed to us by Sir Nigel Shadbolt and Sir Tim Berners-Lee. In the five years since OpenCorporates launched, we’ve worked hard not just to create a critical resource for the modern world, but also to pioneer a new open data business model, one that prioritises public benefit while being powered by commercial revenue. This means a sustainable (and fast-growing) income that can power our growth and public-benefit impact, rather than being dependent on fickle short-term grants, and one that can effectively compete against traditional proprietary business information companies. Read our thoughts about winning the award here.
More data and features
We released advanced search on the website, to make it easier for people to get to the data they were looking for! Given that we had hundreds of millions data points covering 90 million companies – this feature not only saves our users a lot of time (allowing exclusion of inactive companies or branches, or restricting to non profit companies, for example), it also allowed searches that weren’t before possible, for example searching addresses, or alternative names. Of course, these features are free for all users.
2015 is also the year we did a concerted effort to get our community mobilised to scrape financial licenses while our team focused on financial and business licenses from US. Both of these datasets offer a richer view of business activity across the world. Our community scraped licenses from Bosnia and Herzegovina, Bahrain, Myanmar, Libya and many more. These include information about licensed lenders, brokers and banks operating in that jurisdiction which can be local operators or subsidiaries of global corporations. Our team scraped business licenses from most states in US, racking up to over 21 million individual licenses. You can see them here and here.
On the core company data front, most of 2015 was focused on improving our internal processes around the acquisition of this, including tackling a number of tricky data quality issues, such as when identifiers issued by company registers are not unique. However, we didn’t stand still on adding new jurisdictions, with a significant number of new jurisdictions from the US (Kentucky, Indiana, North Dakota, Colorado, Nevada, Alabama and Louisiana) and elsewhere (Panama, Cyprus, Croatia, and Romania, for example). Panama in particular is an important secrecy jurisdiction often used by anonymous companies so having this data on OpenCorporates was hugely satisfying.
The v4 release of our API (our service which allows direct access to our data) is now serving up to 5 million requests a day, and allows searching by registered address, companies starting with given phrase (e.g. ‘Barclays Bank’), filtering by multiple jurisdictions (e.g. Ireland and UK) and country (e.g. US), richer filtering of inactive and branch companies and a new nonprofit filter, to restrict or exclude nonprofit companies. Users with API keys can also get addresses (and dates of birth) for directors/officers, filtering officers by address, date of birth, position or status and a completely new way of representing industry codes that is far more granular and allows more powerful search filtering. Don’t miss this walk through post by Justin about following the money with Python and our API.
Late 2015, we started working on OpenGazettes, with the purpose of opening up government gazettes, aggregating them across multiple countries and enabling powerful searches into the unstructured and rich seam of corporate information contained in them. With a 300-year-old legacy, Gazettes are often the canonical source for company-related legal notices. They are particularly useful when researching or assessing private companies, including critical corporate events such as liquidation, dissolution, winding-up orders, annual general meetings or director actions. This work was possible due to the support of ODINE, an open data incubator by the European Union, who have helped fund & accelerate this. You can read our learnings here.
Investigations & Research
In what we hope will be a template for many other anti-corruption investigations, OpenCorporates helped Global Witness in their groundbreaking investigation into the jade industry in Myanmar. The OpenCorporates website has always been an essential part of the workflow for many journalists and investigators who are looking into the opaque world of corporations. However, Global Witness went further, taking a ‘dump’ of data from OpenCorporates, and then used data science and the OpenCorporates API (data service) to reveal the network of military elites, US-sanctioned drug lords and crony companies controlling Myanmar’s multi-billion dollar jade industry.
Using director and shareholder data collected, collated and published by OpenCorporates helped uncover a web of connections between companies and individuals, and substantiate interview information on the real owners of major jade businesses. OpenCorporates had this information (much of it was removed from the official register during the investigation), but that the fact that it was machine-readable data, available via an and programmatically combinable with other data was essential to discover the hidden connections. This whitepaper shines a light on the investigation and the role our data played in it.
In 2014, when the story of Moldova missing one eighth of its Gross Domestic Product (about $1 billion) went viral, there was a huge outcry in the country and across the globe. Moldova is one of the poorest countries in Europe. Thousands of protesters marched, calling on the government to investigate how $1 billion had mysteriously vanished from the state-owned Savings Bank, and private banks Unibank and the Social Bank. It led to mass street protests and contributed to the resignation of the Prime Minister in Moldova, but what was the British connection? Journalist Richard Smith connected a network of over 20,000 opaque UK LLPs and LPs to the bank fraud, aided by company ownership data provided by OpenCorporates. Data points as simple as addresses (one address has 1,500 companies registered to it) and common directors were used to uncover the sprawling network. You can read his analysis here.
Every year we provide our data to students and researchers to power their research. This year we worked with LSE, Grenoble Ecole De Management, Singapore Management University, Skopje University, University of Zurich, George Mason University, KU Leuven, and University of Arizona.
We held data dives and FlashHacks, our community events where people can contribute bots and map corporate networks, in Madrid, Geneva, Ottawa, Berlin, Nottingham, Birmingham and London. We engaged activists, policy experts, developers, teachers, data journalists, investigators, businesses and concerned citizens to help us get a step ahead in making the corporate world more transparent. One excellent example of this is the community investigation into the corporate network of Donald Trump. This included transcribing his disclosure to Federal Election Commission and checking companies with public records for more leads. You can find the OpenCorporates corporate grouping that resulted from this here, and this became a useful starting point for many journalists investigating Trump as the election cycle progressed.
It’s also worth mentioning the interesting write up by Bex Sentance, a member of our community and budding data journalist, of her experience of coming to FlashHacks. We’re always fascinated to hear how people use our data and the mashups they come up with. Here’s a great one about Cooperatives in the UK and this one that makes it easier for people request an SSL certificate easily. In the same spirit we made our data available for free to the Hack Make the Bank, Privacy Hack at MIT, and an Accountability hackathon.
The world of sport is not new to corruption, trafficking and money laundering. We assisted data journalists form IQ4News and University of Birmingham who were looking at the shady world of football agents connected via an international network of companies that are actively exploiting young players in Nigeria. According to a 2013 study conducted by Paris-based charity Foot Solidaire, about 15,000 young boys travel to Europe and other countries from West Africa each year, either through air travel or by walking across Sahara Desert to jump on a boat to Europe. In Europe their money is taken from them and they are left to fend for themselves leaving them in a cycle of exploitation. Using OpenCorporates data, they looked into this complex world of agents and their corporate structures. The investigation painted a picture of a complex network involving corporate structures in tax havens, agents with no obvious company registered, and many agent companies which were either dissolved or dormant.
We’ve been championing the importance of public and open data about who controls companies, also referred to as ‘beneficial ownership’. You may remember that in December 2014, frustrated by arguments that this would be too technically difficult to achieve, we decided to create a test site that would illustrate how it wouldn’t be. That became WhoControlsit. Originally created as a two-day internal proof-of-concept, we improved this significantly with the help of open data activists over a hack day in Berlin, demonstrating how complex relationships between entities and individuals/officers could be represented in code and visually. 2015 has been an important year in the fight for more beneficial ownership transparency and this prototype has been extremely useful in helping campaigners show the potential of this critical data.
We were thrilled to be included in Omidyar Network’s landmark report on the impact of open data in the UK. Our work in the successful campaign for a public beneficial ownership register was featured as one of six case studies in the report. The report marks five years since the United Kingdom launched its first open data portal and revisits whether the experience so far has rendered the wider public and business benefits promised.
To quote the author of the report, Becky Hogge: “This is a case about the contribution open data has to make to advocacy efforts on complex issues, and illustrates how moving the needle on complex issues like corruption and governance reform requires much more than opening government data…. A by-product of this sort of handling of open government data is the ability to speak the language of internal government bureaucracies. This turned out to be a key advocacy tool.”
We were also very happy to be featured in the renowned Somerset House as part of the ‘Big Bang Data’ exhibition exploring how data is transforming our world.
Chris spoke at the APIStrat conference in Berlin and you can see the writeup about it on Programmableweb here. He also spoke at the 13th Annual North American Offshore Alert conference which brings together financial intelligence and investigations community in Miami (May 3-5). We also highlighted the importance of open company data at the Open Data Conference in Ottawa, Canada and ran a session on following the corporate trail.
Every year brings its set of challenges and 2015 was no exception.
As OpenCorporates continues to scale to cover more jurisdictions, we discover limitations of our system. Early in 2015, we decided to invest considerable time in solving these challenges which are often non-trivial, even though this would mean that in the short term we would add fewer jurisdictions. We stand by this decision as it would allow us to more, faster and better in 2016.
On the data modelling front, we found working with gazettes and business licence were even more complex than we had anticipated, and we expect to this, and similarly tricky datasets, to present continuing challenges over the next few years.
Perhaps the toughest decision we made was to somewhat deprioritise community-written bots, even though since the beginning of OpenCorporates, this has been the backbone of how the wider community was able to help us grow. First, as our schemas become richer and more complex, the learning curve becomes steeper, which requires significant time investment by the bot-writer. Second, bot writers needed a significant amount of help from our data quality team, taking time away from other essential tasks – something we had been spending a lot of time doing in 2014. Though this was not an easy decision, we started using our FlashHacks and community Slack more to work on mapping corporate networks. This was done on a spreadsheet and later added to a corporate grouping.
Working on Aviva, G4S and Trump corporate networks with a mix of coders, journalists, students, and researchers showed us the difficulties of working with filings. This was a thrilling challenge. How do we represent corporate control and back this up with data? How do we achieve good progress with this mapping with non-investigators in the space of 3 hours? We set this challenge for ourselves and are happy to report, we smashed this target. Over 20 FlashHacks and 4 Data dives, we mapped many networks including G4S and Aviva.
We were featured in the Guardian as one of the five ways open technology can boost democracy around the world, and in another Guardian piece by Brett Scott on how open data can unravel the complex dealing of multinational companies. Our CEO Chris Taggart was interviewed by CDAR about building a global open company database which you can read here. Community Manager Hera was interviewed about the need for open data in Interhacktives.
With every project comes a dose of learning. Sharing these learnings with the rest of community ensures that these lessons don’t stop with us. We blogged this year about the schema for gazettes, who really controls companies, why corporate structures need to be open and a every person guide to what beneficial ownership is.
WHAT’S UP NEXT
To say OpenCorporates has ambitious plans would be an understatement. For fairer markets and protection of democracy, transparency of company affairs is fundamental. And we know how important it is for everyone who depends on our data for us to be independent, sustainable and fighting the good fight for open company data.
So what’s next for us? We want to break through our internal target of 100 million companies. More jurisdictions. More features. More clients. More investigations. Better API. We’re going to continue working on pushing for public registries of beneficial ownership information because we believe that is a game changer dataset to fight corruption and improve due diligence checks.