The success story that is OpenCorporates is very much a team effort – not just the tiny OpenCorporates core team, but the whole open data community, who from the beginning have been helping us in so many ways, from writing scrapers for company registers, to alerting us when new data is available, to helping with language or data questions.
But one of the most common questions has been, “How can I get data into OpenCorporates“. Given that OpenCorporates‘ goal is not just every company in the world but also all the public data that relates to those companies, that’s something we’ve wanted to allow, as we would not achieve that alone, and it’s something that will make OpenCorporates not just the biggest open database of company data in the world, but the biggest database of company data, open or proprietary.
To launch this new era in corporate data, we are launching a #FlashHacks campaign.
Flash What? #FlashHacks.
We are inviting all Ruby and Python botwriters to help us crowdscrape 10 million data points into OpenCorporates in 10 days.
Can we do it?
With your help, we are confident we can beat the target.
Why is this important
Information about public and private sector is of monumental importance to understanding and changing the world we live in. Transnational corporations can wield unprecedented influence on politics and economy and we have a limited capacity to understand this when we don’t know what these legal entities look like. The influence of these companies can be good or bad and we don’t have a clear picture of this.
Company information is often not available and when it is, it is buried under hard-to-use websites and PDFs. Fortunately, the work of the open data and transparency community has brought a tide of change. With the introduction of Open Government Partnership and G8 Open Data Charter, governments are committing to make this information easily and publicly available. Yet, action on this front remains slow. And that’s why scraping is at the heart of the open data movement! Where would the open data community be if it had not been for bot-writers spending time deciphering formats and writing code to release data?
We want to use #FlashHacks as a celebration of the commitment of bot-writers and invite others to join us in changing the world through open data.
How you can join the crowdscraping movement
- Join missions.opencorporates.com and sign up!
- Have a look at the datasets we have listed on the Campaign page as inspiration. You can either write bots for these or even chose your own!
- Sign up to a mission! Send a tweet pledge to say you have taken on a mission.
- Write the bot and submit on the platform.
- Tweet your success with the #FlashHacks tag! Don’t forget to upload the FlashHack design as your twitter cover photo and facebook cover photo to get more people involved.
Join us on our Google Group, share problems and solutions, and help build the open corporate data community.
If you are interested in covering this story, you can view the press release here.
Also of interest: Ruby and Python coders – can you help us?
Special thanks to the following:
- Alfred P. Sloan Foundation for generously providing the funding that makes this possible.
- Morph.io for providing a cool and light-weight platform for writing scrapers.