Open data at Global Witness in 2016

Cross-posted for the Global Witness blog where it was first published.

I’m a self-confessed data junkie, but I appreciate that spreadsheets aren’t everyone’s idea of fun. Most people want their digested data packaged up into compelling stories and beautiful visuals. My job here at Global Witness is to do exactly that - tooling up the organisation and its partners so that we can take the troves of data we have access to and ask the questions of it that will help us create positive change.

Now is a very exciting time to be doing this. As the explosion of powerful, free data analysis tools intersects with our own efforts to bring business into the open, we’re seeing exciting new opportunities to help cut poverty and financial crime all over the world. The challenge, as we see it, is to find the right tools and equip the right people so that all this information can be translated into genuine accountability for those in power.

There is already a lot of valuable data relevant to our mission in the public domain. For example, billions of dollars worth of company-to-government payments are now published each year as part of the Publish What You Pay initiative that Global Witness helped to found. Elsewhere, recent US legislation means thousands of companies who use minerals in their products routinely publish information on their supply chains, to make sure they aren’t sponsoring conflict or bloodshed where the raw materials are mined.

As well as laws demanding information be published, there are non-profit organisations which exist to collect data and make it useful. OpenCorporates, the world’s largest free database of corporate data, now contains the records of over of 97 million companies from around the world. MySociety’s nascent contains the records of over 60,000 politicians from over 232 companies which have been scraped and crowdsourced from numerous digital sources. Our friends at OpenOil now have a service which provides free access over one million company filings related to the oil, gas and mining industries.

Visualisation from Open Oil's
vast database of contracts from Nigeria's byzantine
oil industry:

This data could revolutionise the way journalists and NGOs like Global Witness expose the corrupt networks that deprive some of the world’s poorest people of the wealth beneath their feet. But for that to happen, the data needs to be actively monitored and mined by groups equipped with the right skills and tools. This analysis is especially valuable in states with limited political will or capacity to undertake their own corruption investigations, as is the case in many countries where Global Witness works. So the opportunities are obvious, but it’s complicated.

No single public dataset is likely to unearth substantial corruption on its own. The real value tends to come when these datasets are combined in ways not anticipated by the publisher. For example, combining port records with company information has allowed us to show how the EU’s demand for illegal timber from the DRC undermines efforts to protect the world’s second largest rainforest. Elsewhere, company filings from multiple jurisdictions helped piece together the corrupt and complex deal behind the sale of Nigeria’s largest oil block, which deprived its citizens of over one billion US dollars. The more data that is available on companies and governments in a standardised form, the greater the opportunity for linking this information in meaningful ways to fight corruption.

A screenshot of Global Witness’s Timber Trade Tracker built in collaboration with Open Knowledge that visualises the flows of illegal timber out of the DRC. The data is updated on a quarterly basis:

How do we stop all this data from becoming an overwhelming deluge of information? This is where the tools developed by civic hackers and socially minded tech firms come in. Take the tools for cross-checking lists of names against databases of “politically exposed people” (PEPs) now being developed by OCCRP and the Influence Mapping community. These are invaluable instruments in identifying conflicts of interest and corruption in the allocation of lucrative natural resources. Free software like OpenRefine puts powerful data processing algorithms at the fingertips of non-programmers and tools like Python-Pandas , initially developed for the finance industries but now open source, let people manipulate spreadsheets millions of rows long. We put these tools to use in combination in our recent Myanmar jade investigation, using OpenCorporates data to help show how this vast trade is monopolised by a military elite at the expense of some of Myanmar’s most vulnerable people.

Secrecy for sale in the British Virgin Islands as
visualised in our Great Rip Off Map

That’s not to say we shouldn’t also work tirelessly to improve the data that’s already out there. Existing sources of public data are far from perfect, and in many countries online access to public records still remains a distant dream. Lack of standardisation makes information difficult to retrieve and connect. The secrecy for sale in the world’s tax havens means that many corporate structures used to funnel the proceeds of corruption remain hidden. Global Witness will continue to push for those gaps to be filled and input into developing strong, open and workable standards along with our partners in groups such as Publish What You Pay and the Financial Transparency and Open Contracting coalitions. And alongside our data work, we will be lobbying decision-makers to ensure that transparency becomes the expected norm in how the world does its business.

The open data revolution shows no sign of slowing in 2016. This year even more data will go online as hundreds of UK companies release project-level information on all payments to governments in the oil, gas and mining industries under new EU law. From July onwards all UK companies will be forced to publish who their real owners are in publicly accessible register, something that Global Witness has been fighting for many years.

OpenCorporates' proof-of-concept of a global beneficial
ownership registry:

For anyone who wants to fight corruption and build a more open, just society, this is really exciting. For a data geek like me, it’s doubly so. But as the possibilities grow, so does the work involved in mining all the data, and we’re far from the only ones driving this. This week in Lima, I’ll be working with activists from all over the world to look at how they can apply these tools and techniques to their different contexts, and help make sure the new availability of information brings justice and accountability from their governments. Together we’ll be trawling this new information for insights and leads, so that the individuals and systems that perpetuate conflict, corruption and environmental devastation have no place to hide.

Watch this space!