Open Data in South Africa and Beyond
Everyday, every hour, every minute, every second of our lives we are creating data. Consciously or unconsciously, everyone is creating tons of data just by being alive. Unconsciously being a citizen of South Africa you are a data point that moves and evolves. Consuming electricity at some rate, using water at another rate and being transported from point A to point B using some form of transportation. Consciously, we occasionally get sick, visit the doctor, get scans. Our cars break down, we get them checked out, fixed etc. As you can see we have so many opportunities of data creation.
What happens to all this data? Do you have access to it? Do you have a right to it? What possible things can be done with this data? These are not easy questions to answer or get answers to. There is a movement though, to make data Open. Open Data, to put it simply is a movement to make available data to everyone without any restrictions or fees. I will separate the use of the data to its access. A (relatively) simple argument to make is for data collected with public money (Tax/Rates etc.). So lets say your municipality gathers and stores data about the amount of water consumed by every household on a daily basis. This data might be used for their planning or monitoring. There still lies a question of other potential uses of that data by other organisations. To get this data, one would have to know about it and have to go through a process of requesting access to it (which normally comes with jumping through hoops and limitations).
If the data was collected by an institution that is a public good (government institution, paid for by taxes), then shouldn’t the data automatically be open to anyone? This is one of the arguments for Open Data. A lot of research in any country is partly funded by public money and as such the argument goes that such research should always aim to make the data used in the research open and easily accessible once the research is concluded. The typical flow of research is that the researcher/student/lab collects data to pursue some project. Once the project is concluded that data is often just archived rarely used again.
What if someone could actually use that data from a different field or subject area? This could lead to new breakthroughs and understandings. Or someone else could use it to develop a new innovative service given the insights that could be gleaned from the data. This archived and non-utilised data could lead to better services created for others. This definition now extends to private individuals and entities who have collected data, have kept it and for a myriad of reasons do not make it open to others even after their own value of utility of data has reached zero. So imagine, analyses of municipal water usage information leading to services that better predict when water problems might happen in different parts of the city and be able to avoid or minimise complete shutdowns.
For most of this time I have focused data of a personal nature that is unlocked when it becomes open. Another important reason for open data is to keep our government accountable. With government data available to be scrutinised and analysed by outsiders, the citizens of the country can know what the government is doing as well as to ask the right questions. A good examples of such a service, made possible by open data, is The People Assembly. The People’s Assembly makes information about the South African president available to citizens that should be public by law. Code4SA however uses open data to create, visualise and push data driven journalism. On the other hand Open Africa pushes for open data throughout the continent, allowing for people/organisations to measure the pulse of the continent country by country and data entry by data entry.
I would be doing the topic a disservice though if I also did not cover the arguments against open data. Researchers/organisations toil away collecting data to carry out studies or reach some specific goals. Some of them may not want to share data with others, but want to control where it goes to guarantee that whoever uses it, does so with similar goals or principles to the original intent the original gatherers of the data had. I myself, as a researcher, have had many instances where I had to agree to use some data I was getting access to in one manner and but not in another. Some Public (governmental) datasets tend to not have such restrictions once a government department make its data public. But, think about medical data collected from public health institutions, what would the consequences of simply just making patient data (anonymised of course) openly available without restrictions? What about ethics ?What about respect for the individuals from whom the data was collected? In this case one can more easily argue that the data should be availed to those who work in public health, who have ethics in mind and with a goal to improve the public health care system.
A large number of research projects, that allow access to data, disallow the use of that data for commercial purposes (private business or private research institutions) or by non-research organisations. This in my opinion is understandable, though it is always going to lead to many debates. The data was collected in pursuit of understanding our world better, the original researchers were not collecting it so that another company could get a commercial advantage. If the company seeks such data, I think, they should put their own resources collecting it and doing the work to clean it, prepare and analyse it (a large amount of man hours) or compensate the collectors for their effort.
The topic of Open Data is a very interesting one, especially for the African continent as the continent tends to upend a lot of established technology trends.