Quality data remain elusive while data access freedoms disappear. Serious mis-matches between data availability and human need should attract societal attention.
The journal we founded,
As instantaneous access to geographic, economic and social information grows, we remind ourselves that contemporary world views, shaped by this information, remain subservient to imposed perspectives and interests. Governments and corporations compile and exploit datasets and models, computer-based visualisations, data analytics and machine learning, and other artificial intelligence tools to shape citizen and consumer perspectives. “Customary” interpolations and extrapolations of patchy data mislead or support misconceptions about under-served parts of the world. Emerging usage of cloud storage and computing by countries, institutions and individuals, promoted as useful, affordable and convenient, introduces new barriers to data exchange while eroding essential concepts of reproducibility: when commercial entities control – and modify at will – provenance of global data products, science loses necessary traceability. Rapid exchanges of forecast products or satellite images via high-bandwidth connections, exploited between prominent centres, fail in many cases to ensure equitable global access.
Data-dependent society faces unprecedented challenges. Global pandemics
threaten human health. Storms, flash floods, droughts and heatwaves
exacerbated by climate change assault human and planetary well-being.
Biodiversity, when measurable, deteriorates on local as well as global
scales. These challenges occur amidst society-wide deficiencies in equity
and justice. At the same time, data journals such as Copernicus'
Countries accumulate genetic data to exploit commercial aspects, establish or maintain global pre-eminence, or protect national security (New York Times, 2021). Calls for “coordination and standardisation of data collection, data quality, monitoring, and reporting” to serve public health needs (Sachs et al., 2022) clearly conflict with plans to exploit data for national or monopolistic commercial benefit. Failure to define, collect, and share reliable COVID-19 infection and health data in a timely manner inhibits effective global, national and local responses (SPIEGEL, 2021; Washington Post, 2022). As this particular virus proves more persistent and more flexible than expected, accurate timely tracking of infections seems to recede. Do such outcomes represent public health success or public information failures? How will our global public health community develop necessary accessible tools to track and respond to the next pandemic? How will global societies share trustworthy warnings or efficacy assessments? Will data journals play useful roles?
Countries report greenhouse gas emissions by territory and sector to UNFCCC,
but many emissions reports arrive late, lack necessary detail, and, rarely,
exploit loopholes or otherwise manipulate data to hide non-compliance or to
project favourable impressions (Washington Post, 2021). Emissions data from
military operations remain largely “off the books”. As a consequence,
countries debate international climate policies based on flawed emissions
accounting. A global research community, evaluating shared up-to-date data
from remote sensing, ground-based networks and advanced models, expends
increasing time and effort to identify and resolve discrepancies (e.g. Deng
et al., 2022; Grassi et al., 2022). Amidst unfortunate uncertainty about data
veracity, how will society develop, certify and apply high-resolution
openly shared reliable global CO
Biodiversity data remain restricted. Habits and policies (e.g. the Nagoya protocol promoted by the Convention on Biological Diversity, CBD, 2022) imposed nationally or adopted among international scientific communities, despite positive intentions, make no progress toward convenient global access with the effect of inhibiting systematic analyses. Countries and communities have not yet agreed on definitions of “key” or essential information. While abundance data emerge for some species, data for other species or same species in other regions remain hidden or blocked. Cross-border approaches remain rare: compiling sufficient data to plan or evaluate ecosystem-wide management options (e.g. protected areas, migration corridors) remains extraordinarily difficult. Ecologically relevant information remains largely unavailable in quality-certified open-access formats as practised by data journals. As land use and land use changes become more evident via remote sensing, more important for monitoring biodiversity as well as emissions, and more subject to national manipulation, will biodiversity communities agree on terms and issues? In ocean ecosystems largely hidden from remote sensing, will resource exploitation data remain subject to proprietary national and commercial policies?
We repeat our initial assessment: severe deficiencies in how science and society develop, share, validate and use data leave us increasingly vulnerable to mistakes and mis-steps as we confront planetary challenges. We note recent admonitions (Anthony Fauci, New York Times, 2022) that reinforce concerns: “It is our collective responsibility to ensure that public health policy decisions are driven by the best available data.” Open access to quality data via recently successful data journals represents a positive but meagre response, not scalable to vast varieties or volumes of data. We proclaim three urgent recommendations: (a) all publicly funded data, and all other data necessary to inform public policies, must be open; (b) we must train society to expect, discover and use open trustworthy data; (c) we insist on open data as the international, national, commercial, environmental and economic default rather than exception. Meeting these challenges presumes functional funded data infrastructure.
All data from all sources must emerge and remain as freely and openly accessible as technically and ethically feasible. We particularly urge that all data, of any form or format, used in, applied to, or serving as a basis for public environmental, economic, security and health policies reside in free well-documented accessible public repositories. Open data access as practised at the moment by data journals will represent a powerful antidote to inadvertent or deliberate biases, allowing and encouraging evaluation of completeness, accuracy and trustworthiness. A world of free access to open data will only develop in parallel with a data-smart society. We call for focus on data availability, reliability and use as a highly relevant feature of “higher” and vocational education. We urge broad exposure to global health, biodiversity and environmental data issues for every student regardless of intended specialisation. We must offer next generations knowledge and tools to challenge access barriers, assimilate disparate data, produce skilful analyses, and scrutinise sources and biases. We identify clear roles for data journals: Borduas-Dedekind et al. (2022) report positive outcomes when students review datasets. Beyond recommendations for data as a free open asset available to all citizens, we argue for philosophical and technical change of direction: open data to exist as the default. Citizens should govern all data with societal relevance, except in rare cases when they have, a priori, agreed to exceptions. Mindful that a single researcher apparently initiated the rapid exchange of sequence data for the recent corona virus that, within weeks, led to specific PCR tests and mRNA vaccines, we wish to see such brave decisions become common rather than rare, expected rather than serendipitous. A society that enjoys open access to data gains at least a science-based chance to improve its response to health, climate and biodiversity issues. Next generations, expecting open access and trained to evaluate data quality, will take positive steps toward understanding how society in general, and politicians specifically, expose or hide and use or misuse their data.
We do not underestimate the magnitude of change needed to confront global
environmental and equity issues. We do not call for more data. Current or
future open data practices within scientific journals may have negligible
impact on larger social and political issues, but they set good examples and
occasionally provide benchmarks for plausibility checks. As countries
cooperate to identify and implement persistent coherent responses to health,
climate and biodiversity issues, they will quickly confront issues of data
access and reliability. We advocate for substantial improvement in how society
interacts with information and how cultures interact with each other through
data exchange, based on positive community experience with
Unfortunately, data access freedoms, dependent on careful collection, responsible development of algorithms and code, and effective quality assurance, disappear faster than researchers know, beyond the ken of most citizens. Serious mis-matches between data availability and human need should alarm researchers, data managers and journalists and should attract societal attention. During the most recent (2007–2008) International Polar Year (IPY), one of us (Carlson, 2011) reported “inadequate services, almost no international support, and few solutions”. We suspect this dismal situation has yet to improve for future international efforts; recovering IPY data after-the-fact required substantial effort (Driemel et al., 2015). To meet impending challenges, society must provide urgent attention to openly shared trustworthy data.
We thank Falk Huettmann (long-time advocate of open data
policies and practices) and Andrew Hufton (founding chief editor of