Editorial: Science, data and society

. Quality data remain elusive while data access freedoms disappear. Serious mis-matches between data availability and human need should attract societal attention.

The journal we founded, Earth System Science Data (ESSD, Copernicus), developed into the first high-impact mechanism to facilitate free exchange of reliable research data.ESSD now presents a remarkable variety and volume of openly accessible quality-certified data products covering many aspects of environmental, geophysical and biogeochemical science.Despite these positive developments, we hesitate to celebrate because we detect clear evidence that present deluges of information remain largely untrustworthy and/or not accessible.Science and the society it serves grow increasingly vulnerable to mistakes and mis-steps as a consequence of limitations on open sharing of trustworthy data.
As instantaneous access to geographic, economic and social information grows, we remind ourselves that contemporary world views, shaped by this information, remain subservient to imposed perspectives and interests.Governments and corporations compile and exploit datasets and models, computer-based visualisations, data analytics and machine learning, and other artificial intelligence tools to shape citizen and consumer perspectives."Customary" interpolations and extrapolations of patchy data mislead or support misconceptions about under-served parts of the world.Emerging usage of cloud storage and computing by countries, institutions and individuals, promoted as useful, affordable and convenient, introduces new barriers to data exchange while eroding essential concepts of reproducibility: when commercial entities control -and modify at will -provenance of global data products, science loses necessary traceability.Rapid exchanges of forecast products or satellite images via high-bandwidth connections, exploited between prominent centres, fail in many cases to ensure equitable global access.
Data-dependent society faces unprecedented challenges.Global pandemics threaten human health.Storms, flash floods, droughts and heatwaves exacerbated by climate change assault human and planetary well-being.Biodiversity, when measurable, deteriorates on local as well as global scales.These challenges occur amidst society-wide deficiencies in equity and justice.At the same time, data journals such as Copernicus' ESSD or Nature Publications' Scientific Data receive an increasing volume of diverse submissions.Do more authors seek high impact factors of newly successful data journals?Or do we observe increasing recognition of individual and collective benefit from openly shared data?We highlight three examples -COVID-19 infections (public health), CO 2 emissions (climate), and species abundance and distributions (biodiversity) -wherein current data practices seem insufficient to support useful science-based social responses.

Infections
Countries accumulate genetic data to exploit commercial aspects, establish or maintain global pre-eminence, or protect national security (New York Times, 2021).Calls for "coordination and standardisation of data collection, data quality, monitoring, and reporting" to serve public health needs (Sachs et al., 2022) clearly conflict with plans to exploit data for national or monopolistic commercial benefit.Fail-Published by Copernicus Publications.D. Carlson and H. Pfeiffenberger: Science, data and society ure to define, collect, and share reliable COVID-19 infection and health data in a timely manner inhibits effective global, national and local responses (SPIEGEL, 2021;Washington Post, 2022).As this particular virus proves more persistent and more flexible than expected, accurate timely tracking of infections seems to recede.Do such outcomes represent public health success or public information failures?How will our global public health community develop necessary accessible tools to track and respond to the next pandemic?How will global societies share trustworthy warnings or efficacy assessments?Will data journals play useful roles?

Emissions
Countries report greenhouse gas emissions by territory and sector to UNFCCC, but many emissions reports arrive late, lack necessary detail, and, rarely, exploit loopholes or otherwise manipulate data to hide non-compliance or to project favourable impressions (Washington Post, 2021).Emissions data from military operations remain largely "off the books".As a consequence, countries debate international climate policies based on flawed emissions accounting.A global research community, evaluating shared up-to-date data from remote sensing, ground-based networks and advanced models, expends increasing time and effort to identify and resolve discrepancies (e.g.Deng et al., 2022;Grassi et al., 2022).Amidst unfortunate uncertainty about data veracity, how will society develop, certify and apply high-resolution openly shared reliable global CO 2 emission data?

Species distributions
Biodiversity data remain restricted.Habits and policies (e.g. the Nagoya protocol promoted by the Convention on Biological Diversity, CBD, 2022) imposed nationally or adopted among international scientific communities, despite positive intentions, make no progress toward convenient global access with the effect of inhibiting systematic analyses.Countries and communities have not yet agreed on definitions of "key" or essential information.While abundance data emerge for some species, data for other species or same species in other regions remain hidden or blocked.Crossborder approaches remain rare: compiling sufficient data to plan or evaluate ecosystem-wide management options (e.g.protected areas, migration corridors) remains extraordinarily difficult.Ecologically relevant information remains largely unavailable in quality-certified open-access formats as practised by data journals.As land use and land use changes become more evident via remote sensing, more important for monitoring biodiversity as well as emissions, and more subject to national manipulation, will biodiversity communities agree on terms and issues?In ocean ecosystems largely hidden from remote sensing, will resource exploitation data remain subject to proprietary national and commercial policies?
We repeat our initial assessment: severe deficiencies in how science and society develop, share, validate and use data leave us increasingly vulnerable to mistakes and mis-steps as we confront planetary challenges.We note recent admonitions (Anthony Fauci, New York Times, 2022) that reinforce concerns: "It is our collective responsibility to ensure that public health policy decisions are driven by the best available data."Open access to quality data via recently successful data journals represents a positive but meagre response, not scalable to vast varieties or volumes of data.We proclaim three urgent recommendations: (a) all publicly funded data, and all other data necessary to inform public policies, must be open; (b) we must train society to expect, discover and use open trustworthy data; (c) we insist on open data as the international, national, commercial, environmental and economic default rather than exception.Meeting these challenges presumes functional funded data infrastructure.
A. All data from all sources must emerge and remain as freely and openly accessible as technically and ethically feasible.We particularly urge that all data, of any form or format, used in, applied to, or serving as a basis for public environmental, economic, security and health policies reside in free well-documented accessible public repositories.Open data access as practised at the moment by data journals will represent a powerful antidote to inadvertent or deliberate biases, allowing and encouraging evaluation of completeness, accuracy and trustworthiness.
B. A world of free access to open data will only develop in parallel with a data-smart society.We call for focus on data availability, reliability and use as a highly relevant feature of "higher" and vocational education.We urge broad exposure to global health, biodiversity and environmental data issues for every student regardless of intended specialisation.We must offer next generations knowledge and tools to challenge access barriers, assimilate disparate data, produce skilful analyses, and scrutinise sources and biases.We identify clear roles for data journals: Borduas-Dedekind et al. ( 2022) report positive outcomes when students review datasets.
C. Beyond recommendations for data as a free open asset available to all citizens, we argue for philosophical and technical change of direction: open data to exist as the default.Citizens should govern all data with societal relevance, except in rare cases when they have, a priori, agreed to exceptions.Mindful that a single researcher apparently initiated the rapid exchange of sequence data for the recent corona virus that, within weeks, led to specific PCR tests and mRNA vaccines, we wish to see such brave decisions become common rather than rare, expected rather than serendipitous.A society that enjoys open access to data gains at least a science-based chance to improve its response to health, climate and biodiversity issues.Next generations, expecting open access and trained to evaluate data quality, will take positive steps toward understanding how society in general, and politicians specifically, expose or hide and use or misuse their data.
We do not underestimate the magnitude of change needed to confront global environmental and equity issues.We do not call for more data.Current or future open data practices within scientific journals may have negligible impact on larger social and political issues, but they set good examples and occasionally provide benchmarks for plausibility checks.As countries cooperate to identify and implement persistent coherent responses to health, climate and biodiversity issues, they will quickly confront issues of data access and reliability.We advocate for substantial improvement in how society interacts with information and how cultures interact with each other through data exchange, based on positive community experience with ESSD.We anticipate vociferous objections based on national, commercial, military and privacy interests.We do not discount those concerns but prefer them as rare exceptions rather than broad justification.We contend that humanity's necessary successful response to health, climate and biodiversity challenges will require careful competent open access to reliable data.
Unfortunately, data access freedoms, dependent on careful collection, responsible development of algorithms and code, and effective quality assurance, disappear faster than researchers know, beyond the ken of most citizens.Serious mis-matches between data availability and human need should alarm researchers, data managers and journalists and should attract societal attention.During the most recent (2007)(2008) International Polar Year (IPY), one of us (Carlson, 2011) reported "inadequate services, almost no international support, and few solutions".We suspect this dismal situation has yet to improve for future international efforts; recovering IPY data after-the-fact required substantial effort (Driemel et al., 2015).To meet impending challenges, society must provide urgent attention to openly shared trustworthy data.