Preprints
https://doi.org/10.5194/essd-2024-163
https://doi.org/10.5194/essd-2024-163
12 Jun 2024
 | 12 Jun 2024
Status: a revised version of this preprint was accepted for the journal ESSD and is expected to appear here in due course.

Global biogeography of N2-fixing microbes: nifH amplicon database and analytics workflow

Michael Morando, Jonathan Magasin, Shunyan Cheung, Matthew M. Mills, Jonathan P. Zehr, and Kendra A. Turk-Kubo

Abstract. Marine nitrogen (N) fixation is a globally significant biogeochemical process carried out by a specialized group of prokaryotes (diazotrophs), yet our understanding of their ecology is constantly evolving. Although marine dinitrogen (N2)-fixation is often ascribed to cyanobacterial diazotrophs, indirect evidence suggests that non-cyanobacterial diazotrophs (NCDs) might also be important. One widely used approach for understanding diazotroph diversity and biogeography is polymerase chain reaction (PCR)-amplification of a portion of the nifH gene, which encodes a structural component of the N2-fixing enzyme complex, nitrogenase. An array of bioinformatic tools exists to process nifH amplicon data, however, the lack of standardized practices has hindered cross-study comparisons. This has led to a missed opportunity to more thoroughly assess diazotroph biogeography, diversity, and their potential contributions to the marine N cycle. To address these knowledge gaps a bioinformatic workflow was designed that standardizes the processing of nifH amplicon datasets originating from high-throughput sequencing (HTS). Multiple datasets are efficiently and consistently processed with a specialized DADA2 pipeline to identify amplicon sequence variants (ASVs). A series of customizable post-pipeline stages then detect and discard spurious nifH sequences and annotate the subsequent quality-filtered nifH ASVs using multiple reference databases and classification approaches. This newly developed workflow was used to reprocess nearly all publicly available nifH amplicon HTS datasets from marine studies, and to generate a comprehensive nifH ASV database containing 7909 ASVs aggregated from 21 studies that represent the diazotrophic populations in the global ocean. For each sample, the database includes physical and chemical metadata obtained from the Simons Collaborative Marine Atlas Project (CMAP). Here we demonstrate the utility of this database for revealing global biogeographical patterns of prominent diazotroph groups and highlight the influence of sea surface temperature. The workflow and nifH ASV database provide a robust framework for studying marine N2 fixation and diazotrophic diversity captured by nifH amplicon HTS. Future datasets that target understudied ocean regions can be added easily, and users can tune parameters and studies included for their specific focus. The workflow and database are available, respectively, in GitHub (https://github.com/jdmagasin/nifH-ASV-workflow; Morando et al., 2024) and Figshare (https://doi.org/10.6084/m9.figshare.23795943.v1; Morando et al., 2024).

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Michael Morando, Jonathan Magasin, Shunyan Cheung, Matthew M. Mills, Jonathan P. Zehr, and Kendra A. Turk-Kubo

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on essd-2024-163', Anonymous Referee #1, 21 Jul 2024
  • RC2: 'Comment on essd-2024-163', Anonymous Referee #2, 25 Jul 2024
  • AC1: 'Comment on essd-2024-163', Jonathan Magasin, 22 Oct 2024

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on essd-2024-163', Anonymous Referee #1, 21 Jul 2024
  • RC2: 'Comment on essd-2024-163', Anonymous Referee #2, 25 Jul 2024
  • AC1: 'Comment on essd-2024-163', Jonathan Magasin, 22 Oct 2024
Michael Morando, Jonathan Magasin, Shunyan Cheung, Matthew M. Mills, Jonathan P. Zehr, and Kendra A. Turk-Kubo

Data sets

nifH ASV database [Global biogeography of N2-fixing microbes: nifH amplicon database and analytics workflow] Michael Morando, Jonathan Magasin, Shunyan Cheung, Matthew M. Mills, Jonathan P. Zehr, and Kendra A. Turk-Kubo https://doi.org/10.6084/m9.figshare.23795943.v1

Interactive computing environment

DADA2 nifH pipeline Michael Morando, Jonathan Magasin, Shunyan Cheung, Matthew M. Mills, Jonathan P. Zehr, and Kendra A. Turk-Kubo https://github.com/jdmagasin/nifH_amplicons_DADA2

nifH ASV workflow (post-pipeline stages) Michael Morando, Jonathan Magasin, Shunyan Cheung, Matthew M. Mills, Jonathan P. Zehr, and Kendra A. Turk-Kubo https://github.com/jdmagasin/nifH-ASV-workflow

Michael Morando, Jonathan Magasin, Shunyan Cheung, Matthew M. Mills, Jonathan P. Zehr, and Kendra A. Turk-Kubo

Viewed

Total article views: 560 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
390 145 25 560 22 17 15
  • HTML: 390
  • PDF: 145
  • XML: 25
  • Total: 560
  • Supplement: 22
  • BibTeX: 17
  • EndNote: 15
Views and downloads (calculated since 12 Jun 2024)
Cumulative views and downloads (calculated since 12 Jun 2024)

Viewed (geographical distribution)

Total article views: 534 (including HTML, PDF, and XML) Thereof 534 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 13 Dec 2024
Download
Short summary
Nitrogen is crucial in ocean food webs, but only some microbes can fix N2 gas into a bioavailable form. Most are known only by their nifH gene sequence. We created a software workflow for nifH data and ran it on 865 ocean samples, producing a database that captures the global diversity of N2-fixing marine microbes and the environmental factors that influence them. The workflow and DB can standardize analyses on past and future nifH datasets to enable insights into marine microbial communities.
Altmetrics