A 30-year ocean front datasets based on deep learning from 1993 to 2023 for Northwest Pacific ocean
Abstract. Ocean fronts are critical interfaces between different water masses, profoundly influencing atmosphere–ocean interactions, weather systems, marine ecosystems, and climate regulation. Accurate and long-term observations of ocean fronts are essential for advancing studies in meteorology, oceanography, and climate science. However, no publicly available, long-term ocean front dataset currently exists, and existing detection methods often rely on time-consuming manual labeling or traditional algorithms with limited accuracy in complex frontal regions. In this study, we release the first publicly available 30-year ocean front dataset (1993–2023) for the Northwest Pacific, generated by applying a deep learning framework (Mask R-CNN) to daily sea surface temperature (SST) fields, with manually annotated samples for model training. The dataset provides pixel-level frontal boundaries along with associated attributes, including position, intensity, and width, stored in NetCDF-4 format at 1/12° spatial and daily temporal resolution. Accuracy evaluation shows a mean average precision (mAP) exceeding 0.90, with smaller errors in front width and intensity compared with traditional gradient-based methods, while capturing more small-scale features. The dataset offers three main contributions: (1) Filling the critical gap of a standardized, long-term ocean front product; (2) Serving as a ready-to-use training resource for deep learning models, greatly reducing the need for manual labeling; and (3) Providing benchmark samples for validation and intercomparison of other ocean front detection products. This dataset supports robust investigations of seasonal-to-interannual frontal variability and provides a valuable foundation for applications in meteorology, ecosystem management and climate change research.