Sea ice in the Baltic Sea – revisiting BASIS ice , a historical data set covering the period 1960 / 1961 – 1978 / 1979

The Baltic Sea is a seasonally ice-covered, marginal sea in central northern Europe. It is an essential waterway connecting highly industrialised countries. Because ship traffic is intermittently hindered by sea ice, the local weather services have been monitoring sea ice conditions for decades. In the present study we revisit a historical monitoring data set, covering the winters 1960/1961 to 1978/1979. This data set, dubbed Data Bank for Baltic Sea Ice and Sea Surface Temperatures (BASIS) ice, is based on hand-drawn maps that were collected and then digitised in 1981 in a joint project of the Finnish Institute of Marine Research (today the Finnish Meteorological Institute (FMI)) and the Swedish Meteorological and Hydrological Institute (SMHI). BASIS ice was designed for storage on punch cards and all ice information is encoded by five digits. This makes the data hard to access. Here we present a post-processed product based on the original five-digit code. Specifically, we convert to standard ice quantities (including information on ice types), which we distribute in the current and free Network Common Data Format (NetCDF). Our post-processed data set will help to assess numerical ice models and provide easy-to-access unique historical reference material for sea ice in the Baltic Sea. In addition we provide statistics showcasing the data quality. The website http://www.baltic-ocean.org hosts the post-processed data and the conversion code. The data are also archived at the Data Publisher for Earth & Environmental Science, PANGAEA (doi:10.1594/PANGAEA.832353)


Introduction
The Baltic Sea is a seasonally ice-covered marginal sea situated in a densely populated and highly industrialised area in northern Europe (Fig. 1).Major shipping routes cross the regularly ice-covered regions (e.g.Granskog et al., 2006).The ice season lasts up to 7 months, (Vihma and Haapala, 2009) with the maximum ice extent typically reached in late February.Interannual variations are large and range, expressed in terms of ice cover, between ≈ 10 and 100 % (e.g.Leppäranta and Myrberg, 2009).The ice can be classified into several types, which obstruct ship traffic to a varying degree: in coastal and archipelagic areas of the Baltic, the dominant ice type is generally (land)fast ice, which is solid, even and immobile (apart from very early and very late in the ice season).Further into the basins, as wind fetch increases and the ice cover is repeatedly broken, the ice is forced into motion (Uotila, 2001).Typical conditions there are characterised by an irregular ice-coverage comprising floes of variable size, leads (i.e.linear areas of open water), belts of slush (i.e.mixture of small ice crystals basically from snow or liquid water), shuga (i.e.accumulation of spongy white lumps with a diameter of a few centimetres across) and deformed ice patches (such as rafted and ridged ice).Wintertime shipping is challenging in that ships have to find their way through this "drift ice landscape" (Leppäranta and Myrberg, 2009), which can slow down or even stop their progress.Hence, any information on the actual ice state is of benefit to shipping.Thus, it is not surprising that record keeping of sea ice states started as early as more than 1000 years ago (e.g.Ogilvie, 1984).While the information at that time was rather sparse and preserved through oral information exchange, the This paper describes a data set based on historic ice charts from local weather services.The data were collected and then digitised in 1981 by a joint effort of the Swedish Meteorological and Hydrological Institute (SMHI) and the former Finnish Institute of Marine Research (today FMI), and led to the publication of the "Climatological Ice Atlas for the Baltic Sea, Kattegat, Skagerrak and Lake Vänern (1963-1979)" (SMHI and FIMR, 1982).The initial initiative comprised the winters 1963/1964-1978/1979. The winters 1960/1961and 1961/1962 were added later.The original database was named "Data Bank for Baltic Sea Ice and Sea Surface Temperatures", abbreviated as BASIS.BASIS is a composite of direct ice measurements and estimates from voluntarily observing ships, coast guards, ice breakers, light houses and harbour authorities.Additional information came from overflights operated by FMI, SMHI and the Swedish Air Force (Udin et al., 1981).From the late 1960s onwards observations from space became available and were partly included.
A problem related to BASIS is that the underlying ice charts are extrapolated from the irregular (as regards space and time) observations described above.The associated uncertainties are unclear and are largest when they are at a distance from the major shipping lines.Nevertheless, BASIS is the best available information on historic ice conditions in the Baltic Sea and goes beyond estimates of the historical ice extent (as, e.g. in Omstedt et al., 2004) because it includes unique information on the spatial distribution of ice concentrations, ice thicknesses and, particularly, ice types.
The problem with accessing this data set has been that it was encrypted with a five-digit numerical code, representing the ice conditions in serially numbered grid boxes.Designed for utmost data compression, the encryption rendered storage on cardboard punch cards possible but, at the same time, seriously hindered the accessibility of the data.
This paper describes a post-processing procedure of BA-SIS developed for utmost accessibility in an age of everaccelerating storage technology.In the following (Sect.2), we describe the original data.Section 3 describes the postprocessing procedure which necessitated the introduction of ad hoc assumptions.Section 4 presents basic statistical analysis as a means to test the post-processing and reconcile our product with previous studies.We close with a summary and download instructions in Sects.5 and 6, respectively.

The original database
The original BASIS ice data set is described in detail by Udin et al. (1981).The data are based on ice charts, which are still regularly provided by the local weather services for shipping.Ice charts summarise the prevailing knowledge of the Baltic sea ice situation.They are based on measurements, which are guided by the individual practical expertise of an ice analyst.Due to this subjective element, it is, as far as we can see, impossible to backdate a quantitative assessment of uncertainties associated with the original data.
Additional uncertainties were added during the digitalisation process, as the data were gridded and at maximum two ice types were considered per grid box (at the ice edge only one), while the potential occurrence of additional types was neglected.Figure 2a provides an overview of how often two (or originally possibly more) dominant ice types occur relative to the occurrence of only one ice type.As expected, the risk that some information on ice classes was lost is largest in the centres of the large basins, where mostly several ice types occur.Another difficulty with digitalisation is that the classification into intervals is relatively coarse for some variables.The latter holds particularly for sea ice thickness.Furthermore, probably the biggest problem in interpreting the data today is that total ice concentration is not coded explicitly and can only be derived based on certain assumptions (cf.Udin et al., 1981, p. 19).
That said, the ice services already had a large observational network when compiling BASIS.As operational product, the ice charts were continuously refined by experienced staff who were well aware that the welfare of ships and their crews depended on their work.One can thus expect that BA-SIS is the best source of information for historical sea ice conditions in the Baltic Sea.
Both the original five-digit code and documentation are today available at the online portal "Environment Climate Data Sweden" at http://www.smhi.se/ecds.The data there are distributed in the MATLAB ® file format containing five-digit coded ice properties for 612 serially numbered grid boxes.The grid boxes are 15 in latitude and 30 in longitude, comprising ≈ 800 km 2 each.They cover Lake Vänern in Sweden and the whole Baltic Sea.The first two digits of the original five-digit code provide information about the dominant ice type, as described in detail in Table 1.Ice thickness, if available, is given by the third digit (Table 2).The fourth and fifth digits describe, if present, the second most abundant ice type in the respective grid box.If, however, a grid box contains a lead or the ice edge (as indicated by the first digit equaling "9"), the third digit does not encode ice thickness but, instead, encodes the fraction of open water in the box (Table 3).
In addition to sea ice properties, sea surface temperature (SST) estimates were encoded in the original data.According to Udin et al. (1981) all codes containing fewer than five digits resemble SSTs.However, we were unable to retrieve this information about SSTs because, apparently, the information about the positioning on the punch card has been lost and, hence, leading zeros and minus signs disappeared, which eventually spoiled the reconstruction.

The post-processing
In a first step, we bin the original five-digit coded data on a latitude-longitude-time grid based on metadata provided by Udin et al. (1981), and store it in the current Network Common Data Format (NetCDF) (http://www.unidata.ucar.edu/software/netcdf/).All data containing fewer than five digits were discarded before subsequent analysis in order to discard (as explained above) inconclusive SST information.Subsequently we extract (or derive) the following: 1. pack ice concentration 2. ridged ice concentration 3. fast ice concentration 4. level ice concentration 5. consolidated ice concentration 6. rafted ice concentration 7. rotten ice concentration Table 1.Ice type allocation according to the original five-digit (C1, C2, . . .C5) BASIS ice punch card coding (Udin et al., 1981).If 9 > C1 > 2, the primary ice type is coded by C1 and C2 according to this table.If additionally, C4, C5 > 0, a secondary ice type exists and its coding is analogous to C1 and C2 in this table.Ice edge and (generally linear) areas of open water (leads) are coded differently (as described in the text).The unit for 1 to 9 is percentage of ice cover and the unit of 10 is cm.Note that this list differs slightly from the original classification in BASIS as we restrict ourselves to contemporary ice types.That is, we do not consider different floe sizes in the classification of pack ice and do not distinguish between non-consolidated and consolidated ridged ice.We stick to the original terminology, "consolidated ice", while today's terminology varies and "compact pack ice" might be a more contemporary description.Some of the above extractions are straightforward; others are based on assumptions that we explicitly state in the following.Pack ice and ridged ice concentrations are assigned first, which is straightforward since their concentration is always explicitly expressed in the five-digit code (a convenience that does not apply to all ice types).In a subsequent step, we allocate ice concentrations for all grid boxes at the ice edge and for all grid boxes that contain leads.In those cases, again, concentrations can be assigned without further assumptions since they are directly given by the five-digit code.In the remaining ice interior, away from leads and the ice edge, we base our decryption on the assumption that ice coverage is complete in the presence of fast, level, consolidated, rafted and rotten ice.An overview on how often total ice concentrations are explicitly provided is given in Fig. 2b.In the following we present a detailed description of respective conversions.
1. Pack ice concentration is originally coded by numbers ranging from 0 to 9, representing the pack ice fraction on a scale of 0 to 1. Thus, the five-digit code encrypts ranges with a precision of 0.1.We differ from the original coding in that we do not assign ranges but assign the respective mean value (of the ranges) to the postprocessed data.Subsequently, we multiply the fraction by 100 to convert to a percentage.
2. Ridged ice concentration is explicitly coded according to Table 1, resolving the (irregular) intervals 0.1-0.3,0.4-0.6,0.7-0.8 and 0.9-1.In the (rare) cases where the original data report "growlers" and "disintegrating" ridges, we set the ridged ice concentration to 0.1.Finally, we multiply it by 100 to convert it to a percentage value.Note that this representation differs somewhat 3.-7.In the presence of leads and at the ice edge, the original code provides open water fraction with a precision of 1/10 (Table 3) and the primary ice type (according to Table 1).In this case, we can derive the concentrations of fast, level, consolidated, rafted and rotten ice directly by attributing the entire derived ice concentration to the one dominant ice type.Potential minor occurrence of an additional type might be neglected.
In the absence of leads and away from the ice edge, the original code does not contain information to explicitly derive concentrations of fast, level, consolidated, rafted and rotten ice.In those cases we are restricted to information merely indicating which (maximally two) ice types are most abundant.In order to infer concentrations nevertheless, we introduce ad hoc assumptions: experience suggests that the ice cover, away from the ice edge and leads, is generally rather complete for fast, level, consolidated, rafted and rotten ice.Hence we assume that the ice coverage here is always 100 % (note that the appearance of belts of slush and shuga is another exceptional case and is described in point 8).If only one primary ice type is given, which does not consist of pack ice or ridged ice, the 100 % is attributed to the primary type.If there is a primary ice type plus a secondary ice type, we assign (ad hoc) 60 % to the primary ice type and 40 % to the secondary ice type, except in cases where one of the ice types consists of pack ice or ridged ice.In the latter cases we assign the difference between 100 % and the pack or ridged ice concentration to the post-processed data.
Earth Syst.Sci.Data, 6, 367-374, 2014 www.earth-syst-sci-data.net/6/367/2014/ 8.When the secondary ice type indicates the width of "belts of slush and shuga", the original code provides information about the width but not the length or positioning of the belts.An areal percentage could thus not be computed.Therefore, for the sake of simplicity, we assign a concentration of 10 % for any appearance of shuga (and the primary ice type is determined to 90 %, unless explicitly given) to the post-processed data.Note that shuga does not contribute to the total ice concentration.
9. Total ice concentration is calculated as the sum of the concentrations described above of pack, ridged, fast, level, consolidated, rafted and rotten ice.
10. Level ice thickness is indexed by numbers from 1 to 9 in the original data.We determine the corresponding thickness according to Table 2 and do not make any secondary assumptions.A peculiarity is that ice thickness information is intermittently lacking, e.g. in the presence of ice edges or leads.In those cases we assign −9999.Land is marked by −1 × 10 34 while zeros denote absence of ice.

Basic statistics
Here we show some statistics showcasing the quality of the BASIS data.The time period comprises the winters 1960/1961-1978/1979, a period known to feature high interannual variability, predominantly the effect of some particularly severe winters during the 1960s (e.g.Koslowski and Loewe, 1994).Figure 3 confirms the large interannual variability of the basin's ice coverage, which is consistent with previous studies (Leppäranta and Myrberg, 2009): maximum coverage reached during an annual cycle ranges from percentages as low as ≈ 15 up to almost 80 %. Figure 4 features the climatological ice cover during December, January, February and March and, in addition, information on fast ice and ridges, which is, as far as we know, unique knowledge for the time period under consideration: during December high ice concentrations are restricted to the northernmost Bothnian Bay and the easternmost Gulf of Finland (cf.Fig. 1).
The ice builds up further into the season and generally peaks in February, then covering even parts of the Baltic Proper and the Danish Straits.Later in the season the total ice fraction declines but, even so, substantial ridged ice prevails during March in the Bothnian Bay, where it obstructs ship traffic.
The data quality and density of BASIS allow for more elaborate statistics than simply calculating climatological averages.In order to illustrate this, we perform an empirical orthogonal functions (EOFs) analysis of the winter mean total ice concentration, which captures patterns of major variability.The leading EOF (Fig. 5) explains 60.2 % of the variability of the total ice fraction and thus catches the predominant signal.The subsequent EOFs form rather small-scale pattern and explain 10 % or less of the variance.The analysis of the first principal component (PC, Fig. 5, lower panel) reveals that the major variability in ice fraction is closely related to changes in the large-scale atmospheric circulation: the leading PC is correlated with 0.66 (Pearson correlation coefficient) to the PC-based North Atlantic Oscillation (NAO) index (as applied, e.g. in Hurrell, 1995;Hurrell and Deser, 2009;Löptien and Ruprecht, 2005).This finding is in agreement with previous studies considering the relation between the NAO and ice extent (e.g.Tinz, 1996;Jevrejeva et al., 2003;Koslowski and Loewe, 1994).Note, however, that the relation between the NAO index and the ice extent is nonstationary (Omstedt and Chen, 2001), i.e. the correlations vary When considering the spatial pattern (Fig. 5, upper panel), we find that the leading EOF indicates the largest variations in regions that are often (but not always) ice covered -as is the case in the western part of the Bothnian Bay, the Åland Sea, the northwestern Gulf of Finland and the Gulf of Riga.Further north, the amplitude decreases until the leading EOF pattern changes sign at the northernmost tip of the Baltic.Here, it is straightforward to argue that, as it is located furthest north, winter mean ice concentrations in the Bothnian Bay are always high and thus the interannual variability of the ice cover is rather weak.Also, as explained in Löptien et al. (2013), anomalous cold temperatures typically occur in combination with persistent northerly winds, which shift the ice further south and trigger the formation of leads.
In addition to total ice concentration, BASIS data contain information on the prevailing ice types.Figure 6 shows this by giving a climatological overview (omitting rotten ice because its occurrence is short-lived at the end of the ice season).We find that fast ice dominates throughout the ice season except for May, when pack ice prevails (Fig. 6a).While the fraction of level ice almost equals the fast ice fraction in December, the relative contribution of level ice decreases strongly throughout the ice season (Fig. 6b).This decrease is plausible because the ice is increasingly forced into motion, breaks up and deforms as the season proceeds.As regards interannual variations of the different ice types, we find that they have characteristics that differ substantially from one type to another.For example, fast ice is highly correlated with the seasonal basin-average ice cover (0.98) and the typical difference between mild and severe winters in maximum fast ice coverage is illustrated in Fig. 7.For seasonal mean pack ice and consolidated ice concentrations the correlations with the seasonal basin-average ice cover are lower, although still considerable (0.84 and 0.71, respectively).Average ridged ice correlates with a much lower value (0.58, corresponding to only 34 % of explained variance).This is probably related to the impact of local wind effects as explained in Haapala (2000) and Löptien et al. (2013).For rafted ice and the appearance of shuga, correlations with the seasonal mean total ice concentrations are negligible (0.2 and −0.09).

Conclusions
The BASIS ice data set is unique in that it provides comprehensive ice information for the Baltic Sea for the period 1960/1961-1978/1979.However, because the underlying data format was designed for storage on punch cards, Earth Syst.Sci.Data, 6, 367-374, 2014 www.earth-syst-sci-data.net/6/367/2014/ accessing it has always been difficult.This paper describes the conversion of the original data to the current and free file format NetCDF.In addition to the original five-digit numerical code, we provide extracted (and derived) prevalent quantities on a latitude-longitude-time grid.Specifically, we provide concentrations of pack ice, ridge ice, fast ice, level ice, consolidated ice, rotten ice, total ice concentration, ice thickness and a rough indicator of shuga and slush ice.This data set is of value as it sets a reference point in a gradually warming world.The relevance of the BASIS-ice data set (and our easier-to-access derivative), however, goes beyond that: as the Arctic sea ice declines and waterways such as the Northwest Passage become more navigable, the need for sea ice nowcasts and forecasts is increasing.This relates not only to ice concentration and thickness but also to the modelling of ice properties, and in particular ridged ice (such as Funkvist and Kleine, 2007;Haapala, 2000;Haapala et al., 2005), because ridges are difficult to break and thus form substantial obstacles for ships.Also, the presence of ridged or deformed ice bears witness to preceding large ice stresses which can lead to a substantial slowdown and, in the worst case, even cause damage to ships (see, e.g.Suominen and Kujala, 2013;Pärn et al., 2007).Consistently with this, there is a fast-growing body of literature on deformed ice with a major focus on ridges (Haapala, 2000;Kankaanpää, 1988;Lensu, 2003;Leppäranta and Hakala, 1992;Leppäranta et al., 1995;Löptien et al., 2013).We expect that modelling lessons learnt in the Baltic may be applicable elsewhere.To this end, BASIS ice (and the post-processed product presented in this study) may well serve as a unique test bed to assess and develop sea ice models.

Data and code repository
The website www.baltic-ocean.orghosts the gridded original five-digit code, the post-processed data and relevant postprocessing computer code.The data are provided in NetCDF format.The code is written in MATLAB ® (Gilat, 2004).In addition, the post-processed data set is archived at the Data Publisher for Earth & Environmental Science, PANGAEA, doi:10.1594/PANGAEA.832353.

Figure 1 .
Figure 1.Bathymetry of the Baltic Sea.The colour-coding relates to the unit, i.e. metres.The white line depicts the 10 m isobath.Subbasins are abbreviated as follows: BB -Bothnian Bay; BS -Bothnian Sea; GF -Gulf of Finland; BP -Baltic Proper; GR -Gulf of Riga; BB2 -Bornholm Basin.

Figure 2 .
Figure 2. (a) Ratio between the occurrence of two (or originally potentially more) ice types and one ice type only.(b) Percentage of the ice observations where the total ice concentration could be derived from the original data set without additional (ad hoc) assumptions.

Figure 3 .
Figure 3. Temporal evolution of basin-averaged ice cover.Total ice cover, deformed ice and fast ice are denoted by black, red and green lines, respectively.

Figure 4 .
Figure 4. Climatological ice cover for 1960/1961-1978/1979.Panels (a), (b), (c) and (d) refer to December, January, February and March, respectively.The colour-coding denotes the degree of ice cover in units %.The black line denotes regions where the averaged fast ice cover exceeds 20 %.The hatched areas denote regions hosting more than 5 % ridges.

Figure 5 .
Figure 5. EOF analysis of the winter mean total ice concentration.The upper panel shows the leading EOF of the total ice concentration.The lower panel shows the corresponding PC (black line) and the NAO index (blue line; Hurrell, 1995).

Figure 6 .
Figure 6.Climatological seasonal ice cycle, weekly resolved, for 1960/1961-1978/1979.(a) Basin-averaged cover by different ice types.The total cover is denoted by the black line; ice types as indicated in the legend.(b) as in (a) but normalised to the total ice cover, which, hence, is omitted.

Figure 7 .
Figure 7. Fast ice and ice edge in exceptional years: (a) the mild winter 1975/1976; (b) the severe winter 1966/1967.The coloured shading denotes the maximum fast ice cover occurring in the different years; the white contour line is the corresponding ice edge (here defined as the 1 % isoline of the seasonal maximum total ice concentration).

Table 2 .
(Udin et al., 1981)punch card coding of ice thickness (C3)(Udin et al., 1981).Original thickness bounds and thickness assigned in our post-processed product are listed in columns 2 and 3, respectively.Note that ice thickness in the presence of leads is not explicitly provided by the original data.

Table 3 .
(Udin et al., 1981)punch card coding of open-water fraction which is given in the presence of leads or at the ice edge (in C3)(Udin et al., 1981).Original open-water fraction and ice concentrations assigned in our post-processed product are listed in columns 2 and 3, respectively.