Reply on RC2

Reading over the paper, I think what is missing is a thorough comparison to conventional, i.e. smoothness-constraint, ERT inversion. Since there is no ground-truth available, the authors cannot show that their approach provides superior accuracy in determining the IBPT. So the reader is somewhat left wondering why this additional computational effort is actually needed. Couldn't you achieve similar results by using "standard" processing schemes? To address this, I would suggest adding the results of a smoothness-constraint inversion to Fig. 2 and 7, which I believe will show the benefit of your inversion method clearly, and will highlight that the additional computational effort yields a more robust recovery of the subsurface structure.


Reading over the paper, I think what is missing is a thorough comparison to conventional,
i.e. smoothness-constraint, ERT inversion. Since there is no ground-truth available, the authors cannot show that their approach provides superior accuracy in determining the IBPT. So the reader is somewhat left wondering why this additional computational effort is actually needed. Couldn't you achieve similar results by using "standard" processing schemes? To address this, I would suggest adding the results of a smoothness-constraint inversion to Fig. 2 and 7, which I believe will show the benefit of your inversion method clearly, and will highlight that the additional computational effort yields a more robust recovery of the subsurface structure.
We agree that comparing our results to other inversion strategies might help illustrate the advantages and/or disadvantages of our inversion routine. However, as also indicated in our reply to a similar comment from reviewer 1, our aim is not to compare and evaluate different inversion approaches. We rather want to present an alternative approach to image the IBPT interface and estimate uncertainties in depth and resistivity when we do not have any borehole data, which is typical in subsea permafrost studies. However, for an interested reader, we will provide the smooth inversion in an appendix (see the supplement file in answer to reviewer 1). Please note that a smooth inversion for the Bykovsky data set is already published and discussed in detail by Angelopoulos et al. (2019).
Another more fundamental comment refers to the spatial heterogeneity of the water resistivity you are trying to image. You describe the two field sites as places with different flow patterns feeding freshwater into the coastal system. I believe that this is likely causing spatial heterogeneity in the water resistivity going from the coast further into the sea. Yet, in your inversion approach, you only address the variation in the thickness of the sea-water layer, but not its resistivity. Why are you not addressing this? Is it because the variability in rho_w is small enough that it does not affect the inversion (if so, can you show that?), or is there another reason for not addressing it?
Our decision to use homogeneous resistivity for the water layer for both of our case studies is based on different CTD measurements near our ERT profiles. For the Bykovsky field site, CTD measurements offshore of the Bykovsky Peninsula in July 2017 (freely available at https://doi.org/10.1594/PANGAEA.895887) demonstrate that there is little vertical variation in the electrical resistivity (or conductivity) in the water column. Additionally, for a particular day (e.g., 29 July), there was up ~1 ohm-m of water resistivity variation laterally. Because we do not expect significant stratification, assuming homogeneous resistivity is appropriated. Although the CTD measurements indicate resistivity values around 13.7 ohm-m, we still allow resistivity variations between 11 and 15 ohm-m. To provide more clarity, we will add this reasoning in section 5.1.
For our Drew Point field site, CTD control points along our ERT transect indicate minimal lateral and vertical variation in water resistivity. To illustrate this, we add a supplement figure (upload as supplement material) with the CTD cast profile at different offshore distances. As noticed in this figure, water resistivity is in the order of 0.42 and 0.44 ohmm for the first 600 m with minor variations in the vertical direction. Because we have rather small resistivity variations both horizontally and vertically, we can justify our decision to use a homogeneous resistivity for the water layer in our Drew Point example. Although the CTD measurements indicate resistivity values around 0.43 ohm-m, we still allow resistivity variations between 0.2 and 2 ohm-m. Again, for clarity, we will extend this in section 5.2.

Line 38: Although I generally agree, you may want to check out the work by Wagner et al., who show an approach to get quantitative values of ice content from joint inversion of ERT and seismic data.
We think this is an excellent reference to highlight how in a different environment like the Alps, a combination of ERT and seismic data can help to quantify ice content. We will include this reference on line 530 to support our discussion about ice-content estimation.
Line 66: It might be better to stick with resistivity here rather than changing to conductivity.
We agree. For consistency, we will replace the word conductivity with resistivity in the whole text. In the text you only refer to this plot to highlight the higher noise level, but I think you can also that comparing (c) and (g).
Following common terminology, a sounding refers to a set of electrode configurations collected with different spacings around one central sounding location. In our studies, the used streamer had 10 channels and allowed us to measure ten electrode configurations with different spacings. As also suggested by reviewer 1, these plots will be removed for an updated version of the manuscript.
Line 160-161: Judging from c, it looks like levels 6 to 9 in general are noisier than the shallower ones.
We will rephrase these lines because Fig. 1d and h were removed. We will also indicate that the data starts being noiser after level 6.
Line 188: To improve clarity, it might be worth adding here how you describe the geometry of the interface. Are you using a specific function with x numbers of parameters, or do you have a layer thickness for each sounding location?
Following this comment and a similar comment from reviewer 1, we will add more details about our model and interface parameterization to section 4.1, including also the corresponding arctan function. The sum of the arctangent function can be seen as a set of coefficients (similar to 1D spline interpolation) that allows the creation of complex interfaces. More details regarding this parameterization can be found in Roy et al. (2005) and Rumpf and Tronicke (2015), which are cited in our manuscript.

Line 264: Is this an arbitrary number for the number of points of the interface, or where does it come from?
As shown in the answer of line 188, the number of points of an interface is given by your vector x (1, 2, 3,…,100). However, the number of nodes has a different meaning from the number of points of an interface. Typically, we set a smaller number of nodes because we wish to reduce the number of parameters while creating complex interfaces. As shown in Arboleda-Zapata et al. (2022), a single interface has 1 + 3 * number of nodes. We typically set this value between 3 to 7 for a distance vector of about 100-200 positions. That allows recreating relatively complex structures.
Line 265: I'm not entirely sure I follow how you get to 36? Five nodes for two interfaces should be 10 parameters describing the thickness, and then you need a resistivity for the water column, unfrozen sediments and frozen sediments. Arboleda-Zapata et al. (2022). Considering that each interface has the same number of nodes, the number of parameters is given by (n_int + 3 * n_nod * n_int) + (n_int + 1), where n_int is the number of interfaces, n_nod is the number of nodes. In our presented examples, we considered n_int = 2 and n_nod = 5, which results in 35 parameters. By checking this again, we identified a mistake, and 36 will be changed to 35. For completeness, we will add this simplified equation to the text.

Line 330: This argumentation is a bit weak. Only because you have some sensitivity does not necessarily mean that you can resolve subsurface structures and that you can interpret the inverted models.
Having some sensitivity means that a change in resistivity may impact our cost function, thus, influencing the finally found resistivity model. As we already pointed out in line 332, a more conservative way may be to start our interpretation at a position of x = -25 m where most of the sensitivity is concentrated. However, while we agree that there is less sensitivity below the outermost electrodes close to the shoreline, our interpretation of features in the ERT inverted profiles was not based on the geophysical data alone, as discussed in lines 578 -582. For example, the cryopeg features were encountered by drilling observations reported in Bull et al. (2020) andBristol et al. (2021). Therefore, the interpretation of the nearshore features shown in MF2 (Figure 7) is plausible. Because our answers are already stated in the original manuscript, we do not find it appropriate to further extend the statement in line 330 for such an applied manuscript.
Line 335-336: These areas seem a little suspicious to me. Why do you first get almost no sensitivity, and then a comparable high value. This does not seem to agree with the expected sensitivity pattern.
We agree that these sensitivity patterns may look odd for conventional sensitivity analysis assuming lower resistivity contrast. However, these sensitivity patterns can be obtained with such high resistivity contrast between the horizontal layers, as also pointed out by Spitzer (1998). To assess up to what point these sensitivities below the outer electrode are present, we calculated the sensitivity considering other depths to IBPT. We found that the sensitivity below the external electrodes is almost zero by setting a depth to IBPT of 20 m. Because this additional analysis does not impact our presented results, we will leave the statement in lines 335-336 as it is presented.
Line 359: Why did you choose different PSO parameters for the two different sites? To compare the results, wouldn't it be better to use the same set of parameters?
Generally, there is not a unique recipe to carry out PSO optimization. As a rule of thumb, setting the number of particles one to three times the number of parameters is a good compromise. Because we noticed that the inversion of the Drew Point data set was converging much faster than Bykovsky, we decided to lower the number of particles and the number of iterations to save some computational cost.

Sensitivity analysis: Having a sensitivity study for each site feels a bit repetitive. Perhaps merging the two sensitivity studies would make sense?
Although we recognize that it may sound repetitive, we found it appropriate to let it in the current positions. We wrote the sections in parallel, following the same structure and workflow. Some parameters considered in our sensitivity analysis are derived from previous subsections. We think that the current positions of the sensitivity sections are still appropriate for this manuscript, especially because the two subsea permafrost environments are so different.
Line 464 -468: I may have missed that, but where do you show that in the 2D case. As I understand, you invert for rho_w and z_w only.
With this comment, we realized we did not mention the constraints used for our 2D inversion. We will include our considered constraints in a new version of the manuscript. Please see the answer to your second comment where we mentioned some of the considered constraints.