Supplement of A comparison of estimates of global carbon dioxide emissions from fossil carbon sources

Abstract. Since the first estimate of global CO2 emissions was
published in 1894, important progress has been made in the development of
estimation methods while the number of available datasets has grown. The
existence of parallel efforts should lead to improved accuracy and
understanding of emissions estimates, but there remains significant
deviation between estimates and relatively poor understanding of the reasons
for this. Here I describe the most important global emissions datasets
available today and – by way of global, large-emitter, and case examples – quantitatively compare their estimates, exploring the reasons for
differences. In many cases differences in emissions come down to differences
in system boundaries: which emissions sources are included and which are
omitted. With minimal work in harmonising these system boundaries across
datasets, the range of estimates of global emissions drops to 5 %, and
further work on harmonisation would likely result in an even lower range,
without changing the data. Some potential errors were found, and some
discrepancies remain unexplained, but it is shown to be inappropriate to
conclude that uncertainty in emissions is high simply because estimates
exhibit a wide range. While “true” emissions cannot be known, by comparing
different datasets methodically, differences that result from system
boundaries and allocation approaches can be highlighted and set aside to
enable identification of true differences, and potential errors. This must
be an important way forward in improving global datasets of CO2
emissions. Data used to generate Figs. 3–18 are available at
https://doi.org/10.5281/zenodo.3687042 (Andrew, 2020).



Soviet CO 2 emissions during WW2
On the eve of World War II, the Soviet Union was the fourth-largest coal producer in the world, after the USA, Germany, and Britain . But coal production dropped considerably during the war. Soviet production of coal crashed during WWII. It appears this was largely a result of the German invasion of Ukraine in 1941. The invasion of the Soviet Union was motivated both by ideology and economics, and Ukraine "was expected to be both a giant breadbasket and a reservoir of essential minerals" (Priemel, 2015, p. 31).
The largest source of Soviet coal was in the industrial Donetsk region, and the city of Donetsk was almost completely destroyed and depopulated. Priemel (2015) describes how surprised the Germans were, when they arrived, to discover how fast the Soviets had progressed in building up the industry of the Eastern Ukraine, compared to the plans they had seen.
But according to Priemel, much of the industrial destruction in Donetsk was done by the Soviets under a scorched-earth policy, and the Germans encountered "complete devastation and emptiness". "In late June [1942], Hitler [ordered] the reconstruction of the Donbas coal mines." The repair of destroyed Zaporizhzhya hydroelectric dam allowed pumping out of flooded mines. But in Sept 1943 the Red Army arrived again, and the Germans repeated the same Scorched Earth policy. In the two years of German occupation of the area, the Germans managed to produce "slightly more than four million tons [of coal] -some 5 percent of Soviet prewar production." During this period, Soviet emissions of CO2 were dominated by combustion of coal. But while we see a significant dip in emissions from coal in 1942, there's a sharp increase again in 1943, when we would expect from the foregoing discussion that coal production would be limited. There aren't many datasets that provide estimates of emissions in the Soviet Union during the Second World War. CDIAC is one, derived from the work of (Andres et al., 1999), and these emissions are used in turn by the Global Carbon Project (GCP) and the Community Emissions Data System (CEDS). Andres et al based their estimates primarily on the energy production data reported by (Etemad and Luciani, 1991) (hereafter EL91), adjusted using international trade data from Mitchell (various years; Andres et al., 1999). Mohr et al. (2015) also produced estimates, going back to original energy data sources, although while they provide the detailed energy data, they don't provide the emissions estimates.
In the following figure I plot Soviet CO2 emissions from coal directly from CDIAC, and using approximate energy-to-emissions factors for both Mohr and EL91. Given that I'm using approximate factors, and not adjusting Mohr or EL91 for international trade, we wouldn't expect exact matches, but the numbers are very close nonetheless.
We can clearly see that CDIAC's emissions follow closely the data from EL91 during the war years, with a minimum in 1942, a sharp increase in 1943, before another dip in 1944. In contrast, by examining Mohr's reported energy data it's clear that they have linearly interpolated between 1940 and 1945, and their estimates for 1941-1944 are therefore of little value. As shown in the following figure, EL91 (the source used by Andres) does not report production of either brown coal or peat during the war years, and these are not marked as zero, but as unknown.
Further, EL91's reported Soviet coal production during WWII appears to be estimated, with rounded figures compared with more precise figures in other years. For example, for 1942 they report hard coal production as 90.000 million tonnes, while in 1940 it was 139.974 million tonnes.
We can therefore conclude that the data from EL91 for 1941-1944 are unreliable and incomplete and we should look for other sources. I've located two additional Soviet data sources. The first is "The national economy of the USSR in the Great Patriotic War of 1941-1945" (MIPC, 1990hereafter PW). Chapter 4 presents data tables of the fuel industry, with production by year for 1940-1945. The second is the "Statistical time series for the years 1913-1951" available at the RSAE archive (f. 1562, op. 41, d. 65;hereafter RSAE) (RSAE, n.d.). This report is marked both 'Top secret' and (presumably later) 'Declassified'.
It appears the substantial drop in Soviet coal production during WWII was largely a result of the German invasion of Ukraine in 1941. The largest source of Soviet coal was in the industrial Donetsk region, and the city of Donetsk was almost completely destroyed and depopulated. The following bar chart combines information from two tables in RSAE. Production by coal basin are from page 67 while the empty bars representing total production are from page 310.
The German armed forces didn't drive Soviet troops out of Eastern Ukraine until September 1941, so perhaps coal production could have been maintained through much of 1941. While these are Soviet statistics and therefore most likely only show production by the Soviets, rather than production in all Soviet territory, we recall the abovementioned information that peak German coal production was only 5% of pre-war production, so these are likely very close to full production in the entire territory, including that occupied by the Germans. RSAE and PW, the two Soviet data sources, agree precisely on total coal production for each year 1940-1945. The source used by CDIAC, EL91, agrees with RSAE precisely for years up to 1940 and from 1945, but it does not report production of brown coal or peat during 1941-1944. Moreover, EL91 reports production of hard coal that is greater that the Soviet reports of total coal during 1941-1944, while matching exactly outside of that period.  How reliable are these statistics? Rather than tinkering with statistics, (Bergson, 1953) argues Soviets mostly simply kept statistics secret if they were either sensitive or negative. The report I've used for coal production has "СОВ. СЕКРЕТ" on the cover ("Soviet Secret") with "РАССЕКРЕЧЕНО" stamped next to that ("Declassified"). While I can't be entirely certain these are accurate statistics, I've shown that the statistics previously used during this period are not accurate, and the new statistics agree with historical evidence about the occupation of Ukraine during the war.

Emissions in the Netherland Antilles and Aruba before 1950
In 1918 an oil refinery was established on the small island of Curacao, just off the north coast of South America to refine Venezuelan crude, since Venezuela didn't have a suitable port. Curacao at the time was part of the Kingdom of the Netherlands overseas territory called Netherland Antilles. In the 1920s, two more refineries followed on the island of Aruba. These refineries became the largest in the world at the time.
CDIAC's emissions data show very strange behaviour for this group of islands, with a jump from 113 Mt CO 2 in 1948 to 7 Mt in 19507 Mt in (and no data in 1949. Estimates from 1950 onwards are derived from UN energy statistics, while those before 1950 were collated by (Andres et al., 1999) using production data from (Etemad and Luciani, 1991) and trade data from Mitchell's International Historical Statistics. (Etemad and Luciani, 1991) doesn't present any production information for these islands. Mitchell's volumes have been continued as the Palgrave Macmillan International Historical Statistics series (Palgrave Macmillan Ltd, 2013).
The trade data show that the islands imported substantial crude oil and exported substantial petroleum products, as would be expected by the history of Curacao as an economy dominated by a single oil refinery. But it appears that somehow these exports didn't make it into the energy dataset used by Andres et al. (1999), whether because they were missing in the publication they used or a transcription error.
CDIAC's liquid-fuel emissions can be converted back to the approximate apparent consumption of crude oil using a factor of 0.85 tC per t oil ( Figure SI-9). This demonstrates clearly that the emissions from this island group have been estimated without taking exported petroleum products into account.
This error results in ~100 Mt CO2 (~30 MtC) extra in 1948, and a total probably over 1 Gt CO 2 between 1926 and 1948. However, this error does not propagate through to CDIAC's estimates for global emissions, because those are calculated not by summing national emissions, but rather by calculating emissions directly from summed global energy production data.

IEA emissions reconstruction
Replicating the methodology used by IEA to generate emissions estimates from energy data is a good test of precise understanding and the completeness and transparency of documentation.
In the following figure, global emissions reported by IEA are compared with global emissions calculated directly from IEA's energy data using their reported methodology. The difference is never more than 100 kt/yr, and the reason for this residual is that the published energy data do not differentiate between two products for which IEA uses two separate emission factors, Orimulsion and Other Hydrocarbons, such that a single emission factor is used for these different fuels. In IEA's energy data Orimulsion is included in Other Hydrocarbons. For countries with no, or very low, use of other hydrocarbons, estimates differ only because of rounding in the data reported by IEA. Few countries report use of these hydrocarbons -Canada, Denmark, Guatemala, Italy, South Korea, Lithuania, UK -and only over a limited period.

Revisions of CRFs
While the official reporting by Annex-1 countries to the UNFCCC via the CRFs is often seen as the 'gold standard', significant revisions of these estimates do nevertheless occur. Here we demonstrate this with some selected examples, although these should not be taken as representative; in many cases revisions are relatively minor.
As part of the UNFCCC reporting requirements, revisions to previous estimates must be documented and explained.
The first case ( Figure SI-13) shows CO 2 emissions from combustion of gaseous fuels in Germany, which demonstrates that the final year's emissions in each report can be quite heavily revised, despite that year having finished more than 15 months before submission. In this case, however, no revisions are evident in earlier periods.

Figure SI-13: Comparison of the five most recent annual German CRF submissions for CO 2 emissions from gas combustion.
The next example ( Figure SI-14) shows CO 2 emissions from combustion of liquid fuels in France. These are perhaps harder to estimate than emissions from gaseous or solid fuels because they are strongly affected by transport, which can cross international borders; a vehicle transport model is therefore required to derive territorial emissions from national fuel sales and vehicle traffic data.

Dataset comparisons for European Union countries
The following figures compare a number of datasets for the 28 countries of the European Union.

Double counting: Territorial overlap
Territorial boundaries have changed at various points through history. Andres et al. (1999) state: "Land exchanges between countries were also accommodated, when possible. For example, the emissions from Alsace-Lorraine were included with Germany or France, reflecting which political unit governed these lands at any given time. This maintained the integrity of political entities despite changes in national borders" (Andres et al., 1999, p. 760). In other words, emissions occurring in Alsace-Lorraine have been assigned to either Germany or France depending on which nation controlled the territory at the time.
An interesting case of such territorial changes is that of the Austrian Empire, of which Czechoslovakia was a part until WWI. Etemad and Luciani (1991) present coal and lignite production data for Austria and Czechoslovakia, but with some explicit double counting, as clearly indicated by the comment on Austrian production data "Czechoslovakia included until 1916" (p.23). Subtracting Czechoslovakian production from the Austrian total indicates that Austrian coal production was not substantial: hard coal in Austria was about 13% of that in Czechoslovakia in 1915, with the figure for brown coal being about 20%.
However, CDIAC's dataset suggests that this double-counting may not have been properly handled, with Austrian emissions in the early 1900s almost as high as those estimated from Etemad & Luciani, known to include emissions in Czechoslovakia ( Figure SI-21).

Högbom's Estimate of Global CO 2 Emissions
Högbom's article "On the probability of secular changes in the level of atmospheric CO 2 " (original title: "Om sannolikheten för sekulära förändringar i atmosfärens kolsyrehalt") was published in 1894 in the Svensk Kemisk Tidskrift (Swedish Chemistry Journal) (Högbom, 1894). Here 'secular' means 'over a period of about a hundred years'.
Here's the relevant text (p. 171, my translation): "Current global hard coal production is in round numbers 500 million tonnes per annum, or 1 tonne per km 2 of the Earth's surface. Transformed to CO 2 this amount of coal represents approximately a thousandth part of the air's total CO 2 . This is equivalent to a layer of limestone of 0.003 mm over the entire surface of the Earth, or 3 mm over Sweden, or, expressed as a volume, 1.5 km 3 of limestone." Unfortunately, Högbom didn't actually say how many tonnes of CO 2 this was, but one can derive it directly from the information supplied.
Firstly, he writes that 500 Mt is the same as one tonne per km 2 , meaning he's using an approximate surface area for the Earth of 500 million km 2 (it's actually about 510 million km 2 ).
Then he writes that this is equivalent to 0.003 mm thickness of limestone (CaCO 3 ) over 500 million km 2 surface, or 1.5 km 3 . And 1.5 km 3 is 1.5×10 15 cm 3 , which, with a density of about 2.72 g/cm 3 , is about 4.1×10 9 tonnes. The CO 2 'content' of this mass is roughly 44/100 (ratio of the molar masses), i.e. 0.44×4.1×10 9 =1.8×10 9 tonnes. This is also consistent with his 'thousandth part'. So Högbom estimated about 1.8 Gt of CO 2 were emitted at about the time he was writing. Given the starting point of 500 Mt of coal, 1800/500=3.6 tons CO 2 per ton coal, which is about the ratio of the molar masses of CO 2 to C (3.664), such that he appears to have assumed that coal was equal or close to 100% carbon.
While this estimate is only for coal, use of natural gas at the time was almost zero, and oil was still in its infancy: almost all fossil energy was from coal.
The reason Högbom compared to limestone in this way was because he believed these emissions from coal combustion would be completely compensated for by mineralisation processes: formation of limestone and other carbonates taking CO 2 out of the atmosphere. That is, his conclusion was that short-term emissions from burning fossil fuels would not affect the natural carbon balance. His interest was in changes in the carbon cycle over geological periods.

EIA's high coal production and consumption data
As discussed in section 6.1.1 of the article, the Energy Information Administration's (EIA's) coal production and consumption data show a significant deviation from that of both the International Energy Agency (IEA) and BP. Figure SI-22 demonstrates that the deviation between EIA and IEA coal consumption is associated with China's data, where EIA's estimates are significantly higher.   Figure SI-24 shows that when coal production in mass terms is converted to energy terms, NBS and IEA match very closely, while EIA is significantly higher. This appears to be a result of not removing the non-combustible coal washing wastes. This then translates to higher estimates for CO 2 emissions during these years, and also a changed trend.

International bunker fuels
By including emissions from international bunker fuel sales in a country's totals, significant deviations between datasets appear for some countries ( Figure SI-25, Figure SI-26). By international agreement emissions from bunker fuels are not included in countries' accounts, and the two international bodies responsible for civil aviation and maritime navigation have been tasked with taking responsibility for these emissions. Their inclusion by BP and EIA in country totals therefore results in emissions estimates at odds with current rules on national emissions responsibility.  The International Maritime Organization (IMO) reports that emissions from combustion of international marine bunker fuels are probably underestimated in existing global datasets (IMO, 2014). The IMO has used locational data from ships combined with databases on ship characteristics to estimate emissions in domestic, international, and fishing trips, and this bottom-up approach differs considerably from the top-down approach used by the IEA and others that are based on national reporting of total sales of fuels to international bunkers .The IMO report states that "The top-down estimates are also uncertain, including observed discrepancies between global imports and exports of fuel oil and distillate oil, observed transfer discrepancies among fuel products that can be blended into marine fuels, and potential for misallocation of fuels between sectors of shipping (international, domestic and fishing)" (p. 8, IMO, 2014). An example of the discrepancies referred to can be seen in Figure SI-27, plotted here from IEA data. As discussed in Section 5.3 of the article, CEDS uses an independent estimate of international marine bunker fuel emissions, resulting in estimates that are more than 50% higher than those reported by the IEA in some years ( Figure SI-28). While the IEA's emissions dataset begins in 1971, CDIAC's begins in 1751, but with bunker fuel emissions only from 1950 and absent before then. CEDS provide estimates through the entire timeseries. The use of the alternative estimation approach for marine bunkers results in total bunker fuel emissions that are much higher than those in both IEA and CDIAC ( Figure SI-29).