the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A consistent dataset for the net income distribution for 190 countries, aggregated to 32 geographical regions and the world from 1958–2015
Kanishka B. Narayan
Brian C. O'Neill
Stephanie Waldhoff
Claudia Tebaldi
Abstract. Data on income distributions within and across countries are becoming increasingly important to inform analysis of income inequality and to understand the distributional consequences of climate change. While datasets on income distribution collected from household surveys are available for multiple countries, these datasets often do not represent the same income concept and therefore make comparisons across countries, over time and across datasets difficult. Here, we present a consistent dataset of income distributions across 190 countries from 1958 to 2015 measured in terms of net income. We complement the observed values in this dataset with values imputed from a summary measure of the income distribution, specifically the GINI coefficient. For the imputation, we use a recently developed principal components-based approach that shows an excellent fit to data on income distributions compared to other approaches. We also present another version of this dataset aggregated from the country level to 32 geographical regions and the world as a whole. Our aggregation method takes into account both within-country and across-country income inequality when aggregating to the regional level. This dataset will enable more robust analysis of income distribution at multiple scales.
- Preprint
(1548 KB) - Metadata XML
-
Supplement
(445 KB) - BibTeX
- EndNote
Kanishka B. Narayan et al.
Status: final response (author comments only)
-
RC1: 'Comment on essd-2023-137', Anonymous Referee #1, 14 Jun 2023
The paper attempts to create a large dataset of 190 countries over almost 70 years providing consistent information on net income distributions. Such an attempt is valuable. However, if such a database does not already exist, or with limited scope such as the LIS, it is because it raises serious challenges, ultimately related to the lack of suitable data. I found the paper unconvincing in the ways it tackles these challenges. I thus believe that the database it intends to produce (and document) is unlikely to be taken up by other researchers and institutions.
- Imputing net income shares using consumption shares:
- I am sceptical of this approach without more information about the estimation sample and the country-years for which such imputation is performed. If the two sets of countries are different, there are reasons to doubt that good R-squares would translate into good out-of-sample predictions.
- I understand that the current setup with 10 regressions does not ensure that all income shares add up to one. Why not run one regression and impose the constraint that the sum of all income shares must be equal to 1?
- Imputing net income deciles based on summary measures of the Gini coefficient
- The same point mentioned above about in-sample vs. out-of-sample predictions applies here too.
- Where it is not known whether the Gini coefficients are based on income or consumption, it would be best to drop these countries and years to ensure consistency of the income concept.
- Aggregating income distributions to the regional level
- I do not see a clear motivation for this section (other than the need for the authors to carry out these analyses for another project/report).
- The approach sounds highly problematic as it appears to confuse (or ignore the differences) between household net income and GDP per capita. In addition, it also appears to ignore (crucial) variations in income dispersion within income decile groups.
- The same issues apply to section 4, in which the authors aggregate country income distributions up to the global level.
- The differences between the different data sources shown in Figure 5 are concerning. Given these sizeable differences, it is far from clear that one could accurately assess inequality levels and trends using data imputed by the authors.
- US results (Fig 5): why is “original data” for net income not available for the whole period? The CPS is a large and representative survey that collects detailed income information and that has been running yearly since the 1960s.
Specific comments:
- Introduction : the paragraph starting on line 29 is odd because it suggests that “at the national level”, datasets on income inequality have been “limited to summary metrics”. That is clearly not true. In many countries, detailed microdata allow researchers and statistical agencies to produce detailed distributional analyses.
Citation: https://doi.org/10.5194/essd-2023-137-RC1 -
AC1: 'Comment on essd-2023-137 (Response to reviewers)', Kanishka Narayan, 08 Sep 2023
Dear Editor,
Thank you and the reviewers for the detailed comments on our manuscript. We have responded to all review comments from both reviewers in detail. The responses are attached in the pdf. The review comments have greatly improved the quality of our manuscript. Do let us know what you think. We did not attach the revised manuscript here since the instructions mentioned that only author comments be added. We can add that as well if required.
We look forward to hearing from you.
- Imputing net income shares using consumption shares:
-
RC2: 'Comment on essd-2023-137', Anonymous Referee #2, 15 Jun 2023
The comment was uploaded in the form of a supplement: https://essd.copernicus.org/preprints/essd-2023-137/essd-2023-137-RC2-supplement.pdf
-
AC1: 'Comment on essd-2023-137 (Response to reviewers)', Kanishka Narayan, 08 Sep 2023
Dear Editor,
Thank you and the reviewers for the detailed comments on our manuscript. We have responded to all review comments from both reviewers in detail. The responses are attached in the pdf. The review comments have greatly improved the quality of our manuscript. Do let us know what you think. We did not attach the revised manuscript here since the instructions mentioned that only author comments be added. We can add that as well if required.
We look forward to hearing from you.
-
AC1: 'Comment on essd-2023-137 (Response to reviewers)', Kanishka Narayan, 08 Sep 2023
-
AC1: 'Comment on essd-2023-137 (Response to reviewers)', Kanishka Narayan, 08 Sep 2023
Dear Editor,
Thank you and the reviewers for the detailed comments on our manuscript. We have responded to all review comments from both reviewers in detail. The responses are attached in the pdf. The review comments have greatly improved the quality of our manuscript. Do let us know what you think. We did not attach the revised manuscript here since the instructions mentioned that only author comments be added. We can add that as well if required.
We look forward to hearing from you.
Kanishka B. Narayan et al.
Data sets
A consistent dataset for net income deciles for 190 countries, aggregated to 32 geographical regions and the world from 1958-2015 Kanishka B. Narayan, Brian C. O'Neill, Stephanie Waldhoff, and Claudia Tebaldi https://zenodo.org/record/7093997
Kanishka B. Narayan et al.
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
875 | 223 | 37 | 1,135 | 74 | 9 | 7 |
- HTML: 875
- PDF: 223
- XML: 37
- Total: 1,135
- Supplement: 74
- BibTeX: 9
- EndNote: 7
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1