Skip to main content

Citation

Kolenikov, Stanislav & Angeles, Gustavo (2004). The Use of Discrete Data in PCA: Theory, Simulations, and Applications to Socioeconomic Indices. Chapel Hill, N.C.: Carolina Population Center MEASURE.

Abstract

The last several years have seen a growth in the number of publications in economics that use principal component analysis (PCA), especially in the area of welfare studies. This paper gives an introduction into the principal component analysis and describes how the discrete data can be incorporated into it. The effects of discreteness of the observed variables on the PCA are overviewed. The concepts of polychoric and polyserial correlations are introduced with appropriate references to the existing literature demonstrating their statistical properties. A large simulation study is carried out to shed light on some of the issues raised in the theoretical part of the paper. The simulation results show that the currently used method of running PCA on a set of dummy variables as proposed by Filmer & Pritchett (2001) is inferior to other methods for analyzing discrete data, both simple such as using ordinal variables, and more sophisticated such as using the polychoric correlations.

Reference Type

Edited Book

Year Published

2004

Author(s)

Kolenikov, Stanislav
Angeles, Gustavo

ORCiD

Angeles - 0000-0003-4598-152X