﻿Template-type: ReDIF-Paper 1.0
Author-Name: Juan Manuel Pérez-Salamero González
Author-Workplace-Name: Department of Financial Economics and Actuarial Science University of Valencia. (Spain).
Author-Name: Marta Regúlez-Castillo
Author-Workplace-Name: Department of Applied Economics III University of the Basque Country (UPV/EHU) Bilbao (Spain).
Author-Name: Manuel Ventura-Marco
Author-Workplace-Name: Department of Financial Economics and Actuarial Science University of Valencia. (Spain).
Author-Name: Carlos Vidal-Meliá
Author-Workplace-Name: Department of Financial Economics and Actuarial Science, University of Valencia and Research Institute of Economic 
	Analysis (ICAE), Complutense University of Madrid.
Title: Automatic regrouping of strata in the chi-square test
Abstract: Pearson´s chi-square test is widely employed in social and health science to analyze categorical data and contingency tables and to 
	assess sample representativeness. For the test to be valid the sample size must be big enough to provide a minimum number of expected 
	elements per category. If the researcher chooses to regroup the strata in order to solve the failure on the minimum size requirement, 
	the existence of automatic re-grouping procedures in statistical software would be very useful, especially when tests are applied 
	sequentially. After comprehensively reviewing the software that can carry out this test, we find that, with a few exceptions, there is 
	no automatic regrouping of the strata to meet this requirement, although it would be very useful if this were available. This paper 
	develops some functions for regrouping strata automatically no matter where they are located, thus enabling the test to be performed 
	within an iterative procedure. The functions are written in Excel VBA (Visual Basic for Applications) and in Mathematica, so it would 
	not be hard to implement them in other languages. The utility of these functions is shown by using three different datasets. Finally, 
	the iterative use of the functions is applied to the Continuous Sample of Working Lives, a dataset that has been used in a considerable 
	number of studies, especially on labor economics and the Spanish public pension system.
Classification-JEL: C46, C88, H55.
Keywords: Chi-square test, statistical software, VBA, Mathematica, Continuous Sample of Working Lives.
Length: 25 pages 
Creation-Date: 2017-10
Number: 2017-24
X-File-Ref: http://america.sim.ucm.es/repec/ucm/ref/doicae1724.txt
File-URL: https://eprints.ucm.es/id/eprint/45317/1/1724.pdf
File-Format: Application/pdf
Handle: RePEc:ucm:doicae:1724