Methodology for Handling Missing Values In TANKAN

June 2002
Kiyohito Utsunomiya
Katsurako Sonoda

The "Short-term Economic Survey of All Enterprises in Japan" (Tankan) is a sample survey that contains both judgment questions and quantitative items. For the latter, we use a standard stratified sampling procedure, producing an estimate of the population total of each individual stratum. If missing values occur in a stratum, the estimated total for the population of that stratum is obtained by weighting the sum of respondents' values, where the weight used is the ratio of the stratum population to the number of respondents. This method is equivalent to estimating the total for the stratum population by filling in missing values with the stratum mean. However, in Tankan survey, the variance within each stratum is comparatively large, so that we must consider the possibility of large deviations between the stratum mean and the true values of nonrespondents. In light of such problems, this paper conducts experimental simulations in order to discover the most appropriate method for handling missing values in Tankan.

We compare three methods: mean imputation (weighting adjustment); "last value carried forward" (a kind of cold-deck adjustment); and "last value multiplied by respondents' mean growth rate" (a kind of hot-deck adjustment). Simulations are made for three items: sales, current profits, and fixed investments.

Our study shows the following points: the method we are currently using is not as accurate as the alternatives; on the whole, the "last value carried forward" procedure seems to be the most suitable for Tankan's purposes, although this finding does depend upon the item under examination, as well as upon the type of industry and the size classification (large, medium, small) to which an enterprise belongs. These results reflect an important characteristic of our business survey data: namely that the variance observed in the data of a given enterprise over time is smaller than the variance observed within an individual stratum of the survey at a given point in time.

In addition, our work suggests that, because enterprises respond on a semi-annual basis based upon annual projections, when "last value carried forward" is employed, seasonality of the data should be taken into consideration.

