Clarifying Some Issues in the Regression Analysis of Survey Data
Keywords: Error term, standard linear model, extended linear model, nonignorable, sample
AbstractThe literature offers two distinct reasons for incorporating sample weights into the estimation of linear regression coefficients from a model-based point of view. Either the sample selection is nonignorable or the model is incomplete. The traditional sample-weighted least-squares estimator can be improved upon when the sample selection is nonignorable, but not when the standard linear model fails and needs to be extended. Conceptually, it can be helpful to view the realized sample as the result of a two-phase process. In the first phase, the finite population is drawn from a hypothetical superpopulation via simple random (cluster) sampling. In the second phase, the actual sample is drawn from the finite population. In the extended model, the parameters of this superpopulation are vague. Meansquared-error estimation can become problematic when the primary sampling units are drawn within strata using unequal probability sampling without replacement. This remains true even under the standard model when certain aspects of the sample design are nonignorable.
How to Cite
Kott, P. S. (2007). Clarifying Some Issues in the Regression Analysis of Survey Data. Survey Research Methods, 1(1), 11-18. https://doi.org/10.18148/srm/2007.v1i1.47
Copyright for articles published in this journal is retained by the authors, with first publication rights granted to the journal. By virtue of their appearance in this open access journal, users can use, reuse and build upon the material published in the journal but only for non-commercial purposes and with proper attribution.