Detecting Fabricated Interviews Using the Hamming Distance
DOI:
https://doi.org/10.18148/srm/2023.v17i2.7961Keywords:
Fabricated data, string distances, PISA dataAbstract
In the research literature on survey methodology, there is considerable discussion of interviewer effects and how to prevent data fabrication; however, there is little discussion on the detection of data fabrication by interviewers in published data, and there are even fewer papers examining the phenomenon of employees of survey research organizations fabricating data. Among them, Blasius and Thiessen (2015) show for the PISA 2009 principal data that employees of survey research organizations in some countries duplicate cases to generate data. While the authors focus on exact copies, more sophisticated data fabrication techniques might include duplicating whole cases and changing a few entries afterwards. By calculating Hamming distances and applying them to the same data, we show that – in some countries in particular – large parts of the data have been duplicated, and most of them have been retrospectively modified to a small degree.Downloads
Additional Files
Published
2023-08-08
How to Cite
Blasius, J., & Sausen, L. (2023). Detecting Fabricated Interviews Using the Hamming Distance. Survey Research Methods, 17(2), 131–145. https://doi.org/10.18148/srm/2023.v17i2.7961
Issue
Section
Articles
License
Copyright (c) 2023 Jörg Blasius, Lukas Sausen
This work is licensed under a Creative Commons Attribution 4.0 International License.