Detecting Fabricated Interviews Using the Hamming Distance


  • Jörg Blasius
  • Lukas Sausen University of Bonn



Fabricated data, string distances, PISA data


In the research literature on survey methodology, there is considerable discussion of interviewer effects and how to prevent data fabrication; however, there is little discussion on the detection of data fabrication by interviewers in published data, and there are even fewer papers examining the phenomenon of employees of survey research organizations fabricating data. Among them, Blasius and Thiessen (2015) show for the PISA 2009 principal data that employees of survey research organizations in some countries duplicate cases to generate data. While the authors focus on exact copies, more sophisticated data fabrication techniques might include duplicating whole cases and changing a few entries afterwards. By calculating Hamming distances and applying them to the same data, we show that – in some countries in particular – large parts of the data have been duplicated, and most of them have been retrospectively modified to a small degree.




How to Cite

Blasius, J., & Sausen, L. (2023). Detecting Fabricated Interviews Using the Hamming Distance. Survey Research Methods, 17(2), 131–145.




Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.