Predicting Reliable Respondents

23 Jul

Setting aside concerns about sampling, the quality of survey responses on popular survey platforms is abysmal (see here and here). Both insincere and inattentive respondents are at issue. A common strategy for identifying inattentive respondents is to use attention checks. However, many of these attention checks stick out like sore thumbs. The upshot is that an experience respondent can easily spot them. A parallel worry about attention checks is that inexperienced respondents may be confused by them. To address the concerns, we need a new way to identify inattentive respondents. One way to identify such respondents is to measure twice. More precisely, measure immutable or slowly changing traits, e.g., sex, education, etc., twice across closely spaced survey waves. Then, code cases where people switch answers across the waves on such traits as problematic. And then, use survey items, e.g., self-reports and metadata, e.g., survey response time, metadata on IP addresses, etc. in the first survey to predict problematic switches using modern ML techniques that allow variable selection like LASSO (space is at a premium). Assuming the equation holds, future survey creators can use the variables identified by LASSO to identify likely inattentive respondents.