Pairwise matching before randomization reduces s.e. (see here, for instance). Generally, the strategy is used to create balanced control and treatment groups from available observations. But we can use the insight for optimal sample recruitment especially in cases where we have a large panel of respondents with baseline data, like YouGov. The algorithm is similar to what YouGov already uses, except it is tailored to experiments:
- Start with a random sample.
- Come up with optimal pairs based on whatever criteria you have chosen.
- Reverse sort pairs by distance with the pairs with the largest distance at the top.
- Find the best match in the rest of the panel file for one of the randomly chosen points in the pair. (If you have multiple equivalent matches, pick one at random.)
- Proceed as far down the list as needed.
Technically, we can go from step 1 to step 4 if we choose a random sample that is half the size we want for the experiment. We just need to find the best matching pair for each respondent.