Every pair isolates one design variable.
You build the screenshot library and assemble the pairs yourself — the tool doesn't curate. Group them into sections by what they test: typography, colour palette, photography style, layout density. Hold everything constant except the one dimension you're probing.