Game-Based Assessments: Do They Measure Cognitive Ability or Gaming Acceptance?

The Promise

Game-based assessments have attracted considerable enthusiasm in talent acquisition circles. The appeal is intuitive: if you can measure cognitive ability through an engaging game rather than a dry paper-and-pencil test, you might improve the candidate experience, reduce test anxiety, attract younger applicants, and signal an innovative employer brand, all while gathering valid selection data. Ohlms and colleagues (2024) put this promise to an empirical test, examining both whether a game-based assessment actually measures what it claims to measure and how applicants actually react to it. The results are instructive on both counts.

The Study

The researchers developed a game-based assessment designed to measure verbal, numerical, and figural cognitive ability, the three components also measured by a conventional paper-and-pencil ability test used as the comparison. One hundred and eighty-three participants completed both assessments along with measures of applicant reactions and personality (Ohlms et al., 2024). This within-person design, where the same individuals complete both tools, allows direct comparison of both validity and reactions.

The Validity Finding

The game-based assessment showed a strong positive correlation of .51 with the paper-and-pencil test overall (Ohlms et al., 2024). This is a reasonably encouraging validity finding. A correlation of .51 between two measures designed to assess the same construct suggests meaningful convergence, providing evidence that the game-based assessment is capturing something genuinely related to cognitive ability rather than simply measuring how well people play games.

This is not a trivial finding. Many game-based assessments in commercial use have limited published validity evidence, and the field has been justifiably criticised for moving faster than the psychometric research supports. A correlation of .51 with an established cognitive ability measure is a reasonable starting point, though it also implies the two tools are not measuring identical things, leaving open the question of what accounts for the remaining variance.

The Applicant Reaction Finding

Here the results take a more surprising turn. Applicant reactions toward the game-based assessment were consistently worse than reactions toward the paper-and-pencil test (Ohlms et al., 2024). This directly contradicts one of the central assumptions driving enthusiasm for game-based assessments: that candidates will find them more engaging, fairer, and more favourable than traditional tests.

The finding suggests that for at least some applicants, a game format in a selection context does not feel more engaging. It may feel less serious, less professional, or simply unfamiliar in a way that generates discomfort rather than enthusiasm. The selection context matters: what is enjoyable in a leisure context does not automatically translate into what feels appropriate or respectful in a high-stakes professional evaluation.

Who Reacts Positively and Who Does Not

The study reveals a significant moderating factor. Male participants and those with more video game experience held considerably more positive perceptions of the game-based assessment than female participants and those with less gaming experience (Ohlms et al., 2024). This is a practically important finding and one that raises genuine fairness concerns.

If a game-based assessment is received more favourably by people who already play video games, and less favourably by those who do not, then applicant reactions to the tool will be systematically shaped by prior gaming exposure rather than by the quality of the assessment itself. Given that gaming participation varies considerably across demographic groups, including by gender, age, and cultural background, this differential reaction pattern means game-based assessments may not deliver the universally improved candidate experience their proponents promise. They may instead improve the experience for some groups while worsening it for others (Ohlms et al., 2024).

What This Means for Practice

The study presents a mixed picture that resists simple conclusions in either direction. Game-based assessments can measure cognitive ability with reasonable validity; this one did. But the assumption that they will be better received than conventional tests is not supported here, and the differential reactions by gender and gaming experience introduce fairness considerations that organisations should take seriously before deploying such tools at scale.

The broader implication is one that applies across many innovations in assessment: the case for a new tool needs to rest on evidence about what it actually measures and how it is actually received, not on the intuitive appeal of the concept. Engagement and validity are both necessary conditions for a good selection tool, and assuming that a more game-like format automatically delivers both is an assumption this research does not support.

Reference

Ohlms, M. L., Melchers, K. G., & Kanning, U. P. (2024). Can we playfully measure cognitive ability? Construct-related validity and applicant reactions. International Journal of Selection and Assessment, 32(1), 91–107. https://doi.org/10.1111/ijsa.12450