Personality assessment has long relied on self-report questionnaires: people answer a series of questions about themselves, and their responses are scored against established frameworks like the Big Five. But self-report has well-known limitations, including social desirability bias and limited insight into one’s own traits. Fan and colleagues (2023) asked whether an AI chatbot could do something more indirect: infer personality from the way people naturally write and speak, without asking them directly.

How It Worked

The study involved 1,444 undergraduate students who completed a standard Big Five personality measure and then engaged in a 20 to 30 minute conversation with an AI chatbot (Fan et al., 2023). The chatbot analysed textual features of their responses and used machine learning to generate personality scores. The researchers then put these machine-inferred scores through a comprehensive battery of psychometric tests, examining reliability, factor structure, convergent and discriminant validity, and criterion-related validity.

What the Results Showed

The picture that emerged was mixed but genuinely promising in parts. The machine-inferred scores showed acceptable reliability and produced a factor structure comparable to traditional self-report measures, suggesting the AI was picking up on something real and coherent (Fan et al., 2023).

Convergent validity was good: machine scores correlated meaningfully with self-reported scores on the same traits, with an average convergent correlation of .48 (Fan et al., 2023). This is a respectable figure, indicating the two approaches are measuring overlapping constructs.

Discriminant validity was weaker. The machine scores correlated more with each other than ideal psychometric standards would prefer, meaning the AI struggled to cleanly separate different personality traits (Fan et al., 2023). Criterion-related validity, the extent to which the scores predicted real-world outcomes like academic performance, was also low.

On the other hand, machine-inferred scores showed incremental validity over self-report in some analyses, meaning they added predictive information beyond what traditional questionnaires captured (Fan et al., 2023). This suggests the two approaches are not simply redundant.

Why It Matters

AI-based personality inference is already moving into applied contexts, including hiring and selection. This study provides one of the most rigorous psychometric evaluations of such a system to date, and its message is nuanced: the technology shows real promise but also real limitations. Strong convergent validity is encouraging; weak discriminant and criterion validity are not minor concerns in high-stakes applications (Fan et al., 2023). For organisations considering AI-based personality tools, this research is a useful reminder that psychometric rigour matters, and that novelty is not the same as validity.

Reference

Fan, J., Sun, T., Liu, J., Zhao, T., Zhang, B., Chen, Z., Glorioso, M., & Hack, E. (2023). How well can an AI chatbot infer personality? Examining psychometric properties of machine-inferred personality scores. Journal of Applied Psychology. Advance online publication. https://psycnet.apa.org/record/2023-43379-001