Last month, I discussed the issue of impartiality with reference to universities and research. This month, I want to look at the myth of impartiality from the perspective of the users and suppliers of psychometrics. With respect to users, my focus is HR professionals and recruiters. The suppliers I refer to are the plethora of assessment suppliers from around the world.
Much of the credibility of psychometric tests is assumed through their application. The general public’s interaction with psychometric assessment comes primarily through the job application process. The corollary is that those who are responsible for those processes must be skilled practitioners in their field and have a highly justifiable reason for the application of psychometrics and the application of a given assessment. This gives rise to the myth of impartiality in reference to practitioners.
The practitioner is often reliant on test providers as their source of information on psychometrics. However, this is akin to asking a financial advisor, who is selling a particular investment, to describe the principles of investment to you! It is important to recall that those who are psychologically trained are subject to issues of impartiality (as discussed in last month’s blog post).
Research has indicated that practitioners’ beliefs in predictive power do not marry with reality. While this may change over time, practitioners who lack the skills needed to read the statistics and understand how the tools are applied are unaware of their own blind-spots when it comes testing.
Examples I have witnessed include:
- Assuming that the correlation can be read as a percentage (%). For example, a common misconception is assuming that a scale that has a correlation of 0.3 between job performance and conscientiousness accounts for 30% of the variability and not 9%.
- Talking about the validity of a test when it is not so much a test that is ‘valid’ as the scales inside the test that correlate with a given outcome.
- Not understanding that the correlation they are citing as evidence for the value of the test is not linear. According to the research, extreme ends of the scale are best for predictive purposes. However, most practitioners will warn of the problems with extremes. The contradiction between application and research is clear.
- Assuming a quoted validity is applicable to their organisation. Validity varies greatly between jobs, organisations, and time. These are only 3 variables. To talk of using a given validity as ‘applicable to your organisation’ is often a big leap in logic.
- Validity is ultimately more than a number on a page. It is a system of interacting parts to produce an outcome. To simplify it to a number makes the commonly relied upon concept near redundant.
- While many practitioners ask about the size of a norm group very few ask about the makeup of the norm group.
- Those that ask about the makeup of the norm group fail to ask about the spread of data.
- A classic example is the request for industry-based norms. People fail to understand that the request for industry-based norms has inherent problems such as the restriction of range that comes by taking a more homogenous sample. This is highly apparent when looking at industry norms for cognitive ability.
A practitioner may be influenced by a product as a result of its branding, rather than its substance if critical evaluation tools are not used to evaluate the assessment more fully. If a tool is a ‘leadership tool’ than it is presumed what’s needed for leadership. If the assessment claims to ‘predict psychopathic behaviour at work’, then it is assumed that it must do so. The practitioner is convinced that the right tool is found for the job and the brand may even justify its high costs.
Rather than be impartial, practitioners tend to use what they are comfortable with and endorse it accordingly. Often, they don’t have full knowledge of the options available to them (reference article), and testing may become a tick box service that is transactional rather than strategic in nature. Many HR professionals are so busy with a multitude of HR concerns that they do not have the time to spend on turning psychometrics into a strategic solution. Neither do they investigate validity in a more sophisticated way. Ironically, this then elevates the value of the aura of the psychometric tool and the myth of impartiality continues.
The solution to this problem is relatively simple. Firstly, HR professionals who use assessments need to attend some basic training that covers the myths and realities of psychometric testing. I’m proud to say that OPRA has been running these courses, together with thought pieces like this, since the late 1990’s. The solution, however, is not to attend an OPRA course but to attend any course that takes a critical look at the application of psychometrics. The second is to understand the limitations of testing and opt for a simple broad-brush measure of personality and cognitive ability that is cost-effective for the organisation without giving the test more credibility than it is worth. Finally, adopt a more critical outlook to testing that enables one to truly be impartial.
Psychometric Test Providers
The final area of impartiality I want to look at is the test providers themselves; it is only fitting that I close with a critical review of the industry I’m entrenched in. The reality is that any claims to impartiality by someone who is selling a solution should be regarded with caution. Many people do not realise that the testing industry is increasingly lucrative as demonstrated by recent acquisitions. For example, in recent times we have seen the $660 million acquisition of SHL by CEB or Wiley’s purchase of Inscape, and more recently Profiles International.
It would be naïve to think that such businesses could be truly impartial. The fact is that testing companies build and hold a position much like other industries such as soft drink or food. The result is that innovation ceases and marketing takes over.
No technology of which we are aware- computers, telecommunications, televisions, and so on- has shown the kind of ideational stagnation that has characterized the testing industry. Why? Because in other industries, those who do not innovate do not survive. In the testing industry, the opposite appears to be the case. Like Rocky I, Rocky II, Rocky III, and so on, the testing industry provides minor cosmetic successive variants of the same product where only the numbers after the names substantially change. These variants survive because psychologists buy the tests and then loyally defend them (see preceding nine commentaries, this issue).
Sternberg, R. J., & Williams, W. M. (1997). Does the Graduate Record Examination predict meaningful success in the graduate training of psychologists? A case study. American Psychologist, 52,
The solution to this problem is not innovation for innovation’s sake. This tends to happen when we try to achieve greater levels of measurement accuracy and lose sight of what we are trying to achieve (such as predict outcomes). As an example, the belief that IRT-based tests will provide us with greater validity does not appear to be supported by recent studies (see article one and article two).
Moreover, we can contrast increase measurement sophistication with moves toward the likes of single item scales and the results are surprisingly equivalent: (cf. Samuel, D.B., Mullins-Sweatt. S.N., & Widiger, T.A. (2013) An investigation of the factor structure and convergent and discriminant validity of the Five-Factor Model Rating Form. Assessment, 20, 1, 24-35.)
There is simply a limitation to how much an assessment will ultimately be able to capture the complexity of human behaviour that itself is subject to free will. It is no more complex than this. Rather than highlighting on the magical uniqueness of their test, psychometric test providers need to be upfront about the limitations of their assessments. No one has access to a crystal ball and claims that one exists are fundamentally wrong.
The future for testing companies lies in acknowledging the limitations of their tests and recognising that they are simply part of an HR ecosystem. It is within that system that innovation can reside. The focus then moves away from trying to pretend that a given test is significantly better than others and instead focuses on how the test will add value through such things as:
- Integration with an applicant tracking system to aid screening
- Integration with learning and development modules to aid learning
- Integration with onboarding systems to ensure quick transition into work.
There is a range of solid respectable tests available and their similarities are far greater than their differences. Tests should meet minimum standards, but once these standard are met, the myth of impartiality is only addressed but accepting that there are a collection of quality tools of equivalent predictive power and the eco-system, not the assessment should be the focus point.
I realise I’m still a myth behind in the series and will follow up with a short piece that provides more support for the use of psychometrics in the industry; addressing the myth that psychometric tests have little value for employment selection.