I recommend you do not use it.
After using the Implicit Relational Assessment Procedure (IRAP) in many publications in my PhD and first postdoc, I began to scrutinize many of the basic claims made about the task. I believe the evidence suggests that the IRAP is not fit for purpose, either as a measure of relational responding or of implicit attitudes. Due to both properties of the task and the research practices in the community, much of the published IRAP literature very likely non-replicable and non-credible. Below are some of the key publications highlighting these concerns. I recommend to people that they should not use the IRAP in their research, and instead pursue other lines of research.
The IRAP’s criterion validity is greatly overstated. Vahey et al. (2015) reports a meta-analysis of the IRAP’s clinical criterion validity, and has been frequently cited for sample size determination in subsequent IRAP publications for its claim that N = 37 is often sufficient. This article demonstrates that there are serious errors and biases in Vahey et al. (2015) at almost every stage of data extraction and analysis. I made the authors of Vahey et al. (2015) aware of some of these errors in 2019, and more in 2025, but they have declined to correct the original article.
This file-drawer meta-analysis of published and unpublished results (N = 1839) suggests the IRAP’s internal consistency is poor (α = .49) and its test-retest reliability is very poor (ICC2 = .10). If scores are calculated for individual trial types as many IRAP proponents argue for, both forms of reliability are very poor (α = .27, ICC2 = .18). Low reliability reduces statistical power and replicability, suggesting that many published results may be false positives.
Sample sizes in IRAP studies are extremely low – lower than in social psychology prior the start of the replication crisis – and have not risen meaningfully over time. Very low samples imply low statistical power, and contribute to poor replicability.
There is consensus among IRAP researchers that IRAP effects are biased in some way. O’Shea et al. (2016) called this a positive framing bias, and Finn, Barnes-Holmes, et al. (2016, 2018) refer to this generic pattern as the single trial type dominance effect. While there is disagreement about its cause, the presence, replicability, and generalizability of these biases in IRAP effects is apparently uncontroversial. However, the necessary implications of this confound are not fully appreciated. This article uses a large open data set (N = 753) of IRAPs in multiple domains to show that a) majority of variance in IRAP effects is attributable to the generic pattern rather than the stimuli domain being assessed by the IRAP, and b) that this pattern is observed even when nonsense stimuli are used. Given that many IRAP studies conclusions are based on the presence of non-zero IRAP effects, and these IRAP effects are observed regardless of the stimuli employed in the task, this implies that many conclusions in the IRAP literature are invalid or erroneous, and merely the result of a statistical artifact.
A key rationale for using the IRAP over other more reliable and valid measures such as the IAT is that its four trial types are supposedly functionally independent of one another, but little evidence is presented for this. This analysis of 1464 participants across 35 IRAPs in 16 different domains suggest that the IRAP trial types are not independent and are typically correlated with one another.
This article was my bachelor's thesis project, and the results reported in this article are heavily p-hacked, with the knowledge of both authors (although not a full understanding that p-hacking was bad, at the time). Specifically, we used optional stopping and flexibility in data processing to obtain a significant result. At least one unpublished conceptual replication of this study using the RRT, a similar implicit measure, found null results. I no longer believe the claims in this article are credible.
A blog post by Chad Drake detailing his own negative experiences in dealing with the IRAP and its research culture.