Dropping the Anchor: The Use of Plausibility in Credibility Assessments

by Enide Maegherman, Tanja S. van Veldhuizen and Robert Horselenberg

Decisions in asylum seeking procedures tend to be based on a credibility assessment. This means that the story on which the asylum claim is based is probed. Four indicators are typically used to assess credibility, namely internal consistency, external consistency, sufficiency of detail and specificity, and plausibility. The relation between these indicators, and the problematic lack of understanding of the plausibility concept have been insufficiently addressed in previous research. According to the findings in this study, none of the indicators seem to be rated objectively or independently of each other. There appears to be an unconscious problem of subjectivity in the credibility assessment. This issue could arise from the use of the ill-defined plausibility indicator, or could be due to another factor influencing all four indicators. The limitations of this study point to the need for further research to elucidate the unidentified influences on the indicators used in the credibility assessment in asylum procedures.

Introduction

“Owing to the special situation in which asylum seekers often find themselves, it is frequently necessary to give them the benefit of the doubt when it comes to assessing the credibility of their statements and  the documents submitted in support thereof” (R.C. v. Sweden: §50)

This position of the European Court of Human Rights in R.C. v. Sweden illustrates the challenging task that European immigration authorities face when processing requests for asylum. R.C. asserted to have been detained and tortured after having been arrested during a student demonstration criticising the lack of freedom in Iran. He was allegedly sentenced without a formal trial; instead, his conviction was upheld in a religious trial. The special situation to which the court refers is the general lack of documentation to support the applicant’s story (Consterdine et al. 2012). One can imagine the difficulties for the Swedish authorities in assessing objectively whether these assertions are true in the absence of an arrest warrant, conviction or other documentary evidence. Because of this lack of documentary evidence, authorities must often rely on an assessment of the credibility of the asylum seeker’s claims about identity, origin, and persecution to determine whether the applicant is genuinely in need of international protection (UNCHR 2013; Gyulai et al. 2013).

Using the same gauges would make the assessment less dependent on the individual decision maker

Realising this task presents a challenge for European immigration authorities, several organisations have taken steps towards more structured, objective, consistent and protection-oriented credibility assessment practices in asylum procedures (see e.g., EASO 2017; European Commission 2017; European Parliament 1999; and UNHCR 2013). As part of this structured and more objective approach, immigration authorities typically use similar indicators in their credibility assessment of asylum claims (UNHCR, 2013). Using the same gauges would make the assessment less dependent on the individual decision maker. These indicators are the focus of this study, whose primary aim is to study to what extent they can be used objectively and independently in the evaluation of oral statements, with a special focus on plausibility. Specifically, we study whether plausibility is a unique concept or whether it is related to other concepts used in credibility assessments.

The four common credibility indicators are internal consistency, external consistency, sufficiency of detail and specificity, and plausibility (UNHCR, 2013). Determining consistency means considering the extent to which the information given by the applicant is in agreement. This includes information within a single statement and between statements (Vredeveldt et al. 2014). In asylum cases, internal consistency can be defined as “a lack of discrepancies, contradictions, and variations in the material facts asserted by the applicant” (UNCHR 2013). External consistency requires a comparison of the applicant’s statement with information from other sources, especially known facts about the applicant’s country of origin (Gyulai et al. 2013). Thus, external consistency refers to the accuracy of the statements in light of other evidence. Alternatively, external consistency can also refer to consistency with other witnesses. For the criterion of sufficiency of detail and specificity, the decision-maker needs to determine whether the statement provided by the applicant contains the level and nature of detail that would reasonably be expected from someone describing a genuine personal experience, in the individual and contextual circumstances described by the applicant (UNCHR 2013).

The fourth and final indicator is plausibility. Whereas the other three criteria mentioned have been clearly defined, UNCHR (2013: 176) acknowledges that “[i]n the context of the credibility assessment, the intended meaning of the term ‘plausible’ lacks clarity”. One possible definition uttered by the European Asylum Curriculum is that the facts alleged by the applicant should be “believable and consistent” (EASO 2014: Section 3.2). This definition renders plausibility a redundant indicator, as it would be a combination of the other indicators. An alternative definition of plausibility, in which plausibility is defined as a unique indicator, is the degree to which the events claimed by the applicant are “reasonable or likely” (UNHCR 2013). In the light of that definition, a statement is implausible if it is beyond human experience of possible occurrence.

Both judges and policy makers seem to endorse the latter definition, in which plausibility is a unique concept. For example, the International Association for Refugee Law Judges explicitly separates plausibility from the other indicators by stating that “decisions based solely on implausibility are likely to be less persuasive than those based on a wider range of criteria” (Mackey and Barnes 2013: 35). Moreover, in the UK policy brief on credibility assessments, plausibility is defined as follows: “The plausibility of an account is assessed on the basis of its apparent likelihood or truthfulness in the context of the general country information and/or the claimant’s own evidence about what happened to him or her” (Home Office 2015: section 5.6.4.). The Dutch Immigration and Naturalisation Service seems to define plausibility as the degree to which something is physically possible. They view something as implausible when the described order of events or actions are impossible according to reasonable standards (Kok 2016). These definitions contain a subjective element. The UK definition speaks of an “apparent likelihood” and the Dutch definition of “impossible according to reasonable standards”. For both definitions questions arise about who decides what is reasonable or likely. The same problem goes for the Swedish context, in which plausibility is equated with realism; plausibility is operationalised as the extent to which a statement is “more or less realistic” (Granhag, Landström and Nordin 2017: 51). These elements allow for decision-makers to base their plausibility judgment on their own frame of reference, prior knowledge and experiences (Granhag, Landström and Nordin 2017: 51-52; UNHCR 2013).

This does not mean that a plausibility judgment is always subjective when using these definitions. If an asylum seeker claims to have jumped over a 10-meter wall, one can assume that this is implausible. However, sometimes what we deem unlikely or physically possible is also influenced by our prior experiences and knowledge. Take for example an asylum seeker who claims to have fled prison by jumping over a wall. We may implicitly assume that this wall was 10 meters high, because we imagine a Western prison system. Before concluding that this is implausible, however, one should verify what the alleged wall looked like (Van Veldhuizen 2017: 62-64). In line with this example, training manuals and policy documents warn for the subjective element in plausibility definitions, as plausibility in this sense is a culturally and personally determined concept (Gyulai 2013; Mackey and Barnes 2013; UNHCR 2013). As a result, the credibility assessment is at risk of becoming intuitive or subjective.

Plausibility at best seems a personally determined concept that is dependent on one’s personal knowledge and experiences


In conclusion, internal consistency, external consistency, and detail and specificity initially appear to be more objective indicators than plausibility, as they are clearly defined and can essentially be ‘counted’ in a statement.[1] In contrast, plausibility at best seems a personally determined concept that is dependent on one’s personal knowledge and experiences, and at worst is merely a combination of other credibility indicators rather than an independent concept. The decision-maker must therefore always be aware of the assumptions on the basis of which they judge plausibility and adduce evidence to support their plausibility findings. This is most clearly stated in the EASO practical guide on evidence assessment, containing the following definition: “to be plausible the sequence of events has to have the quality of being likely and seemingly possible to a reasonable person” (EASO 2015: 12). The manual continues by stating that case officers should avoid speculation and subjective assumptions, and “a finding of implausibility must be based on reasonably drawn and objectively justifiable inferences and the case officer should give clearly articulate reasons for finding an account implausible”.  It is questionable, however, whether they are also always able to do so, as legal decision making is often not as deliberate and objective as intended (Colwell 2005). Therefore, plausibility presents a potential obstacle to the requirement that asylum decisions must be made individually, objectively and impartially (Qualification Directive 2013).

This is also problematic because the possibility that the intuitive plausibility judgement influences the remaining three concepts cannot be excluded in light of literature which has established that a feeling or belief can undermine purported objectivity. This has been shown in both investigators and decision makers in various contexts (e.g., Dror et al. 2006; Kassin et al. 2013). Adjudicators’ initial beliefs about a case, can have a major influence on the interpretation of evidence and the reasoning in legal cases (Kassin et al. 2013; Burke 2005). The initial belief then functions as an anchor judgment, which influences all subsequent judgments and decisions in a case (Furnham and Boo 2011).

The question arises whether plausibility truly is a unique concept, as portrayed by policy documents. Also important is the relation between the four concepts used in the assessment of asylum claims. Is it possible to separate these concepts from each other? To objectively judge the quality of the statement for each of these indicators, without being influenced by the statement’s context, or judgments of the other indicators (Dror et al. 2006; Kassin et al. 2013)?

To investigate how plausibility is determined in relation to the other concepts used in credibility assessments, the current study made use of vignettes containing statements made in diverging contexts (i.e. asylum seeking procedure or criminal investigation) and by different characters (i.e. Eritrean asylum seeker, Syrian asylum seeker, child victim, adult witness, and adult suspect). The fictional statements were manipulated in such a way that there was equal internal consistency, external consistency, and specificity and detail within them. Participants were subsequently asked to rate these three concepts, as well as plausibility, for each of the vignettes.

We expected that if plausibility indeed is a unique concept, ratings would differ due to the different contexts depicted in the vignettes. If plausibility, in contrast, would be related to consistency and detail, the ratings should be similar for the different vignettes. If the ratings for internal consistency, external consistency, and detail and specificity change across the different vignettes, this would suggest judgments of accounts according to these indicators are not as objective, or independent, as they are meant to be; the potentially subjective concept of plausibility may influence the remaining indicators or vice versa.

Methods

Participants

Participants for this study were lay-people recruited through advertisements on social media platforms, as well as through personal communication. 112 participants completed the study. Three participants who provided ratings outside two standard deviations on multiple indicators across vignettes were removed. On average, the 109 participants included in the analyses were 38.73 years old (SD = 16.58). About half of the participants were female (51.7%). Most of the participants had completed either a master’s degree (49.2%) or a bachelor’s degree (33.9%).

Design

The study made use of a mixed experimental design. All participants were presented with all five vignettes, in a random order, and rated them on the four credibility indicators (internal consistency, external consistency, plausibility, and sufficiency of detail and specificity). The measures for the different ratings were therefore repeated across all participants. To control for any possible order effects in the rating of the indicators, the order in which participants rated the credibility indicators was manipulated to create four separate conditions. Therefore, order (of ratings) was a grouping variable.

Materials

Ethical considerations

We used Qualtrics Online Survey Software to conduct the study. Participants were first informed about the study. Prior to agreeing to take part in this study, participants were told that they were free to withdraw from the study at any time. They were also informed that the collected data would be stored anonymously. The material used in the study was also designed in such a way that it would not be shocking to participants. Participants were provided with contact details of the researcher both prior to being presented with the material and after the debriefing. They were asked to contact the researcher in case of any questions or concerns. No concerns about the nature of the material were raised by any of the participants. They were also provided with a more detailed explanation of the purpose of the study at the debriefing stage, in which we clarified further that the fictional vignettes were based on how credibility is assessed in asylum procedures.

Definitions

After giving their consent, participants were told that they would be presented with statements made by five different people. Participants were then asked to assess the credibility of each of these statements using four indicators often used in credibility assessments. A fifth indicator used in credibility assessment, namely consistency with other witnesses, was not studied in the current research. The vignettes used did not lend themselves to the inclusion of this criterion. For instance, for the child victim vignette, including a witness would have made it less realistic.  Participants were provided with the following terms and definitions. They were also able to view these definitions again throughout the survey by hovering over the term with their mouse cursor. The definitions were taken from sources on assessing credibility in asylum seeking procedures (UNCHR 2013; Gyulai et al. 2013).

Sufficiency of detail and specificity: refers to the level and nature of detail provided by an individual in an interview. The quality of a statement increases as the interviewee can provide ample and vivid details. Vague, minimal, or very brief explanations undermine the quality of the statement.

Internal consistency: refers to the extent to which the statements obtained within one interview – or in successive interviews with the same person – are coherent and do not contain serious contradictions or discrepancies.

Consistency with available external information: refers to the extent to which statements are in line with other evidence or known facts. Contradictions or discrepancies between the facts presented by the applicant and other evidence or knowledge about the case undermine the quality of the statement.

Plausibility: refers to the extent to which the information provided by the applicant seems reasonable or probable. The plausibility of a fact is assessed on the basis of its apparent likelihood in the context of the described situation.

Rating the vignettes

In the second section, participants were presented with the five vignettes containing statements made by different fictional characters. The vignettes represented different contexts and characters, and were presented to participants in a random order. The first character was an asylum seeker from Eritrea, whose statement concerned his life in Eritrea to support his origin claim. The second character was an asylum seeker from Syria, whose statement described the reason he had fled his home country to support his persecution claim. The third statement concerned an allegation of sexual abuse by a child and her mother. An eye-witness made the fourth pair of statements, who described a collision between two cars. The fifth character was a man who was a suspect in the investigation concerning drugs found in a garage box he owned. Following the reasoning that the indicators can be used to objectively judge statements, there should be no influence of the individual providing the statement, nor of the context of the statement.

Each of the vignettes included two subsequent statements. The second statement was given a week after the first statement. In four vignettes, both statements were given by the character. However, in cases of child abuse, the initial statement is normally not given by the child, but by an adult who suspects the abuse has taken place. To simulate this within the current study, the child victim vignette contained one statement made by the mother, and one statement made by the child. All vignettes also included a third section which described other evidence or information that had been obtained in relation to the case.

The statements were written in such a way that they were all equally (in)consistent. We used elements from the literature on the accuracy of statements to accomplish this (Smeets et al. 2004). Each vignette included one omission, one commission, and one contradiction. Hence, each of the vignettes should receive a similar rating for internal consistency. In addition, we also manipulated the external information which participants should use to rate external consistency; each vignette contained one element supported by the external information, one contradicting element, and one element about which the external information was inconclusive. The vignettes were intended to be equally detailed and specific. The vignettes were proofread and rated by two of the researchers who were not involved in writing the vignettes.

Thus, the three indicators which are clearly defined in the credibility assessment literature were manipulated to be equal across vignettes. Consequently, we expected that ratings for these indicators would not differ greatly across the different vignettes. As plausibility does not have a concrete definition, this could not be manipulated. The aim of using different characters and contexts was to introduce a subjective element. Subjective in the sense that existing attitudes about the general trustworthiness of the different characters could influence the credibility ratings.

For each of the five vignettes, participants had to rate the four indicators using a slider ranging from 0 (not at all) to 100 (very). The order in which the indicators were rated differed per condition. Participants were randomly assigned to one of the four conditions.

Intuition vs. Fact

After having rated the credibility of the vignettes, in the third section of the study, participants were asked to rate to what extent they had used their intuition to rate the various indicators. They were again provided with a slider which ranged from 0 to 100. A rating of zero meant they had given their previous ratings based purely on intuition, whereas a rating of 100 indicated they had only specific information. This question helped gain insight into people’s awareness of what they believed influenced their ratings.

Demographics

In the final section of the study, participants were asked about their demographic information. Questions included the participant’s age, sex, level of education, and topic of their educational background.

Results

Precluding order effects

For each vignette, a MANOVA analysis was conducted, using the credibility ratings as the dependent variables and order as the independent variable. That way, we tested for any differences between the independent conditions, in which the order of the credibility ratings had been changed. A Bonferonni correction was applied, resulting in an alpha level of p = 0.01. No significant differences between the conditions were found for any of the indicators across the different vignettes. Thus, the order in which participants rated the four indicators was not related to differences in the ratings they provided. The data from participants in the various independent conditions were therefore combined for the remainder of the analyses.

Correlations between credibility indicators

To determine whether plausibility is an independent concept, Pearson correlation analyses were conducted for each of the vignettes, using all four indicators. For each of the vignettes, the four credibility indicators were found to be significantly positively correlated. As can be seen in Table 1, the correlations ranged from medium (.388) to large (.709), suggesting the ratings for the different indicators were not independent of each other.

Table 1: Pearson’s correlations between plausibility, internal consistency, external consistency, and sufficiency of detail and specificity

 

 

 

 

 

 

 

 

 

 

 

 

Differences between credibility indicators for the different vignettes

Repeated measures ANOVAs were used to determine if there was a difference for each of the different indicators between the vignettes. The descriptive statistics for the indicators for each of the vignettes are presented in Table 2.

Table 2: Means and SDs (in parentheses) for the credibility indicators for all vignettes on a scale from 0 (not at all) to 100 (very much)


 

 

 

 

Plausibility. A significant difference between the plausibility ratings given in response to the different vignettes was found (F(4, 468) = 26.836, p < .001, ɳ2partial = .187). Post hoc Bonferroni pairwise comparisons revealed the statements made by the asylum seeker from Syria were rated as being significantly more plausible than the statements made by the Eritrean asylum seeker (p < .001), the child (p <.001), the witness (p <.001), and the suspect (p <.001). The vignette concerning the child was rated as being significantly more plausible than the vignettes concerning the Eritrean asylum seeker (p = .003) and the suspect (p < .001). The vignette concerning the witness was also rated as being significantly more plausible than the vignette concerning the suspect (p = .006). 

External Consistency. A significant difference was found between the ratings for external consistency for the different vignettes (F(4, 468) = 20.748, p <.001, ɳ2partial =.151). Post hoc Bonferroni pairwise comparisons were used to explore this difference. The Syrian asylum seeker was found to be significantly more consistent with the external information than the Eritrean asylum seeker (p <.001), the child, (p <.001), the witness (p <.001), and the suspect (p <.001). The ratings for the child’s external consistency were also significantly greater than those for the Eritrean asylum seeker’s external consistency (p = .027).

Sufficiency of detail and specificity. A significant difference was found between the ratings given for the sufficiency of detail and specific for the different vignettes (F(4, 468) = 14.066, p <.001, ɳ2partial =.107). Post hoc Bonferroni pairwise comparisons were used to explore this difference. The Syrian asylum seeker’s statements were rated as being significantly more so than the statements made by the Eritrean asylum seeker (p < .001), the child (p < .001), the witness (p = .011) and the suspect (p <.001). The witness also received a significantly higher rating for sufficiency of detail and specificity than the suspect (p = .038)

Internal Consistency. There was a significant difference between the ratings given to each of the vignettes for internal consistency between the two statements (F(4, 468) = 20.812, p < .001, ɳ2partial = .151). Post hoc Bonferroni pairwise comparisons were used to explore this difference. The Syrian asylum seeker was considered to be significantly more consistent than the Eritrean asylum seeker (p <.001), the child (p = .001), the witness (p < .001), and the suspect (p < .001). The child was also considered to be significantly more consistent than the witness (p < .001). The suspect was also considered to be significantly more consistent than the witness (p = .001).

 

Facts vs. Intuition

After having rated the credibility indicators for each of the vignettes, participants were asked to which extent they had used intuition or facts to provide their ratings. To do so, they used a slider ranging from 0 (intuition) to 100 (facts). The descriptive statistics for these results are presented in Graph 1.

Graph 1: Descriptive statistics for the extent to which participants used intuition (0) or fact (100) to determine external consistency, internal consistency, sufficiency of detail and specificity, and plausibility. Error bars represent mean standard error.

 

 

 

 

 

 

 

 

 

 

A repeated measures ANOVA was conducted for the reported use of facts by participants for the different credibility indicators. The assumption of sphericity was violated for this analysis. As epsilon was < .75, the Greenhouse-Geisser correction was used. A significant difference was found (F (2.019, 236.224 = 50.92, p < .001, ɳ2partial =.303). Simple planned contrasts were conducted. Participants’ rating of the extent to which they used facts to make their decision was significantly lower for plausibility than for external consistency (F(1,117) = 72.66, p < .001, ɳ2partial =.383), for internal consistency (F(1,117) = 91.12 p < .001, ɳ2partial =.438), and for sufficiency of detail (F(1, 117) = 52.01, p < .001, ɳ2partial =.308). Thus, participants reported relying more on intuition when rating plausibility than when rating the other three credibility indicators.

Discussion

The main aim of the current study was to elucidate how people judge the credibility of statements made in a legal context. Specifically, we assessed whether plausibility is an objective credibility indicator independent of internal consistency, external consistency and level of detail. We found that, despite the fact that in policy documents plausibility is often defined as a unique and independent concept, plausibility ratings were significantly correlated with the other three indicators. In addition, the other indicators, which had been manipulated to be equal across vignettes, were rated significantly differently for the various vignettes, and were also correlated with each other. Together, these findings suggest that all ratings were dependent on the character making the statement and the specific context. As such, none of the credibility indicators seem to be rated objectively and independently of each other. The participants, however, only reported relying more on intuition than on facts when rating plausibility compared to when rating internal consistency, external consistency, or sufficiency of detail and specificity.

Interpretation

A closer look at plausibility ratings teaches us that the suspect’s statements are considered least plausible, followed by the Eritrean asylum seeker. Statements asserted by the victim, witness, and Syrian asylum seeker are deemed considerably more plausible. These findings may be explained by the Projected Motive Model (Levine et al. 2010), which postulates that people often rely on heuristics or simple decision rules to determine if they need to be concerned about possible deception. In light of this model, the plausibility ratings in the current study could be based on perceptions of general trustworthiness and deservingness of the character making the statements. For example, people generally would be less likely to believe suspects than alleged victims and witnesses of crimes, which reflects the differences we found. In addition, images of the war in Syria undoubtedly come to mind easily, whereas people are probably less aware of the political situation in Eritrea. As such, our participants may have implicitly regarded the Syrian asylum seeker more deserving than the Eritrean applicant. Indeed, media exposure has previously also been shown to have an unconscious prejudicial and biasing impact (Ogloff and Vidmar 1994).  However, other biases, based on for example race or sex, cannot be excluded as individual attitudes towards the characters were not measured in the current study. Nevertheless, ratings of plausibility appear to have been based on factors other than a thorough assessment of the likelihood of claimed events.

A more informative finding is that plausibility ratings were correlated to ratings for the other credibility indicators, which suggests that plausibility is not a unique and independent indicator. Thus, even though in policy documents plausibility is defined as a unique indicator of credibility, in practice it seems difficult to distinguish the plausibility of a story from its consistency and level of detail. Thereby, the findings implicate support for the first definition of plausibility proposed in this article, namely that plausibility is a combination of the other indicators.

In practice it seems difficult to distinguish the plausibility of a story from its consistency and level of detail


However, contrary to expectations, ratings for the other credibility indicators also seem to be influenced by the context of the statements under consideration. The significant difference between internal consistency, external consistency and sufficiency of detail ratings for the five vignettes suggests that these judgments are more subjective than intended. One possible explanation for this finding is that the seemingly intuitive determination of plausibility influenced the supposedly objective determinations of the other credibility indicators. This can also be explained using the phenomenon of tunnel vision, where an initial belief influences the following judgements (Kassin et al. 2013). Considering the fact that there was no difference between the conditions in which participants rated the indicators in different orders, such an initial belief may be formed immediately after reading the vignette, and therefore not influenced by which rating is given first. Instead of plausibility being the initial belief or anchor which influenced the other indicators, it is also possible that participants used one subjective strategy for all credibility indicators, including plausibility. For example, the snap-judgment about the trustworthiness of the character – possibly based on the projected motive for lying (Levine et al. 2010) – could have influenced all the determinations relating to the actual quality of the statements.

In both cases, participants seem to have been unaware of these intuitive decision-making processes, as they reported that their ratings for internal consistency, external consistency, and sufficiency of detail and specificity were based more on facts or specific utterances than on intuition. They only seemed to be aware of the rather intuitive nature of their plausibility ratings. Such a lack of awareness could threaten the process of a fair and objective credibility assessment to an even greater extent, as officials may not recognise the need to counteract subjective influences if they do not understand the extent of their impact. It has been argued that some form of motivation is needed in order to move away from intuitive reasoning (Alter et al. 2007).

Limitations and alternative explanations

A brief consideration of the limitations to the manipulation is in place. We aimed for equal consistency and detail across vignettes. However, internal consistency was manipulated according to the definitions of inconsistency within statements used in the literature (Smeets et al. 2004). These definitions are mainly conceptual, and may not be identical to the operational definitions used in the determination of the consistency of two statements. Furthermore, a distinction can also be made between the prevalence and the weight of inconsistencies. For instance, an inconsistency in peripheral details such as the colour of the perpetrator’s shoes may not have the same impact as an inconsistent account about central details such as the perpetrator’s gender. We tried to account for this in our design of the vignettes, and we asked the two researchers who reviewed the vignettes to pay specific attention to it as well. Therefore, we still feel confident that – objectively – the vignettes were roughly equally (in)consistent. Nevertheless, in future studies having a larger panel evaluating the vignettes would be advisable.

Another limitation of the current study is the use of lay people, who have no training or experience in the use of the credibility indicators. Although they were provided with definitions of the indicators taken from the literature on credibility assessment, their responses cannot be equated to those of trained officials who deal with credibility assessments on a regular basis. Experience may influence the assessment process. Therefore, future research should attempt to investigate whether the experience of officials tasked with credibility assessment in real life causes them to make different decisions than the average population.

Conclusion and implications

Despite the limitations, the results of the study do give rise to a debate about the objectivity of credibility assessment in European asylum procedures. First of all because the results strongly suggest that plausibility is neither a unique, nor an objective indicator. If plausibility judgements are indeed subjective, one could question whether it should be used as a credibility indicator at all, or whether it should be removed from all policy documents and other guidance documents for asylum officials. Yet, whether abandoning the plausibility indicator is sufficient, is also questionable. The other credibility indicators, despite being conceptually objective, appear to be subjective when applied. The current results imply that ratings for all the indicators are actually based on one anchor judgment. What exactly forms the anchor for the other ratings cannot be determined based on the current study. Considering that the order of rating the different indicators did not influence the results, it seems as though the anchor judgment is not necessarily based on one of the existing credibility indicators. In fact, it seems as though the anchor judgment is based on a rather quick intuitive judgment about the case at hand.

Policy makers, case workers, and courts together should search for ways to increase the validity of the current method for credibility assessments

Although it is important to realise that the use of the indicators may not be as objective as intended, merely recognising and naming the risks of intuitive assessments may not be enough to prevent intuitive assessments in the future. As was outlined in the introduction, both training manuals and policy documents already warn for subjectivity in credibility assessments (Granhag, Landström and Nordin 2017; Gyulai 2013; Mackey and Barnes 2013; UNHCR 2013). Yet, as the findings of this study imply, such warnings do not seem to be sufficient to warrant a deliberate decision. Considering that there is currently no alternative for credibility assessments of oral statements, abandoning the credibility indicators altogether is also not an option (Van Veldhuizen 2017: 66). Instead, we argue that policy makers, case workers, and courts together should search for ways to increase the validity of the current method for credibility assessments.

Specifically, we contend that policy guidance and training programs for case officers should include scenario thinking as a tool for deliberate credibility assessments (see Van Veldhuizen 2017: 191-194). With scenario thinking, a decision-maker is forced to consider different possible explanations for a negative credibility finding. Think again of the applicant claiming to have jumped over a prison wall. This statement would likely be considered implausible. When considering alternative scenarios, one scenario is that the statement is fabricated; a second scenario is that the statement is an embellishment or exaggeration of a true event, and that the wall in fact was much lower; a third scenario is that there is a cultural misconception about what a prison wall looks like. All these scenarios can be further assessed in the interview, and perhaps with additional evidence, and eventually will feed into the decision about the two broader scenarios in the case: the applicant either is lying about escaping prison or the applicant is truthful about escaping from prison. By explicating the possible scenarios and by explaining which scenario is supported and why – that is, on the basis of which inferences and evidence – the reasoning of the decision maker becomes both more deliberate and more transparent. Moreover, the method seems to align with the duty of the determining authority to assess each asylum claim individually, in light of country of origin information and personal position and circumstances of the applicant, as laid down in the Qualification Directive (2011: Article 4.3), as well as with the  right to “appropriate notification of a decision and of the reasons for that decision in fact and in law” (Asylum Procedure Directive 2013: ‘whereas 25’).

Part of a fair and effective asylum procedure is that decisions are made deliberately and are adequately motivated

Furthermore, courts are also a key player in preventing subjectivity in credibility assessments. Courts form the control mechanism to ensure that determining authorities function fairly, effectively, and efficiently. Part of a fair and effective asylum procedure is that decisions are made deliberately and are adequately motivated. Courts can play an active role by demanding that the decision of the administrative authority explicates different scenarios that were investigated, along with the reasons and evidence for dismissing specific scenarios while holding on to others. A motivation of this kind is also necessary to provide for a full and ex nunc examination of both facts and points of law (Asylum Procedure Directive 2013: Article 46.3).

In that sense, our advice is consistent with what the EASO practical guide states on findings of implausibility, namely that it “must be based on reasonably drawn and objectively justifiable inferences and the case officer should give clearly articulate reasons for finding an account implausible” (EASO 2015: 12). Only we extend this recommendation to other negative credibility findings: case officers should give clear articulate reasons for findings of inconsistency, lack of detail, and implausibility. Courts should also demand such elaborate motivations. Moreover, policy makers should not only focus on what the desired state of affairs is – namely objective and deliberate assessments – and what potential risks are – for example intuitive decision making – but also on how these influences can be mitigated in practice.

As credibility assessments are at the heart of asylum cases, more knowledge about potential biases in credibility assessments in asylum cases is pivotal. The objectivity of this assessment should be safeguarded to ensure that every applicant receives a fair and deliberate assessment. A first step towards achieving this is to acknowledge that the credibility indicators may not be as objective as intended, and to raise awareness and instigate a discussion concerning this issue.


Enide Maegherman (MSc., LL.M) is a PhD student at the Law Faculty of Maastricht University. Her research focuses on falsification and decision-making in criminal law proceedings. E-mail: enide.maegherman@maastrichtuniversity.nl

Tanja S. van Veldhuizen (dr.) is a postdoctoral researcher within the Montaigne Centre for Judicial Administration and Conflict resolution at Utrecht University. Her research primarily concentrates on credibility assessments in the EU asylum procedure from a legal psychological perspective. E-mail: t.s.vanveldhuizen@uu.nl

Robert Horselenberg (dr.) is an assistant professor in legal psychology at the Law Faculty of Maastricht University. He is specialized in assessing statements in context of criminal law as expert witness and researcher on these topics. E-mail: robert.horselenberg@maastrichtuniversity.nl


The views expressed in this article are those of the authors and do not necessarily represent the position of OxMo.


Bibliography

ALTER, A. L., OPPENHEIMER, D. M., EPLEY, N. and EYRE, R. N. (2007) ‘Overcoming intuition: Metacognitive difficulty activates analytic reasoning’ Journal of Experimental Psychology: General 136: 569-576.

ASYLUM PROCEDURE DIRECTIVE (2013) Directive 2013/32/EU on common procedures for granting and withdrawing international protection [Recast APD 2005], Council of Europe, Official Journal of the European Union.

BURKE, A. S. (2005) ‘Improving prosecutorial decision making: Some lessons of cognitive science’ William and Mary Law Review 47: 1587-1634.

COLWELL, L. H. (2005) ‘Cognitive heuristics in the context of legal decision making’ American Journal of Forensic Psychology 23: 17-41.

CONSTERDINE, E., PENDRY, L. and MCKINLAY, P. (2013) Establishing identity for international protection: challenges and practices. National contribution from the United Kingdom, Brussels, European Migration Network.

DROR, I. E., CHARLTON, D. and PÉRON, A. E. (2006) ‘Contextual information renders experts vulnerable to making erroneous identifications’ Forensic Science International 156: 74-78.

EASO (2014) ‘EASO e-learning platform. European Asylum Curriculum. Module 7: Evidence Assessment’ [Online], Malta, European Asylum Support Office. Available: https://ceac.easo.europa.eu/eac/ [Accessed January 10 2014].

EASO (2015) EASO practical guide: Evidence Assessment, Luxembourg: Publications Office of the European Union.

EASO (2017) EASO Training Curriculum, Luxembourg: Publications Office of the European Union.

EUROPEAN COMMISSION (2017) A Common European Asylum System, Luxembourg: Publications Office of the European Union.

EUROPEAN PARLIAMENT (1999) ‘Tampere European council 15 and 16 october 1999: Presidency conclusions [Online]. Strassbourg, European Parliament. Available: http://www.europarl.europa.eu/summits/tam_en.htm [Accessed January 27 2014].

FURNHAM, A. and BOO, H. C. (2011) ‘A literature review of the anchoring effect’ The Journal of Socio-Economics 40: 35-42.

GRANHAG, P.A., LANDSTRÖM, S. and NORDIN, A. (2017) Evaluation of oral statements. A scientifically based decision-aid for migration cases, Gothenburg, Sweden, Gothenburg University.

GYULAI, G. (2013) Credibility assessment in asylum procedures: a multidisciplinary training manual, Budapest, Hungary, Hungarian Helsinki Committee.

HOME OFFICE (2015) Asylum policy instruction. Assessing credibility and refugee status, United Kingdom: Home Office.

KASSIN, S. M., DROR, I. E. and KUKUCKA, J. (2013) ‘The forensic confirmation bias: Problems, perspectives, and proposed solutions’ Journal of Applied Research in Memory and Cognition 2: 42-52.

KOK, S. (2016) Bij gebrek aan bewijs, De beoordeling van de geloofwaardigheid van het asielrelaas onder Werkinstructie 2014/10, Leiden, the Netherlands, Leiden University and the Dutch Council for Refugees.

LEVINE, T. R., KIM, R. K. and BLAIR, J. P. (2010) ‘(in)Accuracy at detecting true and false confessions and denials: An initial test of a projected motive model of veracity judgments’ Human Communication Research 36: 82-102.

MACKEY, A. and BARNES, J. (2013) Assessment of credibility in refugee and subsidiary protection claims under the EU Qualification Directive: Judicial criteria and standards, Haarlem, the Netherlands.

OGLOFF, J. R. P. and VIDMAR, N. (1994) ‘The impact of pretrial publicity on jurors: A study to compare the relative effects of television and print media in a child sex abuse case’ Law and Human Behavior 18: 507-525.

QUALIFICATION DIRECTIVE (2011) Qualification Directive 2011/95/EU [Recast QD 2004], Council of Europe, Official Journal of the European Union.

R.C. V. SWEDEN (9 March 2010). Application no. 41827/07. Strasbourg: European Court of Human Rights

SMEETS, T., CANDEL, I. and MERCKELBACH, H. (2004) ‘Accuracy, completeness, and consistency of emotional memories’ The American Journal of Psychology 117: 595-609.

UNHCR (2013) Beyond proof, credibility asessment in EU asylum systems, Brussels: UNHCR.

VAN VELDHUIZEN, T. (2017) Where I come from and how I got here: Assessing credibility in asylum cases, Enschede, Gildeprint Drukkerijen.

VREDEVELDT, A., VAN KOPPEN, J. P. and GRANHAG, P. A. (2014) The inconsistent suspect: A systematic review of different types of consistency in truth tellers and liars. In: BULL, R. (ed.) Investigative Interviewing, New York, NY: Springer New York.


[1] However, these indicators are not always valid indicators of truthfulness in light of legal psychological research about human memory and strategies of truth-tellers and liars (see Van Veldhuizen 2017: chapter 2). Moreover, despite the clear definition and the fact that inconsistencies and details are in principle observable and countable, how much weight should be attached to a particular inconsistency or how much detail may be expected in light of the applicant’s background remains ambiguous.


Appendix

Vignette 1 – Eritrean asylum seeker:

A man applies for asylum in Europe after fleeing his home-country. He claims to come from Eritrea. Therefore, the man is interviewed about his origin. During the first interview, he made the following statement:

In Eritrea, I lived in Gemilab in the Anseba region. It was a small village. The nearest large cities were Kerkebet, Olef, and Uague. To go to the school in Gemilab, I used to have to walk for about one hour every day. After passing through the town square, I would take the third turn left. I was also a member of the Eritrean Orthodox Church in my home-town.

One week later, the man was interviewed again. He then made the following statement:

Before I left Eritrea, I was living in the Anseba region, in a small village called Gemilab. The closest big town nearby was Uague. When I used to walk to school, I would first cross the town square, and then take the third turn right. I am a member of the Eritrean Orthodox Church. Because of my beliefs, I often take part in confession, which has to happen to a priest.

The following additional information has been obtained by the authorities:

The town of Gemilab is indeed in the Anseba region. The closest town is Kerkebet, with Olef and Uague being the 2nd and 3rd closest. The route the applicant would walk to the school he claims to have attended could not be confirmed. The Eritrean Orthodox Church does consider Confession as one of its core sacraments. Members of the religion are encouraged to confess to someone they are close to, known as a “soul father”.

Vignette 2 – Syrian asylum seeker:

A man applies for asylum in Europe. He is interviewed about his reasons for leaving his home-country Syria. During the first interview, he made the following statement:

I left my home in Aleppo because it was not safe for me and my family. After the fighting began, we were scared to leave the house. Our neighbour was attacked and abducted by supporters of Assad because he had told people about his support for the revolution. I don’t know what happened to him. He was taken late at night by 3 men who wore masks. He didn’t want to go, so they beat him until he lost consciousness. His wife saw all of this. She was also beaten because she was screaming. After this, we started preparing to leave, because I had also expressed my support for the revolution to acquaintances.

One week later, the man was interviewed again. He then made the following statement:

Living in Aleppo after the fighting started was very unsafe for us. We became scared to leave the house. My neighbour was abducted, and I don’t know what happened to him. He was attacked because he had been sending photos to a press agency, I think in France. When he was taken, he protested so much they beat him until he was unconscious before they took him. His wife was also beaten because she was screaming. She saw her husband being abused. The masked men had guns with them, but they didn’t shoot. My family was so scared after this we decided to leave, because we were afraid that I would also be targeted due to my support for the revolution.

The following additional information has been obtained by the authorities:

The applicant gave an accurate description of Aleppo and records indicate fighting took place in his neighbourhood during the time he describes. No information could be found about reasons for the neighbour’s arrest. The arrest described by the applicant does not contradict what is known about the way in which militant groups operate.

Vignette 3 – Victim:

A woman comes to the police station to report the alleged sexual abuse of her daughter. In her statement, she described the situation as follows:

I think my 7 year old daughter is being sexually abused by her teacher. She came home yesterday and was acting strange so I asked her what was wrong. At first she wouldn’t tell me, but then she told me her teacher had kept her inside during break-time. He had asked her to take off her clothes and tried to kiss her. When I asked her when this happened, she said that it first happened right after the Christmas holidays and had happened once a week since then.

One week later, the child was interviewed. She described the situation as follows:

After we came back to school after the Christmas holidays, my teacher sometimes asked me to stay inside when the other kids went outside to play. When I had to stay inside, I had to take my clothes off. Once he also took pictures, but he wouldn’t show them to me. I only had to stay inside with the teacher twice. I didn’t like it when I had to stay inside, so I was upset. I told my mummy about it as well.

During the police investigation of the alleged abuse, the following information was collected:

During a medical examination, no signs of abuse were found. The teacher in question denies the allegations. The browsing history on the computer in his house showed he had searched child pornography on 3 separate occasions in the past year. No pictures were found on his phone, but the phone had recently been reset to factory settings.

Vignette 4 – Witness:

A man who witnessed a car accident was questioned about what he saw. He gave the following statement:

I was walking to the supermarket when I saw two cars hit each other at the cross roads. One car, the blue one, was coming from the High street, and crashed into the side of the black car. I think the black car ignored the red light. The black car then got pushed off the road and ended up on its side. The blue car swerved to the right and came to a stop. It was incredibly loud, so a lot of people turned up after it happened.

One week later, the man was again interviewed about what he saw. He then gave the following statement:

When I was on my way to the supermarket, I saw a car crash. It happened at the cross roads. The blue car crashed into the side of the black car, which then got pushed off the road. Both cars seemed to be exceeding the speed limit. The blue car swerved to the right before it stopped. I think the blue car ignored the red light at the crossing. After it happened, a lot of people arrived at the scene. I think they must have heard the noise.

The police investigation of the car accident resulted in the following information:

A call to the emergency services described a collision of a blue and black vehicle at the High Street and St. John Street. Traces of blue paint were found on the side of the black car. No brake marks were found. There was some evidence of alcohol in the driver of the black car’s blood, but his blood alcohol level was below the legal limit. No video footage could be obtained from the traffic cameras due to a technical malfunction.

Vignette 5 – Suspect:

Drugs were found in a rented garage box. The man who rents the garage box is a suspect in the investigation and made the following statement during the police interview:

I own the garage box behind the station, but I don’t know anything about the drugs that were found in there. I only use it to store junk from around the house. I haven’t visited the garage box in more than 2 months. Occasionally, I let my friends borrow my key to the garage box. For a while, I have suspected that there is a copy of the key somewhere. These friends are people I’ve known for a couple of years. I met them in a soccer team that I played in a while ago.

One week later, the suspect was questioned about the drugs again. He then made the following statement:

I only use the garage box behind the station to store old stuff from around the house, so I don’t know anything about the drugs that were found there. I do own it, but the last time I was there was last month. I sometimes let my friends borrow the key. I’ve known these friends for several years. None of them have had problems with the police, and I have never seen them take drugs themselves. I first met them through a soccer team I was in a while ago.

The police investigation into the drugs that were found resulted in the following information:

The suspect was convicted for drug possession 10 years ago. The garage box also contained old furniture and other things that belonged to the suspect. None of the items found in the garage box belonged to anyone else. The person renting the garage box next to the suspect’s garage box recognised one of the suspect’s friends as someone he had seen around the garage box. This friend has previously been named in an investigation into drug dealing, but was not convicted.