A primary emphasis in personnel selection research identifies determinants of future work performance, where tests of general mental ability and personality traits, employment interviews, and work sample tests are examples of established and valid predictors (Hunter & Schmidt, 1998; Schmidt et al., 2016). However, some selection methods, such as work sample tests (e.g., assessment center exercises) and interviews, are more complex relative to psychometric tests in terms of measurement methodology. While results from standardized psychometric tests generally are dependent on item responses from the test-takers themselves, assessment ratings obtained from interviews and assessment center exercises rely on how a candidate is perceived and interpreted by others. Accordingly, psychometric tests often provide an operationalization of well-defined constructs. In contrast, human judgment produces assessment ratings where the constructs measured are less well defined (Roth et al., 2005). Due to this essential difference, research investigating the overlap between test results and ratings from assessors is vital for an increased psychological understanding of complex predictors.
While several meta-analyses have examined if candidate ratings obtained in employment interviews and assessment center exercises are associated with cognitive abilities and personality traits (Berry et al., 2007; Collins et al., 2003; Salgado & Moscoso, 2002; Hoffman et al., 2015; Huffcutt et al., 2001), military studies on this matter are scant and often limited to the trait of extraversion (e.g., Darr et al., 2018; Thomas et al., 2001). Due to likely differences in civilian and military selection programs, such as job competencies rated and assessment center exercises employed, there is a need for further military studies. Detailed knowledge of military predictors is relevant for optimizing selection programs, which is important considering that military organizations often recruit candidates to life-long careers where employees might participate in high-risk activities and operations (Campbell et al., 2010; Rumsey & Arabian, 2014). The present study aimed at providing such knowledge by applying the well-established Five-Factor Model (FFM) for studying if candidate personality traits were associated with ratings in a military interview and a field selection exercise at an officer selection program.
Although studies investigating the predictive validity of military selection interviews exist (e.g., Darr & Catano, 2016; Køber et al., 2017), there is a lack of published military studies relevant to the present investigation. However, the civilian personnel selection literature reports that the FFM1 personality traits, to some degree, are embedded in interview ratings. The meta-analysis by Salgado and Moscoso (2002) documented that observed mean correlations ranged between r = .12–.17 between traits and ratings obtained in conventional interviews. Regarding behavioral interviews, where structure and questions are pre-planned in more detail relative to conventional interviews, smaller correlations were found. Here, the strongest correlations were mean r = .10 for extraversion and mean r = .08 for conscientiousness. Those findings were in line with results from another meta-analysis that focused exclusively on structured interviews (Roth et al., 2005).
Other authors have tried to explain the associations between extraversion, conscientiousness, and interview ratings. For instance, some have hypothesized that these traits fuel advantageous self-efficacy mechanisms, meaning that the likelihood of interview success is related to interview self-efficacy (Tay et al., 2006). A more recent study by Wiersma and Kappe (2016) offered a somewhat different explanation by noting that extraversion and conscientiousness may be considered as incentive-enhancing preferences, where the former trait leads to such characteristics as assertiveness, decisiveness, and ambitiousness, and the latter to intrinsic motivation and a striving to perform above average. These authors claimed that extraversion is typically visible in interviews and, therefore, more easily assessed relative to conscientiousness, which is perhaps intuitively compelling. This explanation is supported by the findings of Salgado and Moscoso (2002) for conventional interviews, where extraversion had a higher association with ratings compared with conscientiousness. However, in structured interviews, extraversion seems to have a somewhat lower association relative to conscientiousness (Roth et al., 2005; Salgado & Moscoso, 2002).
The field selection exercise investigated in the present study was quite different from a standard civilian assessment center (AC), both in duration and content. It lasted seven days and nights in the outdoors (i.e., the field), where candidates participated in a war-like scenario. Nevertheless, in terms of measurement methodology, the field selection exercise mirrors an AC as the candidates were rated by assessors when engaging in behavioral simulation exercises (International Taskforce on Assessment Center Guidelines, 2015). The civilian research concerning AC construct embeddedness is voluminous (see e.g., Hoffman et al. (2015) and Thornton and Gibbons (2009) for reviews). For the present study, research focusing on personality and AC overall assessment ratings (OARs) and studies using military samples are of primary interest. One meta-analysis based on civilian samples reported quite high operational validities corrected for unreliability in criteria for the FFM personality traits (.16–.47), where extraversion had the largest validity (Collins et al., 2003). While conscientiousness was not investigated in this study, the authors also found a high operational validity for cognitive ability (.65). On the other hand, more modest results were reported in the review and meta-analysis by Hoffman et al. (2015), also focusing on civilian sample—where several operational validities were less than .10 for the FFM traits.
Paralleling the results of large-scale meta-analyses using civilian samples (e.g., Barrick et al., 2001), meta-analyses using military samples have also detected that the FFM relates to job performance, where especially conscientiousness has demonstrated predictive validity (Darr, 2011; Salgado, 1998). In the few personality studies regarding military ACs and short-duration training performance, however, extraversion has received the most attention in research. In a five-week training and evaluation course for the U.S. Army Reserve Officer Corps (Thomas et al., 2001), it was found that extraversion was positively related to leadership ratings among approximately 800 cadets (r = .14). Darr et al. (2018) argued that extraversion might be advantageous in basic military training, as such contexts are typically collective, allowing participants to interact with and lead others. In their study of 251 candidates undergoing basic officer training in a 15-week course, Darr et al. (2018) found that the dominance aspect of extraversion was positively related to performance (r = .16). An investigation of 60 junior officers completing a five-week course measured all FFM traits (Calleja et al., 2019) and found that conscientiousness was related to “planning performance” (r = .27).
Three Norwegian studies relevant to the present study aim have used samples from the same selection program as the current investigation (Hystad et al., 2011; Martinsen et al., 2020; Sørlie et al., 2020). Of those studies, Sørlie et al. (2020) included the same sample used in the present study. While Sørlie et al. (2020) investigated the predictive validity of a person-organization fit measure, the authors reported that extraversion and openness had minor isolated predictive impacts (β = .17 and –.10) on the field selection exercise OAR when other variables were included in the regression model. We will expand on those results in the present study by including more nuanced field selection exercise ratings and also including the interview ratings. Hystad et al. (2011) and Martinsen et al. (2020) used data obtained some years earlier at the same selection program for their studies. The first study documented some predictive validity of dispositional hardiness toward final admission decisions and is thus more indirectly relevant as neuroticism is frequently found to be negatively associated with resilience variables (Oshio et al., 2018). The second study (Martinsen et al., 2020) also focused on final admission decisions and found that those offered officer training had lower scores on neuroticism and higher scores on extraversion and conscientiousness relative to those not selected.
The purpose of the present study was to explore if candidate personality traits were associated with ratings in a selection program for military officers. Selection officers rated candidates in a competency-based interview and a field selection exercise simulating a war-like scenario. We measured the FFM personality traits with the established NEO-PI-3 (McCrae et al., 2005). A shorter military FFM test was added (the Norwegian Military Personality Inventory; NMPI) for obtaining construct validity estimates for the less comprehensive NMPI. Additionally, it was of interest to investigate if this test would produce the same results as the NEO, considering that the NMPI was developed for the military. As general research has shown advantages for contextualized personality measures with respect to predictive validity (e.g., Shaffer & Postlethwaite, 2012), the NMPI may be a promising tool for military selection. Due to the impact of cognitive ability on performance (Schmidt & Hunter, 1998; Schmidt & Hunter, 2004), including in studies of ratings in employment interviews (Berry et al., 2007) and AC methods (Collins et al., 2003), scores on general mental ability (GMA) were used for purposes of statistical control.
At the selection program, the criteria upon which candidates were rated are formulated as military leadership competencies. The competencies rated are believed to be in line with general research on effective leadership (Yukl, 2012) and individual prerequisites for successful development into a mission command leader—the espoused leadership philosophy of the Norwegian Armed Forces (Defence Staff Norway, 2012). The competencies, and the gist of their content, are role model, acts in line with NAF’s core values, is open to feedback, shows integrity; task focus, takes the initiative, works systematically toward goals, prioritizes adequately; mental robustness, can cope with high demands and stressful life events, is emotionally stable, and adapts to uncertain circumstances; cooperation, gains trust from others, communicates efficiently, delegates, and supports others; and development, stimulates autonomy in others and encourages reflection, original thinking, and self-development in others.
Due to the lack of military studies, civilian findings formed the basis for hypothesis development for the interview. Small positive associations between extraversion and conscientiousness on the one hand and the interview OAR on the other, in line with the findings of Salgado and Moscoso (2002) and Roth et al. (2005), were expected. The general arguments put forth by Tay et al. (2006) and Wiersma and Kappe (2016) regarding the relevancy of these traits for interviews also supported this expectation. The following hypothesis was thus formulated:
H1: Extraversion and conscientiousness will show statistically significant positive associations with the interview OAR—after controlling for age, sex, and GMA.
In the interview, the three competencies of role model, mental robustness, and development were rated. The following hypothesis was formulated based on content similarities between personality traits and the competencies:
H2: Conscientiousness will be positively associated with role model ratings, neuroticism negatively with mental robustness, and openness positively with development.
Expected findings would necessarily parallel the results from Sørlie et al. (2020), considering that those authors used parts of the same data set as the present study (the field selection exercise OAR and the NEO). However, as the current study also included the NMPI, we expanded the hypothesis development pertaining to the field selection exercise. Although acknowledging that civilian ACs and military field selection exercises are equivalent in terms of measurement methodology, hypothesis development was clouded by the different results reported in meta-analyses of the former (e.g., Collins et al., 2013; Hoffman et al., 2015). Furthermore, likely differences in duration and content between civilian ACs and military field selection exercises limited hypothesis development. Based on the military studies of Thomas et al. (2001), Darr et al. (2018), and Sørlie et al. (2020), however, we expected a positive association between extraversion and the field selection exercise OAR—and also a negative association with respect to openness based on the findings from the latter study. It was also reasonable to expect a positive association between conscientiousness and the OAR considering the predictive validity of this trait in military studies (Calleja et al., 2019; Darr, 2011; Fosse et al., 2015; Martinsen et al., 2020). Finally, based on the harsh elements of the field selection exercise, together with the findings of Martinsen et al. (2020) and Hystad et al. (2011), it was reasonable to expect that neuroticism would be negatively associated with the OAR.
H3: Extraversion and conscientiousness will show statistically significant positive associations with the field selection exercise OAR; whereas, neuroticism and openness will show statistically significant negative associations—after controlling for age, sex, and GMA.
In the field selection exercise, all five competencies were rated. Expectations in terms of competency-level associations had parallels with those for the interview. Additonally, we hypothesized that conscientiousness might be positively associated with task focus, and extraversion and agreebleness positively with cooperation, due to content similarities.
H4: Neuroticism will be negatively associated with mental robustness ratings, extraversion and agreeableness positively with cooperation, openness positively with development, and conscientiousness positively with role model and task focus.
Our study can contribute to the selection literature in several ways. First, it may increase the understanding of complex military predictors and provide results of similarities or differences relative to civilian findings. Second, the study may be valuable for evaluating the potential usefulness of incorporating a personality test in selection programs that use military interviews and field selection exercises. Third, as the present study used both OARs and ratings of specific military leadership competencies, study findings can uncover if these different competencies are unequal in personality overlap. Those findings may be of interest as job competencies are often less precisely defined and operationalized compared to psychological constructs (Furnham, 2008).
All participants in the present study were candidates attending a selection program for basic officer schools in the summer of 2016. The selection program lasted two weeks and selected officer education applicants for either the Army, Navy, or Air Force. Initially, there were 1287 candidates attending. The personality data were collected during the first two days, where candidates were introduced to the research project in a classroom setting. The final sample consenting to participate was N = 901, which included all candidates that attended the classroom brief. Accordingly, 386 candidates did not participate in the classrooms due to early termination at the selection program. Some may have chosen not to attend the classroom brief while continuing the selection program, but we did not register this number. Thus, the response rate was 70%, calculated from the total number of registered candidates. There were 207 women (23%) and 694 men (77%) in the final sample, and the age range was 18–34 years (M = 19.6, SD = 1.86).
This study was part of a larger research program at the Norwegian Defence University College: “The Leadership Candidate Study” (NAF, 2020), which is approved by the Norwegian Centre for Research Data. Acknowledging ethical challenges related to informed consent when obtaining data at military selection areas (e.g., obedience to authority and conformity pressure), we thoroughly informed candidates of the study purpose. Candidates were also informed that the decision to participate would not affect selection decisions and that the personality data would not be registered in their military records. Furthermore, we evaluated that the measures used were unobtrusive by measuring normal personality traits in which foreseeable adverse psychological consequences of responding were unlikely.
During Week 1, in which the interviews took place, 91 candidates from the initial research-participant pool left the selection program for various reasons (e.g., self-choice, medical conditions, failed physical tests), resulting in 810 participants with complete interview ratings. The field selection exercise was carried through during the second week. A further reduction in participants was witnessed, primarily due to self-choice, where candidates left the program during the first two days of the field selection exercise; in addition, some participants had already been selected out due to unsuccessful interview ratings. The final number of participants with complete field selection exercise ratings was 551. There was no systematic dropout with respect to sex or age.
All participants had, at age 17, passed the conscript assessment procedure in Norway, undergoing GMA testing. Since the 1950s, the Norwegian Armed Forces has used a GMA test that includes three subtests measuring reasoning, numerical, and verbal abilities (Køber et al., 2017). The administration time is one hour. A previous study (Skoglund et al., 2014) documented adequate parallel-form reliability by correlating the total score from a paper and a computerized version of the GMA test (r = .85). The present study used the mean GMA stanine score.
The NEO-PI-3 is a 240-item self-report 5-point Likert scale test, aiming to measure the FFM of personality, including six facets for each of the five domains (neuroticism, extraversion, openness to experience, agreeableness, and conscientiousness) (McCrae & Costa, 2010). The present study used raw scores based on the Norwegian version of the NEO-PI-3 (Martinsen, 2007). Test completion time was approximately 45 minutes. Factor analyses supporting the five-factor structure of this inventory are reported in Martinsen et al. (2020). Based on the present sample, Cronbach’s alpha values were .91 (neuroticism), .88 (extraversion), .87 (openness), .88 (agreeableness), and .91 (conscientiousness).
The NMPI is a self-report personality test consisting of 79 items developed by the Norwegian Armed Forces (Skoglund et al., 2020). The items are rated on a 7-point Likert scale, aiming to measure the five factors of (number of items in parenthesis) emotional stability (15), extraversion (18), openness to experience (17), agreeableness (16), and conscientiousness (13). In the development of the NMPI, most items were extracted from the International Personality Item Pool database (Goldberg et al., 2006) and then translated to Norwegian. Experienced military psychologists provided the remaining items in the NMPI. There are no facet scores for this test. The present study used raw scores, and the test completion time was approximately 15 minutes. Based on the present sample, Cronbach’s alpha values were .89 (emotional stability), .89 (extraversion), .78 (openness), .90 (agreeableness), and .87 (conscientiousness).
Candidates were interviewed and rated by an experienced officer if they had passed a medical examination and physical tests. The interview lasted one hour, and the personality test results were not available. The three competencies of role model, mental robustness, and development were rated by a 9-point scale (1 indicated the weakest score and 9 the strongest). The interviewers were expected to provide their ratings based on a detailed scoring system that operationalized each competency into example answers along the 9-point scales. An OAR was calculated as the mean competency score and used for scoring interview leadership potential, a variable used in the final selection decision. There were five or six questions for each competency, and a mix of behavioral and situational questions was applied. Examples of behavioral questions were “Please tell me of one episode where you received challenging feedback about yourself” (role model), “Can you remember a situation of unpredictability in your studies/work?” (mental robustness), and “Can you give an example of finding a new and original solution to a problem” (development). No estimate of the interview interrater reliability was available.
In the seven-day field selection exercise, candidates took turns solving ongoing work sample cases as leaders within teams of five to seven. Typical for these cases were threats from hostile forces while maneuvering in difficult terrain, establishing camps, or providing first aid to wounded soldiers. The field selection exercise was physically demanding, and the candidates experienced frequent discomfort, including some lack of food and sleep. An experienced military selection officer followed the team and rated the candidates. With a few exceptions, officers did not rate the same candidates they had interviewed the week before. All five competencies were rated on a 9-point scale, operationalized in behaviorally anchored rating scales (BARS) adapted to the different work sample cases. The personality test results were not available to the selection officers. While isolated competencies were rated as per work sample, an across-exercise (i.e., work samples and other observations) system was applied in the end. This meant that the final competency rating was the product of multiple observations in different settings. Some selection officers used a mathematical approach (i.e., mean score) and others did not. A final OAR based on mean competency scores was used for scoring the field exercise leadership potential, used in the final selection decision. Interrater reliability estimates of the field exercise ratings were not available.
IBM SPSS 26.0 was used for all statistical analyses. Only two to four subjects were missing the mean domain scores on the NEO. However, on the NMPI, there was a larger portion of missing data, with 45–58 subjects missing the mean scores on the factors. This difference in missing data was most likely due to respondent fatigue, as the NMPI came last in the questionnaire used. For those with complete ratings on the field selection exercise, the missing data for the NMPI had dropped to 21–34. One to two participants were missing GMA scores. The analyses used pairwise deletion of cases to handle missing data. Initial inspections of normality, linearity, multicollinearity, and homoscedasticity did not reveal any serious violations of the statistical assumptions. As the field selection exercise ratings were obtained when the team leader was part of a candidate group, a possibility of dependency of observations was actualized. However, Sørli et al. (2020) reported no need for multilevel modeling based on their dataset (which included the sample used in the present study) by investigating the group-level variation of ratings based on a fixed model and a random model of the data.
For investigating if correction for range restriction was appropriate, the standard deviations of GMA and personality trait scores in three groups based on the selection hurdles were inspected: (1) candidates attending the first two days of the selection program; (2) candidates obtaining interview ratings; and (3) candidates obtaining field selection exercise ratings. These standard deviations showed only minor differences. Furthermore, it would be imprecise to use GMA and NEO population norms as an unrestricted group for our purpose of investigating military officer candidates. Thus, because data from military studies relevant for correcting interrater unreliability were also lacking, observed associations were used for testing our hypotheses.
We chose to omit the intercorrelations between ratings in Table 1 for increasing readability and report these results here. There were large correlations between the competency ratings in both the interview (r = .72–.79) and the field selection exercise (r = .75–.84). The interview OAR and the field selection exercise OAR did not correlate strongly (r = .26), indicating a large amount of nonshared variance between these variables. Hierarchical regression analyses were used to test H1 (interview) and H3 (field selection exercise), with the candidate OARs as dependent variables. In the first analytic step, the control variables sex, age, and GMA were entered, followed by the FFM traits in the second. For testing H2 (interview) and H4 (field selection exercise), the correlations between the specific competency ratings and the FFM traits were used. Although the study aim did not include an investigation of the associations between NEO facets and candidate ratings, we have provided these correlations in a supplementary file (appendix).
|NEO N (NMPI ES)||1.31 (5.24)||0.42 (0.86)||0.25–3.02 (1.73–6.93)||.15** (.19**)||–.03 (.06)||–.15** (.15**)||–.82**||.39**||–.23**||–.32**||–.41**|
|NEO E (NMPI E)||2.71 (5.19)||0.37 (0.74)||1.40–3.67 (2.61–7.00)||.15** (.06)||.10** (.07*)||–.11** (–.02)||–.31**||.80**||.33**||.61**||.28**|
|NEO O (NMPI O)||2.43 (4.83)||0.38 (0.61)||1.33–3.58 (2.76–6.82)||.09* (.01)||–.06 (–.05)||.18** (.15**)||–.12**||.32*||.80**||.31**||.10**|
|NEO A (NMPI A)||2.60 (5.72)||0.36 (0.67)||1.08–3.60 (2.13–7.00)||.20** (.18**)||–.06 (–.08*)||–.05 (–.07)||–.08*||.02*||.08*||.62**||.27**|
|NEO C (NMPI C)||2.93 (5.70)||0.37 (0.68)||1.52–3.88 (3.62–7.00)||.12** (.15**)||.01 (.02)||–.00 (–.05)||–.33**||.29**||.19**||.39**||.82**|
|Role model||6.26||1.67||1–9||.09**||.03||.00||–.06 (.07)||.19** (19**)||.03 (.00)||.06 (.15**)||.09* (.10**)|
|Mental robust||6.05||1.69||1–9||.09**||.01||.02||–.11** (.13**)||.20** (.23**)||.03 (.01)||.04 (.13**)||.12** (.12**)|
|Development||5.96||1.68||1–9||.09**||.02||.03||–.07* (.08*)||.19** (.20**)||.05 (.02)||.06 (.15**)||.11** (.11**)|
|Role model||5.68||2.07||1–9||.02||.17**||–.08||–.00 (–.01)||.05 (.04)||–.09* (–.11*)||.04 (.02)||.03 (.04)|
|Task focus||5.44||2.07||1–9||.05||.15**||–.03||–.03 (.02)||.05 (.09*)||–.09* (–.10*)||–.02 (–.00)||.02 (.03)|
|Mental robust||5.66||2.08||1–9||–.02||.17**||–.04||–.07 (.03)||.07 (.07)||–.09* (–.10*)||–.00 (–.01)||.03 (.05)|
|Cooperation||5.56||1.85||1–9||.05||.17**||–.04||–.01 (–.00)||.08 (.06)||–.04 (–.09)||.06 (.02)||.06 (.07)|
|Development||5.56||1.83||1–9||.07||.14**||–.03||–.01 (.02)||.06 (.07)||–.03 (–.04)||.03 (.02)||.07 (.08)|
|Interview||6.09||1.53||1–9||–.10**||.02||.02||–.09* (.10**)||.21** (.23**)||.04 (.01)||.06 (.16**)||.11** (.12**)|
|Field exercise||5.58||1.82||1–9||.03||.18**||–.05||–.03 (.01)||.07 (.07)||–.08 (–.10*)||.02 (.01)||.05 (.06)|
Table 1 provides the correlations between the NEO domains and the NMPI factors on the one side and the interview and field selection exercise ratings on the other. Small statistically significant correlations were observed between neuroticism/emotional stability, extraversion, agreeableness, and conscientiousness and the interview OAR (r = –.09/.10 – .23). Due to the opposed direction of neuroticism and emotional stability, a minus sign indicates the correlation for the former. For the field selection exercise OAR, only openness demonstrated a statistically significant correlation (r = –.10).
Table 2 summarizes the hierarchical multiple regression analyses using age, sex, and GMA as control variables. The inclusion of the NMPI factors provided a significant contribution to explaining the rating variance in both the interview (7%) and field selection exercise (3%); whereas, the NEO domains only showed a significant contribution for the interview (5%). Thus, personality variables contributed to an overall marginal increment in explained variance above that provided by the control variables, somewhat higher for the interview ratings relative to the field selection exercise ratings.
|Interview OAR (N = 810)||Field Exercise OAR (N = 551)|
|Predictor||Step 1 β||Step 2 β||Step 1 β||Step 2 β|
|Neuroticisma||–.00 (.01)||.02 (–.04)|
|Agreeableness||–.01 (.05)||.02 (–.02)|
|Conscientiousness||.06 (.03)||.03 (.05)|
|R2||.01*||.06*** (.08***)||.03***||.05** (.06***)|
|R2 change||–||.05***(.07***)||–||.02 (.03*)|
For testing hypotheses 1 and 3, the standardized regression coefficients were used. The NMPI demonstrated a unique statistical contribution for extraversion and openness in both selection methods (β = .25/.17 and –.14/–.16). The NEO showed a unique statistical contribution for extraversion concerning ratings in both selection methods (β = .22/.11). However, for openness, a statistically significant beta value was reached for the field selection exercise only (β = –.11). Therefore, while extraversion was to some extent positively related to candidate ratings, openness was somewhat negatively related when controlling for age, sex, and GMA. The individual predictors of neuroticism, agreeableness, and conscientiousness were not significantly related to the candidate ratings.
Regarding hypotheses 2 and 4, concerning associations between the personality traits and the specific competency ratings, the correlations provided in Table 1 were used for investigation. Inspection of the correlations between the personality measures and the interview competency ratings (role model, mental robustness, and development) indicated no clear competency-dependent associative patterns with the five personality traits. Regarding the field selection exercise, a marginal tendency of associations was found between openness and the competencies of role model, task focus, and mental robustness, but not for cooperation and development.
The NEO and the NMPI coincide mostly in terms of associations toward the ratings, likely due to the correlation between these measures (shown in the upper right of Table 1). For agreeableness, however, there was a more moderate correlation (r = .62) pointing to somewhat different operationalizations between the measures regarding this trait. This may explain why NEO and NMPI agreeableness had different correlations toward the interview ratings.
Studies of construct embeddedness in interviews and assessment centers are important for an increased understanding of predictors (Berry et al., 2007; Collins et al., 2003, Roth et al., 2005). We sat forth to investigate this in a military setting by scrutinizing associations between FFM personality traits and ratings at an officer selection program when controlling for the well-known performance predictor of cognitive ability. Four hypotheses targeted expected associations between candidate personality and ratings of both OARs and five leadership competencies. Overall, our findings indicated that (1) personality traits were to some degree related with the OARs, where extraversion and openness demonstrated small isolated positive and negative associations, respectively, and (2) there was a lack of expected associations between FFM traits and single competencies. Thus, findings partially supported hypotheses 1 and 3, while no support was found for hypotheses 2 and 4.
The association between extraversion and candidate ratings was expected and is in agreement with previous civilian and military studies (Collins et al., 2003; Salgado & Moscoso, 2002; Sørlie et al., 2020; Thomas et al., 2001). Trait activation theory may shed light on extraversion embeddedness. Tett and Burnett (2003) theorized that situational cues increase the relevancy of a given personality trait—and arguably, a military selection arena holds several such cues for extraversion activation (Darr et al., 2018). However, our results for openness contrast civilian findings (Collins et al., 2003; Salgado & Moscoso, 2002), which may point to differences in civilian and military selection processes concerning the attractiveness of this trait. Interestingly, a large-scale study documented that individuals low on openness were more likely to enter service in the German military than those scoring higher on this trait (Jackson et al., 2012). We speculate that, on average, action-oriented concrete thinkers may thrive more in military organizations relative to individuals with tendencies toward abstract thinking and aesthetic interests. As such, it is perhaps an advantage to be somewhat conventional when answering military interview questions and when persevering in a demanding exercise in the outdoors. It could also be that the interview questions and the work samples in the field selection exercise were suboptimal for triggering individual differences in openness.
It was somewhat surprising to observe the marginal personality overlap with the two selection methods, and especially so for the field selection exercise considering the duration of seven days. Acknowledgment of situational strength may contribute to understanding the scarcity in personality embeddedness. The gist of situational strength theory is that a strong situation provides guidelines or cues for expected behavior, whereas a weak situation does not (Judge & Zapata, 2015). In the field selection exercise, candidates were provided with uniforms and basic military equipment, and they certainly understood that they were under observation. Such contextual factors most likely constituted a strong situation, possibly constraining the manifestation of personality differences between candidates. While Darr (2011) reported an overall generalizability of published meta-analytic FFM estimates concerning the prediction of military job performance, it could be that situational forces are more salient in a selection setting. We also note that considering the bandwidth debate, questioning whether broad traits or its subcomponents are the best predictors for performance (e.g., Judge et al., 2013), NEO facets could perhaps show higher associations with the ratings relative to the NEO domains. However, as shown in the supplementary file, correlations at the NEO facet level did not show clearly more evident associations toward the ratings than the NEO domains. Some nuances can be seen, though, first and foremost regarding the field selection exercise where no NEO domains correlated significantly with the OAR. However, the facet of activity (extraversion) did show a significant positive association (r = .18), and the facets of ideas (openness), compliance (agreeableness), and depression (neuroticism) did show significant negative associations (r = –.13, –.11, and –.09).
By testing hypotheses of differential personality-competency associations, we could investigate possible personality overlaps in a more nuanced way relative to the usage of the OARs. However, our hypotheses were not supported, and there was otherwise no clear pattern in the correlations. This lack of a clear pattern, due to the high intercorrelations between the competency ratings, most likely point to a practice where interviewers and assessors rated candidates based on global evaluations. The high intercorrelations may, of course, be due to a “g” factor, where candidates who excel on one leadership competency excel on others as well—such tendencies have, for example, been demonstrated in ratings of job performance (Viswesvaran et al., 2005). However, it is also relevant to note findings from decision-making psychology, where several cognitive biases fuel so-called “system 1” thinking characterized by fast and intuitive information processing (Kahneman, 2011), which can potentially threaten the use of the interview scoring system and the BARS. One such likely bias is the halo effect, whereby a global evaluation of a person influences judgments of specific attributes (Nisbett & Wilson, 1977; Viswesvaran et al., 2005). We did not, however, obtain data on the decision-making processes of selection officers. Based on the high intercorrelations of competency ratings, it is difficult to argue for aspects of the construct validity of the competencies themselves. The present study shows that when using the leadership competencies in a practical selection context, ratings of the isolated competencies intercorrelate highly.
The results revealed thus a suboptimal rating practice at the officer selection program. Still, we do not intend to criticize the selection methods of the interview and the field exercise per se. With stronger associations between constructs measured by cost-friendly psychometric tests and judgment-based ratings, one could, from a predictive perspective, argue for the unnecessity of employing the more costly rater-based selection methods (Collins et al., 2003). As seen in our study, with low associations, the argument can be turned around, thus pointing to a potential for incremental validity when using psychometric tests as predictors. Although acknowledging the weak embeddedness of the established predictors of cognitive ability and personality traits in the ratings, we do not know the predictive validity of the interview and field selection exercise ratings toward military job performance. Nevertheless, such costly selection methods are valuable for other purposes than purely predictive. Among those are realistic job previews and beginning socialization into a military identity, which may foster positive applicant reactions and acceptance rates for chosen candidates.
The present study has some limitations. First, as all research participants were preselected through a conscript assessment procedure and also had actively applied for attending the selection program, there was some risk of range restriction in study variables. However, our purpose was not to generalize to the general population but to preselected candidates for officer selection. Thus, our relevant unrestricted group would be those attending the first two days at the selection program (i.e., before the interview and the field selection exercise). Considering, for example, the GMA scores, there were only minor differences in the standard deviations between unrestricted and assumed restricted groups: (1) candidates attending the first two days of the selection program, M = 6.62, SD = 1.23; (2) candidates obtaining interview ratings, M = 6.61, SD = 1.23; and (3) candidates obtaining field selection exercise ratings, M = 6.65, SD = 1.24. For NEO neuroticism: (1) M = 1.31, SD = 0.42; (2) M = 1.30, SD = 0.41; (3) M = 1.27, SD = 0.40. We also note that some social desirability in the self-report personality measures may have occurred as the data were collected in a selection setting and thus contributed to skewed distributions.
Second, the reliability and validity of competency ratings were unknown, first and foremost due to a lack of interrater reliability studies and information of selection officers’ actual decision-making processes. As our criteria were a military interview and a field exercise in a simulated war context, not directly comparable to interrater reliability estimates from civilian employment interviews and assessment centers, we did not have relevant data for correcting the criteria for attenuation. Considering the field selection exercise, the authors did not have access to work sample specific ratings. Perhaps more nuanced personality competency-rating relationships would be observed with such data.
Third, unfortunately, it was not possible in our dataset to identify whether candidates applied for officer school in the Army, Navy, or Air Force. Because incumbents in these three branches within the Norwegian Armed Forces seem to have different military identities (Johansen et al., 2013), and also because they have had experiences of somewhat different traditions for officer selection prior to the implementation of the joint selection program (Hansen, 2006), personality saturation might have been somewhat branch dependent.
Future studies may investigate the reliability and predictive validity of the competency ratings. Findings would be important for evaluating if the competencies are adequately measurable and are valid indicators of individual prerequisites for a mission command leader. Considering the oftentimes impreciseness of job competencies relative to psychological constructs as emphasized by Furnham (2008), such a study may be valuable. Another research line might be investigations of selection officers’ decision-making processes when expected to use interview scoring systems and BARS. Such a study would be theoretically interesting and valuable for evaluating the practical adequacy of structured selection systems in military settings.
While low associations between candidate personality and ratings point to the usefulness of personality testing in the selection program (e.g., the potential for incremental validity), we also suggest that personality test scores may help assessors achieve more nuanced competency assessments of candidates. There are content similarities between the FFM and the mission command competencies that military psychologists and assessors may discuss for counteracting the tendency toward global evaluations of candidates. At the time of writing, personality testing is not systematically used at the selection program (i.e., as a predictor). The NMPI developed in-house may be a promising tool for future test usage, where possible advantages for military organizations are a short administration time and a lack of propriety restriction. However, further reliability and predictive validity analyses of the NMPI are warranted before operational use.
In closing, we suggest that an awareness of whether high scorers on extraversion or openness are rated objectively can be important in military selections. While extraversion might be advantageous to some degree in military settings (Darr et al., 2018), a possible extraversion favorability in a selection program is suboptimal. Such favorability may especially be counterproductive when camouflaging low conscientiousness scores (Wiersma & Kappe, 2016) considering the predictive validity of the latter trait in the context of military job performance (Darr, 2011; Fosse et al., 2015; Salgado, 1998). Although high openness scorers presumably are few in military organizations (Jackson et al., 2012), this trait may very well be relevant for success in both educational programs and ultimately in the execution of leadership in unpredictable and potentially dangerous contexts in which military officers might operate (Campbell et al., 2010). For example, open-mindedness and creativity are possibly more adaptive than rigidity and conventionality when engaging hostile forces under changing circumstances. Furthermore, we speculate that openness can also be advantageous when developing and employing mission command leadership principles (i.e., encouraging decentralized and disciplined initiatives), such as being generally self-reflective and forthcoming when subordinates present original solutions to challenges and problems.
1While the five-factor model and the Big Five taxonomy belong to the questionnaire and lexical traditions, respectively, the contents of the five main personality factors are essentially equivalent (Simms et al., 2017).
The authors thank the Norwegian Armed Forces HR and Conscription Centre for supporting the data collection.
The authors have no competing interests to declare.
Tom H. Skoglund designed the study and wrote the manuscript. Tom H. Skoglund, Thomas Fosse and Ole Christian Lang-Ree collected the data. All authors analyzed and interpreted the data and gave feedback on the manuscript.
Barrick, M. R., Mount, M. K., & Judge, T. A. (2001). Personality and performance at the beginning of the new millennium: What do we know and where do we go next? International Journal of Selection and Assessment, 9(1–2), 9–30. DOI: https://doi.org/10.1111/1468-2389.00160
Berry, C. M., Sackett, P. R., & Landers, R. N. (2007). Revisiting interview-cognitive ability relationships: Attending to specific range restriction mechanisms in meta-analysis. Personnel Psychology, 60(4), 837–874. DOI: https://doi.org/10.1111/j.1744-6570.2007.00093.x
Calleja, J. A., Hoggan, B. L., & Temby, P. (2019). Individual predictors of tactical planning performance in junior military officers. Military Psychology, 32(2), 149–163. DOI: https://doi.org/10.1080/08995605.2019.1691405
Campbell, D. J., Hannah, S. T., & Matthews, M. D. (2010). Leadership in military and other dangerous contexts: Introduction to the special topic issue. Military Psychology, 22, S1–S14. DOI: https://doi.org/10.1080/08995601003644163
Collins, J. M., Schmidt, F. L., Sanchez-Ku, M., Thomas, L., McDaniel, M. A., & Le, H. (2003). Can basic individual differences shed light on the construct meaning of assessment center evaluations? International Journal of Selection and Assessment, 11(1), 17–29. DOI: https://doi.org/10.1111/1468-2389.00223
Darr, W. A. (2011). Military personality research: A meta-analysis of the Self Description Inventory. Military Psychology, 23(3), 272–296. DOI: https://doi.org/10.1080/08995605.2011.570583
Darr, W. A., & Catano, V. M. (2016). Determining predictor weights in military selection: An application of dominance analysis. Military Psychology, 28(4), 193–208. DOI: https://doi.org/10.1037/mil0000107
Darr, W. A., Ebel-Lam, A., & Doucet, R. G. (2018). Investigating the extravert advantage in training: Exploring reward sensitivity, training motivation, and self-efficacy as intermediary factors. Canadian Journal of Behavioural Science/Revue canadienne des sciences du comportement, 50(3), 172–184. DOI: https://doi.org/10.1037/cbs0000102
Fosse, T. H., Buch, R., Säfvenbom, R., & Martinussen, M. (2015). The impact of personality and self-efficacy on academic and military performance: The mediating role of self-efficacy. Journal of Military Studies, 6(1), 47–65. DOI: https://doi.org/10.1515/jms-2016-0197
Furnham, A. (2008). Intelligence and personality at work. Exploring and explaining individual differences at work. Routledge. DOI: https://doi.org/10.4324/9780203938911
Hansen, I. (2006). Bidrag til psykologitjenestens historie i Forsvaret fra 1946–2006 [Contributions to the History of the Norwegian Military Psychology Service 1946–2006]. Norwegian Defence University College.
Hoffman, B. J., Kennedy, C. L., Lopilato, A. C., Monahan, E. L., & Lance, C. E. (2015). A review of the content, criterion-related, and construct-related validity of assessment center exercises. Journal of Applied Psychology, 100(4), 1143–1168. DOI: https://doi.org/10.1037/a0038707
Huffcutt, A. I., Conway, J. M., Roth, P. L., & Stone, N. J. (2001). Identification and meta-analytic assessment of psychological constructs measured in employment interviews. Journal of Applied Psychology, 86(5), 897–913. DOI: https://doi.org/10.1037/0021-9010.86.5.897
Hystad, S. W., Eid, J., Laberg, J. C., & Bartone, P. T. (2011). Psychological hardiness predicts admission into Norwegian military officer schools. Military Psychology, 23(4), 381–389. DOI: https://doi.org/10.1080/08995605.2011.589333
International Taskforce on Assessment Center Guidelines. (2015). Guidelines and ethical considerations for assessment center operations. Journal of Management, 41, 1244–1273. DOI: https://doi.org/10.1177/0149206314567780
Jackson, J. J., Thoemmes, F., Jonkmann, K., Lüdke, O., & Trautwein, U. (2012). Military training and personality trait development: Does the military make the man, or does the man make the military? Psychologcial Science, 23(3), 270–277. DOI: https://doi.org/10.1177/0956797611423545
Johansen, R. B., Laberg, J. C., & Martinussen, M. (2013). Military identity as predictor of perceived military competence and skills. Armed Forces & Society, 40(3), 521–543. DOI: https://doi.org/10.1177/0095327X13478405
Jugde, T. A., Rodell, J. B., Klinger, R. L., Simon, L. S., & Crawford, E. R. (2013). Hierarchical representations of the five-factor model of personality in predicting job performance: Integrating three organizing frameworks with two theoretical perspectives. Journal of Applied Psychology, 98(6), 875–925. DOI: https://doi.org/10.1037/a0033901
Judge, T. A., & Zapata, C. P. (2015). The person-situation debate revisited: Effect of situation strength and trait activation on the validity of the Big Five personality traits in predicting job performance. Academy of Management Journal, 58(4), 1149–1170. DOI: https://doi.org/10.5465/amj.2010.0837
Køber, P., Lang-Ree, O. C., Stubberud, K., & Martinussen, M. (2017). Predicting basic military performance for conscripts in the Norwegian Armed Forces. Military Psychology, 29(6). DOI: https://doi.org/10.1037/mil0000192
McCrae, R. R., Costa, P. T., & Martin, T. A. (2005). The NEO-PI-3: A more readable revised NEO personality inventory. Journal of Personality Assessment, 84(3), 261–270. DOI: https://doi.org/10.1207/s15327752jpa8403_05
NAF (Norwegian Armed Forces). (2020). Lederkandidatstudien [Leadership Candidate Study]. Retrieved from https://forsvaret.no/hogskolene/Forskning/lederkandidatstudien
Nisbett, R., & Wilson, T. D. (1977). Telling more than we can know: Verbal reports on mental processes. Psychological Review, 84(3), 231–259. DOI: https://doi.org/10.1037/0033-295X.84.3.231
Oshio, A., Taku, K., Hirano, M., & Saeed, G. (2018). Resilience and Big Five personality traits: A meta-analysis. Personality and Individual Differences, 127, 54–60. DOI: https://doi.org/10.1016/j.paid.2018.01.048
Roth, P. L., Van Iddekinge, C. H., Huffcutt, A. I., Eidson, C. E., Jr., & Schmit, M. J. (2005). Personality saturation in structured interviews. International Journal of Selection and Assessment, 13(4), 261–273. DOI: https://doi.org/10.1111/j.1468-2389.2005.00323.x
Rumsey, M. G., & Arabian, J. M. (2014). Military Enlistment Selection and Classification: Moving Forward. Military Psychology, 26(3), 221–251. DOI: https://doi.org/10.1037/mil0000040
Salgado, J. F. (1998). Big Five personality dimensions and job performance in army and civil occupations: A European perspective. Human Performance, 11(2), 271–288. DOI: https://doi.org/10.1207/s15327043hup1102&3_8
Salgado, J. F., & Moscoso, S. (2002). Comprehensive meta-analysis of the construct validity of the employment interview. European Journal of Work and Organizational Psychology, 11(3), 299–324. DOI: https://doi.org/10.1080/13594320244000184
Schmidt, F. L., & Hunter, J. (1998). The validity and utility of selection methods in personnel psychology. Psychological Bulletin, 124(2), 262–274. DOI: https://doi.org/10.1037/0033-2909.124.2.262
Schmidt, F. L., & Hunter, J. (2004). General mental ability in the world of work: Occupational attainment and job performance. Journal of Personality and Social Psychology, 86(1), 162–173. DOI: https://doi.org/10.1037/0022-3518.104.22.168
Schmidt, F. L., Oh, I.-S., & Shaffer, J. A. (2016). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 100 years of research findings. Working paper.
Shaffer, J. A., & Postlethwaite, B. E. (2012). A matter of context: A meta-analytic investigation of the relative validity of contextualized and noncontextualized personality measures. Personnel Psychology, 65(3), 445–493. DOI: https://doi.org/10.1111/j.1744-6570.2012.01250.x
Simms, L., Williams, T. F., & Simms, E. N. (2017). Assessment of the Five Factor Model. In T. A. Widiger (Ed.), The Oxford Handbook of the Five Factor Model. Oxford University Press. DOI: https://doi.org/10.1093/oxfordhb/9780199352487.013.28
Skoglund, T. H., Brekke, T.-H., Steder, F. B., & Boe, O. (2020). Big Five personality profiles in the Norwegian Special Operations Forces. Frontiers in Psychology, 11, 747. DOI: https://doi.org/10.3389/fpsyg.2020.00747
Sørlie, H. O., Hetland, J., Dysvik, A., Fosse, T. H., & Martinsen, Ø. L. (2020). Person-Organization Fit in a military selection context. Military Psychology, 32(3), 237–246. DOI: https://doi.org/10.1080/08995605.2020.1724752
Tay, C., Ang, S., & Van Dyne, L. (2006). Personality, biographical characteristics, and job interview success: A longitudinal study of the mediating effects of interviewing self-efficacy and the moderating effects of internal locus of causality. Journal of Applied Psychology, 91(2), 446–454. DOI: https://doi.org/10.1037/0021-9010.91.2.446
Thomas, J. L., Dickson, M. W., & Bliese, P. D. (2001). Values predicting leader performance in the U.S. Army Reserve Officer Training Corps Assessment Center: Evidence for a personality-mediated model. The Leadership Quarterly, 12(2), 181–196. DOI: https://doi.org/10.1016/S1048-9843(01)00071-6
Thornton, G. C., & Gibbons, A. M. (2009). Validity of assessment centers for personnel selection. Human Resource Management Review, 19(3), 169–187. DOI: https://doi.org/10.1016/j.hrmr.2009.02.002
Viswesvaran, C., Schmidt, F. L., & Ones, D. S. (2005). Is there a general factor in ratings of job performance? A meta-analytic framework for disentangling substantive and error influences. Journal of Applied Psychology, 90(1), 108–131. DOI: https://doi.org/10.1037/0021-9010.90.1.108
Wiersma, U., & Kappe, R. (2016). Selecting for Extroversion but rewarding for Conscientiousness. European Journal of Work and Organizational Psychology, 26(2). DOI: https://doi.org/10.1080/1359432X.2016.1266340
Yukl, G. (2012). Effective leadership behavior: What we know and what questions need more attention. Academy of Management Perspectives, 24(4), 66–85. DOI: https://doi.org/10.5465/amp.2012.0088