Development and Validation of a Military Training Mental Toughness Inventory

Three studies were conducted to develop and validate a mental toughness instrument for use in military training environments. Study 1 (n = 435) focused on item generation and testing the structural integrity of the Military Training Mental Toughness Inventory (MTMTI). The measure assessed ability to maintain optimal performance under pressure from a range of different stressors experienced by recruits during infantry basic training. Study 2 (n = 104) examined the concurrent validity, predictive validity, and test–retest reliability of the measure. Study 3 (n = 106) confirmed the predictive validity of the measure with a sample of more specialized infantry recruits. Overall, the military training mental toughness inventory demonstrated sound psychometric properties and structural validity. Furthermore, it was found to possess good test–retest reliability, concurrent validity, and predicted performance in 2 different training contexts with 2 separate samples.

The research literature on mental toughness has been dominated by qualitative approaches which have significantly shaped our understanding of mental toughness (e.g., Bull, Shambrook, James, & Brooks, 2005;Connaughton et al., 2008;Coulter et al., 2010;Gucciardi, Gordon, & Dimmock, 2009a;Jones et al., 2002). However, some researchers have argued that qualitative methods have become overused (e.g., Andersen, 2011), whereas others have urged researchers to develop reliable and valid measures of mental toughness (e.g., Sheard, Golby, & van Wersch, 2009). Further, Hardy, Bell and Beattie (2014) argue that one of the limitations of adopting qualitative methods is that researchers are unable to differentiate between the causes of mental toughness, processes, outcomes, and other behaviors that are more likely to be correlates associated with mental toughness.
There are however some notable exceptions to the qualitative approaches, with several quantitatively derived mental toughness measures having been developed (e.g., the Mental Tough-ness Inventory (MTI; Middleton, Marsh, Martin, Richards, & Perry, 2004; the Sport Mental Toughness Questionnaire (SMTQ; Sheard et al., 2009); the Mental Toughness QuestionnaireϪ48 (MTQ-48;Clough, Earle, & Sewell, 2002); and the Cricket Mental Toughness Inventory (CMTI; Gucciardi & Gordon, 2009). Although these various measures of mental toughness have significantly contributed to the mental toughness literature and have gone some way to alleviating the overreliance on qualitative approaches, they are not without their critics (see, e.g., Gucciardi, Hanton, & Mallett, 2012). Hardy et al. (2014) argued that although the above measures capture a wide array of values, attitudes, cognitions, and affect, they do not explicitly capture mentally tough behavior. They further argue that psychological variables may influence mental toughness, or be correlates of it, but that the primary focus of such measures should be on assessing the presence or absence of mentally tough behavior. Hardy and colleagues also argue that the use of self-report measures in assessing behaviors may be questionable because of social desirability and self-presentation confounds. To this end, Hardy et al. developed an informant rated behavior based Mental Toughness Inventory (MTI) in an elite sport context that was underpinned by the following definition, "the ability to achieve personal goals in the face of pressure from a wide range of different stressors" (p. 5). This definition of mental toughness was used to underpin the current research.
It is important to note that researchers into the concept of mental toughness are not alone in attempting to solve the dilemma of ameliorating the potential harmful effects of exposure to stress. Several similar yet subtly different constructs associated with stress exposure have been proposed, defined, and operationalized. These include the concepts of hardiness, resilience, and grit. Hardiness is viewed as a relatively stable personality characteristic, which involves courage, adaptability, and the ability to maintain optimal performance under exposure to stress. It has been conceptualized as a combination of three attitudes, commitment, control, and challenge, which provide an individual with existential courage and motivation to appraise stressful situations as opportunities for growth (Kobasa, 1979;Maddi, 2006Maddi, , 2007. Hardiness and its core components of commit-ment, control, and challenge are viewed as fundamental to another similar concept, resilience (Maddi, 2007). Resilience is characterized by the ability to recover from negative emotional experiences and the ability to adapt to stressful situations. Another similar psychological construct proposed by Duckworth et al. (2007), which involves striving toward challenges and maintaining effort and persistence despite adversity, setbacks, and failure is termed 'grit.' They define grit as "perseverance and passion for long-term goals" (Duckworth et al., 2007(Duckworth et al., , p. 1087, with the emphasis on long-term stamina, rather than short-term intensity. Kelly et al. (2014) suggest that the concept of grit has obvious utility in the military domain in that it is synonymous with fortitude or courage and the essence of officer cadet development in military academies. Although all of these psychological concepts describe psychological characteristics that are undoubtedly important in a military context, they differ from the current construct of mental toughness in that, the current research is specifically examining mentally tough 'behavior.' That is, the ability to maintain goal focus and high levels of performance in the face of different stressors. The concepts of hardiness, resilience, and grit are described as a constellation of personality characteristics and are as such typically measured at this level. However, mental toughness in the current research is measured and conceptualized at the behavioral level. That is, although the behaviors will be to some extent underpinned by personality, the level of measurement is not personality per se. This is an important distinction that will help to further the mental toughness literature by offering a means by which the personality and behavior relationship can be examined. Indeed, Hardy et al. (2014) demonstrated that the current definition of mental toughness was underpinned by Gray and McNaughton's (2000) revised Reward Sensitivity Theory (rRST). Hardy et al.'s (2014) MTI has been shown to have good psychometric properties, strong testretest reliability, and successfully discriminate between professional and nonprofessional athletes. A particular strength of the MTI (which sets it apart from other conceptualizations of mental toughness) is that it was conceptualized within a neuropsychological theoretically driven framework, namely Gray and McNaughton's (2000) revised Reward Sensitivity Theory (rRST). rRST was used as it has the potential to offer a neuropsychological explanation of the maintenance of goal directed behavior in the face of stressful stimuli. Hardy et al. were successful in examining the prediction of mental toughness from rRST personality traits. In a further study, the MTI was used to evaluate the efficacy of a successful mental toughness training intervention (Bell, Hardy, & Beattie, 2013) that was underpinned by Hardy et al.'s findings.
The MTI and the use of rRST (Gray & Mc-Naughton, 2000) appears to offer some promise in furthering our understanding of mentally tough behavior in elite sport. Consequently, based on Hardy et al.'s findings, there is a need to develop contextually relevant measures of mentally tough behaviors for other settings. One particular context where mental toughness is undoubtedly important is within the military. However, to date there appears to have been little or no empirical research conducted on mental toughness in the military domain, although there is evidence to suggest that it has recently started to be explored (e.g., Hammermeister, Pickering, & Lennox, 2011).
Military action requires soldiers to perform under intense pressure in highly stressful environments, characterized by fear, fatigue, and anxiety largely caused by risk to one's life. Typical combat stressors include, for example, exposure to enemy fire and improvised explosive devices, armed combat, and seeing colleagues killed or seriously injured. To demonstrate this, one soldier recently defined mental toughness as "gearing yourself up to go on a patrol in Afghanistan, outside the wire, the day after you lost a member of your squad to a sniper, and you know the sniper is still out there" (Lt Col. Burbelo; cited in Hammermeister et al., 2011, p. 4). The purpose of the present study was to develop a behaviorally based measure of mental toughness in a military training environment based on Hardy et al.'s (2014) definition and measure. Four independent samples, drawn from general and specialized infantry training platoons from a U.K.-based Army training establishment, were employed in the study.

Method
Stage I: Item development. Item development was underpinned by the behaviorally based approach adopted by Hardy et al. (2014). Environmental stressors were identified by conducting focus groups with recruit instructors and senior military personnel. An item pool representative of typical stressors experienced by recruits in training (e.g., feeling fatigued, being reprimanded, pressure to perform well, etc.) was developed by the authors, which were then presented back to the recruit instructors for further refinement. This resulted in a 15-item pool.
Participants and procedure. A total of 279 infantry recruits (M age ϭ 21.45, SD ϭ 3.16) who were between 5 and 24 weeks of training (M ϭ 14.18 weeks, SD ϭ 7.11) were reported on by 41 male infantry recruit instructors who had served for an average of 9.03 years in the Army (SD ϭ 2.35) and had spent an average of 11.78 months as an instructor (SD ϭ 5.89). For the instructors to accurately assess the recruits, a minimum of 5 weeks supervision was set for inclusion criteria (M ϭ 11.73 weeks, SD ϭ 6.84 weeks).
Infantry recruit instructors are responsible for training infantry recruits through a 26 week Combat Infantryman's Course (CIC). They are all experienced section corporals who are selected to serve a 24-month tenure at a training establishment before returning to their parent unit. The aim of the CIC is to train infantry recruits to the standards required of an infantry soldier to operate as an effective member of a platoon in extremely hostile environments. Infantry training is therefore designed to be both physically and mentally demanding with the majority of instruction and training taking place outdoors and on field exercises. The consequences of failing to meet the required standards at any point in training result in being reallocated to an earlier point in training with another training platoon.
After receiving institutional ethical approval, instructors and recruits were verbally solicited to take part in the study, and informed of the nature of the study and the inclusion criteria. Confidentiality was assured and once the inclusion criteria were satisfied, informed consent was obtained. The same conditions for recruitment, participation and assurance of confidentiality were applied to all of the studies in this research program.
The instructors were asked to complete the 15 items that were retained from stage 1 for each recruit in their section and asked to rate how well they were able to maintain a high level of personal performance when confronted with different stressful situations in training (example items included "when the conditions are difficult" and "when he has been reprimanded/ punished"; see the Appendix). Responses were based on a 7-point Likert scale that ranged from 1 (never) to 7 (always), with a midpoint anchor of 4 (sometimes).
Results. Confirmatory factor analysis (CFA) using LISREL 8.80 (Jöreskog & Sörbom, 2006) was used in an exploratory way to refine the item pool. The fit statistics for the 15 item model was poor, 2 (90) ϭ 511.23, p Ͻ 0.01; RMSEA ϭ .10, CFI ϭ .97, NFI ϭ .96, SRMR ϭ .06, GFI ϭ .80. Post hoc item refinement was conducted using the standardized residuals, modification indices for theta delta and theoretical rationale. This process identified a number of items that had considerable conceptual overlap with other items, were ambiguously worded, or referred to environmental conditions that may not be a universal stressor. Removal of these items resulted in a six item scale that demonstrated a good fit to the data, 2 (9) ϭ 17.95, p ϭ .04; CFI ϭ .99, RMSEA ϭ .03, SRMR ϭ .02, NFI ϭ .99, NNFI ϭ .99, GFI ϭ .98. The mean mental toughness score was 4.17 (SD ϭ 1.30) with an internal consistency (Cronbach's alpha) of .89. Factor loadings ranged from .72 to .81 (see Table 1 for items and descriptives).
Stage II: Structural validity. The purpose of stage 2 was to confirm the factor structure of the MTMTI on a separate sample.
Participants and procedure. A total of 156 recruits (M age ϭ 21.33, SD ϭ 2.90) between weeks 7 and 23 of training (M ϭ 14.77 weeks, SD ϭ 6.49) were reported on by 23 instructors (M age ϭ 26.87, SD ϭ 2.09) who had served for an average of 8.48 years in the Army (SD ϭ 2.27) and had spent an average of 13.30 months as an instructor (SD ϭ 5.46) training recruits. Instructors completed the 6-item MTMTI developed in Stage I.

MTMTI.
The MTMTI developed and validated in Study 1 was used.
Concurrent validity of the MTMTI was tested by selecting variables that are theorized to correlate with mentally tough behavior (e.g., selfreport mental toughness, self-confidence, and resilience measures). Predictive validity was tested by assessing the extent to which the MTMTI predicated performance.
Sport Mental Toughness Inventory. The sport mental toughness questionnaire (SMTQ; Sheard et al., 2009) is a 14-item measure that consists of three subscales: confidence, constancy, and control. These subscales can be combined to create a global measure of mental toughness. The scale is measured on a 4-point Likert scale anchored at 1 (not at all true) to 4 (very true). Example items include "I have what it takes to perform well under pressure" (confidence); "I am committed to completing the tasks I have to do" (constancy); and "I worry about performing poorly" (control; reverse scored). CFA has been shown to provide good support for the 3-factor model (Sheard et al., 2009).
Self-confidence. Self-confidence was measured using a 5-item scale that was developed and validated by Hardy et al. (2010) in a military training context by asking, "compared to the most confident recruit you know, how would you rate your confidence in your ability to. . . . (e.g., "meet the challenges of training"). The response format is rated on a 5-point Likert scale anchored at 1 (low) to 5 (high). This scale has been shown to have good psychometric and predictive validity in a military training context (Hardy et al., 2010).
Resilience scale. Resilience was measured using a 4-item resilience scale developed specifically for use in a military training context by Hardy et al. (2010). The stem and response format used was the same as the self-confidence scale. Example items include "adapt to different situations in training and be successful." This scale has been shown to have good psychometric and predictive validity in a military training context (Hardy et al., 2014).
Performance. Performance was determined by the recruits' end of course final grades, based on their weekly reports and grades throughout the CIC. This grade is awarded by the platoon commander (Lieutenant or Captain) and ranges from 0 (fail) to 6 (excellent).
Procedure. To assess test-retest validity, the MTMTI was administered at weeks 20 and 23 of training. The self-report SMTQ, resilience and confidence scales were administered during week 23 of training, and the performance data were collected at the end of training (week 26).

Results
Descriptive statistics and correlations for all study variables are displayed in Table 2. The MTMTI demonstrated a good fit to the data ( 2 (9) ϭ 6.81, p ϭ .66; RMSEA ϭ .00, NNFI ϭ 1.00, CFI ϭ 1.00, SRMR ϭ .01), although this result should be interpreted with caution due to the small sample size.
Test-retest reliability. The mean mental toughness score at week 20 was 4.95 (SD ϭ 1.34), and the mean score at week 23 was 4.89 (SD ϭ 1.36). A paired sample t test revealed that these means were not significantly different, t(103) ϭ 0.63, p ϭ Ͼ .05. The test-retest reliability for the MTMTI was .72. Concurrent validity. Table 2 demonstrates that the MTMTI significantly correlated with the global SMTQ (r ϭ .43), the separate subscales of the SMTQ (confidence r ϭ .37, constancy r ϭ .40, and control r ϭ .24), and Hardy et al.'s (2010) subscales of resilience (r ϭ .35), and confidence (r ϭ .33).
Predictive validity. Regression analysis revealed that mental toughness significantly predicted individual course performance (R 2 ϭ .31; ␤ ϭ .56, p ϭ Ͻ .01). Furthermore, hierarchical regression analyses revealed that the MTMTI accounted for a significant proportion of variance in course performance (Block 2: ⌬R 2 ϭ .19; ␤ ϭ .48, p Ͻ .01) over and above that accounted for by the SMTQ (Block 1: R 2 ϭ .15; ␤ ϭ .19, p Ͻ .01). We also tested whether the MTMTI accounted for variance in performance after controlling for all the self-report variables used in the current study. The results revealed that the MTMTI accounted for a significant proportion of variance in performance (Block 2: ⌬R 2 ϭ .18; ␤ ϭ .48, p Ͻ .01) over and above that accounted for by all the selfreport measures (Block 1: R 2 ϭ .17, p Ͻ .05).

Study 3: Further Test of Predictive Validity
Study 2 demonstrated the test-retest reliability and concurrent and predictive validity of the MTMTI. Furthermore the MTMTI was shown to predict performance after controlling for selfreported mental toughness. The aim of Study 3 was to further test the predictive validity of the MTMTI in a specialized infantry context, namely the Parachute Regiment (Para).
Although initial training for the infantry is necessarily arduous and demanding, initial training for Para recruits is widely regarded by the British Army as being the most physically and mentally demanding of all Infantry regiments in the British Armed Forces (Wilkinson, Rayson, & Bilzon, 2008). Their specialist role requires them to operate at a higher intensity than the regular infantry, carrying heavy loads for longer distances, at a faster pace, as well as withstanding the hardships of operating independently in the field for long periods under harsh environmental conditions (Wilkinson et al., 2008). To determine their suitability for this role, at week 20 of the CIC Para recruits are required to undergo a pre-Para selection testweek (PPS), known colloquially as P-Company. P-Company consists of a series of physically demanding team and individual events that involve carrying personal equipment weighing 20 kg or more for distances of up to 32 km over severe terrain with time constraints, a steeplechase assault course, and aerial confidence course. Two team events require the participants to run with a 60 kg log and 80 kg stretcher for 2.5 km and 8 km respectively. Pass rates typically range between ϳ40% and 70%.
Furthermore, the nature of the military performance indicators is such that they tend to be very physical in nature. However, while a specific level of fitness is required for military service, the various tests are designed to assess recruits abilities to perform under stressful and arduous conditions. That is, it is not just fitness that determines the quality of a Para recruit but the ability to maintain a high level of performance in stressful and arduous conditions. Success on P-Company entitles a recruit to wear the coveted maroon beret and pass out of training into a Parachute Regiment unit. Conversely, failure results in the recruit being reallocated to a platoon earlier in the training cycle or transfer to another infantry regiment. The recruits have been training for this test week for the preceding 20 weeks. It is hypothesized that fitness will predict performance on P-Company but, more importantly, mental toughness will predict variance in performance on P-Company after controlling for fitness.

Method
Participants. Participants for Study 3 were 134 Para recruits (M age ϭ 19.95, SD ϭ 4.14) who were reported on by 20 different Para recruit instructors (M age ϭ 28.71 years, SD ϭ 2.92) who had served for an average of 10.65 years in the Army (SD ϭ 2.63) and had spent an average of 10.95 months as an instructor (SD ϭ 4.87). The recruits had been under the supervision of their respective instructors for between 7 and 20 weeks (M ϭ 15.31 weeks, SD ϭ 4.06).
Instruments. Mental toughness. The MTMTI was used to measure mental toughness.
Performance. During P-Company, participants can achieve a maximum of 70 points, determined by their performance on each event (i.e., up to 10 points for each of the 7 events; the aerial confidence course is a pass or fail test). Most of the points are awarded objectively based on time to complete or completion of an event and are awarded by P-Company staff who are independent of the recruits' regular training team. Performance scores in the current sample ranged from 10 to 70 (M ϭ 49.95, SD ϭ 15.07).
Fitness. An objective measure of fitness was used to control for individual fitness. During training, recruits are required to complete physical assessments to measure progression in individual fitness. One of these assessments is a two-mile loaded run in less than 18 minutes, carrying a 16-kg pack and rifle. Another assessment is a timed run over a steeplechase assault course consisting of several dry and water obstacles. Each event generates an individual time. Two-mile loaded times for this cohort ranged from 15 minutes and 30 seconds to 22 minutes and 47 seconds (M ϭ 18:39, SD ϭ 1: 37). The steeplechase times ranged from 18 minutes 30 seconds to 22 minutes 26 seconds (M ϭ 20:19, SD ϭ 1:08). To create an overall indication of fitness these times were standardized within event and were then combined to create an overall score. We then multiplied the overall score by Ϫ1 so that a higher score was indicative of better performance.
Procedure. The fitness tests were conducted during week 18 of training and the MTMTI was administered at the end of week 19 of training. P-Company was conducted at week 20 of training.

Discussion
The purpose of the present series of studies was to develop and validate a measure of mentally tough behavior in a military training environment. Study 1 found good support for the structural validity of the MTMTI, whereas Study 2 found support for the concurrent, predictive, and test-retest reliability. The predictive validity of the MTMTI was further supported in a specialized infantry sample. Moreover, the predictive validity tests demonstrated that the MTMTI predicted objective performance while controlling for another measure of mental toughness (SMTQ in Study 3) and fitness (in Study 4). Overall, the MTMTI demonstrated good psychometric properties across 4 separate samples and the predictive validity was supported in two separate samples. Consequently, these results provide some further support for Hardy et al.'s (2014) proposal that mental toughness should be assessed via observer rather than self-report ratings.
The current research is an important first step in developing a valid measure of mental toughness in a military context. Having a valid scale that stands up well to both psychometric and predictive testing allows researchers to examine mental toughness both from applied and theoretical perspectives that will help to further our understanding of mentally tough behavior. For example, the current measure will allow for further exploration of the neuropsychological underpinnings of mentally tough behavior across contexts; namely, whether Hardy et al.'s (2014) counterintuitive finding that mentally tough behavior was related to high levels of punishment sensitivity and low levels of reward sensitivity in cricketers (see Gray & Mc-Naughton, 2000 for a review of reward and punishment sensitivity, and Hardy et al., 2014 for a description of how reward and punishment sensitivities might be related to mental toughness). It would seem prudent to examine these results across different contexts.
Based on the findings from Hardy et al. (2014), Bell et al. (2013) developed a successful multimodal intervention that was designed to impact mental toughness in elite level cricketers. Consequently, the MTMTI could potentially be used to conduct similar interventions to evaluate mental toughness in a military training environment. The intervention contained three main components; exposure to punishment conditioned stimuli, coping skills training, and was delivered in a transformational manner. While the results of the intervention indicated that it was successful in developing mental toughness by the authors own admission, no attempt was made to measure the separate effects of the punishment conditioned stimuli, the transformational delivery, or the efficacy of the coping skills. Thus, no conclusions can be inferred regarding which aspects of the intervention contributed most to the observed change in mental toughness, or indeed, whether these aspects interacted to impact the observed change in mental toughness. Consequently, further research is needed to delineate more precisely the effects that punishment conditioned stimuli, transformational delivery, and coping skills has on the development of mental toughness.
Although the current measure has been demonstrated to perform well in the standard tests of measurement efficacy, it is noted that the scale is one-dimensional, that is, all the stressors fall under one global aspect. It is suggested that it might be possible to delineate the stressors into clusters. For example, some of the stressors identified in the MTMTI may fall under physical stress (e.g., tiredness) whereas others may fall under threats to ego (e.g., punishments). Further investigation of this would seem warranted. For example, all of the social pressure items (e.g., "he is not getting on with other section members") were deleted at Stage I because of inadequate fit. Indeed, the inclusion of a multidimensional aspect to the measurement of mentally tough behavior will allow for a closer examination of the construct of mental toughness. This would allow for more in-depth questions around mental toughness to be examined, such as, whether some individuals are better able to cope with certain types of stressors than other types of stressors (e.g., social stressors, threats to ego, physical stressors etc.). Furthermore, the role that underlying personality dimensions have in determining individual differences in ability to cope with different types of stressors would also be a worthwhile area of future research. However, to test these and other related questions one would need to develop a multidimensional measure of mentally tough behavior. A further limitation and area worthy of future research is to explore the possibility of whether the current anchors should be more reflective of behaviors rather than a Likert-type scale.
To sum up the current series of studies have gone some way toward developing and validating a measure of mental toughness in a military training environment that will hopefully stimulate further theoretical and applied research in this area.