Metacognition, metamemory, and mindreading in high-functioning adults with autism spectrum disorder.

Metacognition refers to cognition about cognition and encompasses both knowledge of cognitive processes and the ability to monitor and control one's own cognitions. The current study aimed to establish whether metacognition is impaired in autism spectrum disorder (ASD). According to some theories, the ability to represent one's own mental states (an aspect of metacognition) relies on the same mechanism as the ability to represent others' mental states ("mindreading"). Given numerous studies have shown mindreading is impaired in ASD, there is good reason to predict concurrent impairments in metacognition. Metacognition is most commonly explored in the context of memory, often by assessing people's ability to monitor their memory processes. The current study addressed the question of whether people with ASD have difficulty monitoring the contents of their memory (alongside impaired mindreading). Eighteen intellectually high-functioning adults with ASD and 18 IQ- and age-matched neurotypical adults participated. Metamemory monitoring ability and mindreading ability were assessed by using a feeling-of-knowing task and the "animations" task, respectively. Participants also completed a self-report measure of metacognitive ability. In addition to showing impaired mindreading, participants with ASD made significantly less accurate feeling-of-knowing judgments than neurotypical adults, suggesting that metamemory monitoring (an aspect of metacognition) was impaired. Conversely, participants with ASD self-reported superior metacognitive abilities compared with those reported by neurotypical participants. This study provides evidence that individuals with ASD have metamemory monitoring impairments. The theoretical and practical implications of these findings for our current understanding of metacognition in ASD and typical development are discussed.

Metacognition can be broadly defined as "thinking about thinking". More specifically, it refers to an individual's awareness of cognitions and encompasses "metacognitive knowledge", "metacognitive monitoring", and "metacognitive control". Metacognitive knowledge refers to one's beliefs and factual knowledge about cognitive processes in general (in self and others), whereas metacognitive monitoring and control refer respectively to one's awareness of and ability to regulate one's own current, online mental states and cognitive activity (Flavell, 1979).
One extensively studied component of metacognition is metamemory, which refers to an individual's knowledge of memory processes, and ability to monitor and control their own memory. Nelson and Narens' (1990) influential model of metamemory divides metamemory (monitoring and control) processes into two levels: the "object-level" and the "meta-level".
The object-level consists of first-order memory processes (i.e., memory itself), whilst the meta-level consists of dynamic, second-order representations of the object-level. This model is supported by neuropsychological (e.g., Janowsky, Shimamura, & Squire, 1989;Shimamura & Squire, 1986) and psycho-pharmacological (e.g., Dunlosky et al., 1998) data, which highlight a dissociation between memory and metamemory. According to Nelson and Narens' model, through metamemory monitoring individuals create a meta-representation of the object-level (Nelson & Narens, 1990). Additionally, metamemory control processes use information held at this meta-level to feedback to the object-level, allowing individuals to alter object-level processes and implement different strategies during learning (e.g., by allocating more study time to information that one believes one has not learnt). It is partly for this reason that metamemory is considered essential for adaptive functioning, allowing one to tailor one's behaviour according to one's strengths and weaknesses in object-level memory.
As such, if an individual's metamemory monitoring is inaccurate the strategies they implement during learning are likely to be ineffective.

Metamemory judgments
One of the most commonly-used and classic paradigms to assess metamemory monitoring involves asking people to make feeling-of-knowing (FOK; Hart, 1965) judgements. During a typical FOK task, participants are asked (during a study phase) to memorise a series of stimulus pairs (e.g., pairs of words, such as "pen-key", "computerelephant" etc.). Participants are then presented (during a cued-recall test phase) with one stimulus from each pair (the cue; e.g., "pen"), and asked to recall its missing pair (the target; e.g., "key"). Importantly, on trials in which participants fail to correctly recall the target they are asked to judge the likelihood that, at a later point, they would be able to recognise it.
Finally, participants are then presented with the cue and are asked to select the unrecalled target from several options (a recognition test phase). The accuracy of participants' judgments on metamemory tasks is typically assessed using Gamma correlations (Goodman & Kruskal, 1954), which measure the association between individuals' predictions about their future ability to recognise the correct target with their actual subsequent recognition performance (see the Method section for a detailed description of how Gamma correlations are calculated).

Metacognition as "applied theory of mind"
Theory of mind (ToM) is the ability to attribute mental states, such as beliefs, desires, and intentions, to self and others in order to explain and predict behaviour (Premack & Woodruff, 1978). While most research into ToM focuses on awareness of other minds (henceforth called "mindreading"), research into metacognition focuses on awareness of one's own mind. Indeed, given the potential role of metacognition in self-regulation, Flavell (2000) considered metacognition an example of "applied ToM". Several different perspectives have been proposed to explain the potential relation between mindreading and metacognition. According to one perspective (e.g., Carruthers, 2009;Frith & Happé, 1999), the ability to represent one's own mental states (metacognition) relies on the same underlying metarepresentational mechanism as the ability to understand mental states in others (mindreading). Crucially, according to this one-mechanism theory, no dissociation should exist between mindreading and metacognition ability; individuals who demonstrate mindreading impairments should also demonstrate impaired metacognition.
However, this proposal has been disputed. According to a version of the "simulation theory", our ability to read other minds stems from our ability to directly introspect the contents of our own mind, and then use this information to mentally simulate the contents of another's mind in imagination (e.g., Goldman, 2006). From this perspective, metacognition is both ontogenetically and phylogenetically prior to, and foundational for, mindreading. According to a third theory, proposed by Nichols and Stich (2003), mindreading and metacognition are underpinned by separate mechanisms; the "monitoring mechanism" is responsible for access to/awareness of one's own mental states, whereas a separate "mindreading mechanism" is responsible for processing information about others' mental states. Crucially, both of these latter two theories imply that there should be some people who manifest diminished mindreading abilities, despite undiminished metacognition. Indeed, both Goldman, and Nichols and Stich explicitly suggest that people with autism spectrum disorder (ASD) present precisely this pattern of impaired mindreading, but intact metacognition.

Metacognition in Autism Spectrum Disorder
Autism spectrum disorder (ASD) is a developmental disorder diagnosed on the basis of social-communication deficits, and fixated interests and repetitive behaviours (American Psychiatric Association, 2013). It is widely acknowledged that ASD is characterised by diminished mindreading ability (see Yirmiya, Erel, Shaked, & Solomonica-Levi, 1998).
However, until recently the question of whether metacognition is diminished among people with ASD has remained largely unexplored.
The study of metacognition in ASD could have important implications for educational practice among individuals with ASD. Metacognition in general and, more specifically, metamemory play key roles in aspects of learning and decision-making that we know people with ASD have difficulties with. According to Nelson and Narens' (1990) metamemory model, information gained by monitoring one's own memory feeds back to memory functioning, allowing individuals to control their learning efficiently. As such, having a good awareness of what one has learnt can improve an individual's subsequent learning ability.
For example, when revising for an exam, if an individual can accurately assess what information they already know, they are able to spend their time effectively, revising the topics they do not know. This issue may be particularly relevant for intellectually highfunctioning people with ASD, given that many of these individuals show significantly lower academic achievement than would be expected on the basis of their intelligence, which in turn impacts negatively on their life chances (see Estes, Rivera, Bryan, Cali, & Dawson, 2011). Indeed, the educational domains in which people with ASD frequently under-achieve are just those in which learning is known to be fostered by metacognitive training. Such training has been shown to remediate difficulties in reading comprehension (see Brown & Campione, 1996), writing (e.g., Sitko, 1998) and mathematical reasoning (e.g., Fuchs et al., 2003). In each of these domains, individuals with ASD show statistically significant underachievement, relative to IQ (see Estes, et al., 2011;Jones et al., 2009) . It is possible that diminished metacognitive monitoring contributes to the lower-than-expected levels of academic achievement in ASD in these areas.
Thus, for several reasons it is important to establish the extent to which individuals with ASD show diminished metacognitive ability. In a seminal paper, Frith and Happé (1999) argued explicitly that individuals with ASD are as impaired at metacognition as they are at mindreading. More recently, Williams (2010) has taken up this idea, citing evidence that individuals with this disorder are as impaired at recognising their own and others' thought processes (Hurlburt, Happé, & Frith, 1994), emotions (Williams & Happé, 2010a) and specific mental states, such as beliefs and intentions (Williams & Happé, 2010b), as they are at recognising these states in others. Evidence from "self" versions of classic mindreading tasks (e.g., Williams & Happé, 2009), in which participants are asked to report their own previously held (now false) belief, also suggests that individuals with ASD demonstrate diminished awareness of their own beliefs. Each of these findings suggests that metacognition is impaired in individuals with ASD, which appears in keeping with the view that mindreading and metacognition rely on the same underlying mechanism. As such, in our view, the evidence from studies of mental state attribution in ASD provides support for the one-mechanism account. However, some have argued that there is a critical limitation with these types of studies that prevents definitive conclusions being drawn about metacognitive ability in ASD (see Carruthers, 2009;Nichols & Stich, 2003). The potential difficulty is that test questions in self versions of classic mindreading tasks require participants to recall their prior mental states, rather than report their current mental states. Simulation and two mechanisms theories claim that only current mental states are directly accessible without the need for mindreading.
Thus, arguably, the results from the above studies do not necessarily show that metacognition is impaired in ASD, because these tasks require inferences to be drawn about past mental states (but see Williams, 2010, for a counter-argument).
By contrast, it is widely agreed that metamemory monitoring judgements are based on awareness of current mental states. As such, if the accuracy of metamemory monitoring is diminished among people with ASD, this would provide strong support for the suggestion that metacognition is diminished in ASD, contrary to the predictions that follow from the simulation/two-mechanisms theory. In this regard, a seminal study by Farrant, Boucher and Blades (1999) reported no metamemory impairment in ASD. This study was used by Nichols and Stich (2003) to support the suggestion that metamemory is unimpaired in individuals with ASD, and thus to support their two-mechanisms theory. However, an issue with this study is that Farrant et al. assessed metamemory knowledge. The one-mechanism account proposes that metacognitive monitoring/control, rather than metacognitive knowledge, necessarily relies on the same metarepresentational mechanism as mindreading. As such, Farrant et al.'s study cannot be taken as conclusive evidence that all aspects of metamemory are typical in individuals with ASD. At most, it suggests that the metamemory knowledge may be intactthe study did not assess metamemory monitoring or control.
In order to unambiguously test whether metacognition is impaired in ASD, evidence is instead required from studies of metacognitive monitoring (or control). Performance on FOK tasks relies on individuals monitoring current internal memory states. Only one study to date has examined metamemory in ASD using a FOK task (Wojcik, Moulin, & Souchay, 2013). Wojcik and colleagues assessed children's metamemory monitoring ability using two FOK tasks, one asking individuals to assess their memory for information stored episodically and one assessing memory for information stored semantically. Wojcik reported that children with ASD were significantly poorer than typically developing children at making accurate FOK judgements, but only when assessing their episodic memory. However, there is a particular methodological difficulty affecting Wojcik et al.'s (2013) study that arguably prevents valid conclusions from being drawn. The difficulty is that the ASD and neurotypical groups were not matched for verbal IQ (VIQ). Matching for VIQ is essential in such studies, because differences between groups in this respect can potentially entirely explain between-group differences in experimental task performance (see Mervis & Klein-Tasman, 2004). Wojcik et al. (2013) recognised this limitation and tried to overcome it using an ANCOVA to "control" for group differences in VIQ. However, ANCOVA does not, in fact, solve this problem (see Miller & Chapman, 2001) and, thus, we cannot determine whether group differences were driven by diagnostic status or by VIQ differences. In the current study, we explored FOK accuracy among ASD and comparison groups that were closely matched for VIQ, as well as for age, PIQ, and FSIQ. If, as we predicted, betweengroup differences in FOK accuracy were apparent, this would provide the first definitive evidence of a diminution of this ability among individuals with ASD.

The Current Study
The aim of this study was to explore the extent to which individuals with ASD are able to accurately monitor their own memory. To examine this, a classic FOK task was employed. Our main prediction was that participants with ASD would make significantly less accurate FOK judgments than comparison participants. During the FOK task different types of errors can lead to inaccurate FOK judgements; individuals can make over-confident errors (in which individuals incorrectly predict they will recognise a word that they subsequently fail to recognise) and also under-confident errors (in which individuals fail to predict their subsequently successful recognition of a target word). The type of error made by people with ASD during metacognitive monitoring tasks has not been explored previously, but we predicted that individuals with ASD would make more FOK judgement errors overall, but would not be specifically biased towards over-confident or under-confident errors.
Additionally, the Meta-cognitions Questionnaire (MCQ; Cartwright-Hatton & Wells, 1997) was also used, as a self-report measure of participants' beliefs about their own metacognitive ability. To our knowledge no study has previously assessed metacognitive ability in individuals with ASD using a self-report questionnaire. It was predicted that individuals in the ASD group would report diminished confidence in and awareness of and their own thoughts, as reflected by lower scores on the cognitive self-consciousness sub-scale and higher scores on the cognitive confidence sub-scale of the MCQ.
A measure of mindreading ability was also included in the current study. It was important to assess participants' mindreading ability, because according to the onemechanism theory, metacognitive impairments should only be apparent if mindreading impairments are also present. To assess mindreading ability, we employed a version of the animations task (Abell, Happé, & Frith, 2000). During this task, individuals are asked to view a series of clips in which animated triangles interact with one another. Participants are asked to provide descriptions of/explanations for the patterns of interaction between the triangles in each clip. An adequate explanation of the triangles' interactions requires the attribution of mental states (e.g., intentions, desires). We employed two conditions from the task, namely a mentalising condition and a goal-directed condition. Both of these conditions appear to rely on the mindreading system, although performance on the mentalising condition is thought to rely on mindreading to a greater extent than the goal-directed condition. Based on the findings from previous studies (e.g., Abell, et al., 2000;Lind, Williams, Bowler, & Peel, 2014), we predicted that participants with ASD would show diminished overall performance on the animations task, but not a group (TD/ASD) by condition (mentalising/goal-directed) interaction on the task.

A priori power analysis
Prior to commencing the study, G*Power 3.1 (Faul, Erdfelder, Lang, & Buchner, 2007) was used to conduct a power analysis to determine the sample size required to detect the predicted group differences in gamma correlation on the FOK task. In our view, no valid studies of FOK accuracy have been conducted among individuals with ASD. Thus, for the purpose of this power analysis, we could not predict an effect size for the between-group difference in FOK accuracy based on effect sizes found in previous studies. Therefore, based on our theoretical inclination toward the one-mechanism view, we predicted that metacognitive impairments in ASD should be of a similar magnitude to the magnitude of mindreading impairments in this disorder. As such, our prediction for the effect size associated with between-group difference in FOK accuracy in the current study was based on the effect size found for between-group differences in mindreading ability in studies of ASD.
In a meta-analysis exploring mindreading ability in individuals with ASD compared to neurotypical individuals, Yirmiya and colleagues reported an average Cohen's d of 0.88 (Yirmiya, et al., 1998). Thus, assuming d = 0.88 for between-group differences in metamemory accuracy and α = .05, it was established that a total sample size of n = 17 participants per group would achieve Cohen's (1992) recommended power of .80.

Participant
Ethical approval for this study was obtained from Durham University ethics committee. Eighteen adults with ASD (13 males, 5 females) and 18 neurotypical comparison adults (11 males, 7 females) took part, all of whom gave written, informed consent before participating. One participant with ASD completed the MCQ incorrectly, and so that participant's data for this questionnaire could not be used. Participants in the ASD group had all received formal diagnoses of autistic disorder (n = 4) or Asperger's disorder (n = 14), according to DSM or ICD criteria (American Psychiatric Association, 2000;World Heath Organisation, 1993).
In order to assess current ASD features, 15 of the 18 participants in the ASD group completed Autism Diagnostic Observation Schedule-Generic (ADOS; Lord et al., 2000) assessments. The remaining three participants declined to complete the ADOS, as they did not feel comfortable being filmed. The three participants who did not complete the ADOS had rigorous diagnoses and scored above the cut-off on the Autism-spectrum Quotient (see immediately below). All participants who completed the ADOS received a total score ≥7, the defined cut-off for ASD (Lord, et al., 2000). All participants completed the Autism-spectrum Quotient (AQ; Baron-Cohen, Wheelwright, Skinner, Martin, & Clubley, 2001), a self-report questionnaire that assesses ASD/ASD-like features. Fifteen out of 18 participants with ASD scored above the defined cut-off for ASD on the AQ (total score ≥26; Woodbury-Smith, Robinson, & Baron-Cohen, 2005). Only three participants missed this cut-off. However, all three of these participants scored well above the defined ASD cut-off on the ADOS (all ADOS scores among these three participants were ≥ 12). All comparison participants scored below the defined cut-off for ASD.
No participants, in either group, reported using any psychotropic medication or any history of neurological or psychiatric disorders (apart from ASD). The participant groups were closely equated for verbal and non-verbal ability (see Table 1 for participant characteristics). Verbal IQ (VIQ), performance IQ (PIQ), and full-scale IQ (FSIQ) were assessed using the full (four subtest) version of the Wechsler Abbreviated Scale of Intelligence (WASI; Wechsler, 1999). Groups were also closely equated for chronological age.

Materials and Procedures
Feeling-of-knowing task. The stimuli used in the FOK task were 80 word pairs, comprising of 160 concrete nouns (80 cue words and 80 target words). Cue words were matched with the target words for syllable length and word frequency (Kucera & Francis, 1967), as reported in the MRC psycholinguistic database (Coltheart, 1981). The adequacy of this matching was confirmed by a non-significant effect of word type (cue/target) in a multivariate ANOVA (using Wilks' Lambda criterion) that included syllable length and word frequency as the dependent variables, F (2, 157) = 0.68, p = .93.
The procedure for the FOK task consisted of a study phase, a cued-recall test phase (during which FOK judgements were also made; see below), and a recognition test phase (see Figure 1 for a graphical representation of one trial of the task). The task was run on an LG desktop computer and lasted approximately 25 minutes. Before completing the task participants completed a practice version of the entire procedure, consisting of five word pairs. As such, individuals knew before the study phase that their memory for the word pairs would be tested, both by a cued-recall test and a recognition test.
Study phase. During the study phase, participants were presented with individual word pairs (e.g., "bear-bridge"), each consisting of a cue word ("bear") and a target word ("bridge"). Each word pair was presented individually for four seconds. After the study phase, there was a five minute break, during which participants filled in the MCQ (see subsection below). After this break participants immediately completed the cued-recall test phase.

Cued-recall and FOK phase.
During the cued-recall phase, participants were shown individually presented cue words, in a random order, and were asked to recall the missing target word associated with each cue. Immediately after each recall attempt (i.e., on a trial-by-trial basis), participants were asked to make a FOK judgement as to whether they thought they would be able to recognise the missing target word at a later point (either "Yes" or "No"). As such, participants made FOK judgements for all cue words, regardless of whether their recall of the target word had been accurate or not. However, in our statistical analyses of FOK accuracy, we only included judgements made on trials in which participants failed to recall the target. This procedure is common to studies of FOK ability among typically and atypically developing populations. The procedure is designed to test participants' ability to judge the likelihood that they will be able to recognise information they have failed to recall.
Recognition phase. Immediately after the cued-recall phase, participants completed the recognition test phase. During the recognition test, participants were individually presented with all 80 cue words, in a random order, and were asked to identify the correct target word in a four-alternative, forced-choice recognition test. On each trial, participants were asked to click (using the computer's mouse) the word they thought had been previously paired with the cue, from a selection of four options; the correct target word, an incorrect target word (that had previously been paired with a different cue word), and two novel distractor words not previously used in the task. Importantly, for a given cue word, all participants were shown the same four options to choose from. Once participants had clicked on a response the next trial began. During the recognition test phase a target word only appeared as an option twice; once on a trial in which it was the correct target word and once on a trial as an incorrect target word. The same target word (appearing either as the correct or incorrect option) never appeared on two consecutive trials.

Meta-Cognitions Questionnaire. The Meta-Cognitions Questionnaire (MCQ;
Cartwright-Hatton & Wells, 1997) was used to assess participants' beliefs about their own thoughts, and the efficacy of different thought processes. The MCQ presents participants with individual statements (e.g., "I have little confidence in my memory for words and names") and participants were asked to decide the extent to which they agreed with each statement, responding on a 4-point likert scale, ranging from do not agree, agree slightly, agree moderately, to agree very much. The questionnaire consists of 65 items comprising five subscales. We were interested in two of these subscales specifically. The Cognitive confidence and Cognitive self-consciousness subscales each address participants' awareness of their own thought processes and their confidence in their own cognitions, which are of particular relevance to this study. In contrast, the remaining subscales addressed issues about worrying and the effects intrusive negative thoughts may have on one's functioning, which seemed less related to the aims of the study, Animations task. During the animations task, participants were required to provide a verbal description of eight silent video clips, each of which displayed an interaction between a large red triangle and a small blue triangle. These clips were taken directly from Abell et al. (2000). In four of the clips, an adequate explanation of the triangles' interaction required the attribution of propositional attitudes, such as beliefs, intentions, and/or desires. As in Abell et al.'s study, these four clips comprised a "mentalising" condition (assessing higherlevel mindreading). In the remaining four clips, an adequate explanation of the triangles' interaction required the attribution of goal states, such as copying or following (lower-level mindreading), but not necessarily propositional attitudes. As in Abell et al. (2000), these four clips comprised a "goal-directed" condition.
Each clip was presented to participants on an LG desktop computer and the order in which the experimental clips were presented was counterbalanced across participants. Before undertaking the experimental trials, participants also completed two practice trials, to familiarise themselves with the task (one goal-directed and one mentalising). During practice trials participants were asked to describe the behaviour displayed by the triangles in each of the video clips, and the experimenter gave feedback after each description. During the experimental trials, participants were asked to watch the clip and provide a running commentary, describing how the triangles interacted. During experimental trials a digital solid state audio recorder was used to record participants' descriptions, which were later transcribed. No feedback was given on experimental trials.

Scoring
Feeling-of-knowing task. Two measures of participants' basic object-level memory performance were calculated on the FOK task. Recall ability was calculated as the proportion of target words participants correctly recalled during the cued-recall-stage. Similarly, recognition ability was calculated as the proportion of target words participants correctly recognised during the recognition test phase of the task. Gamma scores (Goodman & Kruskal, 1954) were calculated to provide an index of overall FOK judgement accuracy. This analysis is recommended by Nelson (1984) and is commonly used to analyse FOK tasks (e.g., Kelemen, Frost, & Weaver, 2000;Nelson & Narens, 1990;Nelson, Narens, & Dunlosky, 2004;Wojcik, et al., 2013). Gamma scores are a non-parametric measure of association (between predictions and actual performance) and were calculated by comparing the number of correct predictions that each individual made with the number of incorrect predictions they made. To calculate gamma scores the formula was used, with (a) representing the number of correct "Yes" predictions an individual made, (b) the number of incorrect "Yes" predictions, (c) the number of incorrect "No" predictions, and (d) the number of correct "No" predictions. Gamma scores range between + 1 to -1, where a score of 0 indicates chance-level accuracy, a large positive value indicates a good degree of accuracy, and a large negative value indicates less than chance-level performance on the task. However, when calculating gamma scores, the score cannot be calculated when two or more of the prediction rates (a, b, c, or d) are equal to 0. As such, the raw data were adjusted by adding 0.5 onto each prediction frequency and dividing by the overall number of FOK judgements made (N) plus 1 (N+1). This correction is recommended by Snodgrass and Corwin (1988) and is routinely used when calculating gamma scores on metamemory tasks (e.g., Bastin et al., 2012;Wojcik, et al., 2013). Animations task. Voice recordings of participants' commentaries were transcribed verbatim by an independent transcriber who was naïve to participants' diagnoses and to the hypotheses of the study. These transcriptions were then scored by the first author and a second, independent rater (who was blind to the hypotheses of the study and the diagnostic status of the participants) on the basis of scoring criteria outlined in Abell et al. (2000).
Participants' descriptions of each animation were given a score of 0, 1, or 2 according to their level of accuracy, and defined as the extent to which the participant's description captured the intended meaning of the animation. As such, the total score achievable in each condition (mentalising/goal-directed) was eight. Inter-rater reliability for scores across the eight animations was almost perfect, Cronbach's α = .98.

Statistical Analyses
A standard alpha level of .05 was used to determine statistical significance. All reported significance values are for two-tailed tests. Where ANOVAs were used, we report values as measures of effect size (≥ .01 = small effect, ≥ .06 = moderate effect, ≥. 14 = large effect; Cohen, 1969). Where t-tests were used, we report Cohen's d values as measures of effect size (≥.0.20 = small effect, ≥ 0.50 = moderate effect; ≥ 0.80 = large effect; Cohen, 1969).

Feeling of knowing task
Memory (object-level) performance. Group differences in object-level memory performance were examined using independent-samples t-tests (see Table 2 for descriptive and inferential statistics). These indicated that individuals in the ASD group recalled significantly fewer target words than comparison participants in the FOK task. However, no significant group difference was found in the proportion of target words correctly recognised in the FOK task.
Metamemory performance. Group differences in metamemory monitoring accuracy were examined (see Table 2 for descriptive and inferential statistics). An independentsamples t-test indicated that there was a significant difference in gamma scores between the ASD and neurotypical group. Thus, in accordance with our predictions, participants with ASD were significantly poorer at predicting their own memory performance than were typically developing participants. Nonetheless, one-sampled t-tests indicated that gamma scores were significantly above chance (i.e. significantly greater than 0) in both ASD and neurotypical groups, all ts > 2.97, all ps < .009.
An additional analysis was also carried out to investigate whether the significant group difference in object-level recall of target words confounded performance at the metalevel of the task (i.e., FOK judgements). For the purpose of this analysis, two participants from each group were excluded to create ASD and neurotypical groups that were matched Group differences in the specific type of errors participants made on the FOK task were also examined. Independent samples t-tests indicated that participants in the ASD group made significantly more under-confident FOK errors than participants in the neurotypical group (see Table 2 for statistics). There was no significant group difference in the number of over-confident FOK errors made (see Table 2 for statistics). Table 2 shows the means and standard deviations for the two key MCQ subscale scores in the ASD and neurotypical group. A significant between-group difference was found in scores on the Cognitive self-consciousness subscale, indicating that participants in the ASD group believed they were superior at monitoring their own thoughts, and more aware of their own thought processes compared to comparison adults. There was no significant between-group difference in scores on the Cognitive confidence subscale. Table 2 shows the means and standard deviations for performance on the animations task. A mixed-model ANOVA was carried out on these data with Group (neurotypical/ASD) entered as the between-subjects variable, and Animation Type (mentalising/goal-directed) entered as the within-subject variable. There was a significant main effect of Group on animations scores, reflecting the fact that participants with ASD performed significantly less well than comparison participants on the task overall, F(1, 34) = 9.02, p = .005, = .21.

Animations task
There was also a significant main effect of Animation Type, indicating that, across both groups, scores were higher in the goal-directed condition than the mentalising condition, F(1, 34) = 72.82, p < .001. = .68. There was no significant Group by Animation Type interaction, F(1, 34) = 0.29, p = .59, = .01, suggesting that individuals in the ASD group were impaired at both higher-and lower-level mindreading, compared to individuals in the neurotypical group.

Exploratory correlation analyses: Associations between metamemory ability, and mindreading ability and self-reported metacognitive skill
A series of correlational analyses was carried out to explore the relation between performance in each condition of the animations (mindreading) task and performance on the FOK (metacognition) task. It should be noted that, although the current study was sufficiently powered to detect predicted group differences in FOK accuracy, it was not sufficiently powered to detect moderately-sized correlations (r = .30) between FOK accuracy and mindreading ability (see Discussion for further information regarding study power). The following correlation analyses should, thus, be considered exploratory. In summary, neither FOK accuracy (gamma score), nor the number of under-confident FOK errors made, nor the number of over-confident FOK errors made was associated significantly with performance in the mentalising condition of the animations task, or performance in the goal-directed condition of the animations task, among ASD or comparison participants, all rs ≤ -.32, all ps ≥ .201. Additionally, neither FOK accuracy (gamma score), nor the number of underconfident FOK errors made, nor the number of over-confident FOK errors made was associated significantly with scores on either of the MCQ sub-scales, among ASD or comparison participants, all rs ≤ -.43, all ps ≥ .077.

Discussion
Until now, no study has established the extent to which individuals with ASD are able to accurately monitor their own memory by judging feelings-of-knowing, As such, the primary aim of this study was to establish this. In terms of the central experimental finding, the study found that participants with ASD showed significantly diminished FOK accuracy.
This diminution was associated with a large effect size (d = 0.97), indicating a substantial difficulty with metamemory monitoring.
This result is in keeping with our predictions that individuals with ASD would show impairments in metamemory monitoring. However, there are several potential explanations for the observation of diminished gamma scores in the ASD group 1 . One possibility is that individuals with ASD demonstrated a "positive illusory bias" during the task. The concept of a positive illusory bias refers to a tendency for an individual to self-assess their perceived competence as greater than their actual ability. This bias has been observed among individuals with attention deficit hyperactivity disorder (see Owens, Goldfine, Evangelista, Hoza, & Kaiser, 2007). More importantly, some studies have indicated that individuals with ASD tend to self-report their own social functioning more positively than parents will report (e.g., Lerner, Calhoun, Mikami, & De Los Reyes, 2012), and will self-report the level of their own autistic traits as less severe than parents will report (e.g., Johnson, Filliter, & Murphy, 2009). These studies have been interpreted as suggesting that individuals with ASD may also show a tendency to manifest a positive illusory bias. Demonstrating a positive illusory bias may indeed partly explain our finding that participants with ASD self-reported (on the MCQ) greater awareness of their own mental states than neurotypical comparison participants reported. This self-reported superior awareness among participants with ASD stood in direct contrast to their diminished performance on an objective, well-established measure of metamemory monitoring ability. As such, the idea that some individuals with ASD manifest a positive illusory bias provides a plausible explanation for the MCQ findings.
However, it is not apparent that a positive illusory bias can explain our central finding of diminished FOK accuracy among participants with ASD. Individuals who manifest a positive illusory bias would, by definition, overestimate their memory ability and would, thus, be expected to make more over-confident errors when making FOK judgements. In other words, diminished FOK accuracy among people whose judgements were driven by a positive illusory bias would be driven by over-confidence. Yet, participants with ASD did not specifically make significantly more over-confident errors than comparison participants.
Rather, individuals with ASD made significantly more under-confident errors than comparison participants. As such, it appears that demonstrating a positive illusory bias cannot explain the specific pattern of results shown in our study The finding that participants with ASD made significantly more errors of the underconfident type (i.e., they tended to recognise targets that they judged they would not recognise), but not the over-confident type, was contrary to our prediction that between-group differences in monitoring accuracy would be driven by an increase of both types of error among participants with ASD. This suggests that diminished performance on the FOK task among participants with ASD was driven by a relative lack of awareness of existing knowledge, rather than a belief in the possession of knowledge that does not, in fact, exist. Indeed, training metacognitive skills has been shown to remediate difficulties in reading, writing and mathematical reasoning (see Brown & Campione, 1996;Fuchs, et al., 2003;Sitko, 1998) in typical development. The results of the current study make it plausible to suggest that diminished metacognitive monitoring ability contributes to educational underachievement in these areas among people with ASD. If this turns out to be correct, it could have revolutionary effects on educational practices for people with ASD. We believe it is important for future research to build upon the current results by exploring the extent to which metacognitive impairments contribute to educational success among individuals with ASD.
As well as having important educational implications, our central finding of reduced FOK accuracy in ASD also has theoretical implications. The central findings of diminished FOK accuracy alongside diminished mindreading ability are in keeping with the predictions of the one mechanism theory of the relation between metacognition and mindreading. Of course, the results do not definitively prove the theory, but certainly they are not in keeping with a key prediction made by either the simulation theory or the two-mechanisms theory that metacognition is unimpaired in ASD. As such, the main results of this study provide some support for the one-mechanism account. Having said this, we did not find a significant positive association between FOK accuracy and performance in either the mentalising or goal-directed conditions of the animations task. The one-mechanism account would have predicted such associations between metamemory and mindreading, so the current results did not support the theory in this respect. However, caution should be taken when interpreting the results of the correlation analyses. The exploration of associations between FOK task performance and animations (mindreading) task performance was carried out as exploratory analysis, and no a priori power analysis was conducted to establish that the study had adequate power for this secondary aim. A subsequent power analysis (after completion of the study) was conducted with a view to determining what sample size would have been necessary to detect meaningful, statistically significant associations between metacognitive monitoring ability and mindreading ability. Assuming a moderate association (r = 0.30) and α = .05, a total sample size of n = 67 participants would be needed to achieve Cohen's (1992) recommended power of .80 for the correlational analyses. Thus, our study was underpowered to detect a meaningful association between these two abilities. This represents a limitation of our study and, as such, caution should be taken when interpreting the findings from our correlation analyses. Future studies using larger sample sizes are warranted to further investigate relations between metacognitive monitoring and mindreading ability.
What is clear is that the current study was sufficiently powered to detect predicted group differences in FOK accuracy and that results indicated participants with ASD showed a substantial diminution of metamemory monitoring. Of course, there are other forms of judgement that can be used to assess metamemory, namely judgements of learning and judgements of confidence. It remains possible that people with ASD will show undiminished accuracy in these judgements. Judgments of learning involve assessing how well one thinks one has learnt a piece of information, and judgements of confidence involve making retrospective judgments about how certain one is in one's knowledge about a piece of information. The literature on typical development suggests that metamemory accuracy is only modestly correlated across different types of metamemory judgement (Kelemen, et al., 2000;Leonesio & Nelson, 1990). This has led to suggestions that different metamemory judgments may be based on different sources of information. Metamemory judgements are thought to be based on mnemonic cues and it is possible that different judgements are based on different cues (see Koriat, 1993;Metcalfe, Schwartz, & Joaquim, 1993). Although we predict that individuals with ASD will demonstrate impairments across different metamemory judgements, this may not turn out to be the case. So far there have been only two published studies of judgment of confidence accuracy (Wilkinson, Best, Minshew, & Strauss, 2010;Wojcik, Allen, Brown, & Souchay, 2011). Results from these studies have been inconsistent; whereas Wilkinson et al. (2010) report that confidence judgments made by children with ASD were less accurate than those made by typically developing children, Wojcik and colleague report no impairments in JOC accuracy in children with ASD (Wojcik et al., 2011). Thus, the study of metacognitive monitoring in ASD is in its infancy and, in our view, a sustained study of metamemory and its neurocognitive basis in ASD would be fruitful.
Future research should address these issues, and should also aim to address whether it is possible to foster metacognitive skills in individuals who do show impairments. In our view, a comprehensive investigation of metacognition in ASD is essential, given the consequences that impaired metacognitive monitoring and regulation may have on an individual's cognitive performance. It is hoped that alongside future research the findings from this study will help to establish a more definitive account of metacognitive ability in ASD, and that a greater understanding of this area will eventually contribute to successful remediation of cognitive and behavioural impairments in this disorder.