Strategy variability in numerosity comparison task: a study in young and older adults

Abstract We investigated strategies used by young and older adults in dot comparison tasks to further our understanding of mechanisms underlying numerosity discrimination and age-related differences therein. The participants were shown a series of two dot collections and asked to select the largest collection. Analyses of verbal protocols collected on each trial, solution times, and percentages of errors documented the strategy repertoire and strategy distribution in young and older adults. Based on visual features of dot collections, both young and older adults used a set of 9 strategies and selected strategies on a trial-by-trial basis. The findings also documented age-related differences (i.e., strategy preferences) and similarities (e.g., number of strategies used by individuals) in strategies and performance. Strategy variability found here has important implications for understanding numerosity comparison and contrasts with previous findings suggesting that participants use a single strategy when they compare dot collections.

There are two types of stimulus characteristics crucially influence numerosity comparison performance: numerical and visual features of dot collections. Regarding numerical properties of dot collections, previous studies found that participants are faster and more accurate when they are asked to compare smaller relative to larger collections of dots (e.g., Revkin et al., 2008) and to estimate collections with larger ratios than on smaller ratios collections (i.e., Ansari et al., 2005;Barth, La Mont, Lipton, & Spelke, 2005;Pica, Lemer, Izard, & Dehaene, 2004). Thus, participants are faster when they compare small collections of 1 and 4 dots than large collections including 10 and 40 dots. They are also faster at larger-ratio collections (e.g., comparing collections of 24 and 12 dots) than at smaller-ratio collections (e.g., comparing collections of 24 and 18 dots). Such findings have been explained as resulting from functioning characteristics of an "approximate number system" (ANS) or "number sense" (Dehaene 1997, p. 5). The ANS is a universal system present in animals, children, and adults that allows comparison, addition, and subtraction of quantities without counting them. Previous studies have shown that ANS performance is dependent on the ratio between the quantities to be compared. Indeed, participants are faster comparing smaller than larger collections and larger-ratio (i.e., 8 vs. 16 dots) than smaller-ratio collections (i.e., 8 vs. 10 dots; Emmerton 1998;Hauser et al. 2003;Pica et al., 2004).
In addition to numerical and visual features of dot collections, participants' performance in dot comparison is also influenced by task or situation characteristics, such as the method of presentation (e.g., sequential vs. simultaneous; e.g., Price, Palmer, Battista, & Ansari, 2012), set size (e.g., Barth, Kanwisher, & Spelke, 2002;De Smedt et al., 2013;Revkin et al., 2008); or the display time (e.g., . For example, participants are faster when they compare two collections of dots displayed separately on a computer screen than when the two collections of dots are displayed in different colors but intermixed (e.g., Price, Palmer, Battista, & Ansari, 2012). They are also more accurate when collections of dots are displayed for longer than for shorter durations (e.g., Gilmore et al., 2016).
Finally, performance in numerosity comparison has been found to either change with participants' age during adulthood (e.g., Halberda et al., 2012;Li et al., 2010;Cappelletti et al., 2014;Norris, McGeown, Guerrini, & Castronovo, 2015;Trick, Enns, & Brodeur, 1996) or remain age-invariant (e.g., Gandini, Lemaire, & Dufau, 2008;Gandini, Lemaire, & Michel, 2009;Lemaire & Lecacheur, 2007;Watson, Maylor, & Bruce, 2005;Watson, Maylor, & Manson, 2002). For example, Lemaire and Lecacheur (2007) asked young and older adults to estimate numerosity of dot collections that included 40-460 dots. Although the results showed similar accuracy in young and older adults, older adults made more numerous but shorter eye fixations than young adults. In a large-scale study, Halberda et al. (2012) asked more than 10,000 participants ranging from 11 to 85 years of age to accomplish a dot comparison task. They found that the ability to discriminate two numerosities decreases with age after age 30. Also, Cappelletti et al. (2014) asked young and older adults to compare collections of dots that included between 5 and 16 dots. Collections were congruent (i.e., larger collections had a larger cumulative area, and smaller collections were displayed with a smaller cumulative area) or incongruent (i.e., larger collections had a smaller cumulative area, and smaller collections were displayed with a larger cumulative area). The researchers found that older adults' performance was poorer than young adults' in incongruent items but both groups performed equally well in congruent items. This result indicates that older adults were specifically impaired on incongruent items where it is necessary to inhibit the irrelevant information (i.e., cumulative area) to focus on relevant information (i.e., numerosity). Cappelletti et al. (2014) proposed that age-related declines on incongruent items result from declined efficiency of inhibition mechanisms (see also Norris et al., 2015). These finding suggests that one of the potential sources of age-related differences in numerosity comparison performance may be age-related changes in general cognitive mechanisms (such as executive control).
In summary, previous studies showed that to select the more numerous of two dot collections, participants base their smaller/larger responses on the number of dots in each visual collection as well as on visual characteristics of stimuli. Participants use visual features of dot collections because one or several visual characteristics correlate with numerosity in each collection of dots. For example, larger collections of dots occupy larger dot area, and have a smaller dot density, a larger distance between dots, a larger convex hull, etc. Previous studies, however, suggest that convex hull may be the most crucial visual feature used by participants to make their smaller/larger responses (e.g., Gilmore et al., 2016). What still remains unknown is whether participants use different visual features for different items and whether they use several features on a given item. To discover this, it is important to assess how participants accomplish dot comparison tasks on a trial-by-trial basis. This is what we did in the present study to test the hypothesis that people use several strategies in dot comparison tasks, a hypothesis that is based on previous findings on strategic variations.

Previous findings on strategic variations in dot comparison tasks.
According to the strategy variability hypothesis tested here, participants may use several strategies to accomplish dot comparison tasks. A strategy is generally defined as a "procedure or a set of procedures to achieve a higher-level goal or task" (Lemaire & Reder, 1999, p. 365). In the present context of dot comparison tasks, strategies were distinguished on the basis of which visual features (e.g., distance between dots, dot size, total surface occupied by dots) participants chose the more numerous of two dots collections. Lemaire and Siegler (1995) distinguished the following four strategy dimensions: strategy repertoire (i.e., which strategies are used?), strategy distribution (i.e., how often each available strategy is used?), strategy execution (i.e., how quickly and accurately each strategy is applied?), and strategy selection (i.e., how do people choose among available strategies on each item?). The present study aimed at testing the usefulness of this framework to study strategies in dot comparison tasks and age-related differences therein. Previous research on cognitive aging found significant differences between young and older adults in each of these strategy dimensions (see Lemaire, 2016, for an overview). Indeed, in a wide variety of cognitive domains, young and older adults tend to use different types (and number) of strategies and available strategies with different frequencies, and to have different strategy preferences. Moreover, older adults tend to execute strategies less efficiently (i.e., they tend to be slower and less accurate) and select strategies more poorly (e.g., they tend to choose the best strategy for each item less often than young adults). This has been found in a wide variety of cognitive domains, including pattern recognition, attention, memory, problem solving, decision making, reasoning, language, and arithmetic (see Lemaire, 2016, for an overview).
In the specific domain of numerosity estimation, previous data suggest that participants use different strategies and that young and older adults may differ in these strategies. Thus, Gandini, Lemaire, and Dufau (2008) asked young and older adults to accomplish numerosity estimation tasks. They collected verbal reports and eye-movements to investigate how participants provide a quick and rough estimate of the number of dots in collections. Results showed that both young and older adults used the same set of six different strategies, but varied in how well they executed each strategy (resulting in poorer performance in older adults), how often they used each strategy, and how the type of collections influenced their strategy use. Whether such strategy variability and systematic age-related differences in strategic variations can be observed in the numerosity comparison tasks is still unknown. In the present study, we collected verbal protocols on each trial in young and older adults to determine whether participants use several strategies, and whether young and older adults differ in dot comparison strategies.

The present study
The present study aimed to further our understanding of how participants accomplish numerosity comparison tasks. Young and older adults performed a dot comparison task in which they had to determine as quickly and accurately as possible which collection has more dots. After each item, we collected verbal protocols and asked participants to indicate on the basis of which visual feature they selected the more numerous of the two dot collections. On each trial, participants saw two dot collections that varied in numerosity as well as visual features. Thus, one collection included 24 dots and the other included 12-48 dots. Some items were congruent (e.g., the more numerous dot collection was displayed with a larger convex hull) whereas other items were incongruent (e.g., the larger collection was displayed with a smaller convex hull).
Before investigating strategic variations, we determined whether collecting trial-by-trial verbal reports changed participants' approach to dot comparison tasks. We compared our participants' performance with the participants' performance in Gebuis and Reynvoet (2012)'s study, aa the researchers conducted the same experiment in young adults with the same items without collecting verbal protocols. Consistent with previous findings, we expected to observe congruency effects in both age groups (i.e., poorer performance on incongruent items relative to congruent items). We also expected to observe larger congruency effects in older adults than in young adults.
Next, we examined strategies used by young and older participants to accomplish dot comparison tasks, investigating which strategies they used, how many strategies individuals used, how often they used each strategy, and for which items they used each available strategy. Moreover, we analyzed age-related differences in these strategic variations. The objective was to determine whether older adults use fewer strategies than young adults, whether young and older adults exhibit different strategy preferences, and whether young and older adults' strategy selection is similarly influenced by item congruency in young and older adults.
Older adults were recruited from senior community centers and young adults were students at Aix-Marseille University (Marseille, France). Older adults accomplished the Mini Mental-State Examination (i.e., MMSE;Folstein et al., 1975), which is a clinical test that provides a global measure of cognitive impairment in older adults. No older adults were excluded because they all had scores larger than the usual cut-off score of 27 (i.e., mean: 28.0). All participants had normal or corrected-to-normal vision, and none of them had a prior experience with the task or the experimental apparatus.
Stimuli. Participants completed a dot comparison task. Each trial in this task included two collections of white dots. These two collections were simultaneously displayed on a black background, side-by-side on a 15″ computer screen. Participants were asked to select which collection was more numerous using left and right keys marked on the keyboard. There were eight practice trials followed by a total of 80 experimental trials divided into four blocks. One dot collection always represented 24 dots and the other 12, 16, 18, 20, 22, 26, 29, 32, 36, or 48 dots. This resulted in five ratio conditions (i.e., ratios of 0.5, 0.6, 0.7, 0.8, and 0.9 for numbers smaller or larger than 24). The dot collections were constructed following the method of Gebuis and Reynvoet (2011, using a freely available Matlab script provided online: http:// titiagebuis.eu/Materials.html). This script controlled for cumulative surface area (i.e., area extended, item size, total surface, density) and convex hull, and generated four types of items (see Figure 1). The first (fully congruent) included pairs of collections for which the more numerous one had a larger cumulative surface area and a larger convex hull. The second (fully incongruent) included pairs of collections for which the more numerous one had a smaller cumulative surface area and a smaller convex hull. The third items (partially incongruent: cumulative surface area incongruent, convex hull congruent) included pairs of collections for which the more numerous one had a smaller cumulative surface area and larger convex hull. The fourth items (partially congruent: cumulative surface area congruent, convex hull incongruent) included pairs of collections for which the more numerous one had a larger cumulative surface area and a smaller convex hull. Procedure and Design. The presentation of stimuli was controlled by the E-Prime Software. Each trial began with a 500-ms blank screen, followed by a warning signal ("*") displayed for 400 ms in the center of the screen. Then, each item was displayed for 1500 ms (see Figure 2). Participants were asked to indicate, as quickly and accurately as possible, which collection of dots included the largest number of dots by pressing the appropriate key (i.e., S or L key) on an AZERTY keyboard. Participants were asked to respond within 1500 ms before stimulus disappeared. If no response was given within 1500 ms, a blank screen appeared, and participants had to press a key (i.e., S or L).
After each response, participants saw a signal (i.e., "?") displayed for an unlimited time, asking them to indicate "Which visual features influenced you to select the largest collection of dots?" On each trial, the experimenter recorded participant's verbal protocols with a tape recorder. Those protocols were subsequently coded by an independent coder for each item based on the information recorded by the experimenter. Two raters who independently classified cue-based strategies on 100 randomly chosen trials agreed on 96% of them.

Results
Results are reported in two main parts. First, we examined age-related differences in participants' performance (i.e., accuracy and response times). This also served the purpose of determining whether our procedure replicates findings previously reported by Gebuis and Reynvoet (2012) or whether collecting verbal protocols changed participants' approach to the task. Then, we investigated strategies used by young and older participants in dot comparison task. Preliminary analyses were run with the block factor (i.e., Block 1, 2, 3, and 4). No main or interaction effects involving this factor proved significant. In all the results, unless otherwise noted, differences are significant to at least p<.05.

Performance
Mixed-design ANOVAs were performed on mean percent errors and response times, 2 (Age: Young, older adults) × 2 (Congruency: Congruent, incongruent) x 2 (Conditions: Fully congruent, partially congruent), with repeated measures on the last two factors. Results showed main effects of age, congruency, and conditions on mean percentages of errors (See Table 1). Young adults made fewer errors than older adults (i.e., 30.1% and 41.5% respectively; F(1,61)=10.98, MSe=2003, η²p=.15). Participants made fewer errors while comparing congruent items (i.e., 29.0%) than while comparing incongruent items (i.e., 42.6%; F(1,61)=24. 86, MSe=11296, n²p=.29), and participants made fewer errors in partially congruent and incongruent conditions than in fully congruent and incongruent conditions (34.3% and 37.3%; F(1,61)=5.67 MSe=533, n²p=.09). Finally, the Congruency x Condition interaction was significant (F(1,61)=199.60, MSe=42276, n²p=.77). Participants made 39.9% more errors on fully incongruent than on fully congruent items (F(1,61)=166.93, n²p=.73), and 12.7% more errors on partially congruent than on partially incongruent items (F(1,61)=13. 16, n²p=.18). Interestingly, these findings replicate results reported by Gebuis and Reynvoet (2012), suggesting that verbal protocols collected here did not change how participants approached the dot comparison task. Analyses of response times showed main effects of age and congruency (see Table 2). Young participants were faster (963 ms) than older participants ( These results show that congruency effects were found for both partially and fully congruent/incongruent conditions in older adults, in contrast to young adults (who showed congruency effects only in fully conditions). This suggests that when compared to young adults who are more sensitive to convex hull, older participants were influenced by all the visual features. In sum, collecting verbal protocols did not change the way participants accomplished dot comparison task. Similar patterns of results were found both in Gebuis and Reynvoet (2012)'s study where no verbal protocols were collected and in the present study where we collected verbal protocols on each item.

Strategies
Examination of verbal protocols collected on each item for each participant revealed that participants used nine different strategies. These strategies were based on different visual features (see Table 2): -Distance: Participants focused on the (larger or smaller) distance between dots. -Dot size: Participants based their choices on whether dots were small or large.
-Total surface: Participants focused on the total surface covered by dots.
-Shape: Participants selected a collection with a recognizable shape (e.g., butterfly).
-Dot size and distance: Participants combined the size of dots and the distance between dots.
-Dot size and shape: Participant used the size of dots and assigned a specific shape to the collection at the same time. -Dot size and total surface: Participants combined the size of dots and the total surface covered by dots.
-Distance and total surface: Participants combined the distance between dots and the total surface covered by dots. -Dot size, distance, and total surface: Participants combined the size of dots, distance between dots, and total surface covered by dots. While 3 young and 1 older individuals used only single cue on all items, 34 young and 25 older individuals used a single cue on some items and several cues on other items. We next analyzed the mean number of strategies used by young and older individuals and how often participants used each strategy. To increase the number of observations for each congruent and incongruent items, we disregarded whether items were fully or partially congruent/incongruent.
Mean number of strategies used by individuals were analyzed with an ANOVA with a 2 (Age: Young, older adults) x 2 (Congruency: Congruent, incongruent items), with age as the only between-participants factor. Results revealed that young and older individuals used an equal number of strategies (i.e., both groups used three strategies, F<1.0). Participants used significantly more strategies for congruent items than for incongruent items (i.e., 3.6 vs. 3.2 strategies: F(1,61)=6.15, MSe=6, n²p=.09).
Mean percent use of the four strategies that were used on at least 5% of problems (i.e., Dot size, Total surface, Distance, Dot size, and Distance) were analyzed with an ANOVA involving a 2 (Age: young, older adults) × 2 (Congruency: congruent, incongruent items) × 4 (Strategy: Dot size, Total surface, Distance, Dot size and Distance) with repeated measures on the last two factors (see Table 3).
Results showed a main effect of strategy (F(3,183)=8.89, MSe=7798, η²p =.13). The distance strategy was used most often, followed by the dot size strategy, the dot size and distance strategy, and finally the total surface strategy.
Moreover, the Congruency x Strategy interaction was significant, showing that participants used strategies in different proportions on congruent and incongruent items (F(3,183)=60.49, MSe=14203, n²p=.50). This interaction was qualified by a significant Age x Congruency x Strategy interaction (F(1,61)=5.94, MSe=31659, n²p=.09). As can be easily seen from Table 3, this interaction showed different strategy distributions on congruent and incongruent items for young and older adults. Older adults used most often the dot size strategy (29.5%), followed by the distance strategy and dot size and distance strategy, and used the total surface strategy least often for congruent items. They used the distance strategy most often (39.9%), followed by the total surface strategy, the dot size strategy, and used the dot size and distance strategy least often for incongruent items. Young adults used mostly the dot size and distance strategy (31.6%), followed by the dot size strategy, the distance strategy, and the total surface strategy for congruent items. They used the distance strategy most often (47.2%), followed by the total surface strategy, the dot size strategy, and the dot size and distance strategy for incongruent items.
To summarize, both age groups used the same set of nine strategies and an equal number of strategies while accomplishing dot comparison task. However, strategy distributions differed between young and older adults when they compared congruent and incongruent items. Young adults tended to use strategies based on a single visual feature while comparing incongruent collections of dots and used strategies based on several visual features for congruent items. Older adults tended to use strategies based on a single visual feature for both congruent and incongruent items.

General discussion
The objective of the present study was to further our understanding of how young and older adults accomplish numerosity comparison tasks. In this context, we adopted a strategy perspective and examined age-related differences in strategic variations in dot comparison tasks. In addition to performance, we assessed strategies used by participants on each item, and determined whether young and older adults exhibit different strategy repertoires and preferences and whether strategy selection is similarly influenced by item characteristics in both age groups. First, consistent with previous findings reported by Gebuis and Reynvoet (2012), we observed congruency effects in both age groups (i.e., poorer performance in incongruent items relative to congruent items). Older adults were slower and less accurate overall and showed larger congruency effects than young adults. Secondly, and most importantly, we found that several strategies were used to compare two collections of dots. Also, young and older adults had different strategy preferences depending on item congruency. These findings have important implications for our understanding of mechanisms underlying numerosity comparison and age-related differences in these mechanisms.

How do participants accomplish numerosity comparison tasks?
Previous findings on numerosity comparison found that participants are influenced by visual features (e.g., convex hull, surface area) of stimuli, above and beyond numerical properties. Most importantly, previous research aimed at determining which visual feature is most crucial (e.g., Gilmore et al., 2016;Halberda et al., 2008;Mazzocco, et al., 2011). For example, in Gilmore et al.'s (2016) study, the influence of dot area and convex hull on participants' accuracy was tested. Results showed that convex hull was most influential on participant's performance (see also  for similar results). In this context, one issue is whether participants actually rely on only one visual feature (e.g., convex hull) and the same feature for all items, or whether they use different/several features for one item and across all items, an issue that we refer to as strategy variability. Here, we addressed this issue by collecting comparison strategies on each item in the context of dot comparison task. We found that participants used at least nine different strategies to accomplish dot comparison task and selected strategies on a trial-by-trial basis. Such evidence of strategic variations in numerosity comparison tasks have important implications to further our understanding of mechanisms underlying numerosity comparison performance. First, previous research suggested that numerosity comparison relies on an "approximate number system" (ANS), defined by Dehaene (1997) as a cognitive system that underlies number sense and skills at discriminating between different numerosities. Upon encoding collections of dots, participants visually scan the stimulus, retrieve a numerical representation in long-term memory, compare the difference between the encoded representation and the retrieved representation, and then adjusted their answer on the basis of this difference (e.g., Gandini et al., 2008;Siegel, Goldsmith, & Madson, 1982). Using several strategies, each of which relies on different visual features, does not mean that participants do not process numerical features of dot collections, and that ANS is not crucial for numerosity comparison. It means that, in addition to, and possibly correlated with numerosity of dot collections, participants are not making use of a single visual feature but use different visual features for different items and sometimes for a given item. Note that multiple-strategy use found here is not specific to dot comparison tasks. Indeed, Gandini et al. (2008) also found that participants used several strategies when shown dot collections that included 40-460 dots and asked to find as quickly as possible the approximate number of dots in each collection. Thus, multiple-strategy use is characteristics of numerosity comparison and contrasts with a view that emerged from previous research suggesting that participants use a single strategy.
Since participants use several strategies to select the most numerous dot collection, we examined how often they use these strategies and whether their strategy use is influenced by item characteristics. Our findings revealed that participants used strategies with different proportions; they used most frequently the distance strategy, followed by the dot size strategy, the dot size and distance strategy, and finally the total surface strategy. Surprisingly, participants never mentioned that they relied on convex hull. This is in contrast to previous findings suggesting that convex hull is the most influential visual feature on which participants base their smaller/larger selection (e.g., Gilmore et al., 2016). That participants did not refer to convex hull here does not mean they did not rely on it. It only means that this is not the way participants articulated their choices. As convex hull may be used less explicitly or consciously than other strategies, its use may be included in other strategies. For example, when participants mentioned the surface strategies, it is possible that they indeed used the surface strategy for some items and convex hull for other items. Yet, it was simpler for them to claim that they used the surface strategy. This does not undermine the validity of verbal protocols collected here as a number of previous studies in a wide variety of cognitive domains have shown the validity of verbal protocols to investigate strategies. It nevertheless suggests that future studies may examine whether participants use convex hull to compare collections of dots. This could be done in numerous ways, such as providing participants a list of strategies and asking them to select which strategy they used for each item from strategies proposed. Also, collecting eye movements may reinforce the validity of verbal protocols to assess strategies and refine determination of which strategy is used for each item, as previous studies have found, included in numerosity comparison tasks (e.g., Gandini et al., 2008).
Another interesting finding here was that strategy use depended on whether numerical and visual features matched (e.g., larger collections had larger convex hull) or mismatched (larger collections were displayed with smaller convex hull). Individuals tended to use a smaller number of strategies and strategies based on a single visual feature for incongruent items whereas they used a larger number of strategies and strategies based on several visual features for congruent items. One of the possible reasons for participants to use fewer strategies and strategies based on a single visual feature more often for incongruent items is possibly the fact that these items require more cognitive resources (i.e., they involve inhibiting irrelevant visual features to focus on the relevant numerosity dimension of the stimulus). Inhibiting a single visual feature requires fewer resources than inhibiting several visual features. This may have led participants to rely on single visual features for incongruent items. For congruent items, as several features match with the numerosity of dot collections, it makes sense to rely on several visual features, as information provided by these features converge to the same smaller/larger decisions. Relying on several visual features increases evidence accumulation in favor of smaller/larger decisions, speeds up and increases confidence in these smaller/larger decisions.
At a very general level, it is important to note that the present findings of strategy variability extend to the numerosity comparison domain strategy variability previously found in many other cognitive domains (e.g., Campbell & Alberts, 2009;Duverne et al., 2004;Gandini et al., 2008;Hodzik & Lemaire, 2011;Hartley & Anderson, 1983). This is interesting because numerosity comparison could be viewed as one of these cognitive domains where strategy variability, if it exists, is less crucial than in many other domains, given how automatically people estimate numerosities of dot collections or are able to quickly see which of the two collections of dots has more dots. As in many other cognitive domains, multiple-strategy use suggests that investigating strategic variations in numerosity comparison should very useful to understand effects already documented in the subject literature, such as display time of collections (e.g., , sequential vs. simultaneous presentation of collections of dots (e.g., Price et al., 2012), or collections of dots displayed either intermixed or separately (e.g., Price et al., 2012). Note that here collections of dots were displayed for 1500 ms to both young and older adults. It could be argued that some age differences (e.g., larger use of multiple cues in young than in older adults) found here may be the result of the same presentation duration to both age groups. Given that older adults are known to process information more slowly than young adults (e.g., Salthouse, 1996), they may have not had enough time to use multiple cues. Although this is a possibility, in the present as well as in previous studies, older adults made their judgment in less than 1000 ms (e.g., Capelletti et al., 2014, found that older adults responded in 633 ms; see also Norris et al., 2015). Therefore, it is reasonable to think that 1500 ms was enough for older adults to accomplish our dot comparison tasks. Of course, given the differences in procedures across studies (e.g., we collected verbal protocols here, previous studies did not), it is possible that presentation duration of stimuli may result in agerelated differences regarding strategic aspects of performance in dot comparison tasks, an issue that may be investigated in future studies.

Age-related differences in numerosity comparison strategies.
Strategy variability has important implications to further understand individual differences in numerosity comparison, such as aging effects. Here, we found both similarities and differences between young and older adults' strategies. First, young and older individuals used the same number and the same set of strategies. To our knowledge, the present study is the first to reveal no age-related differences in strategy repertoires. Previous works have reported age-related differences in strategy dimensions to accomplish cognitive tasks. Older adults tend to use simpler strategies and fewer strategies than young adults in a wide variety of cognitive domains, including arithmetic (e.g., Hodzik & Lemaire, 2011;Lemaire & Arnaud, 2008), reading (e.g., Shake, Noh, & Stine-Morrow, 2009), selective attention (e.g., Folk & Hoyer, 1992), episodic memory (e.g., Kuhlmann & Touron, 2012), and decision making (e.g., Mata & Nunes, 2010). The fact that strategy repertoires were the same in young and older adults here in dot comparison tasks suggests that age differences in strategy repertoire are not systematic and depend on cognitive domains and/or tasks. As discussed by Lemaire (2016), the subject literature does not provide sufficient data across multiple cognitive domains to know the defining features of cognitive domains or tasks in which young and older adults differ in the type and number of strategies, and of domains or tasks showing no age differences.
Interestingly, the present study revealed that age-related differences in how often strategies were used depend on item congruency. Although, young and older adults' strategy preference was the same while comparing incongruent items, these preferences differed while comparing congruent items. Such findings are important when analyzing age differences in effects of item characteristics (such as item congruency) on participants' performance. Indeed, previous findings revealed that young and older adults' performance was the same in congruent items but differed in incongruent items (e.g., Cappelletti et al., 2014;Norris et al., 2015), yielding larger congruency effects in older than in young adults. The present results suggest that older adults' poorer performance in incongruent items relative to young adults' does not stem from young and older adults' using different strategies and/or from age differences in strategy preferences. Discarding age-related differences in strategies reinforces Cappelletti et al.'s proposal that age-related differences in performance in incongruent items are the result of age-related differences in inhibition.
One limitation of the present study is that with insufficient number of observations resulting from strategy selection biases, we could not examine whether congruency effects differed as a function of strategies. Future studies may further investigate whether age-related differences in congruency effects are modulated by strategies with procedures that enable to collect enough data points in each condition and that control for strategy selection biases. This is possible in experiments forcing young and older individuals to execute each available strategy for all items, as proposed by Siegler and Lemaire (2017)'s choice/no-choice method.
In conclusion, the present study is the first to examine multiple-strategy use in young and older adults while accomplishing dot comparison tasks, , and that strategy use depended on item characteristics. Such age-related and condition-related differences in strategy use are important to take into account in future works on numerosity comparison, given that strategies differ in relative efficiency. Future works may indeed adopt the present strategy approach to further investigate effects of stimulus, participants', and situations' characteristics on performance when participants accomplish numerosity comparison tasks.
Financial Support: This work was supported by the Centre National de la Recherche Scientifique (French National Science Foundation), by a grant from the Agence Nationale de la Recherche (Grant # ANR-17-CE28-0003-01-01) to PL, and by a doctoral fellowship to AR from Aix-Marseille University.

Conflict of Interest Statement:
The authors declare that the research was conducted in the absence of any commercial or financial relations that could be construed as a potential conflict of interest.
Ethics statement: When the study was conducted, it was neither compulsory nor customary in France to seek explicit ethical approval for a study including only behavioral data, unlike studies including brain-imaging data. However, the participation in this study was anonymous and we used codenames in the data analyses. Before each experiment, written informed consent was obtained including the (a) research objectives, (b) procedure, (c) anonymity of data, and (d) possible benefits and/or risks of participation. Also, participants were informed that participation was voluntary and could be terminated at any time without any reason or negative consequences for the participant. The experiment started when each participant had read the information in the informed consent, and agreed to the rules of participation. Dot size + Distance 11.0 11.5 Table C. Young and older adults' mean response times (in ms) and percentages of errors in the dot comparison task for each strategy on congruent or incongruent items.

Reaction times (ms) Error rates (%)
Young adults Older adults Total Young adults Older adults Total