Advances in the application of a computational Theory of Visual Attention (TVA) Moving towards more naturalistic stimuli and game-like tasks

: The theory of visual attention, “TVA”, is an influential and formal theory of attentional selection. It is widely applied in clinical assessment of attention and fundamental attention research. However, most TVA-based research is based on accuracy data from letter report experiments performed in controlled laboratory environments. While such basic approaches to questions regarding attentional selection are undoubtedly useful, recent technological advances have enabled the use of increasingly sophisticated experimental paradigms involving more realistic scenarios. Notably, these studies have in many cases resulted in different estimates of capacity limits than those found in studies using traditional TVA-based assessment. Here we review recent developments in TVA-based assessment of attention that goes beyond the use of letter report experiments and experiments performed in controlled laboratory environments. We show that TVA can be used with other tasks and new stimuli, that TVA-based parameter estimation can be embedded into complex scenarios, such as games that can be used to investigate particular problems regarding visual attention, and how TVA-based simulations of “visual foraging” can elucidate attentional control in more naturalistic tasks. We also discuss how these developments may inform future advances of TVA.


Introduction
How can we predict the way humans orient visual attention in their environment?This is a long-standing problem in cognitive psychology, with decades of research devoted to it (see Driver, 2001;Á. Kristjánsson & Egeth, 2020, for historical reviews).One proposal that has a long history is that visual attention can be conceptualized as a biased competition (e.g., Desimone & Duncan, 1995) where possible categorizations of visual stimuli race for encoding into visual short-term memory (VSTM).Bundesen's Theory of Visual Attention (TVA; Bundesen, 1990) formalizes this idea in a computational framework that links data to theoretical concepts with mathematical rigor.Most theoretical accounts of visual attention (e.g., Duncan & Humphreys, 1989;Treisman & Gelade, 1980;Wolfe, 2021) only enable qualitative predictions about human attention performance.Among those that are quantitative and which have concrete implementations, the majority are image processing models, mostly focusing on the computation of visual salience and predicting priority maps for given inputs (e.g., Itti, Koch, & Niebur, 1998; see Zelinsky et al., 2020, for recent developments that focus on top-down guidance and deep neural network approaches).Their outputs can be compared to human behavioral data (e.g., gaze data), but they cannot be fitted directly to empirical data from psychological experiments.By contrast, TVA provides a framework from which quantitative models can be derived and then fitted to experimental data to estimate psychologically meaningful parameters.Consequently, TVA allows for the derivation of quantitative hypotheses that can be formally tested.This mathematical rigor is more likely than many other approaches to enable cumulative progress within this research area (cf.Krüger et al., 2018).
As stated above, a central assumption in TVA is that categorizations of visual stimuli race in parallel for encoding into VSTM under the influence of different attentional control parameters.This results in considerable flexibility that is not present in other approaches such as feature integration theory (FIT; Treisman & Gelade, 1980).Even if the feature processing stage is parallel, the theory involves a subsequent feature integration stage for stimulus selection.This has been used to explain serial search patterns in single-target search displays that include targets defined across multiple feature dimensions, requiring integration (Wolfe, 1994).However, data from more naturalistic tasks, such as the visual foraging paradigm in which participants search for many targets of various types, have shown that the impact of dealing with multiple feature dimensions is more gradual (e.g., it can be modulated by time pressure; see T. Kristjánsson et al., 2018), and so-called "super foragers" apparently experience only minor limitations compared to foraging for single-feature targets-they forage as if they had never heard of Feature Integration Theory (to put it humoristically).This highlights how dramatically their behavior differs from FIT predictions.Such theories, therefore, seem too rigid to explain search patterns in less restrictive tasks (cf.Á. Kristjánsson et al., 2020), and with its conjoint process of feature processing and categorization, TVA offers promise for improving on this.
The tendency within attention research has for a long time been to try to uncover basic operational principles of visual attention with simple stimuli.For instance, central ideas within Treisman and Gelade's (1980) feature integration theory and Wolfe's (2021) guided search model have been tested with simple features (colors, shapes, orientations, etc.) and their combinations.The majority of TVA-based research is based on accuracy data from letter report experiments, a task-model combination commonly applied to estimate attentional parameters in clinical contexts.While such basic approaches are undoubtedly useful, recent technological advances have enabled the testing of increasingly sophisticated paradigms involving more realistic scenarios, and notably those studies have in many cases revealed different capacity limits than studies using simple but well-controlled stimuli (Á.Kristjánsson & Draschkow, 2021).Virtual Reality experiments can increase immersion in real-world-like settings and touch-screen technology allows direct interaction with the stimuli.It would therefore be highly useful to develop extensions to the basic TVA methodology to address attentional allocation in such paradigms.
Here we provide a review of recent advances in TVA-based assessment of attention that goes beyond the use of simple letter report experiments and experiments performed in controlled laboratory environments.We show that TVA can be used with other tasks and new stimuli, that TVA-based parameter estimation can be embedded into complex scenarios, such as games that can be used to address certain problems regarding visual attention, and how TVA-based simulations of visual foraging can elucidate attentional control in naturalistic tasks.Figure 1B-O provides a quick overview over the range and diversity of such experiments.We believe that these developments will be able to inform future advancement of TVA.

Theory of Visual Attention
In this section, we give a short introduction to TVA and TVA-based assessment.TVA assumes that both recognition and selection of objects in the visual field involve making visual categorizations.A visual categorization has the form 'object x has feature i' (or equivalently, 'object x belongs to category i'), where x is an object in the visual field (e.g., a letter) and i is a perceptual feature (e.g., a certain color, shape, or orientation).A categorization of an object is made (or equivalently, selected) when the categorization is encoded into visual short-term memory (VSTM).Clearing VSTM starts a race among categorizations to be encoded into VSTM.Each object may be represented in the race by all possible categorizations of the object.An object becomes encoded in VSTM if, and only if, some categorization of the object becomes encoded in VSTM.TVA assumes that the capacity of VSTM is limited to K different objects.

Rate equation
The rate equation of TVA gives the processing rate (speed), v(x,i), at which a particular visual categorization is made, (e.g., 'object x has feature i'), as a product of three terms: The first term, η(x,i), is the strength of the sensory evidence that object x has feature i.The second term, β i , is a perceptual decision bias associated with feature i.The third term is the attentional weight of object x, w x , divided by the sum of attentional weights across all objects in the visual field, S. The total visual processing speed, C, is defined as the sum of all v values across all visual features, R, of all objects in the visual field, S:

Weight equation
The attentional weights in the rate equation are derived from pertinence values.Every visual feature j is assumed to have a pertinence value, π j , reflecting the current importance of attending to objects with feature j.The attentional weight of an object x is given by the weight equation of TVA, where R is the set of all visual features, η(x,j) is the strength of the sensory evidence that object x has feature j, and π j is the pertinence value of feature j.
In combination, the rate and weight equations of TVA describe two core mechanisms responsible for selection of objects in the visual field.As an example, consider an experiment in which red letters are to-be-reported and blue letters are to-be-ignored.By setting the pertinence value of red high and pertinence value of blue low, the weight equation of TVA states that the red letters will receive high attentional weights.If in combination, the perceptual decision biases associated with letter types (A, B, C, etc.) are set high, the rate equation of TVA states that categorizations of the red letters with respect to letter types will have the highest processing rate and, thus, are likely to win the race and be encoded into VSTM.
Attention can be a vague concept, and different definitions can be found.Some researchers have even argued that there is no such thing as attention and that the term is at best a label for a certain category of experimental findings (e.g., Anderson, 2010).Importantly, according to this cautious account, attention is not the cause of certain processes or phenomena but a label for their consequences.TVA agrees quite well with this account: it does not contain any notion of attention as a cause; For instance, while the attentional weights do reflect the distribution of attention, their calculation is precisely defined on the basis of other components that combine the visual input and the task set.With regard to the vagueness often associated with concepts of attention, TVA can help by providing a mechanistic description of how objects are encoded into VSTM and how this process is modulated by factors typically attributed to attention.For example, an increase in an attentional weight is associated with a set of formally defined potential causes and consequences.For an in-depth description of TVA's mechanisms, how they relate to typical experimental findings, and how they may function at a neural level, we refer readers to Bundesen and Habekost (2008).

TVA-based assessment
Traditionally, TVA-based assessment has relied on experiments in which letters are used as stimuli.In whole-report experiments (e.g.Sperling, 1960), all letters should be reported, whereas in partial-report experiments (e.g.Duncan et al., 1999), only letters with a prespecified characteristic (for example a feature such as color) should be reported and letters with another color should be ignored.Vangkilde et al. (2011; see also Sørensen et al., 2015) introduced the CombiTVA-paradigm combining whole-report and partial-report letter displays in a single experiment (see Figure 1A).This made TVA-based assessment easier and parameter estimation very reliable (Habekost, Petersen, & Vangkilde, 2014).Based on the CombiTVA-paradigm, five parameters can be estimated: (1) K, the capacity of visual short-term memory measured in number of letters; (2) C, the total visual processing speed (capacity) measured in letters processed per second; (3) t 0 , the perceptual threshold measured in seconds; (4) α, the top-down controlled selectivity defined as the ratio between the attentional weight of a distractor and the attentional weight of a target, so that α values close to 0 indicate optimal selection of targets and values close to 1 indicate no prioritizing of targets compared with distractors; and finally (5) w index , the laterality index of the spatial distribution of attentional weighting defined as the ratio between the sum of attentional weights assigned to objects in the left hemifield and the sum of attentional weights across the entire visual field.A w index value close to 0.5 indicates unbiased spatial weighting of attention, whereas values close to 0 reflect a right-sided bias and values close to 1 indicate a left-sided bias.
The main application of TVA-based assessment has been to study attentional deficits in a range of neurological disorders and psychiatric conditions such as neglect, simultanagnosia, reading disturbances, neurodegenerative diseases, and neurodevelopmental disorders (see Habekost, 2015, for a review).Besides clinical studies, TVA-based assessment has also been used to study attentional functions in young healthy participants following physiological or cognitive interventions such as transcranial direct current stimulation (tDCS; Brosnan et al., 2017), pharmacological intervention (Finke et al., 2011;Vangkilde et al., 2011), meditation (Jensen et al., 2012;Lansner et al., 2019), and video gaming (Wilms et al., 2013;Schubert et al., 2015).

TVA-based assessment beyond letters
In this section, we review TVA-based assessment of attention with stimuli beyond letters.The majority of TVA-based studies have used experimental paradigms in which participants identify and report letters.Letters are used for the TVAbased assessment of attention described above and also in fundamental attention research (e.g., concerning the time course of attention, see A. Petersen et al., 2012).Typically, letters were used out of convenience, as a highly overlearned stimulus that is easy to work with for most participants and for the experimenters who can record responses with standard PC keyboards.Moreover, the set of response options is large, keeping levels of chance performance from guessing low.Most of the time letters were not used because researchers were interested in letter processing per se, but to draw on these benefits.In fact, letters seem like a hindrance for research that focuses on specific stimulus properties which cannot be easily combined with letters (e.g., you can color a letter differently but many alterations to its shape would render it illegible).Similarly, certain participant groups cannot read, such as patients with certain disorders, young children, or animals.These participant groups have proven highly informative for attention research with other paradigms and likely would advance TVA-based research were it not for the restrictions of using letters.Finally, letter processing seems to be a highly specialized capability, rendering the use of letters problematic from a fundamental perspective as it might not illuminate the most general characteristics of attention and visual processing.For instance, using letters could have an impact on the lateralization of attentional parameters, producing a slight right hemifield processing advantage which might originate from the use of verbalizable material (Brosnan et al., 2017;Kraft et al., 2015).
Responding to these challenges, some researchers have started doing TVA-based assessment with stimulus material other than letters.These experiments still use the standard paradigms such as whole and partial report (recently also temporal-order judgments) in typical lab-based settings.However, using versatile stimulus material is a step toward dealing with the immense variability in the natural world and we hence review these approaches here.In (b-o), parameter values are only reported for the main experimental conditions and not for control tasks with typical TVA tasks (such as CombiTVA on PCs).TVA-TOJ refers to the model for temporal-order judgments (cf.Tünnermann et al., 2015); TVA-ExG refers to the ex-Gaussian model described in Dyrnholm et al. (2011); TVA-Ex indicates that the basic shifted exponential arrival time model was applied.This figure is best viewed in color with a large zoom.Starrfelt et al. (2013) and Sand et al. (2016) compared the recognition of letters to that of three-letter words to study the finding that words are recognized better than single letters (see Figure 1B-D).Starrfelt et al. also compared a single-item to a multi-item whole-report condition, and Sand and colleagues also tested the single-item effect in the peripheral visual field.These studies report TVA parameters t 0 , C (multi-item condition) or v (single-item condition) and, in Experiment 3 of Starrfelt et al., K and w index as a laterality index.While there was no influence of the different conditions on t 0 in the study by Starrfelt et al., C was increased with words as compared to letters in the single-item condition.By contrast, with whole report of peripheral items, C and K were smaller for words.The study by Sand et al. (2016) used the single-item paradigm.Advantages for processing of centrally presented words (smaller t 0 and higher v) were found in two experiments.The authors also found some evidence for an opposite effect (a letter processing benefit compared to words) at peripheral left visual field locations manifested in higher v (and smaller t 0 in one of the experiments).Processing in the right visual field did not show significant differences for words compared to letters.
Another good example is Wang and Gillebert (2018) who used line-drawings of fruits and vegetables in a whole/ partial report combination analogous to the letter-based CombiTVA paradigm (see Figure 1E).Instead, keys with corresponding pictograms were used to indicate the response.To examine the reliability of TVA-parameter estimates with such stimuli, Wang and Gillebert had (normal) participants work on the usual CombiTVA paradigm and their foodpicture-based version and estimated five TVA parameters.As can be seen in Figure 1E, parameters obtained from line drawings indicate lower performance in all parameters compared to estimates from letter-based assessment (e.g., see Figure 1A).However, the estimates correlated significantly with those from the letter task and also showed good internal reliability in split-half correlation tests.Other studies using similar approaches include Dall et al. (2021) who used Chinese characters (Figure 1F) differing in familiarity and number of elements and reported that only familiarity had an influence on parameters K and C. Peers et al. (2005) successfully used single letters and faces to assess the processing speed, v, in patients suffering from different lesions and in healthy controls (see Figure 1G).The difference between letters and faces in healthy controls would be interesting for the present paper but this was not explicitly tested; nor does it seem to be large.
One development which has improved the applicability of TVA to non-letter and more natural stimuli is the modelling of temporal-order judgment tasks with TVA.The temporal order judgment (TOJ) is a well-established experimental task, which helped shape nascent 19 th century experimental psychology (Boring, 1957;Hoffmann, 2006).The participants judge the temporal order of two stimuli (uni-or multimodal) that appear in close succession.If participants make a binary judgment ("stimulus x first" vs. "stimulus y first"), the judgment data, summarized as "stimulus x first" per onset interval, can be modelled with TVA (Tünnermann et al., 2015;Tünnermann et al., 2017), yielding a relative attentional weight estimate, w x *1, for stimulus x and a common C (TVA-TOJ model).Alternatively, the model can be parametrized with two processing rates, (e.g., v x =C⋅w x * and v y =C⋅(1 -w x *)), one for each stimulus.
An important advantage of the TOJ task is that it is very simple: animals, cognitively impaired or illiterate people, and young children, who would all have trouble with the letter task, can make such judgments.Also, it can be applied to a wide range of stimuli.For instance, several TVA studies have used oriented and colored bars as often applied in studies on visual salience (Krüger et al., 2016(Krüger et al., , 2017;;Krüger & Scharlau, 2021).
To quantify attention toward objects in the near space in which the observer could perform actions (e.g., grasping the object) compared to similar objects outside of this range, Tünnermann, Krüger, and Scharlau (2017) employed natural images in which two objects flickered asynchronously to implement a TOJ task (see Figure 1H).One object was in the immediate action space and the other one in a background area.A control condition used upside-down versions of the images in which the action-space relationship was expected to be disturbed.The study revealed an action-space attention advantage, reflected in the relative attentional weight of the "probe" stimulus, w p *, being larger than 0.5 (i.e., more attentional weight on the "probe" stimulus compared with the "reference" stimulus).A similar advantage was found with the upside-down images, indicating that either the action-space relationship was not successfully reduced or that there was a salience bias toward the action-space objects.
One interesting example of how TVA in combination with TOJ tasks can be used to cast new light on attentional studies comes from Tünnermann and Scharlau (2018a) who applied TVA-TOJ modelling to data of a study that used air puffs stimulate the whiskers of mice.Similarities between rodent whisker sensing and vision (e.g., both feed into spatial representations of the environment, see C. C. Petersen, 2007), justify-if only as a working hypothesis-the application of the theory-based parameters of TVA which might be well suited to uncover processes underlying TOJ in mice.Tünnermann and Scharlau found that some mice showed a leftward bias (increased processing rate for left air puffs) but that the overall processing resources are roughly constant across mice.These conclusions became possible because of TVA's independent components of attention and hierarchical Bayesian modeling.Importantly, these conclusions differed from the original study that reported a strong leftward bias in all mice which was assumed to reflect lateralization of the TOJ to the right hemisphere (Wada et al., 2005).
A caveat of this experimental method is that, so far, only the order of two stimuli can be modelled by TVA, which makes it impossible to derive an estimate of K.Moreover, the parameter t 0 cannot be estimated without further assumptions (see Tünnermann & Scharlau, 2018b, for a version that estimates t 0 based on the assumption that the race for VSTM is reset if the second stimulus appears within the t 0 range for the first stimulus).Nevertheless, combining TOJ tasks and TVA-based modelling can clearly provide a basis for advances toward integrating TVA into richer, more naturalistic scenarios and games which is discussed below.

TVA outside the lab
In this section we review recent trends towards taking TVA-based research out of laboratory settings.The goal behind this is twofold: (1) moving outside the lab can make the method more accessible to various populations of interest and (2) bringing tasks into behaviorally richer environments such as games and virtual reality enables studying attention under more ecologically valid conditions (Á.Kristjánsson & Draschkow, 2021).These approaches can also be combined.
The most straightforward possibility for moving TVA-based assessment of attention out of the lab is to implement typical TVA-based tasks (e.g., CombiTVA or TOJs) on portable devices such as tablets or smartphones, or with web browser technology as online experiments.With these approaches, the task can be delivered more easily to participant groups which lab-based assessment typically does not reach.These include clinical populations that require bedside administration or the testing of large international samples.
Conducting experiments which are typically highly controlled on consumer-grade mobile devices or even the unknown systems of internet users, in highly variable uncontrolled environments raises concerns of reliability and comparability with typical lab-based measurements.However, several studies have shown that typical patterns from experimental results can be replicated using, for instance, smartphones (e.g., Brown et al., 2014) and that many issues with presentation can be circumvented (Woods et al., 2015).Recently, two TVA studies have addressed whether or not such measurements would be useful and sufficiently reliable.Wang et al. (2021) implemented the CombiTVA-paradigm The TVA-based model assumes that both stimuli race for VSTM entry and their arrival order determines the temporal order judgment; due to its attention benefit, the probe overtakes the reference stimulus in this example, despite being presented after it.(c) The resulting psychometric distribution of "probe first" can be modeled by TVA-based equations, leading to estimates of TVA parameters C (total visual processing speed) and w p *, the relative attentional weight for the probe stimulus.
on an Android-based tablet and compared it to the typical PC-based CombiTVA-paradigm (see Figure 1I).Analyses of device type on TVA parameters showed only a slightly lower w index, (i.e., a slightly larger right-sided bias of attention) and a higher t 0 when the CombiTVA-paradigm was used on a tablet compared with the PC-based version.The tablet (but not the PC screen) was placed in the lower part of the visual field, where horizontal attentional asymmetries of this kind have also been observed with other tasks, potentially explaining the deviation in w index values (as the authors highlight).Concerning the higher t 0 , the authors speculated that this might result from the relatively slow pixel response time in the tablet's display (~ 5 ms).
Apart from the use of tablets for stimulus presentation and responses, the general settings were kept constant between the devices.That is, participants performed both conditions (tablet and regular PC-based versions) under controlled lighting and viewing conditions.Wang et al. (2021) therefore concluded that their findings on the reliability of tablet-based testing "reflect the upper range of what can be achieved in clinical practice" (p.11).To quantify this reliability, all estimated parameters from the tablet version (K, C, t 0 , α, and w index ) were correlated with the corresponding estimates from the PC-based measurements, leading to coefficients ranging from .67 to .93.A measure of concordance in terms of absolute values indicated a similar range only indicating "inadequate" concordance between tablet and PC measurements for t 0 and w index parameters.The test-retest reliability was significant on both devices and turned out to be in similar ranges, except for w index , which was significantly more reliable on the tablet.The internal reliability (assessed with split-half correlations) was comparable for tablet-and PC-parameter estimates, except for the estimate of C which was significantly more reliable using the tablet version.Wang et al. therefore concluded that tablet-based tests can provide reliable TVA-parameter estimates and may be a promising tool in clinical contexts.Krüger et al. (2021) used mobile devices in three TOJ experiments to measure C and w p *.The experiments used arrays of tilted line segments on the left and right side of a fixation mark which each contained a slightly larger element as a target (see Figure 1J).On half of the trials, one of the targets was salient.To implement a TOJ, the two targets flickered asynchronously with varying SOAs.In Experiment 1 the probe target (whose attentional benefit is measured) was an orientation singleton, in Experiments 2 and 3 a color singleton.Viewing distance, light conditions and device type were not constant as in Wang et al. (2021) so that the reliability estimates for these experiments may indicate the lower range of what could be expected if the experimental program was optimized for a particular device and viewing and light conditions at least somewhat controlled-as might be possible in many scenarios.
The estimated TVA parameters fell within reasonable ranges but differed in magnitude between device types, particularly for the total processing rate, C, which was 10 to 24 Hz lower in the mobile-assessment condition.The mismatch could be due to differences in stimulus presentation (e.g., retinal size at different screen sizes and viewing distances) or technical differences in how well the various devices could handle presentation timing.The C estimates from mobile and PC-based conditions were positively correlated with coefficients ranging from .58 to .67.
The relative attentional weight of the probe, w p *, was estimated between .53 and .6 (.5 indicates no attention benefit for the probe) depending on the salience manipulation in the PC-based control condition.In the tablet condition, it was smaller by .01 to .03.Although there was no correlation between the w p * of mobile and PC-based assessment in the orientation salience experiment, correlations for the two color salience experiments were .61and .32,respectively.Also, effect size in Cohen's d and "probability of superiority" (the probability that a random observation from one group has a larger value than a random observation from the other group) were calculated.Effect sizes for the salience effect were very high in the color experiments and did not differ between the mobile and PC-based conditions.In accordance with the w p * results reported above, effect size was moderate in the PC-based condition of the orientation experiment and small in its mobile condition (which is, as may be noted in the passing, the most likely cause of the uncorrelated w p * values in this experiment).In sum, even uncontrolled experiments on mobile devices allow sufficient precision to measure effects for both the total processing capacity, C, and attentional weights in the TVA-TOJ design, confirming the results of other studies and reviews of online experimentation (e.g., Brown et al., 2014;Germine et al., 2012;Lumsden et al., 2016;Woods et al., 2015).
While the results are reassuring overall, an additional question is how well the theoretically derived model describes the data in the two conditions, the mobile and the PC-based condition.To answer this question, Krüger et al. (2021) computed Bayesian posterior predictive p values, which give the probability of observing the present or more extreme data under the fitted model.In all cases, the PC-based condition provided better model fits than the mobile condition.Also, model fits were less good when small temporal intervals were compared with longer ones.Since timing problems in browsers can be expected with small temporal intervals, this finding gives the comparison some credibility.Furthermore, Tünnermann and Scharlau (2018b) have argued that at very small intervals between the two stimuli, the course of the measurements reflects additional processes that are not captured by the TVA-TOJ model.Numerically, model fit was good for larger intervals in the PC-based condition; with all other combinations, it was low.However, the agreement of the data with the estimated curves provided no evidence for systematic misestimates of any parameters.Note also that quantitative model fits need to be weighed against theoretical model plausibility which is high for TVA because of its basis in known cognitive parameters.Also, visually, model fit was good, that is, the judgments curves predicted by the model closely matched the actual data recorded for most of the participants.Foerster et al. (2019) also attempted to make TVA-based assessment more accessible outside the lab, in particular for clinical populations.They focused on standardizing task presentation.Extending their earlier work (Foerster et al., 2016) in which they estimated the C, K, and t 0 parameters in experiments where participants performed a wholereport task with one of the first head-mounted VR displays that became available (see Figure 1K), they now obtained these parameters in addition to α and w index in a VR-based CombiTVA task on a current consumer-grade device.The rationale was that with VR headsets, factors like viewing distance, angle, and lighting conditions would no longer vary as much as for different PC/monitor setups.Foerster and colleagues asked whether these devices can present stimuli accurately enough for reliable estimates of TVA parameters.They included a PC-based control condition where participants performed the same task.The parameter estimates from the VR headsets correlated with those for the PC baseline (coefficients from .54 to .83, with the lowest correlation found for α).They also estimated test-retest reliability finding no significant differences in reliability between the VR headset and PC conditions.
The findings of Foerster et al. (2016Foerster et al. ( , 2019) ) and Wang et al. (2021) demonstrate that reliable TVA measurements can be obtained with mobile tablets set up in controlled environments and with VR headsets even without environment control.In most cases, the parameters were comparable to the control conditions and hence these methods seem to enable measurement of TVA parameters whose absolute values could be compared between patients or different studies from the fundamental domain.The less constrained approach from Krüger et al. (2021) might not be suited to obtain values with absolute meaning, but it nevertheless allows detection of meaningful patterns in TVA parameters, such as increased attentional weights due to salience.Hence, even without basic experimental control this approach can be used to answer many questions concerning attention and processing speed.
While the studies above are all aimed at quantifying the usability of new measuring techniques, a recent study by Scharlau et al. (2020) used the insight that TVA parameters recorded under less controlled conditions are useful to target attention-related research questions in online experiments.Using a similar paradigm to Krüger et al. (2021; Experiments 1 to 3) they study the influence of negation in language on visual processing capacity.Carried out during the COVID pandemic, the participants had to perform the experiments at home on the device at hand.Apart from advising against large screens, there were no recommendations about which device to use and the authors did no checks concerning stimulus appearance (an array of oriented bars, two pairs made salient by color).Negation in the instruction ("not blue!" in comparison to "now green!", indicating the stimulus pair to be judged) reduced C by 4 to 18 Hz.If only one stimulus within a pair was negated, w p * was also reduced.The findings are compatible with other studies assessing TVA parameters in dual-task situations, which also reported effects of an additional task on TVA parameters estimated from a concurrently performed TVA-based task (e.g., tapping; Künstler et al., 2018;Poth et al., 2014); the further "task" in the study by Scharlau et al. being negation processing.Also important in the present context is that although the authors dispensed with screen checks or similar tests, the effect was present for the four colors tested (yellow, red, blue, and green).
In another current application of the TVA-TOJ methodology, Scheller et al. ( 2022) compare the effects of perceptual and social salience.Research on social salience effects typically associates arbitrary geometric shapes to socially meaningful concepts such as "self" or "other".With the limitation to use letters in experiments for TVA modelling lifted by the TVA-TOJ methodology, the typical stimuli from this research area can be used to open up this field for modelbased research.

TVA in games
The criticism that experimental tasks are often highly artificial, rigid, and perhaps not ecologically valid is not specific to TVA-based research but may apply to a wide range of attention research.This criticism can pertain to a variety of features of tasks in experiments, for instance that they consist of a sequence of separate trials instead of more continuous and dynamic attending.This criticism will be taken up below in the section on visual foraging.Here, we will turn to a scenario that has some resemblance to laboratory tasks but is experienced as much less artificial, namely, video games.Many video games consist of sequences of repeated challenges such as jumps across ditches and over obstacles or shooting and collecting objects.Similar to lab experiments, they are repetitive and artificial, but people spend much time gaming voluntarily.So, while keeping to the idea of an experiment as a collection of separate trials belonging to different conditions, gamification might be a way to circumvent its perceived artificiality or tediousness (see, e.g., the review by Lumsden et al., 2016) or lack of effort (DeRight & Jorgensen, 2015).
Although in principle, different ways of gaming TVA are possible, to our knowledge, so far only the TOJ task has been used in embedded games.As mentioned above, the basic idea is to incorporate the TOJ unobtrusively within a game and optimally as a direct part of it.This best case is realized in Experiments 4 and 5 in Krüger et al. (2021).The game was a race in which the player steered (as quickly as possible) a dragonfly or a spaceship through a tunnel with grids consisting of holes (see Figure 1K-L).The grids with the holes also served as the material for the TOJ.Two of the holes were salient so that they could be identified easily.(As an experimental manipulation, one was more salient than the other.)They flickered once each in quick succession, and the TOJ was made implicitly because the player had to fly through the hole flickering second.If they managed this, they received a short boost to their momentum.Choosing the wrong hole reduced the momentum of the dragonfly or the spaceship and thus lessened the chances of winning the race.Gamification was also implemented by increasing difficulty (i.e., levels).In higher levels of Experiment 4, the players were challenged by wind gusts from random directions, which they had to counteract in order to fly through the right hole and which increased with level.In Experiment 5, gaming conditions were varied firstly via overall difficulty of the TOJ (i.e., the length of the interval between the flickers to be judged), and secondly via adapting spaceship speed to overall performance.The reason for having the players choose the hole flickering second was to facilitate that they observe the entire TOJ presentation and not act before the second event occurred, despite the strong time pressure.
Compared to a similar experiment in which all gaming features were removed (lab condition), gaming expectedly diminished the attentional weight of the salient hole and total visual processing speed, C, in Experiment 4.More relevant in the context of the present paper, visual inspection showed good model fit, that is, observed judgment data and model predictions closely resembled each other.As a caveat, quantitative estimates of the model fit were low in both conditions.This may, however, have been caused by the fact that there were very many repetitions per condition which may have caused the chosen estimate of model fit, posterior predictive check p values (Conn et al., 2018), to penalize even slight deviations harshly.There is, thus, neither reason to specifically trust nor mistrust TVA-TOJ model fit in games.It is surely possible that there could be models that better describe the data from the game-like experiments (e.g., by adding additional parameters that capture the added flexibility), but it is not only model fit that is of importance and further parameters should not be added without theoretical justification.The visual fits Krüger et al. (2021) report are quite good in the sense that no fit departs obviously or systematically from the data, both within and across participants' data.
In Experiment 5, C was influenced by both experimental conditions: It was smaller when the TOJ was more difficult and also smaller when speed of the spaceship was adapted to gaming performance.Parameter w p * was clearly larger for the more salient hole.However, the experiment also indicated potential problems with the gaming scenario.Firstly, some C values were unaccountably high (95 and 118 Hz in the nonadaptive condition), and the experiment revealed a trade-off pattern of C and w p *, which is strange given that these parameters are assumed to be independent.Whether these are indeed systematic weaknesses or spurious findings remains to be tested.
Another gaming-like scenario, used by Stratmann et al. (2019) and in Experiment 6 in Krüger et al. (2021) was a combination of gaming and real-life activity because it embedded the TOJ task into cycling on a bike simulator through a virtual traffic scenario.The restrictions of the traffic simulation made it difficult to include the TOJ as a natural part of the task; instead, it was included as the helper task, similar as in the dual-task situations mentioned in the previous section.From time to time, two gemstones appeared hanging in the air in front of the driver who had to drive below the diamond that flickered first (Stratmann et al., 2019, see Figure 1M).In Krüger et al. (2021), the gems were replaced by pylons and the driver had to hit the pylon that flickered second (see Figure 1N).Both experiments allowed precise estimates of C with reasonable means of approximately 60 Hz.Furthermore, in Stratmann et al., C decreased by 10.5 Hz for higher traffic density.Krüger et al. reported a reduced attentional weight for a more salient stimulus which was unexpected.While this unexpected result is yet to be explained, it indicates that attention manipulations such as an increase in salience, known to be consistent and reliable in the lab, can have different consequences within behaviorally more complex scenarios.
To sum up, including the TOJ task into games seems a reasonable and productive way of taking TVA-based assessment beyond classical stimuli and designs.It is not readily apparent whether this direction may also be taken with whole-and partial-report paradigms which are more affected by basic conditions such as background contrast (which are typically dynamic in games).

TVA into the wild
While tasks with discrete trial presentation and reactions can certainly provide useful information about attentional orienting, their main drawback is that they involve a reductionist approach, where the aim is to measure attention allocation in a "pure" sense, in isolation from other processes.Some authors have argued that more dynamic and continuous tasks may therfore be better indicators of actual attention allocation in real world scenarios (Á.Kristjánsson, & Draschkow, 2021;Hayhoe, 2017).In a recent study on selective attention in visual search tasks, Sauter, Stefani, and Mack (2020) list several points in which searching for objects in the real world differs from searching for pre-defined targets as studied in many laboratory search tasks.Two crucial points are that real-world searches allow for more elaborate search strategies, which strongly rely on memory (Gilchrist, North, & Hood, 2001), and often include complex search path planning (Riggs et al., 2017) to balance search efficiency and search effort.This second point emphasized that real-world searches are usually performed as part of an interaction sequence with the environment and therefore include action planning.Action planning, however, can have a strong impact on perceptual processes and bias attention toward action-relevant object features or locations (e.g., Baldauf, Wolf, & Deubel, 2006;Fagioli, Hommel & Schubotz, 2007;Wykowska, Schubö, & Hommel, 2009).Similar findings have been reported for visual working memory (e.g., Heuer & Schubö, 2017;Heuer, Ohl, & Rolfs, 2020;Olivers & Roelfsema, 2020).
Most laboratory attention tasks not only lack naturalistic actions, but also use single rather than multiple targets, which are presented in discrete trials rather than being distributed across continuous trials.Yet our daily interactions with the world do seldom involve a single target in a given scene but require us to process multiple targets for various actions simultaneously, while we also need to monitor our environment for unexpected events.Tasks that allow more freedom include so-called adaptive-choice visual search tasks, where observers are free to choose between two alternative targets in a context that dynamically changes over trials and the difficulty of choosing one rather than another target changes depending on the context at different times (Irons & Leber, 2016;Bergmann, Tünnermann, & Schubö, 2019).
An even less restrictive task is the visual foraging task, in which participants collect multiple targets of multiple types within a single display or "patch" on a given trial (Wolfe, 2013;Á. Kristjánsson, Jóhannesson & Thornton, 2014;Tünnermann, Chelazzi, & Schubö, 2021; see T. Kristjánsson, Ólafsdóttir, & Á. Kristjánsson, 2019, for review;and Prpic et al., 2019 for a foraging task within a virtual natural environment).The targets can all be items of certain colors or shapes among other shapes and there is no upper limit on the number of different targets or distractors (see T. Kristjánsson & Kristjánsson, 2018).These tasks yield a continuous measure of attentional orienting in time while they also allow manipulation of several visual parameters and can provide a more realistic picture of at least many aspects of attentional selection in the visual environment.

Modelling visual foraging with TVA
Less structured tasks present a challenge for TVA's analytical model, but one way of overstepping this is to perform simulations whose outcomes can be compared to empirical data.Initial advances in this direction have been made concerning the phenomenon of "run-like" behavior which is often observed in foraging, originally in animals (e.g.Dawkins, 1971;Dukas, 2002;Dukas & Ellner, 1993) but more recently also in humans (Á.Kristjánsson et al., 2014).Á. Kristjánsson et al. had observers select 40 targets from two categories as quickly as possible from amongst 80 items (40 were distractors) by tapping them on an iPad.Once tapped, the targets disappeared.This study was originally inspired by a study of Dawkins (1971) where young chickens selected food of different colors.Dawkins found that the chickens repeatedly selected same-colored food pellets, much more often than could be expected by chance, arguing that their attention was biased towards the recently selected items, while this differs somewhat by task difficulty (see also Bond, 1983).Á. Kristjánsson et al. subsequently found that foraging patterns in humans differed strongly by the nature and difficulty of the foraging task.If only color distinguished the targets from distractors, all observers were able to easily switch between target types (also seen in animals foraging for conspicuous prey; Bond, 1983;Langley et al., 1996;Reid & Shettleworth, 1992).But when the targets could only be distinguished from distractors on the basis of two features (a conjunction of shape and color) observers tended to repeatedly select the same target type and foraging strategies were similar to selection patterns for animals foraging for cryptic prey (Dukas, 2002;Dukas & Ellner, 1993).But perhaps the most interesting finding was that a subset of observers did not show this repeated selection but switched frequently even for conjunction targets.T. Kristjánsson, Thornton, & Kristjánsson (2018) then showed that when time constraints were introduced on the foraging time, most observers were able to switch frequently, even between conjunction-based targets, as if they were able to load their working memory with a lot of information for short bursts of concentration.This means that many theories of attention have trouble explaining results of recent visual foraging experiments in a straightforward way.One potential reason may be that the theories are intended to explain results of tasks that reflect reductionistic approaches to the experimental study of visual attention.
TVA's detailed description of the interplay of various components of attentional control and early vision seems promising for working toward models that resolve the difficulties outlined above.TVA already has attentional control settings parameters (pertinence π and bias β) that provide good flexibility for modeling such situations.They can assign gradual levels of importance to certain features and categories.However, many of the questions raised by foraging experiments such as regarding the mechanics and strategies of how humans set, adjust, or switch between these settings seem crucial but have not yet been considered within TVA.In the following, we describe recent advances in taking a TVA-based perspective on visual foraging tasks.
Before turning to a TVA-based simulation of foraging, a brief look at modeling of foraging tasks in general seems useful.Perhaps the models most commonly applied to foraging data are optimal foraging models from behavioral ecology.These consider top-level decisions about whether foragers should invest time (or energy) to move toward a new location that might yield more payoff than the current one.These models have no direct connection to the early vision and attention processes modeled by TVA.At the other end, close to the low-level data, recent approaches look at modeling the distributions of target switches (Tünnermann, Chelazzi, & Schubö, 2021;Tünnermann & Schubö, 2022).An approach by Clarke, Hunt, and Hughes (2022) amends this with a spatial bias that accounts for the fact that foraging unfolds spatially organized, with preferences for near-by locations.While optimal foraging accounts might be connected with TVA-based models of foraging only in the long-term (e.g., if TVA can predict the perceived productiveness of a location) the models that estimate switching probabilities might more easily be connected with TVA concepts in the near future.Below we describe a forward-simulation of foraging based on TVA.This allows the implementation of different foraging strategies and assessment of how they interact with TVA's encoding mechanisms.Setting up flexible attentional strategies requires amending TVA with some mechanisms (often ad-hoc solutions) that have not been modeled yet, such as the switching between attentional control settings or spatial attention guidance across the display.In this way, the approach can be instructive about the shortcomings in current TVA and inspire future improvements.
In ongoing work, Tünnermann, Kristjánsson, and Schubö (2022) simulated foraging in patches similar to Á. Kristjánsson et al.'s (2014) conjunction displays (see Figure 3A).TVA's pertinence values, biases, attentional weights, and processing rates were calculated (see Figure 3B) and encoding times for items to enter VSTM were calculated assuming that they were exponentially distributed.Encoded targets were "collected", which means that the simulation removed them from the display, to mimic the removal by a human forager.A particularly efficient mode of conjunction foraging emerged when attentional control exclusively favored one target type and only changed occasionally, replicating human same-type selection runs (Figure 3B-C).In TVA, a disadvantage of searching for two conjunction targets at the same time arises from its implementation of attentional templates as importance scores for separate, individual features.For instance, to search for red disks and green squares (as in Á. Kristjánsson et al.'s, 2014, conjunction displays) the visual system could boost the pertinence values for red and green (e.g., π red = high, π green = high) and boost the categorical bias parameters for circles and squares (β disk = high, β square = high2).In many situations, this would selectively boost the attentional weights and processing rates of the targets, red disks and green squares.However, in the conjunction foraging task the distractors share all these features, just in different combinations.Consequently, the processing of distractors would be enhanced as well, leading to both targets and distractors being encoded into VSTM with equal likelihood.If targets and distractors enter VSTM, the organism has to further analyse the VSTM content to decide whether or not to collect an item.But when only the pertinence and β values for one color and one shape are set to high values (e.g., π red = high and β square = high, everything else = low) only processing of the targets (or more precisely the subset of them that has these features) is boosted, leading to only targets entering VSTM.In this case, the organism can trigger collecting actions toward any elements that enter VSTM without further analysis of the VSTM content.This could result in fast-paced selections in runs of same-type targets, just as is observed in humans during conjunction foraging (Á.Kristjánsson et al., 2014Kristjánsson et al., , 2020;;see Clarke, Irons, James, Leber, &Hunt, 2022 andWolfe et al., 2019, for replications).Of course, at some point no more targets of the preferred type would be available and the organism would need to switch the pertinence and β values to match the other target type (e.g., π green = high and β disk = high, everything else = low).The logic of this switching strategy, which avoids encoding distractors, predicts run-like foraging behavior based on how TVA implements attentional control.In addition to these switching strategies, other factors might contribute to run-like behavior.For instance, having the representation of a recently collected target in VSTM might generate feedback that enhances attentional control settings (pertinence and bias) that prioritize similar targets, resulting in priming (or, alternatively, the absence of the costs associated with updating attentional control settings, cf.Tünnermann, Chelazzi, & Schubö, 2021).
To implement a simulation that moves spatially across a display to "collect" elements like human participants do, thereby enabling assessment of the efficiency to which different strategies might lead, requires additions to current TVA.For instance, a more precise notion of spatial attention is required.Nordfang et al. (2018) incorporated a spatial component into the weight equation of TVA.However, this component was modeled as priorities of discrete positions in the display (e.g., the locations where letters are presented in a canonical TVA task).For our simulations of foraging, we had to generalize this spatial component as an attention gradient that prioritizes elements near the current foraging location and fades out toward locations farther away.Adding a spatial preference like this leads to foraging that is spatially more coherent.Nearby elements are selected with higher probability, leading to a more orderly trajectory.However, this brings new questions: When is the current foraging location moved to a new position, or in other words, when is a saccade toward a new location generated?In our current simulation we move the simulated gaze position to the location of the last element that was encoded (and collected), which is on average the one farthest away from the current location among the encoded elements.Another plausible alternative would be to shift the focus spatially before the calculation of the attentional weights so that the attentional weights already reflect the upcoming gaze position as suggested by Schneider (2013).Our current mode seems like a more reflexive way to shift attention spatially while Schneider's proposal seems more active.It is not unlikely that mechanisms of both kinds contribute to foraging behavior.Moving from one location to another in a foraging display requires not only shifting the gaze but also the target location for the action.Recent TVA-based studies (using the classic letter-based paradigm) have revealed that pre-motor shifts of attention related to eye and hand movements draw on the same attentional resources (Kreyenmeier et al., 2020).The interaction of directed action and gaze in foraging-like tasks is another highly interesting topic for future research.
As such considerations make foraging simulations proceed spatially across a scene, more questions arise: How long does an organism wait for elements to arrive in VSTM?In typical TVA experiments this question does not come up because all targets are presented at high priority locations and are encoded quickly.In foraging tasks, some targets might be far away from the focus of attention and therefore receive a very low attentional weight.For instance, if a stimulus far from the current attentional focus receives a relative weight of w* = 0.001, an observer with a typical total processing speed of C = 70 Hz would need, on average, more than 14 seconds (>1 / (C⋅w*) + t 0 ) to encode this stimulus.Realistically, it seems unlikely that an organism would stare at one location for this long to wait for more stimuli to be encoded.Hence, to prevent foraging simulations from getting stuck in this way, an upper limit for processing the scene at one location seems necessary.What if at one location no stimuli reach VSTM (within such time limits)?Then, priority must be given to locations far away either by shifting the focus of attention "intentionally", searching for new productive locations, or by broadening the focus so much that further elements fall within a range that receives substantial attentional weights, speeding up encoding.In our current implementation of the TVA-based foraging simulation, whenever the foraging gets stuck (no targets arrive in VSTM, even after switching to other target sets) the attentional focus is broadened by enlarging the spatial attentional weight component iteratively until new targets are encoded.
The considerations outlined above show where TVA needs additional assumptions to deal with data from unrestricted tasks like the visual foraging paradigm.These assumptions relate to temporal dynamics of TVA parameters (e.g., switches in time between attentional control settings) and spatial dynamics (e.g., how attentional weights change with distance to the currently attended location).Future work should include such mechanisms in the formal model and validate them with empirical data.Visual foraging tasks seem to be a promising candidate for generating such data because they are far less restrictive than current paradigms but still involve a well-defined task for which many factors can be experimentally controlled.

Challenges, limitations, and outlook
The studies discussed here with TVA-based assessment in less restrictive environments and the recent approach of simulating visual foraging with TVA highlight many challenges for TVA and limitations that need to be addressed in future work.For instance, running typical lab-based tasks on mobile devices (or online) outside of the lab requires standardized recording situations since otherwise parameters may differ in their estimated values from lab-based versions of the same tasks.When TVA tasks are embedded into games or virtual reality (e.g., the bicycle simulation), TVA parameters only capture the part of attentional processing that is involved in the embedded task.Unknown portions of the total resources and unknown attentional weight distributions will apply in complex scenes which are not explicitly modeled.For instance, car traffic in the background can only be quantified as the resources that seem to be missing in the embedded task (e.g., attentional weights associated with a street sign or a particular car in a given moment is not assessable).Similarly, in foraging patches, stimuli and backgrounds are kept simple to reduce influences of unmodeled complexity.
These limitations reflect the fact that complex natural environments cannot be modeled with TVA in detail, yet.In a typical TVA experiment where letters are presented on a computer screen, it is quite clear which objects to model explicitly.This seems much less clear in realistic scenes.In some cases, it might be adequate to ignore any objects irrelevant to the task.This can be achieved by setting their attentional weight to zero.Another approach might be to include an attentional weight of an extraneous noise object (i.e., the sum of all irrelevant information that is not directly a part of the task) and assume that the β-values associated with its features are vanishingly small.Extraneous noise will thus take up processing resources from the relevant objects but only have a vanishingly small probability of being encoded into VSTM (see Bundesen et al., 1984Bundesen et al., , 1985;;Bundesen, 1987 for implementation of extraneous noise in TVA).
Foraging tasks raise many questions about how attentional strategies are managed in behaviorally continuous situations.Another reason for difficulties with more complex stimuli is that TVA's stimulus encoding model includes quantities that cannot be easily measured (and are therefore typically held constant).For instance, the strength of sensory evidence η(x,i) that a stimulus belongs to feature category i might vary strongly between the viewing conditions on a tablet device "in the wild" or a lab PC.For the multitude of objects and complex backgrounds in natural scenes η(x,i) values are even less clear.One way forward might be to integrate TVA with image processing approaches to modeling attention (e.g., Itti & Koch, 2001;Navalpakkam & Itti, 2007;Tünnermann et al., 2019) and estimate such quantities computationally from natural images or videos.Such models are often inspired by the operational principles of the visual cortex.They could provide a basis for questions like which feature dimensions play a role or how the evidence

New research questions inspired
Using non-letter stimulus material with TVA What can TVA parameters reveal about attention in certain participant groups (young children, cognitively impaired people, animals)?
Modeling the TOJ task for TVA-based assessment Is the race for VSTM reset if the asynchronous stimuli appear within the t 0 interval?Or, more generally, how can TVA account for stimulus asynchrony?TVA assessment with mobile devices such as tablets and VR headsets What can TVA parameters reveal about attention in (clinical) participant groups requiring bedside tests?
Including TVA assessment in games and traffic simulations What is the cause of reduced attentional weights found for salient objects in a driving simulation?Are correlations between C and t 0 systematic weaknesses or artefacts of TVA?More generally, do TVA parameters still capture independent components in complex scenarios?Applying TVA to visual foraging tasks (How) can TVA be linked to models of optimal foraging?How are pertinence and bias parameters (or, more generally, attentional control settings) updated to implement certain foraging strategies?How are pertinence and bias affected by attentional priming?
When foraging through spatially distributed stimuli, how are attention and gaze shifted (in light of TVA mechanisms)?
How long does an organism wait for elements to arrive in VSTM (before foraging at another location)?
for certain features can be quantified in natural images (e.g., by applying Gabor pyramids for measures of orientation across scales).Deep neural networks trained to classify objects could provide a basis quantifying the evidence that a certain object belongs to a certain category in natural scenes.On the other hand, TVA can provide the top-down mechanisms that integrate the information in a psychologically plausible and interpretable manner.So far, the only image-processing attention model where this is attempted is the model by Wischnewski et al. (2010) which includes TVA concepts.Like our foraging simulation, this approach predicts which objects or locations are prioritized based on TVA mechanisms but does not allow estimation of parameters from experimental data-which is a worthy goal for future approaches.

Conclusion
The studies reviewed in this article show that TVA has progressed into diverse areas of visual attention research.With TVA, findings from various areas can be connected with a precision that would be impossible without a common formal framework.New applications mirror current trends such as the use of VR headsets and tablet PCs.As such devices become more and more available they can bring attention research out of the lab into richer environments.This review suggests that for areas such as online assessment, with mobile devices, or implemented in games, TVA-based modelling is already a fruitful endeavor.These new applications and the initial advances toward modeling behaviorally continuous tasks such as visual foraging with TVA have revealed a range of new research questions (some of which are listed in Table 1) but also new challenges.As a general theory of visual attention, TVA should address these questions and meet these challenges.The framework should enable models of attentional behavior in complex visual scenarios not only in artificial lab tasks, a focus for future TVA-based research and theory development.
Financial Support: This research received no specific grant from any funding agency, commercial or nonprofit sectors.

Figure 1 :
Figure 1: Range of experiments used in the field.(a) Presentation procedure, stimuli and typical results from letter-based CombiTVA paradigm.(b-o) Stimulus examples from TVA-based assessment of attention "beyond letters".Displays marked with * were drawn based on descriptions from the original articles and may not represent exact sizes, spatial relationships, colors, and faces.The image in (h) is adapted from Tünnermann et al. (2017) and reproduced with permission.The dashed boxes in subfigure (h) mark the objects of interest and were not present in the actual displays.Panels (e, j, l, m, n, and o) are adapted from the cited sources under Creative Commons licenses.In (b-o), parameter values are only reported for the main experimental conditions and not for control tasks with typical TVA tasks (such as CombiTVA on PCs).TVA-TOJ refers to the model for temporal-order judgments (cf.Tünnermann et al., 2015); TVA-ExG refers to the ex-Gaussian model described in Dyrnholm et al. (2011); TVA-Ex indicates that the basic shifted exponential arrival time model was applied.This figure is best viewed in color with a large zoom.

Figure 2 :
Figure2: (a) General temporal-order judgment procedure: Two stimuli are presented in succession with a variable temporal delay (stimulus onset asynchrony, SOA).The bluish highlight on the star indicates that attention is guided toward the stimulus (which defines it as the probe).(b) The TVA-based model assumes that both stimuli race for VSTM entry and their arrival order determines the temporal order judgment; due to its attention benefit, the probe overtakes the reference stimulus in this example, despite being presented after it.(c) The resulting psychometric distribution of "probe first" can be modeled by TVA-based equations, leading to estimates of TVA parameters C (total visual processing speed) and w p *, the relative attentional weight for the probe stimulus.

Figure 3 :
Figure 3: (a) Conjunction foraging on a tablet PC.(b) Sketch of a TVA-based foraging simulation.The arrival time of a stimulus x (t x,i ) in VSTM depends on attentional control settings (pertinence values π and bias values β, cf.Section "A Theory of Visual Attention").If set selectively to one target type (e.g., red squares) only targets arrive in VSTM.(c) The foraging trajectory is then efficient and switching probability is reduced to small values, similar to what is observed for conjunction foraging with human participants.

Table 1 :
Recent advances toward real-world applications of TVA and the research question they inspired.