Show Summary Details
More options …

# The International Journal of Biostatistics

Ed. by Chambaz, Antoine / Hubbard, Alan E. / van der Laan, Mark J.

2 Issues per year

IMPACT FACTOR 2017: 0.840
5-year IMPACT FACTOR: 1.000

CiteScore 2017: 0.97

SCImago Journal Rank (SJR) 2017: 1.150
Source Normalized Impact per Paper (SNIP) 2017: 1.022

Mathematical Citation Quotient (MCQ) 2016: 0.09

Online
ISSN
1557-4679
See all formats and pricing
More options …
Volume 13, Issue 1

# Parameter Estimation of a Two-Colored Urn Model Class

Line Chloé Le Goff
/ Philippe Soulier
Published Online: 2017-03-25 | DOI: https://doi.org/10.1515/ijb-2016-0029

## Abstract

Though widely used in applications, reinforced random walk on graphs have never been the subject of a valid statistical inference. We develop in this paper a statistical framework for a general two-colored urn model. The probability to draw a ball at each step depends on the number of balls of each color and on a multidimensional parameter through a function, called choice function. We introduce two estimators of the parameter: the maximum likelihood estimator and a weighted least squares estimator which is less efficient, but is closer to the calibration techniques used in the applied literature. In general, the model is an inhomogeneous Markov chain and this property makes the estimation of the parameter impossible on a single path, even if it were infinite. Therefore we assume that we observe i.i.d. experiments, each of a predetermined finite length. This is coherent with the usual experimental set-ups. We apply the statistical framework to a real life experiment: the selection of a path among pre-existing channels by an ant colony. We performed experiments, which consisted of letting ants pass through the branches of a fork. We consider the particular urn model proposed by J.-L. Deneubourg et al. in 1990 to describe this phenomenon. We simulate this model for several parameter values in order to assess the accuracy of the MLE and the WLSE. Then we estimate the parameter from the experimental data and evaluate confident regions with Bootstrap algorithms. The findings of this paper do not contradict the biological literature, but give statistical significance to the values of the parameter found therein.

## Introduction

Urn models have been studied for nearly one century. In 1931, G. Pólya provided the first probabilistic result on the game consisting in drawing a ball from an urn initially containing one red ball and one black ball (see [1]). At each time step, a ball is drawn and put back in the urn with an additional ball of the same color. The probability to draw a red ball is the proportion of red balls in the urn. Pólya proved that, as the number of draws tends to infinity, the proportion of red balls tends to a random variable following the uniform distribution on $\left[0,1\right]$.

The Pólya urn is easily generalizable to a large class of two-colored urn models characterized by a choice function $f:\mathbb{N}×\mathbb{N}\to \left[0,1\right]$. Let ${R}_{n}$ and ${B}_{n}$ be the numbers of red and black balls in the urn after $n$ draws. Note that, by construction, ${R}_{n}+{B}_{n}=n$. The probability ${p}_{n+1}^{R}$ that the $\left(n+1\right)$-th ball is red is given by

$\begin{array}{r}{p}_{n+1}^{R}=f\left({R}_{n},{B}_{n}\right)=f\left({R}_{n},n-{R}_{n}\right)\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

Consequently, the probability to draw a black ball at time $n+1$ is $1-{p}_{n+1}^{R}$.

In this paper we will consider a parametric model for the choice function $f$. That is, we will assume that $f$ is known up to a finite dimensional parameter $\theta \in \mathrm{\Theta }\in {\mathbb{R}}^{d},d\ge 1$. For instance, the parameter $\theta$ can be the initial quantity of balls of each color, in which case $\theta$ would be a $2$-dimensional vector in ${\mathbb{R}}_{\ast }^{+}$. The previous equation becomes

$\begin{array}{r}{p}_{n+1}^{R}=f\left(\theta ,{R}_{n},{B}_{n}\right)=f\left(\theta ,{R}_{n},n-{R}_{n}\right)\phantom{\rule{thickmathspace}{0ex}}.\end{array}$(1)

Our goal will be to propose a valid statistical methodology to estimate the parameter $\theta$. An important issue must be raised. In certain urn models, consistent estimation of the parameter $\theta$ based on only one path ${R}_{1},\dots ,{R}_{n}$ of length $n$ may be impossible, even if $n$ tends to infinity. This will be proved formally in Theorem 8. Therefore, we will suppose that $N$ independent paths are observed, each consisting of a sequence of $n$ balls drawn according to the model defined in eq. (1). The statistical theory is developed as $n$ is fixed and $N$ tends to infinity. This framework may be unusual but is in accordance with data from real life experiments such as those which will be reported in Section 2. These data sets consist of a large number of replications of finite paths.

We define estimators for $\theta$ that we prove to be consistent and asymptotically normal under some usual regularity assumptions on the model (1). We study more precisely two particular cases: the maximum likelihood estimator (MLE) and the weighted least squares estimators (WLSE).

We have applied these statistical tools to the problem of path formation by an ant colony. One of the fundamental factors affecting an organism survival is its ability to optimally and dynamically exploit its environment. For example, in order to take advantage of the best sites of resources, housing or reproduction, these areas must be discovered and exploited at the earliest opportunity. Many species of ants rise to this challenge by developing a network of paths, which connects different strategic sites such as nests and food sources. These paths consist of pheromones, attractive chemical substances. We focus on a specific aspect of this phenomenon: the selection of a path among pre-existing channels. When exploring their environment, ants often face bifurcations and must bypass obstacles. As shown by experimental studies, the laying of pheromones by ants passing successively through a bifurcation results in two possible outcomes: either one branch is eventually selected and the other abandoned, or both branches end up being uniformly chosen (see [2]). The analysis of the spontaneous path formation by a colony of ants is made difficult by the absence of any means to measure precisely the quantity of, or even detect, the pheromones laid by the ants.

It is commonly assumed that when approaching a bifurcation, ants choose a branch and lay a certain constant amount of pheromone without ever turning back. Consequently, the quantity of pheromone laid on a branch is proportional to the number of ants which passed through it. Thus this phenomenon can be described by an urn model as proposed by [2]. More precisely they define a choice function with a two-dimension parameter $\left(\alpha ,c\right)\in \left(0,1{\right)}^{2}$ such that the probability ${p}_{n+1}^{R}$ for an ant to choose the right branch after ${R}_{n}$ passages through the right branch and $n$ passages in total is given by

${p}_{n+1}^{R}=\frac{\left(c+{R}_{n}{\right)}^{\alpha }}{\left(c+{R}_{n}{\right)}^{\alpha }+\left(c+n-{R}_{n}{\right)}^{\alpha }}\phantom{\rule{thickmathspace}{0ex}}.$(2)

The probability to choose the left branch is consequently $1-{p}_{n+1}^{R}$. The parameter $\alpha$ makes this model non-linear with respect to the proportion of passages through one branch. It models the sensitivity of the ant to the concentration of pheromone. The parameter $c$ is the intrinsic attractiveness of each branch and can also be interpreted as the inverse of the attractiveness (or the strength) of the pheromone deposit laid by each ant.

Several probabilistic studies provide the asymptotic behavior of ${R}_{n}/n$ in terms of $\alpha$ and $c$ (See [1, 1, 4]). The influence of the two parameters is on different time scales, but they can contribute to the same effect (selection of one branch or unifomization of the traffic on the two branches) or have antagonistic effects. The model is thus characterized by four phases, according to the values of $\alpha$ and $c$: slow or fast uniformization; slow or fast selection. The phase most commonly considered in the literature is slow selection. This corresponds in our model to $\alpha$ and $c$ larger than $1$: selection of one branch will eventually happen, though slowly because of weak pheromone deposits (see [2, 2, 2]). However, the model may account for other possibilities, such as fast uniformization, which occurs when $\alpha <1$ and $c>1$. One purpose of this paper is to investigate more thoroughly these possibilities which have been more or less overlooked in the previous literature.

Ethological studies have already provided values for the parameter $\left(\alpha ,c\right)$, but the methods used mainly consisted of calibration without control of the statistical validity of these methods and resultsV (see [2, 5, 6, 7]) and in particular these methods do not produce confidence regions. However this type of information supplies interesting and important elements to the behavioral discussion. In this paper, we define the MLE and the WLSE for the ant behavior model (2). We assess the quality of these estimators in a simulation experiment. We then use them on experimental data provided by a real life experiment performed to this purpose with ants. We also compute a confidence region by a Bootstrap algorithm.

The model (2) can be applied, mutatis mutandis, to many other fields. For instance we can consider a fork with two branches as a neuron having two axons. During each period of time, the length of one of the two axons increases. The longer an axon is, the higher the probability that it will further grow. Thus the dynamic of this biological system may also be modeled by eq. (2) (see [8]). Furthermore, using the notion of a choice function (see Section 1.3), the model (1) can be adapted and applied to many situations where a binary choice occurs, or to even more complex situations such as networks with several nodes (see [9, 10]), for instance network exploration by an ant colony (see [7, 11, 12, 13, 14, 1516]).

The paper is organized as follows. The model (1) and the statistical framework is rigorously defined in Section 1. Section 1.1 is focused on the MLE and Section 1.2 on the WLSE. Under some usual regularity conditions, we prove that both estimators are consistent and asymptotically normal. Moreover the MLE is asymptotically efficient. The numerical implementation of the MLE may be difficult and unstable (and lengthy), therefore, a WLSE is considered and theoretically studied. This estimator does not match the theoretical performances of the MLE, but is easier to compute and is popular among practitioners. Section 1.3 proposes an extension of the general urn model to a class of vertex reinforced random walk on graphs. Section 2 is an adaptation of the statistical framework to the ethological problem. In Section 3, we first introduce the assumptions on the ant behavior and then we define the model proposed by [2]. The four phases of the model are described more precisely and are interpreted from the point of view of ethology. Section 4 rewrites the MLE and the WLSE for this particular case. In Section 5, we show that it is not possible to consistently estimate the parameter of the model on a single experiment of length $n$, even if $n$ tends to infinity. In order to assess the performance of the estimators, a short simulation experiment is reported in Section 2.4. Section 3 reports the study on the experimental data. The experimental protocol and the data produced are described in Section 3.1 and Section 3.2. Section 3.3 supplies the estimation results (computation of the estimators and their confidence regions). We provide some concluding remarks in Section 4 and prove the theoretical statistical results of this paper in Section 5.

## 1 Parameter estimation

Let us first write precisely the model studied and statistical framework used. Let $\mathrm{\Theta }\subset {\mathbb{R}}^{d}$, we assume that $\left\{{X}_{k},k\ge 1\right\}$ is a sequence of Bernoulli random variables (representing the colors of the balls drawn: 1 for red and 0 for black) and that there exists a function $f:\mathrm{\Theta }×{\mathbb{N}}^{2}\to \left[0,1\right]$ called the choice function of the urn and a “true value” of the parameter ${\theta }_{0}$ such that, for all integers $k$, $\begin{array}{r}\mathbb{P}\left({X}_{k+1}=1\mid {{F}}_{k}\right)=f\left({\theta }_{0},{Z}_{k},k-{Z}_{k}\right)\phantom{\rule{thickmathspace}{0ex}},\end{array}$

where ${Z}_{0}=0$ and for $k\ge 1$, ${Z}_{k}={X}_{1}+\cdots +{X}_{k}$ and ${{F}}_{k}$ is the sigma-field generated by ${Z}_{0},{X}_{1},\dots ,{X}_{k}$. The random walk $\left\{{Z}_{k},k\ge 0\right\}$ is an inhomogeneous Markov chain. For $n\ge 1$ and a sequence $\left({e}_{1},\dots ,{e}_{n}\right)\in \left\{0,1{\right\}}^{n}$, applying the Markov property, we obtain

$\begin{array}{r}\mathbb{P}\left({X}_{1}={e}_{1},\dots ,{X}_{n}={e}_{n}\right)=\prod _{k=0}^{n-1}f\left({\theta }_{0},{z}_{k},k-{z}_{k}{\right)}^{{e}_{k+1}}\left\{1-f\left({\theta }_{0},{z}_{k},k-{z}_{k}\right){\right\}}^{1-{e}_{k+1}}\phantom{\rule{thickmathspace}{0ex}},\end{array}$(3)

where ${z}_{k}={e}_{1}+\cdots +{e}_{k}$ with ${z}_{0}=0$, by convention.

We assume that we observe $N$ experiments, each consisting in a path of length $n$ of the model (3). In other words, we have a set of $N$ sequences of $n$ consecutive draws. For $j=1,\dots ,N$, and $k=1,\dots ,n$, let ${X}_{k}^{j}\in \left\{0,1\right\}$ denote the color of the $k$-th ball drawn in the $j$-th experiment. Let ${Z}_{0}^{j}=0$ and ${Z}_{k}^{j}=\sum _{i=1}^{k}{X}_{i}^{j}$, $k\ge 1$ be the total number of red balls drawn at time $k$ during the $j$-th experiment, so that ${X}_{k}^{j}={Z}_{k}^{j}-{Z}_{k-1}^{j}$. In all the paper, $n$ will be fixed and our asymptotic results will be obtained with $N$ (the number of experiments) tending to $\mathrm{\infty }$.

The choice function and the true value of the parameter is the same for all experiments, i.e. for $j=1,\dots ,N$, $k=0,\dots ,n-1$ and $i=0,\dots ,k$, $\begin{array}{r}\mathbb{P}\left({X}_{k+1}^{j}=1\mid {Z}_{k}^{j}=i\right)=f\left({\theta }_{0},i,k-i\right)\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

For conciseness of notation, we will hereafter simply write  ${f}_{0}\left(\cdot ,\cdot \right)$ for  $f\left({\theta }_{0},\cdot ,\cdot \right)$. To prove the consistency and the asymptotic normality of the estimators introduced below, we need the following assumptions on the choice function $f$. For any function $g$ defined on $\mathrm{\Theta }$, we denote $\stackrel{˙}{g}$ and $\stackrel{¨}{g}$ the gradient and Hessian matrix with respect to $\theta$, ${\mathrm{\partial }}_{s}g$ the partial derivative with respect to the $s$-th component ${\theta }_{s}$ of $\theta$, $1\le s\le d$ and ${A}^{\prime }$ the transpose of the vector or matrix $A$.

#### Assumption 1

1. (Regularity) The set $\mathrm{\Theta }$ is compact with non-empty interior. For $0\le i\le k\le n-1$, ${f}_{0}\left(i,k-i\right)>0$ and the function $\theta \to f\left(\theta ,i,k-i\right)$ is twice continuously differentiable on $\mathrm{\Theta }$.

2. (Identifiability) If $f\left({\theta }_{1},i,k-i\right)=f\left({\theta }_{2},i,k-i\right)$ for all $0\le i\le k\le n-1$, then ${\theta }_{1}={\theta }_{2}$,

3. $n\left(n-1\right)$-dimensional vectors $\left\{{\mathrm{\partial }}_{s}f\left({\theta }_{0},i,k-i\right),0\le i\le k\le n-1\right\}$, $1\le s\le d$, are linearly independent in ${\mathbb{R}}^{n\left(n-1\right)}$.

Assumption 1 ensures that ${\theta }_{0}$ is the unique maximizer of $L$ and that the Fisher information matrix

$\begin{array}{r}{{I}}_{n}\left({\theta }_{0}\right)=-\stackrel{¨}{L}\left({\theta }_{0}\right)=\sum _{k=0}^{n-1}\sum _{i=0}^{k}\frac{\mathbb{P}\left({Z}_{k}=i\right)}{{f}_{0}\left(i,k-i\right){\stackrel{ˉ}{f}}_{0}\left(i,k-i\right)}{\stackrel{˙}{f}}_{0}\left(i,k-i\right){\stackrel{˙}{f}}_{0}\left(i,k-i{\right)}^{\prime }\end{array}$

is invertible, where we denote ${\stackrel{˙}{f}}_{0}\left(i,k-i\right)=\stackrel{˙}{f}\left({\theta }_{0},i,k-i\right)$ and ${\stackrel{ˉ}{f}}_{0}\left(i,k-i\right)=1-{f}_{0}\left(i,k-i\right)$. The explicit expression of the probabilities $\mathbb{P}\left({Z}_{k}=i\right)$, $i,k\in \mathbb{N}$, is given in Section 5.1, eq. (13).

## 1.1 Maximum likelihood estimation (MLE)

The structure of the model (3) allows to have an explicit expression of likelihood ${V}_{N}\left(\theta \right)$. The independence of the $N$ experiments yields the following multiplicative form

$\begin{array}{r}{V}_{N}\left(\theta \right)=\prod _{j=1}^{N}\prod _{k=0}^{n-1}f\left(\theta ,{Z}_{k}^{j},k-{Z}_{k}^{j}{\right)}^{{X}_{k+1}^{j}}\left\{1-f\left(\theta ,{Z}_{k}^{j},k-{Z}_{k}^{j}\right){\right\}}^{1-{X}_{k+1}^{j}}\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

The log-likelihood function ${L}_{N}$ based on $N$ paths, is thus given by

$\begin{array}{r}{L}_{N}\left(\theta \right)=\sum _{j=1}^{N}\sum _{k=0}^{n-1}\left\{{X}_{k+1}^{j}logf\left(\theta ,{Z}_{k}^{j},k-{Z}_{k}^{j}\right)+\left(1-{X}_{k+1}^{j}\right)log\left\{1-f\left(\theta ,{Z}_{k}^{j},k-{Z}_{k}^{j}\right)\right\}\right\}\phantom{\rule{thickmathspace}{0ex}}.\end{array}$(4)

Let ${\stackrel{ˆ}{\theta }}_{N}$ be the maximum likelihood estimator of ${\theta }_{0}$, that is

${\stackrel{ˆ}{\theta }}_{N}=arg\underset{\theta \in \mathrm{\Theta }}{max}{L}_{N}\left(\theta \right)\phantom{\rule{thickmathspace}{0ex}}.$(5)

Define $L\left(\theta \right)={N}^{-1}\mathbb{E}\left[{L}_{N}\left(\theta \right)\right]$ (where the dependence on $n$ is omitted). Then, $\begin{array}{r}L\left(\theta \right)=\sum _{k=0}^{n-1}\mathbb{E}\left[{f}_{0}\left({Z}_{k},k-{Z}_{k}\right)logf\left(\theta ,{Z}_{k},k-{Z}_{k}\right)+\left\{1-{f}_{0}\left({Z}_{k},k-{Z}_{k}\right)\right\}log\left\{1-f\left(\theta ,{Z}_{k},k-{Z}_{k}\right)\right\}\right]\phantom{\rule{thickmathspace}{0ex}}.\end{array}$(6)

Let ${N}\left(m,\mathrm{\Sigma }\right)$ denote the Gaussian distribution with mean $m$ and covariance $\mathrm{\Sigma }$.

#### Theorem 2

If Assumptions 1–(i) and 1–(ii) hold then the maximum likelihood estimator ${\stackrel{ˆ}{\theta }}_{N}$ is a strongly consistent estimator of ${\theta }_{0}$. If moreover Assumption 1–(iii) holds and ${\theta }_{0}$ is an interior point of $\mathrm{\Theta }$, then, as $N$ tends to $\mathrm{\infty }$, $\sqrt{N}\left({\stackrel{ˆ}{\theta }}_{N}-{\theta }_{0}\right)$ converges weakly towards ${N}\left(0,{{{I}}_{n}}^{-1}\left({\theta }_{0}\right)\right)$.

The proof is in Section 5.4. It is the consequence of a more general result stated and proved therein.

## 1.2 Weighted least squares estimation (WLSE)

Least squares estimators are very popular among practitioners. Moreover for some urn models, the MLE may be numerically unstable hence difficult (and lengthy) to compute so the WLSE constitutes a convenient alternative. In order to describe this estimator, we introduce some notation. For $0\le i\le k\le n-1$, define

$\begin{array}{r}{a}_{N}\left(i,k-i\right)=\frac{1}{N}\sum _{j=1}^{N}{1}_{\left\{{Z}_{k}^{j}=i\right\}}\phantom{\rule{thickmathspace}{0ex}}\phantom{\rule{thickmathspace}{0ex}}\mathrm{a}\mathrm{n}\mathrm{d}\phantom{\rule{thickmathspace}{0ex}}\phantom{\rule{thickmathspace}{0ex}}{\mathrm{p}}_{\mathrm{N}}\left(\mathrm{i},\mathrm{k}-\mathrm{i}\right)=\frac{\frac{1}{\mathrm{N}}\sum _{\mathrm{j}=1}^{\mathrm{N}}\phantom{\rule{thinmathspace}{0ex}}{1}_{\left\{{\mathrm{Z}}_{\mathrm{k}}^{\mathrm{j}}=\mathrm{i}\right\}}{\mathrm{X}}_{\mathrm{k}+1}^{\mathrm{j}}}{{\mathrm{a}}_{\mathrm{N}}\left(\mathrm{i},\mathrm{k}-\mathrm{i}\right)}\end{array}$(7)

with the convention $\frac{0}{0}=0$. The quantity ${a}_{N}\left(i,k-i\right)$ is the empirical probability that $i$ red balls have been drawn at time $k$ and ${p}_{N}\left(i,k-i\right)$ is the empirical conditional probability that a red ball is again chosen at time $k+1$ given $i$ red balls were drawn at time $k$.

We further define ${q}_{N}\left(i,k-i\right)=1-{p}_{N}\left(i,k-i\right)$ and $\stackrel{ˉ}{f}\left(\theta ,i,k-i\right)=1-f\left(\theta ,i,k-i\right)$. Let $\left\{{w}_{N}\left(i,k-i\right),0\le i\le k\right\}$ be a sequence of weights and define the contrast function

${W}_{N}\left(\theta \right)=\sum _{k=0}^{n-1}\sum _{i=0}^{k}{w}_{N}\left(i,k-i\right)\left\{{p}_{N}\left(i,k-i\right)-f\left(\theta ,i,k-i\right){\right\}}^{2}\phantom{\rule{thickmathspace}{0ex}}.$(8)

The weighted least squares estimator minimizes ${W}_{N}$, that is ${\stackrel{ˆ}{\theta }}_{N}^{W}=arg\underset{\theta \in \mathrm{\Theta }}{min}{W}_{N}\left(\theta \right)\phantom{\rule{thickmathspace}{0ex}}.$(9)

#### Theorem 3

If Assumptions 1–(i) and 1–(ii) hold and if the weights ${w}_{N}$ converge almost surely to a sequence of positive weights ${w}_{0}$, then the weighted least squares estimator ${\stackrel{ˆ}{\theta }}_{N}^{W}$ is a strongly consistent estimator of ${\theta }_{0}$.

If moreover Assumption 1–(iii) holds and ${\theta }_{0}$ is an interior point of $\mathrm{\Theta }$, then as $N$ tends to $\mathrm{\infty }$, $\sqrt{N}\left({\stackrel{ˆ}{\theta }}_{N}^{W}-{\theta }_{0}\right)$ converges weakly towards ${N}\left(0,{\mathrm{\Sigma }}_{n}\left({\theta }_{0}\right)\right)$, where ${\mathrm{\Sigma }}_{n}\left({\theta }_{0}\right)$ is a definite positive covariance matrix.

If moreover ${w}_{N}\left(i,k-i\right)={{p}_{N}}^{-1}{{q}_{N}}^{-1}{a}_{N}\left(i,k-i\right)$, for all $0\le i\le k\le n-1$, the estimator ${\stackrel{ˆ}{\theta }}_{N}^{W}$ is asymptotically efficient, i.e. ${\mathrm{\Sigma }}_{n}\left({\theta }_{0}\right)={{I}}_{n}\left({\theta }_{0}\right)$.

The proof is in Section 5.4, where an explicit expression for ${\mathrm{\Sigma }}_{n}\left({\theta }_{0}{\right)}^{-1}$ is supplied.

## 1.3 Generalization

It is possible to extend the statistical framework introduced in the previous sections to a large class of reinforced random walks on graphs. For instance, let $G=\left({X},{E}\right)$ be a non-oriented graph, such that each vertex is connected to a finite number of vertices and, with ${X}$ the set of its vertices and ${E}\subset \left\{\left\{i,j\right\}:\phantom{\rule{thickmathspace}{0ex}}i,j\in {X},\phantom{\rule{thinmathspace}{0ex}}i\ne j\right\}$ the set of its non-oriented edges. We denote by $x\sim y$, if $\left\{x,y\right\}\in {E}$. Let ${\mathbb{N}}^{{X}}$ be the set of integer vectors indexed on ${X}$. We define $X=\left({X}_{n}{\right)}_{n\ge 0}$ a random walk on $G$, i.e. a sequence of vertices, such that, for all $n\ge 0$, ${X}_{n+1}\sim {X}_{n}$. The vector ${Z}_{n}\in {\mathbb{N}}^{{X}}$ is such that, for all vertex $x\in {X}$, ${Z}_{n}\left(x\right)$ is the number of times the walk $X$ has visited $x$ up to time $n$. Let $\mathrm{\Theta }\subset {\mathbb{R}}^{d}$, $d\ge 1$, we suppose that the walk $X$ is vertex-reinforced and that there exists a choice function $f:\mathrm{\Theta }×{X}×{\mathbb{N}}^{{X}}\to \left[0,1\right]$ and a parameter $\theta \in \mathrm{\Theta }$ such that, for all $n\ge 0$ and $x\in {X}$, $\begin{array}{rcl}\mathbb{P}\left({X}_{n+1}=x|{X}_{0},\cdots ,{X}_{n}\right)& =& f\left(\theta ,x,{Z}_{n}\right)\phantom{\rule{thinmathspace}{0ex}}{1}_{\left\{x\sim {X}_{n}\right\}}\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

For instance, the choice function can be similar to the one proposed by [2], for $n\ge 0$ and $x\in {X}$, $\begin{array}{rcl}\mathbb{P}\left({X}_{n+1}=x|{X}_{0},\cdots ,{X}_{n}\right)& =& \frac{\left(c+{Z}_{n}\left(x\right){\right)}^{\alpha }}{\sum _{y\sim {X}_{n}}\left(c+{Z}_{n}\left(y\right){\right)}^{\alpha }}{1}_{\left\{x\sim {X}_{n}\right\}},\end{array}$

where $\mathrm{\Theta }=\left(0,+\mathrm{\infty }{\right)}^{2}$ and $\theta =\left(\alpha ,c\right)$.

The statistical framework introduced for the general urn model is easily adaptable to this vertex reinforced random work. Under some adequate regular conditions, it would not be difficult to prove the consistency and the asymptotic normality of the MLE and the WLSE by establishing a theorem similar to the general result proved in Section 5.3.

## 2 Application to an ethological problem

In 1990, Deneubourg et al. used a particular urn model to reproduce the sequences of consecutive choices made by ants at a fork [2]. Let the urn filled with balls of two colors be replaced by a fork with two branches. Drawing a ball and adding a ball of the same color in the urn is equivalent to an ant choosing a branch and reinforcing it with pheromone by going throw it. Then the probability to draw a red ball depending on the previous draws is equal to the probability to choose the right branch depending on the previous passages.

We first introduce the model proposed by [2] in the statistical framework described in the previous section. Futher, we provide a description of the model behavior depending on the parameter value. We also supply an ethological interpretation of the parameter. We then prove that it is impossible to estimate the parameter on a single path. Finally we report the study of the estimator accuracy that we performed on simulated data.

## Behavioral assumptions

We first introduce the hypotheses assumed on the ant behavior.

1. Each ant regularly deposits a constant amount of pheromone as it walks.

2. The ants are strictly identical which means that every ant has the same reaction to the same amount of pheromone.

3. Pheromone trails do not evaporate during the experiment.

4. Each ant reaches the fork alone, chooses a branch and leaves the bifurcation by crossing only once into the chosen branch without passing through or reinforcing the non-chosen branch.

Under these assumptions, the quantity of pheromone laid on each branch is proportional to the number of passages through it. In path formation modeling, these assumptions are commonly made. However, because of the inter-individual variability in ant behavior, the first two assumptions are unrealistic. For instance, the pheromone perception noise implies that each ant could detect a different signal from the same quantity of pheromone. These assumptions are an approximation of the real ant behavior. The implicit hypothesis here is that the inter-individual variability is small enough to consider that all ants are identical. For the third assumption, we suppose that the persistence of the pheromone trails allows to ignore the evaporation of the pheromone. Experimental protocols are designed to make the four assumptions more acceptable by choosing the ant species adequately and by placing them in an appropriate situation (see Section 3).

## The model

The random variable ${X}_{k}$, introduced previously, is the choice of the $k$-th ant going through the fork (1 for right and 0 for left). Consequently, for $k\ge 1$, ${Z}_{k}$ is the number of passages through the right branch after $k$ passages. For $\theta =\left(\alpha ,c\right)\in \left(0,\mathrm{\infty }{\right)}^{2}$ and all integers $0\le i\le k$, we define the choice function

$\begin{array}{r}f\left(\theta ,i,k-i\right)=\frac{\left(c+i{\right)}^{\alpha }}{\left(c+i{\right)}^{\alpha }+\left(c+k-i{\right)}^{\alpha }}\phantom{\rule{thickmathspace}{0ex}}.\end{array}$(10)

We assume that the probability that an ant chooses the right branch at time $k+1$ given the first $k$ choices is given by:

$\begin{array}{rl}\mathbb{P}\left({X}_{k+1}=1|{{F}}_{k}\right)& =f\left(\theta ,{Z}_{k},k-{Z}_{k}\right)=\frac{\left(c+{Z}_{k}{\right)}^{\alpha }}{\left(c+{Z}_{k}{\right)}^{\alpha }+\left(c+k-{Z}_{k}{\right)}^{\alpha }}\phantom{\rule{thickmathspace}{0ex}},\end{array}$(11)

where $c>0$ is the intrinsic attractiveness of each branch and $\alpha >0$ is the possible non-linearity of the choice. This process is an urn model which has been exhaustively investigated in the probability literature. We recall here its main features.

#### Theorem 4

1. If $\alpha <1$, then $\underset{n\to \mathrm{\infty }}{lim}\frac{{Z}_{n}}{n}=\frac{1}{2}$ \; a.s.

2. If $\alpha =1$, then ${Z}_{n}/n$ converges almost surely to a random limit with a Beta$\left(c,c\right)$ distribution with density $x\to {\mathrm{\Gamma }}^{-2}\left(c\right)\mathrm{\Gamma }\left(2c\right){x}^{c-1}\left(1-x{\right)}^{c-1}$ with respect to Lebesgue’s measure on $\left[0,1\right]$, and $\mathrm{\Gamma }$ is the Gamma function.

3. If $\alpha >1$, then eventually only one branch will be chosen, i.e. $\begin{array}{r}\mathrm{\exists }x\in \left\{0,1\right\}\phantom{\rule{thickmathspace}{0ex}},\text{\hspace{0.17em}}\mathrm{\exists }{n}_{0}\in \mathbb{N}\phantom{\rule{thickmathspace}{0ex}},\text{\hspace{0.17em}}\mathrm{\forall }n\ge {n}_{0}\phantom{\rule{thickmathspace}{0ex}},\text{\hspace{0.17em}}{X}_{n}=x\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

The case $\alpha <1$ is due to [3]; the case $\alpha >1$ to [4] and the case $\alpha =1$ to [1] (see [17] for an online access).

## Ethological interpretation of the parameters and properties of the model

The parameter $\alpha$ characterizes the ant differential sensitivity to the pheromone. When $\alpha >1$, the ants can detect better and better increasing amounts of pheromones laid on each branch and thus are more likely to choose the branch with the most pheromones. Moreover, after a random but almost surely finite number of passages, one branch will eventually be selected, i.e. all ants will afterwards choose this branch (see Theorem 4–(iii)). In the opposite case, when $\alpha <1$, the ants are less able to perceive the differences between the amounts of pheromones laid on each branch as these amounts increase. This minimization effect is so strong that the proportion of passages on each branch converges to $1/2$ (see Theorem 4–(i)). It is important to note that, when $\alpha \ne 1$, the asymptotic behavior of the proportion of passages through each branch depends only on $\alpha$ and not on $c$. Furthermore the larger $\alpha$ is or the closer $\alpha$ is to zero, the faster these effects will happen.

The role of $c$ is clearer when eq. (11) is rewritten as follows:

$\begin{array}{rl}\mathbb{P}\left({X}_{k+1}=1|{{F}}_{k}\right)& =\frac{{\left(1+{Z}_{k}/c\right)}^{\alpha }}{{\left(1+{Z}_{k}/c\right)}^{\alpha }+{\left(1+\left(k-{Z}_{k}\right)/c\right)}^{\alpha }}\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

The parameter $c$ is the inverse of the reinforcement incrementation and thus can be interpreted as the inverse of the attractiveness (or the strength) of the pheromones laid at each passage. Consequently when $\alpha$ is neither very close to zero nor very large, $c$ has a strong short term influence. When $c$ is small compared to $1$, the first passage strongly reinforces the first chosen branch. Thus during the first few passages, a branch will be highly favored even if $\alpha <1$ (in which case the branches will eventually be uniformly crossed). When $c$ is large, the first passages weakly reinforce the chosen branches. Then if $\alpha >1$ (in which case a branch will eventually be selected), a large number of passages must be observed before the clear emergence of a preference. Naturally, the larger $c$ is or the closer $c$ is to zero, the longer these effects will be seen.

When $\alpha =1$, the asymptotic behavior of the passage proportion is determined by $c$ (see Theorem 4–(ii)). As $c$ grows from zero to infinity, the limiting distribution of ${Z}_{n}/n$ (as $n\to \mathrm{\infty }$) evolves continuously from two Dirac point masses at 0 and 1 to a single Dirac mass at 1/2. To illustrate this point, we show in Figure 1 the density of the Beta distribution for $c=0.1$ and $c=10$. We make some further comments.

• If $c<1$, a strong asymmetry in the choices of the branches appears. One branch is eventually chosen much more frequently than the other. Furthermore as $c$ tends to $0$, the Beta distribution tends to the distribution with two point masses at $0$ and $1$. This limit case corresponds to the situation in which a branch is selected, i.e. $\alpha >1$.

• If $c=1$, the limiting distribution is uniform on $\left[0,1\right]$.

• If $c>1$, ${Z}_{n}/n$ appears to be much more concentrated around $1/2$. This is similar to what is observed in the case $\alpha <1$.

Figure 1:

Graph of the Beta distribution for parameters $\left(c,c\right)$ with $c=0.1$, $c=1$ and $c=10$.

To summarize, the model possesses four phases: fast and slow selections and fast and slow uniformizations. These phases are delimited by two phase transitions: a discontinuous one between $\alpha >1$ (branch selection) and $\alpha <1$ (branch uniformization) with a critical state $\alpha =1$ and a smooth one between $c<1$ (strong pheromone deposits) and $c>1$ (weak pheromone deposits). These properties are summarized in the phase diagram in Figure 2. When $c$ is small, it is very likely that one branch will be favored during the first passages, thus the empirical distribution of the choices resembles the Beta distribution with a small $c$ (the grey solid line in Figure 1). As the number of experiments increases, the shape of the empirical distribution will be closer and closer to its limit: a Dirac mass at $1/2$ if $\alpha <1$ and two Dirac masses at 0 and 1 if $\alpha >1$. When $c$ is large, the earlier passages do not show any preference between the branch. Again, when the number of experiments increases, the asymptotic behavior is progressively revealed.

Figure 2:

Phase diagram of the model. The graphs in the shaded boxes show the shape of the empirical distribution of ${Z}_{n}/n$ for small $n$ (left) and its limiting distribution (as $n\to \mathrm{\infty }$, right).

To date, the most commonly used behavioral state in the model is the slow selection of a branch (see [2, 18]). But at least two other states are interesting. The fast uniformization can describe the case where none of the branches are preferred. The slow uniformization could reproduce the saturation phenomenon. There exists a threshold concentration of pheromone upon which ants can no longer detect the pheromone concentration variations (see [19]). In experiments involving many ants, one can first observe the favorization of a branch. But when this branch is saturated (its attractiveness stops increasing), ants go more and more through the other branch, whose attractiveness still increases. Eventually, the two branches are uniformly chosen.

#### Remark 5

In the context of an estimation procedure, the similar effects of $\alpha$ and $c$ (favorization/selection of a branch or not) induce an identifiability issue. Indeed, we observe a finite number of choices and consequently we only see the short term behavior. We have seen that the favorization of a branch in the first passages could be due to a pair of parameter values $\left(\alpha ,c\right)$ with $c$ small compared to $1$ and $\alpha$ close to $1$ or to a pair $\left(\alpha ,c\right)$ with $c$ close to $1$ and $\alpha$ large compared to $1$.

Thus, we can expect that the estimation of the parameters will be difficult when both parameters contribute to the same effect, e.g. $\alpha$ large and $c$ small (fast selection of one branch) or $\alpha$ small and $c$ large (no selection); and also when the parameters have competing effects: very small $c$ and $\alpha <1$, or very large $c$ and $\alpha >1$. The statistical procedure that we introduce in this paper partially circumvents this difficulty, since it focuses on the transition probabilities rather than on the general shape of a curve, which is what calibration methods do. This will be illustrated in Section 2.4.

## 2.2 Parameter estimation

We start with the maximum likelihood estimator, that is ${\stackrel{ˆ}{\theta }}_{N}=\left({\stackrel{ˆ}{\alpha }}_{N},{\stackrel{ˆ}{c}}_{N}\right)$ defined by eq. (5). The Fisher information matrix has the following expression.

$\begin{array}{r}{{I}}_{n}\left(\theta \right)=\sum _{k=0}^{n-1}\sum _{i=0}^{k}\mathbb{P}\left({Z}_{k}=i\right)\phantom{\rule{thinmathspace}{0ex}}f\left(\alpha ,c,i,k-i\right)\stackrel{ˉ}{f}\left(\alpha ,c,i,k-i\right){J}\left(\alpha ,c,i,k-i\right)\phantom{\rule{thickmathspace}{0ex}},\end{array}$(12)

with $\theta =\left(\alpha ,c\right)$ and for $0\le i\le k\le n-1$, $\begin{array}{r}{J}\left(\alpha ,c,i,k-i\right)=\left(\begin{array}{cc}{log}^{2}\left(\frac{c+i}{c+k-i}\right)& log\left(\frac{c+i}{c+k-i}\right)\frac{\alpha \left(k-2i\right)}{\left(c+i\right)\left(c+k-i\right)}\\ log\left(\frac{c+i}{c+k-i}\right)\frac{\alpha \left(k-2i\right)}{\left(c+i\right)\left(c+k-i\right)}& \frac{{\alpha }^{2}\left(k-2i{\right)}^{2}}{\left(c+i{\right)}^{2}\left(c+k-i{\right)}^{2}}\end{array}\right)\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

It is important to note that the Fisher information matrix is not diagonal. Thus the estimation of each parameter has an effect on the estimation of the other.

#### Corollary 6

Let $\mathrm{\Theta }$ be a compact subset of $\left(0,\mathrm{\infty }{\right)}^{2}$ which contains $\left({\alpha }_{0},{c}_{0}\right)$. Then the maximum likelihood estimator ${\stackrel{ˆ}{\theta }}_{N}$ is consistent and asymptotically normal and efficient, i.e. $\sqrt{N}\left({\stackrel{ˆ}{\theta }}_{N}-{\theta }_{0}\right)$ converges weakly to ${N}\left(0,{{{I}}_{n}}^{-1}\left({\theta }_{0}\right)\right)$.

As mentioned above, we also use weighted least squared estimators defined by eq. (9) for several different weight sequences ${w}_{N}$, such that ${w}_{N}\left(i,k-i\right)$ converges almost surely to ${w}_{0}\left(i,k-i\right)>0$, for all $0\le k\le n-1$ and $0\le i\le k$.

#### Corollary 7

Let $\mathrm{\Theta }$ be a compact subset of $\left(0,\mathrm{\infty }{\right)}^{2}$ which contains $\left({\alpha }_{0},{c}_{0}\right)$ and assume that ${w}_{N}$ converges almost surely to positive weights ${w}_{0}$. Then the weighted least squared estimator ${\stackrel{ˆ}{\theta }}_{N}^{W}$ is consistent and asymptotically normal, i.e. $\sqrt{N}\left({\stackrel{ˆ}{\theta }}_{N}^{W}-{\theta }_{0}\right)$ converges weakly to ${N}\left(0,{\mathrm{\Sigma }}_{n}\left({\theta }_{0}\right)\right)$, where ${\mathrm{\Sigma }}_{n}\left({\theta }_{0}\right)$ is a positive definite covariance matrix. It is efficient, if ${w}_{N}\left(i,k-i\right)={p}_{N}^{-1}{q}_{N}^{-1}{a}_{N}\left(i,k-i\right)$, for all $0\le k\le n-1$ and $0\le i\le k$.

The proofs of these corollaries are in Section 5.5. They are an application of Theorems 2 and 3. Since Assumption 1–(i) obviously holds, it remains only to check (ii) and (iii) of Assumption 1.

## 2.3 Estimation on a single path

The main feature of the binary choice model for ${\alpha }_{0}>1$ is that only one branch will be crossed eventually. It seems clear then that a statistical procedure based on only one path (one sequence of choices) cannot be consistent, since no new information will be obtained after one branch is eventually abandoned. This intuition is true and more surprisingly, it is also true in the case ${\alpha }_{0}=1$. This is translated in statistical terms in the following theorem. Let ${\mathrm{\ell }}_{n}$ denote the log-likelihood based on a single path of length $n$ and ${\stackrel{˙}{\mathrm{\ell }}}_{n}$ its gradient. The model is regular, so the Fisher information is ${\mathrm{v}\mathrm{a}\mathrm{r}}_{\theta }\left({\stackrel{˙}{\mathrm{\ell }}}_{n}\left(\theta \right)\right)$.

#### Theorem 8

1. If ${\alpha }_{0}=1$ and ${c}_{0}>0$, then $\underset{n\to \mathrm{\infty }}{lim}{{I}}_{n}\left({c}_{0}\right)<\mathrm{\infty }$.

2. If ${\alpha }_{0}<1$, ${n}^{-1}{\mathrm{\ell }}_{n}\left({\theta }_{0}\right)\to -log2$.

3. If ${\alpha }_{0}>1$, then ${\mathrm{\ell }}_{n}\left({\theta }_{0}\right)$ converges almost surely to a random variable as $n\to \mathrm{\infty }$.

The proof is in Section 5.6. Statement (i) means that, when ${\alpha }_{0}=1$, the Fisher information is bounded. This implies that the parameter ${c}_{0}$ cannot be estimated on a single path. This also implies that the length $n$ of each path should be taken as large as possible (theoretically infinite) in order to minimize the asymptotic variance of the estimators. Statements (ii) and (iii) imply that the maximum likelihood estimator is inconsistent, since the likelihood does not tend to a constant.

## 2.4 Simulation experiment

In order to assess the quality of the estimators proposed, we have made a short simulation study. For several pairs $\left(\alpha ,c\right)$, we have simulated 1,000 experiments of $N\phantom{\rule{negativethinmathspace}{0ex}}\phantom{\rule{negativethinmathspace}{0ex}}=\phantom{\rule{negativethinmathspace}{0ex}}\phantom{\rule{negativethinmathspace}{0ex}}50$ paths of length $n\phantom{\rule{negativethinmathspace}{0ex}}\phantom{\rule{negativethinmathspace}{0ex}}=\phantom{\rule{negativethinmathspace}{0ex}}\phantom{\rule{negativethinmathspace}{0ex}}100$ (recall that $n$ is the number of ants going through the bifurcation). These are reasonable values in view of the practical experiments with actual ants. We compare the performance of the maximum likelihood estimator ${\stackrel{ˆ}{\theta }}_{N}$ (MLE) defined in eq. (5) and of the weighted least squares estimator ${\stackrel{ˆ}{\theta }}_{N}^{W}$ (WLSE) defined in eq. (9) with the weights ${w}_{N}\left(i,k-i\right)={a}_{N}\left(i,k-i\right)$ defined in eq. (7). The asymptotically efficient WLSE with the weights ${w}_{N}\left(i,k-i\right)={a}_{N}\left(i,k-i\right){p}_{N}\left(i,k-i{\right)}^{-1}{q}_{N}\left(i,k-i{\right)}^{-1}$ provides a severely biased estimation of $\alpha$ and always estimates a very small value of $c$ with a very small dispersion. This is caused by the fact that the empirical ${p}_{N}$ and ${q}_{N}$ vanish frequently, so that the weights are infinite. We will not report the study for this estimator.

## The theoretical standard deviation

We first evaluate numerically some values of the theoretical standard deviations of both estimators for several values of $\alpha$ and $c$. We have chosen arbitrary values of $\alpha$ and $c$ in the range $0.5,2$. We have also chosen values of $\alpha$ and $c$ which correspond to those found in the literature cited and to those that we have estimated in the real life experiment described in Section 3. These results are reported in Table 1 and in Figure 3 and their features are summarized in the following points.

• As theoretically expected, the asymptotic variance of the MLE, which is the Fisher information bound, is smaller than the variance of the WLSE, but the ratio between the variances of the two estimators is never less than one fourth. Moreover, their overall behavior is similar.

• The variance of the estimators of $\alpha$ is smaller when both parameters do not contribute to the same effect. The worst variance is for $\alpha$ large and $c$ small, that is when the parameter values imply fast selection of a branch. The variance tend to infinity when $\alpha$ tends to infinity.

• The variance of the estimators of $c$ increases with $c$ and tends to infinity when $\alpha$ tends to 0 and to $\mathrm{\infty }$.

• These effects are explained by the fact that the coefficients of the Fisher information matrix tend to zero when $\alpha$ tends to zero, except the coefficient corresponding to $\alpha$. See Formula (12).

Figure 3:

Theoretical standard deviation for $N=50$ paths of length $n=100$ of the MLE of $\alpha$ (a) and of $c$ (b), for $\alpha$ in $\left(0,2\right]$ and fixed values of $c$. Theoretical standard deviation of the WLSE has the same shape, but with higher convergence speed as $\alpha$ tend to $0$ or $\mathrm{\infty }$.

## Performance of the estimators

Recall that we have simulated 1000 experiments, each of $N\phantom{\rule{negativethinmathspace}{0ex}}\phantom{\rule{negativethinmathspace}{0ex}}=\phantom{\rule{negativethinmathspace}{0ex}}\phantom{\rule{negativethinmathspace}{0ex}}50$ paths of length $n=100$. Because of the length of the computations, each MLE was computed only 500 times. Table 1 reports square root mean squared error (MSE) of both estimators based on the simulated data for the same values of the parameters and their features are summarized in the following points.

• For most values of the parameters, the MSE are close to the theoretical standard deviation.

• The MSE increase significantly when $c$ is large or when both parameters contribute to the same effect.

• This increase is more noticeable for the WLSE than for the MLE.

• This increase is in part due to the skewness of these estimators. For some values of the parameters, both estimators tend to overestimate the parameters.

• For the MLE, the MSE is much larger in the case of non-selection than in the case of selection where the empirical performance of the MLE nearly matches the theoretical value.

• These effects are always stronger for the estimation of $c$ than for the estimation of $\alpha$.

This degraded performance for some specific or extreme values of the parameters is in part due to numerical issues.

• In the case where selection of a branch is fast, many of the empirical weights used to compute the WLSE vanish, and the least squares method uses very few points to fit the curve. The MLE is not affected by this problem.

• In the case where both parameters concur to non-selection, the probability of choosing one branch converges very fast to 1/2, and thus the experiment brings very little information. This affects both the MLE and the WLSE, and in addition, many of the empirical weights vanish so the WLSE is even less efficient.

The degraded performance may also be caused by the identifiability problem explained in Remark 5, i.e., similar effects of the two parameters make the estimation more difficult.

Table 1:

Theoretical standard deviations (TSD) for $N=50$ paths of length $n=100$ and square root of the mean squared errors (MSE) for $500$ (for the MLE) or $1000$ (for the WLSE) simulated experiments of $N=50$ paths of length $n=100$. All figures of this table must be mutiply by $0.01$.

Table 2:

Monte-Carlo 95% confidence intervals for $500$ simulated experiment of $N=50$ paths of length $n=100$ and Bootstrap 95% confidence intervals for one simulated experiment of $50$ paths of length $100$.

## Bootstrap confidence intervals

Since the asymptotic variance depends on the unknown parameters, we have computed the pivotal Bootstrap 95% confidence intervals for the parameters based on one simulation of $N=50$ paths of length $n=100$ and a Bootstrap sample size of $500$ (see [20], Section 8.3, for details on this method). In Table 2, we have compared these Bootstrap intervals with the corresponding Monte-Carlo intervals, based on 500 simulations. The match is nearly perfect for the MLE for $\alpha$, but as before, the performance is poorer for the estimation of $c$. The intervals for $c$ are noticeably skewed to the right but always contain the true value. For further comparison, we only show here the results corresponding to the values of the parameters estimated in the real life experiment reported below and those corresponding to values found in the earlier literature.

## 3 Real life experiment with ants

In this section, we apply the previous estimators on data from a path selection experiment by a colony of ants.

## 3.1 Experiment description

This experiment was done in the Research Center on Animal Cognition (UMR 5169) of Paul Sabatier University Toulouse under the supervision of Guy Theraulaz, Hugues Chaté and the first author. A small laboratory colony (approximately 200 workers) of Argentine ants Linepithema humile was starved for two days before the experiment. During the experiment, the colony had access to a fork carved in a white PVC slab, partially covered by a Plexiglas plate (see Figure 4). The angle between the branches was ${60}^{\circ }$. The fork galleries had a $0.5$ cm square section. The entrance of the maze was controlled by a door. Food was never present during the experiment. The maze was initially free of any pheromone trail.

Figure 4:

The experimental setup: a fork carved in a white PVC slab, partially covered by a Plexiglass plate.

Each trial ($N=50$) consisted in introducing separately each ant to the entrance of the fork (see Figure 4) one at a time. Once inside, an ant must choose between the left or the right branch of the fork. As soon as the ant had made a choice and stepped into one branch, it was removed from the setup and another ant was introduced. All the choices were recorded and a trial ended when 100 ants had passed through the fork.

This experimental protocol was designed to strengthen the behavioral assumptions described in Section 3. Any return to the fork is forbidden so that we can consider that each ant passed only one time. There was never more than one ant in the setup. This implies that each ant in the maze received no other cue about the previous passages than the pheromone that was been laid. The species Linepithema humile was in part chosen to justify the assumption of identical pheromone deposits. Indeed, these ants may deposit regularly the same type of pheromone on their trajectory (see [21, 22]). All ants were prepared the same way before the experiments to increase the credibility of the assumption stating that each ant behaved by the same way. The length of the experiments was limited to stay close to the half-life duration of the pheromone trails (see [9]).

## 3.2 Data representation

Figures 5(a) shows the 50 paths of length $n=100$, that is, $50$ choice sequences of 100 ants that went through the bifurcation. The paths are represented as random walks with increment $+1$ when the right branch is chosen, and $-1$ when the left one is chosen. In less than ten experiments, a branch seemed to be selected, whereas in the others, selection of a branch was not obvious. Figure 5(b) shows the histogram of the distribution of  ${Z}_{100}/100$, that is the final proportion of the choices of the right branch. There is no clear visual evidence that $\alpha >1$ as it is claim in the literature (see [2, 5, 6]).

Figure 5:

Data representation. (a) The 50 paths of $n=100$ ants choosing either left (+1) or right ($-1$) (b) Histogram of the final proportion of right passages (${\text{Z}}_{100}/100$).

## 3.3 Parameter estimation

Several values of these parameters have been proposed in the applied literature. [2] proposed $\alpha =2$, $c=20$ and more recently [6] suggested $\alpha =2.6$ and $c=60$. It must be noted however that these values are not obtained by a statistical method but by the calibration of a curve to a plot. Therefore, these methods do not lead to confidence intervals. Moreover, a calibration method has an inherent risk of over fitting, because of the identifiability problem explained in Remark 5. As illustrated in Figure 2, if for instance $\alpha$ and $c$ are both small, then both branches will be asymptotically equally chosen, but paths of finite length $n$ might be misleading and the calibration will suggest values of $\alpha$ and $c$ corresponding to the selection of a branch. The statistical procedure is based on the dynamics of the process and is thus less prone to this type of error. Nevertheless, we will see that our results do not contradict those of [2, 6], but complement them.

Table 3 shows the results of the maximum likelihood estimation and the weighted least squares estimation. Both estimates of $\alpha$ are close to 1.1 and the estimates of $c$ are between 3 and 7. The $95\mathrm{%}$ confidence intervals are slightly larger than the simulated ones (see Table 2). This increased variability may be due to the extreme paths which seem to show a very fast selection of one branch (see Figure 5). This may suggest that the ants did not have the same behavior and that the distribution of ${Z}_{100}/100$ could be a mixture of two distributions.

For both methods, the 95% Bootstrap confidence intervals of $\alpha$ contain the value 1. More precisely, as shown in Figure 6, approximately 1/3 of the bootstrap parameters gives weak pheromone deposits ($c>1$) and a weak differential sensitivity ($\alpha <1$), which means that branches are eventually uniformly crossed. In almost all the others cases, we conclude for weak pheromone deposits ($c>1$) and a strong differential sensitivity ($\alpha >1$), which means that a branch will be eventually, though slowly, selected. In only a few cases do the estimators give strong pheromone deposits ($c<1$), but a weak differential sensitivity ($\alpha <1$), which means that a branch is chosen more than the other at the beginning of the experiment, but branches are eventually uniformly crossed. Finally, there are no values which imply both strong pheromone deposits ($c<1$) and a strong differential sensitivity ($\alpha >1$). Therefore, we can conclude that pheromone deposits are weak with a good confidence but we cannot confidently decide for $\alpha$.

The values obtained by [2] ($\alpha =2$, $c=20$) and more recently by [6] ($\alpha =2.6$, $c=60$) are both in the confidence intervals for the WLSE found in Table 3. But the values of $\alpha$ suggested by these authors are out of the 95% confidence interval for the MLE. Thus these parameters, which decide for a slow branch selection, are no more likely than a parameter set which would yield non-selection of a path.

Table 3:

The MLE and the WLSE for the 50 paths of real ants and their Boostrap $95\mathrm{%}$ confidence intervals.

Figure 6:

Log-log scatterplots of the estimates $\left({\stackrel{ˆ}{\alpha }}^{\ast },{\stackrel{ˆ}{c}}^{\ast }\right)$ for the 500 Bootstrap samples for the MLE (a) and the WLSE (b) for the 50 paths of real ants.

Figure 7:

Graph of the estimates $\stackrel{ˆ}{\alpha }$ and their Bootstrap 95% confidence interval for the MLE (a) and the WLSE (b) for the 50 paths of real ants as a function of fixed value of $c$.

Figure 6 illustrates the fact that the two estimators are strongly positively correlated. There seems to be two cutoff values for $c$: if ${\stackrel{ˆ}{c}}^{\ast }>8$, then ${\stackrel{ˆ}{\alpha }}^{\ast }>1$, and if ${\stackrel{ˆ}{c}}^{\ast }<1.5$, then ${\stackrel{ˆ}{\alpha }}^{\ast }<1$. The above mentioned values reported by [2, 6] exhibit these features: they both have $c>8$ and $\alpha >1$ and $\alpha$ increase with $c$.

If we fix the value of $c$ and estimate only $\alpha$, then the 95% Bootstrap confidence intervals for $\alpha$ are smaller. Figure 7 shows the estimated values of $\alpha$ and the confidence intervals as functions of the fixed value of $c$. We see that if $c$ is greater than $6$ for the MLE (or than $12$ for the WLSE), then the confidence intervals of $\alpha$ lie entirely above $1$. Furthermore if $c$ is less than $0.8$ for the MLE (or than $2$ for the WLSE), then the confidence intervals of $\alpha$ lie entirely under $1$. This shows that if the deposits are weak enough, i.e. $c>12$, we can conclude that a slow selection of a branch will occur with probability 1. On the other hand, if the deposits are strong enough, i.e. $c<0.8$, we can conclude that branches will eventually be uniformly crossed with probability 1.

## 4 Concluding remarks

In the literature no parameter estimation methods for reinforced random walks can be found. To partially fill this void, this article proposes a statistical framework to estimate the parameter of a general two-colored urn model. We define the maximum likelihood estimator (MLE) and the weighted least squares estimators (WLSE) for the parameter of this model and prove their consistency and their asymptotically normality under some usual regularity conditions. The proof lies on a general result for a large class of estimators called minimum contrast estimators. The MLE is asymptotically efficient, but can be difficult (lengthy) to compute, which can be an issue specially while using Bootstrap algorithms. The WLSE is a suitable alternative. Moreover this estimator is popular among practitioners.

We apply this statistical tool to the problem of path selection by an ant colony. To this purpose, we performed experiments with actual ants to collect data. The experiment consisted of introducing one hundred ants into a $Y$ shaped device, one at a time, and observing their successive choices. We also consider the particular urn model introduced by [2] to describe this phenomenon. This urn has two parameters, $\alpha$ and $c$, which have distinct biological interpretations, but contribute to the same effect: either selection of a branch or uniformization of the choices. The parameter $c$ influences the short term behavior, whereas $\alpha$ determines the asymptotic behavior. Consequently the model exhibits four phases which are illustrated by Figure 2. The case most commonly considered in the literature is the case of slow selection, which corresponds to $\alpha >1$ and $c>1$: the ants will eventually always choose the same branch, but this selection will take a long time. For instance, [2] provides the values $\alpha =2$ and $c=20$. However other phases can be relevant to describe the ant behavior. For instance the fast uniformization, corresponding to $\alpha <1$ and $c>1$, can model the less likely but not negligible case in which ants do not select a branch.

After assessing the accuracy of the MLE and the WLSE on simulated data, we estimate the value of $\alpha$ and $c$ with the two estimators. We also evaluate confidence regions by Bootstrap proceeding. The estimated values of $\alpha$ and $c$ ranged between 1.1 and 3 and between 3 and 7, respectively. This tends to imply that slow selection of a branch will occur. However, the Bootstrap sample gives a confidence level of $65\mathrm{%}$ for the hypothesis of slow selection, while the hypothesis of fast uniformization has a confidence of $35\mathrm{%}$.

This low level of confidence for the commonly assumed slow selection phase might be explained by technical reasons. The number of experiments (50) is relatively small; increasing the number of replicas will reduce the confidence regions. Moreover the competition between the parameters for the same effect induces an identifiability issue. For instance the apparent preference of a branch may be due to $\alpha >1$ or $c$ small with respect to 1. Therefore, the model, which is biologically relevant, is statistically difficult to estimate. Indeed, for an ethological study, discriminating the ant pheromone sensitivity from the pheromone deposit strength is meaningful. But for a statistical procedure, the similarity of effect of the two parameters scales down the estimation performance.

However, the uncertainty may not come from an inefficiency of the statistical procedure, but from shortcomings of the ethological hypotheses. Indeed, the estimated confidence intervals computed from the experimental data are larger than the ones computed from the simulated data (for similar parameter values). Moreover the assumption that the inter-individual variability is negligible is strong. For instance, it may be necessary to consider that the pheromone deposit varies at each passage, i.e. that $c$ is random.

These ethological considerations will be further discussed in a forthcoming paper which will analyze more elaborate experimental designs. The ants will be observed while freely evolving in a network with several nodes. In addition to a data analysis, we will model the experiment with a reinforced random walk on a finite graph for which we have provided probabilistic results (see [23]). The statistical methodology introduced in this paper will be extended to a larger class of reinforced random walks.

## 5.1 Distribution of ${Z}_{k}$, for $k\in \mathbb{N}$

In order to compute the distribution of ${Z}_{k}$, we introduce some notations. Let ${{S}}_{k}$ be the set of sequences of length $k+1$ of integers ${i}_{0},\dots ,{i}_{k}$ such that ${i}_{0}=0$ and ${i}_{j}-{i}_{j-1}\in \left\{0,1\right\}$ for $j=1,\dots ,k$. For $i\le k$ let ${{S}}_{k}\left(i\right)=\left\{\left({i}_{0},\dots ,{i}_{k}\right)\in {{S}}_{k}\mid {i}_{k}=i\right\}$. Then we have

$\mathbb{P}\left({Z}_{k}=i\right)=\sum _{\left({i}_{0},\dots ,{i}_{k}\right)\in {{S}}_{k}\left(i\right)}\prod _{q=0}^{k-1}{f}_{0}\left({i}_{q},q-{i}_{q}{\right)}^{{i}_{q+1}-{i}_{q}}\left(1-{f}_{0}\left({i}_{q},q-{i}_{q}\right){\right)}^{1-{i}_{q+1}+{i}_{q}}\phantom{\rule{thickmathspace}{0ex}}.$(13)

## A central limit theorem for the empirical conditional probabilities

For $0\le i\le k\le n-1$, recall the definition of ${a}_{N}\left(i,k-i\right)$ and ${p}_{N}\left(i,k-i\right)$ in eq. (7) and that ${\stackrel{ˉ}{f}}_{0}\left(i,k-i\right)=1-{f}_{0}\left(i,k-i\right)$.

#### Lemma 9

$\left\{\sqrt{N}\left({p}_{N}\left(i,k-i\right)-{f}_{0}\left(i,k-i\right)\right),0\le i\le k\le n-1\right\}$ converges weakly to a Gaussian vector with diagonal covariance matrix ${\mathrm{\Gamma }}_{0}$ with diagonal elements $\begin{array}{r}{\gamma }_{0}\left(i,k-i\right)=\frac{{f}_{0}\left(i,k-i\right){\stackrel{ˉ}{f}}_{0}\left(i,k-i\right)}{\mathbb{P}\left({Z}_{k}=i\right)}\phantom{\rule{thickmathspace}{0ex}}.\end{array}$(14)

#### Proof.

Define ${b}_{N}\left(i,k-i\right)={N}^{-1}\sum _{j=1}^{N}{1}_{{Z}_{k}^{j}=i}{X}_{k+1}^{j}$, the empirical estimate of $b\left(i,k-i\right)=\mathbb{P}\left({Z}_{k}=i,{X}_{k+1}=1\right)$ and $a\left(i,k-i\right)=\mathbb{E}\left[{a}_{N}\left(i,k-i\right)\right]=\mathbb{P}\left({Z}_{k}=i\right)$. Write then $\begin{array}{r}{p}_{N}\left(i,k-i\right)-{f}_{0}\left(i,k-i\right)=\frac{{b}_{N}\left(i,k-i\right)-b\left(i,k-i\right)}{{a}_{N}\left(i,k-i\right)}-\frac{b\left(i,k-i\right)}{{a}_{N}\left(i,k-i\right)a\left(i,k-i\right)}\left({a}_{N}\left(i,k-i\right)-a\left(i,k-i\right)\right)\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

Since the paths $\left({Z}_{1}^{j},\dots ,{Z}_{n}^{j}\right)$, $1\le j\le N$ are i.i.d., the multivariate central limit holds for the sequence of $2n\left(n-1\right)$ dimensional vectors $\left\{\left({b}_{N}\left(i,k-i\right)-b\left(i,k-i\right),{a}_{N}\left(i,k-i\right)-a\left(i,k-i\right)\right),0\le i\le k\le n-1\right\}$. The proof is concluded by tedious computations using the Markov property, which we omit.

#### Remark 10

We can prove that the covariance matrix ${\mathrm{\Gamma }}_{0}$ is diagonal by a statistical argument. If we consider the tautological model $\left\{f\left(i,k-i\right),0\le i\le k\le n-1\right\}$, i.e. $\theta =f$ and ${f}_{0}$ is the true value. Then the log-likelihood is $\begin{array}{r}{L}_{N}\left(f\right)=\sum _{k=0}^{n-1}\sum _{i=0}^{k}{a}_{N}\left(i,k-i\right)\left\{{p}_{N}\left(i,k-i\right)logf\left(i,k-i\right)+{q}_{N}\left(i,k-i\right)log\stackrel{ˉ}{f}\left(i,k-i\right)\right\}\phantom{\rule{thickmathspace}{0ex}},\end{array}$ where $\stackrel{ˉ}{f}\left(i,k-i\right)=1-f\left(i,k-i\right)$. Thus we see that $\left\{{p}_{N}\left(i,k-i\right),0\le i\le k\le n-1\right\}$ is the maximum likelihood estimator of ${f}_{0}$. This model is a regular statistical model, thus $\sqrt{N}\left({p}_{N}-{f}_{0}\right)$ converges weakly to the Gaussian distribution with covariance matrix ${I}_{n}^{-1}\left({f}_{0}\right)$, where ${I}_{n}\left(f\right)$ is the Fisher information matrix of the model. It is easily seen that ${I}_{n}\left({f}_{0}\right)$ is the $n\left(n-1\right)$ dimensional diagonal matrix with diagonal elements given by eq. (14).

## 5.3 A general result for minimum contrast estimators

Theorems 2 and 3 are a consequence of the general result we prove in this section. More precisely we demonstrate the consistency and the asymptotic normality of a general estimator of which the MLE and the WLSE are particular cases.

For $0\le i\le k\le n-1$, recall the definition of ${a}_{N}\left(i,k-i\right)$, ${p}_{N}\left(i,k-i\right)$ in eq. (7) and that $\stackrel{ˉ}{f}\left(\theta ,i,k-i\right)=1-f\left(\theta ,i,k-i\right)$. Let ${w}_{N}\left(i,k-i\right)$, $0\le i\le k\le n-1$ be a sequence of random weights and let $G$ be function defined on $\left[0,1\right]×\left(0,1\right)$. Define the empirical contrast function by

$\begin{array}{r}{\mathbb{W}}_{N}\left(\theta \right)=\sum _{k=0}^{n-1}\sum _{i=0}^{k}{w}_{N}\left(i,k-i\right)G\left({p}_{N}\left(i,k-i\right),f\left(\theta ,i,k-i\right)\right)\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

For instance, choosing $G\left(p,q\right)=-plogq-\left(1-p\right)log\left(1-q\right)$ and ${w}_{N}\left(i,k-i\right)={a}_{N}\left(i,k-i\right)$ yields

$\begin{array}{rl}{\mathbb{W}}_{N}\left(\theta \right)=& -\sum _{k=0}^{n-1}\sum _{i=0}^{k}{a}_{N}\left(i,k-i\right)\left\{{p}_{N}\left(i,k-i\right)logf\left(\theta ,i,k-i\right)\\ & +{q}_{N}\left(i,k-i\right)log\stackrel{ˉ}{f}\left(\theta ,i,k-i\right)\right\}\\ =& -{N}^{-1}{L}_{N}\left(\theta \right)\phantom{\rule{thickmathspace}{0ex}},\end{array}$

so that minimizing ${\mathbb{W}}_{N}$ is equivalent to maximizing the likelihood ${L}_{N}$, defined in eq. (4). Choosing $G\left(p,q\right)=\left(p-q{\right)}^{2}$ yields the weighted least squares contrast function ${W}_{N}$, defined in eq. (8). We now define the minimum contrast estimator of ${\theta }_{0}$ by

$\begin{array}{r}{\stackrel{ˆ}{\theta }}_{N}^{\mathbb{W}}=arg\underset{\theta \in \mathrm{\Theta }}{min}{\mathbb{W}}_{N}\left(\theta \right)\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

In order to prove the consistency and asymptotic normality of ${\stackrel{ˆ}{\theta }}_{N}^{\mathbb{W}}$, we make the following assumptions on $G$ and on the weights ${w}_{N}\left(i,k-i\right)$. Let ${\mathrm{\partial }}_{2}G$ and ${\mathrm{\partial }}_{2}^{2}G$ denote the first and second derivatives of $G$ with respect to its second argument.

#### Assumption 11

The function $G$ is non-negative, twice continuously differentiable on $\left[0,1\right]×\left(0,1\right)$ with $G\left(p,q\right)-G\left(p,p\right)>0$ if $p\ne q$, ${\mathrm{\partial }}_{2}G\left(p,p\right)=0$ and ${\mathrm{\partial }}_{2}^{2}G\left(p,p\right)>0$.

#### Assumption 12

For all $0\le i\le k\le n-1$, ${w}_{N}\left(i,k-i\right)$ converge almost surely to ${w}_{0}\left(i,k-i\right)$ and ${w}_{0}\left(i,k-i\right)>0$.

#### Theorem 13

If Assumptions 1–(i), 1–(ii), 11 and 12 hold, then ${\stackrel{ˆ}{\theta }}_{N}^{\mathbb{W}}$ is consistent. If moreover ${\theta }_{0}$ is an interior point of $\mathrm{\Theta }$ and Assumption 1–(iii) holds, then $\sqrt{N}\left({\stackrel{ˆ}{\theta }}_{N}^{\mathbb{W}}-{\theta }_{0}\right)$ converges weakly to a Gaussian distribution with zero mean.

The exact expression of the variance is given in the proof.

#### Proof.

Under Assumption 11, the strong law of large numbers shows that ${\mathbb{W}}_{N}\left(\theta \right)$ converges almost surely to $\begin{array}{r}\mathbb{W}\left(\theta \right)=\sum _{k=0}^{n-1}\sum _{i=0}^{k}{w}_{0}\left(i,k-i\right)G\left({f}_{0}\left(i,k-i\right),f\left(\theta ,i,k-i\right)\right)\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

Assumptions 1–(ii) and 11 ensure that ${\theta }_{0}$ is the unique minimum of $\mathbb{W}$. Indeed, $G\left(p,q\right)>0$ if $p\ne q$ and $G\left(p,p\right)=0$. Thus, $\mathbb{W}$ is minimized by any value of $\theta$ such that $f\left(\theta ,i,k-i\right)=f\left({\theta }_{0},i,k-i\right)$. By Assumption 1–(ii), this implies $\theta ={\theta }_{0}$.

Moreover the convergence of ${\mathbb{W}}_{N}$ to $\mathbb{W}$ is uniform, since $\mathrm{\Theta }$ is compact and the function $f$ is twice continuously differential with respect to $\theta$, its first variable. This yields the consistency of ${\stackrel{ˆ}{\theta }}_{N}^{\mathbb{W}}$. For the sake of completeness, we give a brief proof. Since ${\theta }_{0}$ minimizes $\mathbb{W}$ and ${\stackrel{ˆ}{\theta }}_{N}^{\mathbb{W}}$ minimizes ${\mathbb{W}}_{N}$, we have $\begin{array}{rl}0& \le \mathbb{W}\left({\stackrel{ˆ}{\theta }}_{N}^{\mathbb{W}}\right)-\mathbb{W}\left({\theta }_{0}\right)\\ & =\mathbb{W}\left({\stackrel{ˆ}{\theta }}_{N}^{\mathbb{W}}\right)-{\mathbb{W}}_{N}\left({\stackrel{ˆ}{\theta }}_{N}^{\mathbb{W}}\right)+{\mathbb{W}}_{N}\left({\stackrel{ˆ}{\theta }}_{N}^{\mathbb{W}}\right)-{\mathbb{W}}_{N}\left({\theta }_{0}\right)+{\mathbb{W}}_{N}\left({\theta }_{0}\right)-\mathbb{W}\left({\theta }_{0}\right)\\ & \le \mathbb{W}\left({\stackrel{ˆ}{\theta }}_{N}^{\mathbb{W}}\right)-{\mathbb{W}}_{N}\left({\stackrel{ˆ}{\theta }}_{N}^{\mathbb{W}}\right)+{\mathbb{W}}_{N}\left({\theta }_{0}\right)-\mathbb{W}\left({\theta }_{0}\right)\le 2{sup}_{\theta \phantom{\rule{thinmathspace}{0ex}}\in \phantom{\rule{thinmathspace}{0ex}}\mathrm{\Theta }}|{\mathbb{W}}_{N}\left(\theta \right)-\mathbb{W}\left(\theta \right)|\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

Since ${\theta }_{0}$ is the unique minimizer of $\mathbb{W}$, for $ϵ>0$, we can find $\delta$ such that if $\theta \in \mathrm{\Theta }$ and $\parallel \theta -{\theta }_{0}\parallel >ϵ$, then $\mathbb{W}\left(\theta \right)-\mathbb{W}\left({\theta }_{0}\right)\ge \delta$. Thus $\begin{array}{rl}\mathbb{P}\left(\parallel {\stackrel{ˆ}{\theta }}_{N}-{\theta }_{0}\parallel >ϵ\right)& \le \mathbb{P}\left(\mathbb{W}\left({\stackrel{ˆ}{\theta }}_{N}\right)-\mathbb{W}\left({\theta }_{0}\right)\ge \delta \right)\\ & \le \mathbb{P}\left(2{sup}_{\theta \phantom{\rule{thinmathspace}{0ex}}\in \phantom{\rule{thinmathspace}{0ex}}\mathrm{\Theta }}|{\mathbb{W}}_{N}\left(\theta \right)-\mathbb{W}\left(\theta \right)|\ge \delta \right)\to 0\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

The central limit theorem is a consequence of the consistency and Lemma 9. A first order Taylor extension of ${\stackrel{˙}{\mathbb{W}}}_{N}\left(\theta \right)$ at ${\theta }_{0}$ yields

$\begin{array}{rl}0& ={\stackrel{˙}{\mathbb{W}}}_{N}\left({\stackrel{ˆ}{\theta }}_{N}^{\mathbb{W}}\right)={\stackrel{˙}{\mathbb{W}}}_{N}\left({\theta }_{0}\right)+{\stackrel{¨}{\mathbb{W}}}_{N}\left({\stackrel{˜}{\theta }}_{N}\right)\left({\stackrel{ˆ}{\theta }}_{N}^{\mathbb{W}}-{\theta }_{0}\right)\phantom{\rule{thickmathspace}{0ex}},\end{array}$

where ${\stackrel{˜}{\theta }}_{N}\in \left[{\theta }_{0},{\stackrel{ˆ}{\theta }}_{N}^{\mathbb{W}}\right]$. Setting ${\stackrel{˙}{f}}_{0}\left(i,k-i\right)=\stackrel{˙}{f}\left({\theta }_{0},i,k-i\right)$, we have

$\begin{array}{r}{\stackrel{˙}{\mathbb{W}}}_{N}\left({\theta }_{0}\right)=\sum _{k=0}^{n-1}\sum _{i=0}^{k}{w}_{N}\left(i,k-i\right){\mathrm{\partial }}_{2}G\left({p}_{N}\left(i,k-i\right),{f}_{0}\left(i,k-i\right)\right){\stackrel{˙}{f}}_{0}\left(i,k-i\right)\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

Let ${\mathrm{\partial }}_{12}^{2}G$ be the mixed second derivative of $G$. Note that

${\mathrm{\partial }}_{2}G\left({f}_{0}\left(i,k-i\right),{f}_{0}\left(i,k-i\right)\right)=0\phantom{\rule{thinmathspace}{0ex}}.$

Thus, by the delta-method (see [24], Theorem 3.3.11) and since ${w}_{N}$ converges almost surely to ${w}_{0}$, we obtain that $\sqrt{N}{\stackrel{˙}{\mathbb{W}}}_{N}\left({\theta }_{0}\right)$ converges weakly towards

$\begin{array}{r}\sum _{k=0}^{n-1}\sum _{i=0}^{k}{w}_{0}\left(i,k-i\right){\mathrm{\partial }}_{12}^{2}G\left({f}_{0}\left(i,k-i\right),{f}_{0}\left(i,k-i\right)\right){\mathrm{\Lambda }}_{0}\left(i,k-i\right){\stackrel{˙}{f}}_{0}\left(i,k-i\right)\phantom{\rule{thickmathspace}{0ex}},\end{array}$

where ${\mathrm{\Lambda }}_{0}\left(i,k\right)$ are independent Gaussian random variables with zero mean and variance ${\gamma }_{0}\left(i,k\right)$ defined in eq. (14). Equivalently, $\sqrt{N}{\stackrel{˙}{\mathbb{W}}}_{N}\left({\theta }_{0}\right)$ converges weakly to a Gaussian vector with zero mean and covariance matrix $H\left({\theta }_{0}\right)$ defined by

$\begin{array}{r}H\left({\theta }_{0}\right)=\sum _{k=0}^{n-1}\sum _{i=0}^{k}{w}_{0}^{2}\left(i,k-i\right)\left\{{\mathrm{\partial }}_{12}^{2}G\left({f}_{0}\left(i,k-i\right),{f}_{0}\left(i,k-i\right)\right){\right\}}^{2}{\gamma }_{0}\left(i,k\right){\stackrel{˙}{f}}_{0}\left(i,k-i\right)\left({\stackrel{˙}{f}}_{0}\left(i,k-i\right){\right)}^{\prime }\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

By the law of large numbers, ${\stackrel{¨}{\mathbb{W}}}_{N}\left(\theta \right)$ converges almost surely to $\stackrel{¨}{\mathbb{W}}\left(\theta \right)$ and this convergence is also locally uniform. Thus, ${\stackrel{¨}{\mathbb{W}}}_{N}\left({\stackrel{˜}{\theta }}_{N}\right)$ converges almost surely to $\stackrel{¨}{\mathbb{W}}\left({\theta }_{0}\right)$. Using again the fact that ${\mathrm{\partial }}_{2}G\left(p,p\right)=0$, we obtain

$\begin{array}{r}\stackrel{¨}{\mathbb{W}}\left({\theta }_{0}\right)=\sum _{k=0}^{n-1}\sum _{i=0}^{k}{w}_{0}\left(i,k-i\right){\mathrm{\partial }}_{2}^{2}G\left({f}_{0}\left(i,k-i\right),{f}_{0}\left(i,k-i\right)\right){\stackrel{˙}{f}}_{0}\left(i,k-i\right)\left({\stackrel{˙}{f}}_{0}\left(i,k-i\right){\right)}^{\prime }\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

Denote for brevity $g\left(i,k-i\right)={w}_{0}\left(i,k-i\right){\mathrm{\partial }}_{2}^{2}G\left({f}_{0}\left(i,k-i\right),{f}_{0}\left(i,k-i\right)\right)$. Then, for any $u\in {\mathbb{R}}^{d}$, we have

$\begin{array}{rl}u\stackrel{¨}{\mathbb{W}}\left({\theta }_{0}\right){u}^{\prime }& ={\sum }_{k=0}^{n-1}{\sum }_{i=0}^{k}g\left(i,k-i\right){\left({\sum }_{s=1}^{d}{u}_{s}{\mathrm{\partial }}_{s}f\left({\theta }_{0},i,k-i\right)\right)}^{2}\phantom{\rule{thickmathspace}{0ex}}.\end{array}$(15)

By assumption 12, $g\left(i,k-i\right)>0$ for all $0\le i\le k\le n-1$, thus eq. (15) is zero only if for all $k=0,\dots ,n-1$ and $i=0,\dots ,k$, we have $\sum _{s=1}^{d}{u}_{s}{\mathrm{\partial }}_{s}f\left({\theta }_{0},i,k-i\right)=0$. By Assumption 1–(iii), this is possible only if ${u}_{s}=0$ for all $s=1,\dots ,d$. Thus $\stackrel{¨}{\mathbb{W}}\left({\theta }_{0}\right)$ is positive definite.

We can now conclude that for large enough $N$, ${\stackrel{¨}{\mathbb{W}}}_{N}\left({\stackrel{˜}{\theta }}_{N}\right)$ is invertible and we can write

$\begin{array}{r}\sqrt{N}\left({\stackrel{ˆ}{\theta }}_{N}^{\mathbb{W}}-{\theta }_{0}\right)=-{\stackrel{¨}{\mathbb{W}}}_{N}^{-1}\left({\stackrel{˜}{\theta }}_{N}\right)\sqrt{N}{\stackrel{˙}{\mathbb{W}}}_{N}\left({\theta }_{0}\right)\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

The right hand side converges weakly to the Gaussian distribution with zero mean and covariance matrix ${\stackrel{¨}{\mathbb{W}}}^{-1}\left({\theta }_{0}\right)H\left({\theta }_{0}\right){\stackrel{¨}{\mathbb{W}}}^{-1}\left({\theta }_{0}\right)$.

## 5.4 Proofs of theorems 2 and 3

Theorems 2 and 3 are a consequence of Theorem 13.

#### Lemma 14

Assumption 12 holds for the weights ${w}_{N}\left(i,k-i\right)={a}_{N}\left(i,k-i\right)$ and ${w}_{N}\left(i,k-i\right)={a}_{N}\left(i,k-i\right){p}_{N}^{-1}\left(i,k-i\right){q}_{N}^{-1}\left(i,k-i\right)$, $0\le i\le k\le n-1$.

#### Proof.

For all $0\le i\le k\le n-1$, ${a}_{N}\left(i,k-i\right)$ converges almost surely to $\mathbb{P}\left({Z}_{k}=i\right)$ and ${a}_{N}\left(i,k-i\right){p}_{N}^{-1}\left(i,k-i\right){q}_{N}^{-1}\left(i,k-i\right)$ to $\mathbb{P}\left({Z}_{k}=i\right){f}_{0}\left(i,k-i{\right)}^{-1}{\stackrel{ˉ}{f}}_{0}\left(i,k-i{\right)}^{-1}$. Moreover Assumption 1-(i) implies that ${f}_{0}\left(i,k-i\right)>0$ and ${\stackrel{ˉ}{f}}_{0}\left(i,k-i\right)>0$ for all $0\le i\le k\le n-1$. Using Formula (13), this in turn implies that $\mathbb{P}\left({Z}_{k}=i\right)>0$ for all $0\le i\le k\le n-1$.

#### Proof Theorem

2. As mentioned above, the maximum likelihood estimator minimizes the contrast function $\mathbb{W}$ written with the function $G\left(p,q\right)=-plogq-\left(1-p\right)log\left(1-q\right)$ and the weights ${a}_{N}\left(i,k-i\right)$. Thus the proof of Theorem 2 consists in checking Assumption 11 and 12 to apply Theorem 13. Lemma 14 implies that Assumption 12 holds.

The function $G$ considered here satisfies Assumption 11. Indeed, for $p,q\in \left(0,1\right)$, define $K\left(p,q\right)=G\left(p,q\right)-G\left(p,p\right)=plog\left(p/q\right)+\left(1-p\right)log\left(\left(1-p\right)/\left(1-q\right)\phantom{\rule{thinmathspace}{0ex}}.$

Remark that $K\left(p,q\right)$ is the Kullback-Leibler distance between the Bernoulli measures with respective success probabilities $p$ and $q$. Then it is well known that $K\left(p,q\right)>0$ except if $p=q$. Indeed, by Jensen’s inequality, $\begin{array}{r}K\left(p,q\right)\ge -log\left(pq/p+\left(1-p\right)\left(1-q\right)/\left(1-p\right)\right)=log1=0\phantom{\rule{thickmathspace}{0ex}},\end{array}$

and by strict concavity of the log function, equality holds only if $p=q$. Moreover, ${\mathrm{\partial }}_{2}G\left(p,q\right)=-p/q+\left(1-p\right)/\left(1-q\right)$ so ${\mathrm{\partial }}_{2}G\left(p,p\right)=0$ and ${\mathrm{\partial }}_{2}^{2}G\left(p,p\right)={p}^{-1}\left(1-p{\right)}^{-1}>0$. ■

#### Proof of theorem

3 Again, the proof consists in checking Assumption 11 and 12 to apply Theorem 13. The latter holds by virtue of Lemma 14 and Assumption 11 trivially holds for the function $G\left(p,q\right)=\left(p-q{\right)}^{2}$. If ${w}_{N}\left(i,k-i\right)={p}_{N}^{-1}\left(i,k-i\right){q}_{N}^{-1}\left(i,k-i\right){a}_{N}\left(i,k-i\right)$, then $H\left({\theta }_{0}\right)=2W\left({\theta }_{0}\right)=4{ℐ}_{n}\left({\theta }_{0}\right)=4\sum _{k=0}^{n-1}\sum _{i=0}^{k}\frac{ℙ\left({Z}_{k}=i\right)}{{f}_{0}\left(i,k-i\right){\overline{f}}_{0}\left(i,k-i\right)}{\stackrel{˙}{f}}_{0}\left(i,k-i\right){\left({\stackrel{˙}{f}}_{0}\left(i,k-i\right)\right)}^{\prime }\text{\hspace{0.17em}}.$(16)

So that ${\mathrm{\Sigma }}_{n}\left({\theta }_{0}\right)={\stackrel{¨}{\mathbb{W}}}^{-1}\left({\theta }_{0}\right)H\left({\theta }_{0}\right){\stackrel{¨}{\mathbb{W}}}^{-1}\left({\theta }_{0}\right)={{{I}}_{n}}^{-1}\left({\theta }_{0}\right)$. ■

If the weights are chosen as ${w}_{N}\left(i,k\right)={a}_{N}\left(i,k\right)$, then ${w}_{0}\left(i,k-i\right)=\mathbb{P}\left({Z}_{k}=i\right)$ and

$\begin{array}{rl}H\left({\theta }_{0}\right)& =4{\sum }_{k=0}^{n-1}{\sum }_{i=0}^{k}\mathbb{P}\left({Z}_{k}=i\right){f}_{0}\left(i,k-i\right){\stackrel{ˉ}{f}}_{0}\left(i,k-i\right){\stackrel{˙}{f}}_{0}\left(i,k-i\right)\left({\stackrel{˙}{f}}_{0}\left(i,k-i\right){\right)}^{\prime }\phantom{\rule{thickmathspace}{0ex}},\end{array}$(17) $\begin{array}{rl}\stackrel{¨}{W}\left({\theta }_{0}\right)& =2{\sum }_{k=0}^{n-1}{\sum }_{i=0}^{k}\mathbb{P}\left({Z}_{k}=i\right){\stackrel{˙}{f}}_{0}\left(i,k-i\right)\left({\stackrel{˙}{f}}_{0}\left(i,k-i\right){\right)}^{\prime }\phantom{\rule{thickmathspace}{0ex}}.\end{array}$(18)

## 5.5 Proofs of Corollaries 6 and 7

Corollaries 6 and 7 are a consequence of Theorem 13. The assumptions on the weights ${w}_{N}$ and on the functions $G$ have been already verified in the previous section. We have to prove the Assumption 1 on the choice function $f$ defined in (10). Hypothesis 1-(i) is obvious.

By elementary computations, we have, for $0\le i\le k\le n-1$, $\begin{array}{rl}f\left(\alpha ,c,i,k-i\right)=f\left({\alpha }_{0},{c}_{0},i,k-i\right)& ⇔{\left(\frac{c+i}{c+k-i}\right)}^{\alpha }={\left(\frac{{c}_{0}+i}{{c}_{0}+k-i}\right)}^{{\alpha }_{0}}\\ & ⇔\frac{\alpha }{{\alpha }_{0}}=\frac{log\left({c}_{0}+i\right)-log\left({c}_{0}+k-i\right)}{log\left(c+i\right)-log\left(c+k-i\right)}\phantom{\rule{thickmathspace}{0ex}}.\end{array}$(19)

Plugging the pairs $\left(i,k\right)=\left(0,1\right)$ and $\left(i,k\right)=\left(0,2\right)$ into (19) yields

$\begin{array}{r}\frac{log\left({c}_{0}\right)-log\left({c}_{0}+1\right)}{log\left(c\right)-log\left(c+1\right)}=\frac{log\left({c}_{0}\right)-log\left({c}_{0}+2\right)}{log\left(c\right)-log\left(c+2\right)}\phantom{\rule{thickmathspace}{0ex}},\end{array}$

or equivalently

$\begin{array}{r}\frac{log\left(1+1/{c}_{0}\right)}{log\left(1+2/{c}_{0}\right)}=\frac{log\left(1+1/c\right)}{log\left(1+2/c\right)}\phantom{\rule{thickmathspace}{0ex}}.\end{array}$(20)

It is easily checked that the function $x\to log\left(1+x\right)/log\left(1+2x\right)$ is strictly increasing on $\left(0,\mathrm{\infty }\right)$. Thus eq. (20) implies that $c={c}_{0}$. Plugging this equality into eq. (19) yields $\alpha ={\alpha }_{0}$. This proves Assumption 1-(ii).

We now prove that if $n\ge 2$, the vectors $\left\{{\mathrm{\partial }}_{\alpha }f\left({\theta }_{0},i,k-i\right),0\le i\le k\le n-1\right\}$ and $\left\{{\mathrm{\partial }}_{c}f\left({\theta }_{0},i,k-i\right),0\le i\le k\le n-1\right\}$ are linearly independent in ${\mathbb{R}}^{n\left(n-1\right)}$. For $0\le i\le k\le n-1$, we have, $\begin{array}{rl}{\mathrm{\partial }}_{\alpha }f\left(\alpha ,c,i,k-i\right)& =f\left(\alpha ,c,i,k-i\right)f\left(\alpha ,c,k-i,i\right)log\left(\frac{c+i}{c+k-i}\right)\phantom{\rule{thickmathspace}{0ex}},\\ {\mathrm{\partial }}_{c}f\left(\alpha ,c,i,k-i\right)& =f\left(\alpha ,c,i,k-i\right)f\left(\alpha ,c,k-i,i\right)\frac{\alpha \left(k-2i\right)}{\left(c+i\right)\left(c+k-i\right)}\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

Let $\left(u,v\right)\in {\mathbb{R}}^{2}$ and assume that for all $i,j\le n-1$ such that $i+j\le n-1$, it holds that

$\begin{array}{r}ulog\left(\frac{{c}_{0}+i}{{c}_{0}+j}\right)+v\frac{{\alpha }_{0}\left(j-i\right)}{\left({c}_{0}+i\right)\left({c}_{0}+j\right)}=0\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

Replacing $\left(i,j\right)$ for instance successively by $\left(0,1\right)$ and $\left(0,2\right)$ yields

$\begin{array}{r}\left\{\begin{array}{l}ulog\left(\frac{{c}_{0}}{{c}_{0}+1}\right)+v\frac{{\alpha }_{0}}{{c}_{0}\left({c}_{0}+1\right)}=0\phantom{\rule{thickmathspace}{0ex}},\\ ulog\left(\frac{{c}_{0}}{{c}_{0}+2}\right)+v\frac{2{\alpha }_{0}}{{c}_{0}\left({c}_{0}+2\right)}=0\phantom{\rule{thickmathspace}{0ex}}.\end{array}\end{array}$

If $\left(u,v\right)\ne \left(0,0\right)$, this implies

$\begin{array}{r}\frac{{c}_{0}+2}{{c}_{0}}log\left(\frac{{c}_{0}+2}{{c}_{0}}\right)+2\frac{{c}_{0}+1}{{c}_{0}}log\left(\frac{{c}_{0}+1}{{c}_{0}}\right)=0\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

By strict convexity of the function $x\to xlogx$ on $\left(0,\mathrm{\infty }\right)$, this is impossible. Thus $u=v=0$ and Assumption 1-(iii) holds.

## 5.6 Proof of Theorem 8

#### Proof of Theorem

8, case ${\alpha }_{0}=1$ In this case the model is Pólya’s urn, and we have $\begin{array}{rl}{{I}}_{n}\left(c\right)& =\sum _{k=0}^{n-1}\frac{1}{2c+k}\left\{\mathbb{E}\left[\frac{1}{c+{Z}_{k}}\right]+\mathbb{E}\left[\frac{1}{c+k-{Z}_{k}}\right]-\frac{4}{2c+k}\right\}\phantom{\rule{thickmathspace}{0ex}}.\end{array}$(21)

The distribution of ${Z}_{k}$ is given by $\begin{array}{r}\mathbb{P}\left({Z}_{k}=i\right)=\left(\genfrac{}{}{0em}{}{k}{i}\right)\frac{c\left(c+1\right)\cdots \left(c+i-1\right)×c\left(c+1\right)\cdots \left(c+k-i-1\right)}{2c\left(2c+1\right)\cdots \left(2c+k-1\right)}\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

Thus, $\begin{array}{r}\mathbb{E}\left[\frac{1}{c+{Z}_{k}}\right]=\sum _{i=0}^{k}\left(\genfrac{}{}{0em}{}{k}{i}\right)\frac{c\left(c+1\right)\cdots \left(c+i-1\right)×c\left(c+1\right)\cdots \left(c+k-i-1\right)}{2c\left(2c+1\right)\cdots \left(2c+k-1\right)}\frac{1}{c+i}\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

For any $c>0$, there exists constants ${C}_{1}<{C}_{2}$ such that, for all integers $h\ge 1$, $\begin{array}{r}{C}_{1}{h}^{c}\le \prod _{i=1}^{h}\left(1+c/i\right)\le {C}_{2}{h}^{c}\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

Therefore, there exists a constant $C>0$ such that for all $k\ge 1$, $\begin{array}{rl}\mathbb{E}\left[\frac{1}{c+{Z}_{k}}\right]& \le C{k}^{-2}\sum _{i=1}^{k-1}{\left(\frac{i}{k}\right)}^{c-2}{\left(1-\frac{i}{k}\right)}^{c-1}=\left\{\begin{array}{l}O\left({k}^{-1}\right)\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\text{if\hspace{0.17em}}c>1\phantom{\rule{thickmathspace}{0ex}},\\ O\left({k}^{-1}logk\right)\phantom{\rule{thinmathspace}{0ex}}\mathit{\text{if\hspace{0.17em}}}c=1\phantom{\rule{thickmathspace}{0ex}},\\ O\left({k}^{-c}\right)\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\text{if\hspace{0.17em}}c<1\phantom{\rule{thickmathspace}{0ex}}.\end{array}\end{array}$

In all three cases, we obtain that the first series in eq. (21) is summable. By symmetry, the sum of the second expectations is also finite. ■

#### Proof of Theorem

8, case ${\alpha }_{0}<1$] In this case, we know by Theorem 4 that ${Z}_{n}/n$ converges almost surely to 1/2. This implies that $f\left(\theta ,{Z}_{n},n-{Z}_{n}\right)$ converges almost surely to 1/2 for all $\theta$. By Cesaro’s Lemma, this implies that ${n}^{-1}{\mathrm{\ell }}_{n}\left(\theta \right)\to -log2$ a.s. ■

#### Proof of Theorem

8, case ${\alpha }_{0}>1$. Let ${\mathrm{\Omega }}_{1}$ be the event that color 1 is eventually selected, which happens with probability 1/2 by Theorem 4. Then, on ${\mathrm{\Omega }}_{1}$, ${Z}_{n}/n\to 1$ and if $k>{T}_{\mathrm{\infty }}$, then ${X}_{k+1}=1$ and ${Z}_{k}=k-{Q}_{\mathrm{\infty }}$. Thus for large enough $n$, the log-likelihood on one path becomes $\begin{array}{r}{\mathrm{\ell }}_{n}\left(\theta \right)=\sum _{k=0}^{{T}_{\mathrm{\infty }}}{X}_{k+1}logf\left(\theta ,{Z}_{k},k-{Z}_{k}\right)+\left(1-{X}_{k+1}\right)log\left\{1-f\left(\theta ,{Z}_{k},k-{Z}_{k}\right)\right\}+\sum _{k={T}_{\mathrm{\infty }}+1}^{n}logf\left(\theta ,k-{Q}_{\mathrm{\infty }},{Q}_{\mathrm{\infty }}\right)\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

As $k\to \mathrm{\infty }$, for any $\alpha >0$, $\begin{array}{r}logf\left(\theta ,k-{Q}_{\mathrm{\infty }},{Q}_{\mathrm{\infty }}\right)=-log\left\{1+\frac{\left(c+{Q}_{\mathrm{\infty }}{\right)}^{\alpha }}{\left(c+k-{Q}_{\mathrm{\infty }}{\right)}^{\alpha }}\right\}\sim -\frac{\left(c+{Q}_{\mathrm{\infty }}{\right)}^{\alpha }}{\left(c+k-{Q}_{\mathrm{\infty }}{\right)}^{\alpha }}\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

If $\alpha \le 1$ the series is divergent and thus $\underset{n\to \mathrm{\infty }}{lim}{\mathrm{\ell }}_{n}\left(\theta \right)=-\mathrm{\infty }$. If $\alpha >1$ then the series is convergent and thus, on ${\mathrm{\Omega }}_{1}$, $\begin{array}{rl}\underset{n\to \mathrm{\infty }}{lim}{\mathrm{\ell }}_{n}\left(\theta \right)& =\sum _{k=0}^{\mathrm{\infty }}{X}_{k+1}logf\left(\theta ,{Z}_{k},k-{Z}_{k}\right)+\left(1-{X}_{k+1}\right)log\left\{1-f\left(\theta ,{Z}_{k},k-{Z}_{k}\right)\right\}\\ & =\sum _{k=0}^{{T}_{\mathrm{\infty }}}{X}_{k+1}logf\left(\theta ,{Z}_{k},k-{Z}_{k}\right)+\left(1-{X}_{k+1}\right)log\left\{1-f\left(\theta ,{Z}_{k},k-{Z}_{k}\right)\right\}\\ & +\sum _{{T}_{\mathrm{\infty }}+1}^{\mathrm{\infty }}logf\left(\theta ,k-{Q}_{\mathrm{\infty }},{Q}_{\mathrm{\infty }}\right)\phantom{\rule{thickmathspace}{0ex}}.\end{array}$

This implies that $arg\underset{\theta \in \mathrm{\Theta }}{max}{\mathrm{\ell }}_{n}\left(\theta \right)=arg\underset{\theta \in \mathrm{\Theta },\alpha >1}{max}{\mathrm{\ell }}_{n}\left(\theta \right)$ and that this argmax is a random variable which is a function of the whole path, and does not depend on the true value ${\theta }_{0}$. □

## Acknowledgements

We thanks Guy Theraulaz and Hugues Chaté for providing their material framework and their field expertise to allow the first author to collect the data of the Argentine ants experiments. These experiments are part of the project TRACES supported by the CNRS. They were done during two visits in April and July 2012 of the first author to the Centre de Recherches sur la Cognition Animale (CRCA, Centre de Recherches sur la Cognition Animale, UMR 5169, Paul Sabatier University, Toulouse), whose hospitality is gratefully acknowledged. Line C. Le Goff has been supported by a grant from the French Ministry of Education, Research and Technology and then by a Fyssen Foundation post-doctoral grant.

## References

• 1.

Pólya G. Sur quelques points de la théorie des probabilités. Ann IHP 1931;1:117–61. Google Scholar

• 2.

Deneubourg JL, Aron S, Goss S, Pasteels J. The self-organizing exploratory pattern of the argentine ant. J Insect Behav 1990;3:159–68.

• 3.

Tarrès P. Localization of reinforced random walks, 2011. Available at 5536 arXiv:1103.5536 http://arxiv.org/abs/1103. Google Scholar

• 4.

Davis B. Reinforced random walk. Probab Theory Related Fields 1990;84:203–29. Google Scholar

• 5.

Vittori K, Talbot G, Gautrais J, Fourcassié V, Araujo A, Theraulaz G. Path efficiency of ant foraging trails in an artificial network. J Theor Biol 2006;239:507–15. Google Scholar

• 6.

Garnier S, Guérécheau A, Combe M, Fourcassié V, Theraulaz G. Path selection and foraging efficiency in argentine ant transport networks. Behav Ecol Sociobiol 2009;63:1167–79.

• 7.

Thienen W, Metzler D, Choe DH, Witte V. Pheromone communication in ants: a detailed analysis of concentration-dependent decisions in three species. Behav Ecol Sociobiol 2014;68:1611–27.

• 8.

Khanin K, Khanin R. A probabilistic model for the establishment of neuron polarity. J Math Biol 2001;42:26–40.

• 9.

Jeanson R, Ratnieks F, Deneubourg JL. Pheromone trail decay rates on different substrates in the pharaohs ant, monomorium pharaonis. Physiol Entomol 2003;28:192–8.

• 10.

Pemantle R. A survey of random processes with reinforcement. Probab Surv 2007;4:1–79.

• 11.

Arganda S, Nicolis S, Perochain A, Péchabadens C, Latil G, Dussutour A. Collective choice in ants: The role of protein and carbohydrates ratios. J Insect Physiol 2014;69:19–26.

• 12.

Aron S, Deneubourg JL, Goss S, Pasteels J. Functional self-organisation illustrated by inter-nest traffic in ants : the Case of the Argentine ant In: Alt W, Hoffmann G, editors. Biological Motion: Proceedings of a Workshop held in Königswinter, Germany, March 16–19, 1989. Berlin, Heidelberg: Springer; 1990:533–47.

• 13.

Beckers R, Deneubourg JL, Goss S. Modulation of trail laying in the ant lasius niger (hymenoptera: formicidae) and its role in the collective selection of a food source. J Insect Behav 1993;6:751–9.

• 14.

Dussutour A, Deneubourg JL, Fourcassié V. Amplification of individual preferences in a social context: the case of wall-following in ants. Proc R Soc London Ser B 2005;272:705–14. Google Scholar

• 15.

Nicolis S, Deneubourg JL. Emerging patterns and food recruitment in ants: an analytical study. J Theor Biol 1999;198:575–92. Google Scholar

• 16.

Nicolis S, Self-organization Dussutour A., collective decision making and resource exploitation strategies in social insects. Eur Phys J B 2008;65:379–85.

• 17.

Freedman D. Bernard Friedman’s urn. Ann Math Stat 1965;36:956–70.

• 18.

Beckers R, Deneubourg JL, Goss S. Trails and u-turns in the selection of a path by the ant lasius niger. J Theor Biol 1992;159:397–415. Google Scholar

• Pasteels J, Deneubourg JL, Goss S. Transmission and amplification of information in a changing environment: the case of insect societies. In: Prigogine I, Sanglier M, editors. Laws of nature and human conduct: Specificities and unifying themes. Brussels: Gordes, Brussels; 1987. Google Scholar

• Wasserman L. All of statistics: a concise course in statistical inference. New York: Springer; 2004. Google Scholar

• 21.

Van Vorhis Key S, Baker T. Trail-following responses of the argentine ant, iridomyrmex humilis (mayr), to a synthetic trail pheromone component and analogs. J Chem Ecol 1982;8:3–14.

• 22.

Aron S, Pasteels J, Deneubourg JL. Trail-laying behaviour during exploratory recruitment in the argentine ant, iridomyrmex humilis (mayr). Biol Behav 1989;14:207–17. Google Scholar

• Le Goff LC, Raimond O. Vertex reinforced non-backtracking random walks: an example of path formation, 2015. Available at: http://arxiv.org/abs/1506.01239, arXiv:1506.01239.

• 24.

Dacunha-Castelle D, Duflo M. Probability and statistics. vol. II. New York: Springer; 1986. Google Scholar

Published Online: 2017-03-25

Citation Information: The International Journal of Biostatistics, Volume 13, Issue 1, 20160029, ISSN (Online) 1557-4679,

Export Citation

© 2017 Walter de Gruyter GmbH, Berlin/Boston.