We show that counts of squarefree integers up to X in short intervals of size H tend to a Gaussian distribution as long as and . This answers a question posed by R. R. Hall in 1989. More generally, we prove a variant of Donsker’s theorem, showing that these counts scale to a fractional Brownian motion with Hurst parameter . In fact, we are able to prove these results hold in general for collections of B-free integers as long as the sieving set B satisfies a very mild regularity property, for Hurst parameter varying with the set B.
1.1 Statistics of counts of squarefrees
Let be the set of squarefree natural numbers (that is natural numbers without a repeated prime factor; by convention, we include ). We write
for the number of squarefrees no more than x. It is well known that ; thus the squarefrees have asymptotic density . Our purpose in this note is to investigate their distribution at a finer scale. In particular, we will investigate the distribution of squarefrees in a random interval , where n is an integer chosen uniformly at random from 1 to X, with .
If H is fixed and does not grow with X, at most squarefrees can lie in such an interval. Their distribution as is slightly complicated but completely understood; it may be described by Hardy–Littlewood type correlations which can be derived from elementary sieve theory (see [32, 31]). Or, more abstractly, the distribution of squarefrees in an interval of size can be described by a non-weakly mixing stationary ergodic process (see [7, 43]).
For H tending to infinity with X, matters become at once simpler and more difficult; simpler because some of the irregularities in the distribution just described are smoothed out at this scale, but more difficult in that natural conjectures become more difficult to prove.
be the count of squarefrees in the interval . R. R. Hall [16, 17] was the first to investigate the distribution of this count when H grows with X. In [16, Corollary 1], Hall proved that the variance of the number of squarefrees is of order if H is not too large with respect to X. More exactly, as , we have
as long as with .
Keating and Rudnick  studied this problem in a function field setting, connecting it with Random Matrix Theory, and suggested based on this that (1.1) will hold for . The best known result is , where it is shown that (1.1) holds for unconditionally and on the Lindelöf Hypothesis. In fact, in , it is shown that even an upper bound of order for for all would already imply the Riemann Hypothesis.
Because is on average of order H, one might naively have expected the variance to also be of order H. That the variance is of order speaks to the fact that the squarefrees are a rather rigid sequence. This can be discerned even visually in comparison, for instance, to the primes (Figure 1), and we will return to give a more exact description of it in Section 1.3.
In , Hall studied higher moments of counts of squarefrees in short intervals
where k is a positive integer, proving the upper bound . Various authors have asked whether this can be refined, with the most recent result,
for any as long as , being due to Nunes . For H in the range considered, this is an optimal upper bound up to the factor of . In , extensive numerical evidence is presented that suggests that these moments are in fact Gaussian.
Our first main result confirms this conjecture.
For , as ,
for every positive integer k, where if k is even and if k is odd. Here is an explicit constant.
Thus if and for some , we have
Note that the main term is the k-th moment of a centered Gaussian random variable with variance .
If , then for any fixed k, we have that H satisfies for some for sufficiently large X. Hence, by the moment method (see [4, Section 30]), we obtain the following result.
Then, for any ,
That is, the centered, normalized counts tend in distribution to a Gaussian random variable.
Gaussian limit theorems are known for the sums over short intervals of several important arithmetic functions (for instance divisor functions (see ) and the sums-of-squares representation function r (see )), and Gaussian limit theorems for counts of primes in short intervals are known under the assumption of strong versions of the Hardy–Littlewood conjectures , but Theorem 1.2 seems to be the first instance of an unconditional proof of Gaussian behavior for short interval counts of a non-trivial, natural number theoretic sequence.Figure 1
Hall in  asked also about the order of magnitude of the absolute moments
and as a standard corollary of Theorems 1.1 and 1.2, we obtain an asymptotic formula for these.
For fixed , let satisfy
We give a quick derivation in the language of probability. By [4, Theorem 25.12] and the subsequent corollary, if is a sequence of random variables tending in distribution to a random variable Y and for some , then . Having chosen the function , for each X, let be chosen randomly and uniformly, and define the random variables . Then, as , we have that tends in distribution to for G a standard normal random variable, by Theorem 1.2. Moreover, Theorem 1.1 implies for any even integer that . Thus the result follows by computing via calculus. ∎
For a given result relating to the behavior of an arithmetic function in short intervals, it is natural to consider the analogous problem in a short arithmetic progression. For example, in analogy to the quantity for and slowly growing, one might consider the quantity
where is chosen so large that (which corresponds in this context to H) is only slowly growing, and is a specified residue class. When a is coprime to q, the expected size of is
and one can define the analogous moments
As noted by Nunes [37, Sections 1.2, 3.2], one may reduce the estimation of to a quantity that is very similar to what is obtained in the course of estimating , making the analysis of the problem for arithmetic progressions nearly identical to that of short intervals. As such, one could very similarly obtain an arithmetic progression analogue of the Gaussian limit theorem, Theorem 1.2, if desired. However, unlike the short interval problem, the problem in progressions does not seem to admit a nice interpretation in the language of fractional Brownian motion (see Section 1.3). We have therefore chosen to focus on the short interval problem in this paper in order to avoid making this paper even longer.
It is natural to write our proofs in the more general setting of B-free numbers. We recall their definition shortly, but first we fix some notation.
For a sequence J of natural numbers, we will write for the indicator function of J, and
for the count of elements of J no more than x.
We say that a sequence J is of index if .
A measurable function L defined, finite, positive, and measurable on for some is said to be slowly varying if, for all ,
A sequence is said to be regularly varying if for some and some slowly varying function L.
For instance, the function is slowly varying for any . For any slowly varying function L, it is necessary that , but this condition is not sufficient. Clearly, in the definition above, α will be the index of the regularly varying sequence J. Further information on regularly varying sequences can be found in [41, Chapter 4.1].
Fix a non-empty subset of pairwise coprime integers with ; we call such a set a sieving set. We say that a positive integer is B-free if it is indivisible by every element of B. For instance, if , B-frees are nothing but squarefrees. Another studied example is for some , for which B-frees are the m-th-power free numbers, i.e., integers indivisible by an m-th power of a prime. The notion of B-frees was introduced by Erdős , who was motivated by Roth’s work  on gaps between squarefrees. (See also [6, 22] for the closely related notion of convergent sieves. We also mention that  upper bounded the quantity mentioned in Remark 1.4.)
We write for the indicator function of B-free integers, and for the number of B-free integers . For the sets B, we are considering here, it is known (see e.g., [10, Theorem 4.1]) that
so that B-frees have asymptotic density .
We write for the multiplicative semigroup generated by B, that is the set of positive integers that can be written as a product of (possibly repeated) elements of B. By convention, , as 1 arises from the empty product. (For instance, if , then .)
We introduce an arithmetic function , analogous to the Möbius function μ, defined by
Observe that and are multiplicative and relate via the multiplicative convolution
We denote by the subset of of elements n satisfying , i.e., those such that no divides n twice. (For instance, if , then we have .) We will often use without mention the fact that both and are closed under and , or equivalently, they are sublattices of the positive integers with respect to these two operations.
1.2.1 Variance and moments
Let be the count of B-frees in an interval . We consider the moments
If the sequence has index , then for each fixed positive integer k,
exists, and moreover, for any , we have
We will describe an explicit formula for later.
When , we have the following general result for the variance, building on ideas of Hausman and Shapiro  and Montgomery and Vaughan .
If the sequence is of index , then .
If is in addition a regularly varying sequence, then we prove an asymptotic formula for .
If the sequence is regularly varying with index , then
This generalizes a result of Avdeeva  which requires more robust assumptions about . It also gives a new proof for (1.1) that does not use contour integration and is essentially elementary.
In fact, we do not need an asymptotic formula for the variance to prove that the moments are Gaussian.
If has index , then
for every positive integer k. Here c is an absolute constant depending only on α.
It is evident that Proposition 1.8 and Theorem 1.10 recover the moment estimate Theorem 1.1 for squarefrees. Moreover, for the same reasons as given for the central limit theorem there, we have the following theorem.
If the sequence has index , then for any ,
Combining Theorem 1.10 and Proposition 1.7, we see that, for each k, the moments will be asymptotically Gaussian as long as with
In Section 6.2, we give a further application of Theorem 1.10 to estimates for the frequency of long gaps between consecutive B-free numbers, improving results of Plaksin  and Matomäki .
1.3 Fractional Brownian motion
We have mentioned that the squarefrees and more generally B-frees in a random interval with are governed by a stationary ergodic process if H remains fixed. The number of B-frees in such an interval is on average. This process has measure-theoretic entropy which becomes smaller the larger H is chosen to be, and thus does not seem very “random”. This may be compared to primes in short intervals , which contain on average λ primes. In such intervals, primes are conjectured to be distributed as a Poisson point process, and thus appear very “random”.
Nonetheless, a glance at Figure 1 comparing squarefrees to primes – along with consideration of the central limit theorems we have just discussed – reveals that, at a scale of , B-frees still retain some degree of randomness. It turns out that there is a natural framework to describe the “random” behavior of B-frees at this scale (analogous to the Poisson process for primes above), and this is fractional Brownian motion.
We give here a short introduction to fractional Brownian motion, as we believe this perspective sheds substantial light on the distribution of B-frees; however, the remainder of the paper is arranged so that a reader only interested in the central limit theorems of the previous sections can avoid this material.
A random process is said to be a fractional Brownian motion with Hurst parameter if Z is a continuous-time Gaussian process which satisfies and also satisfies for all and has covariance function
for all .
Using , it is easy to see the covariance condition (1.3) is equivalent to
For a proof that such a stochastic process exists and is uniquely defined by this definition, see e.g. .
Classical Brownian motion is a fractional Brownian motion with a Hurst parameter of . If , increments of the process are positively correlated, with a rise likely to be followed by another rise, while if , increments of the process are negatively correlated.
Donsker’s theorem is a classical result in probability theory showing that a random walk with independent increments scales to Brownian motion (see [5, Section 8]). We prove an analogue of Donsker’s theorem for counts of B-frees using the following set up. We select a random starting point at uniform and define the random variables in terms of n by
where denotes the fractional part.
For integer τ, this is a random walk which increases on B-frees and decreases otherwise, and for non-integer τ, the function linearly interpolates between values; thus is continuous.
and choose a random integer at uniform. Suppose that is a regularly varying sequence of index , and define the function
where is defined by (1.2). Then, as a random element of , the function converges in distribution to a fractional Brownian motion with Hurst parameter as .
Our proof of this result follows similar ideas as the proof of Theorem 1.10.
Note that , so only a fractional Brownian motion with negatively correlated increments can be induced this way. It would be very interesting to understand functional limit theorems of this sort in the context of ergodic processes related to the B-frees described in e.g. [3, 8, 9, 10, 24, 27, 30]. There seem to exist only a few other constructions in the literature of a fractional Brownian motion as the limit of a discrete model, e.g. [1, 11, 18, 38, 44].
1.4 Notation and conventions
Throughout the rest of this paper, we allow the implicit constants in and to depend on k when considering a k-th moment, and implicit constants are always for a fixed sieving set B. Later, we will introduce a weight function φ, and implicit constants depend on φ as well. Throughout the paper, where B has an index α, for simplicity, we will assume that . Some proofs would remain correct if or 1, but the proofs of our central results would not be. We use the notation as a subscript in some sums to mean . In general, we follow standard conventions; in particular, , denotes the distance of to the nearest integer, is the greatest comment divisor of a, b, and c, while is the least common multiple.
1.5 The structure of the proof
There is a heuristic way to understand the Gaussian variation of . Note that
The contribution of the first summand is close to the value around which oscillates. On the other hand, the functions are mean-zero functions of period d and thus are linear combinations of terms for . Upon reducing the fractions by the maximal divisor with , we see that
is approximated by a linear combination of terms for B-free and .
Heuristically, if X is large and is chosen uniformly at random, one may expect the terms to behave like a collection of independent random variables, and this would imply the Gaussian oscillation of .
Nonetheless, we do not quite have independence; instead, roughly speaking, one may use the same Fourier decomposition to relate the k-th moment to (weighted) counts of solutions to the equation
where for all i. Indeed, the realization that for the squarefrees are related to these counts appears already in .
It can be seen that Gaussian behavior will then follow from most solutions to the above equation being diagonal, meaning there is some pairing for all i, and this is what we demonstrate. Our main tool is the Fundamental Lemma and its extensions (see Lemmas 2.6, 4.1, and 5.5), developed by Montgomery–Vaughan  and later used by Montgomery–Soundararajan  to prove a conditional central limit theorem for primes in short intervals.
However, this strategy if used by itself is not sufficient to prove a central limit theorem; the Fundamental Lemma was already known to Hall who used it to obtain his upper bound . The reason this strategy does not work as it did for Montgomery–Soundararajan is that has size roughly ; for primes, the k-th moment of a short interval count is (conditionally) much larger. Thus, in order to recover a main term, our error terms must be shown to be substantially smaller than for the primes.
We obtain an upper bound for the number of off-diagonal solutions to (1.5) by bringing in two ideas in addition to those used by Montgomery–Soundararajan. The first is due to Nunes, who showed in the recent paper  that solutions to (1.5) make a contribution to only when is larger than for all i and the least common multiple of the is larger than ; this argument appears in Section 5.1. The second is an observation that bounding a term which appears in a variant of the Fundamental Lemma (Lemma 5.5) requires a more delicate treatment than that which appears in [34, Lemma 8]. However, it can be accomplished in our context by counting solutions to a certain congruence equation, which in turn can be estimated using an averaging argument relying crucially on the Pólya–Vinogradov inequality; this is done in Section 5.2. See Remark 5.10 for further discussion of how the more direct, but less quantitatively precise, approach of [34, Lemma 8] is insufficient in our context.
In order to prove that counts of B-frees scale to a fractional Brownian motion, it is not sufficient to consider the flat counts , but instead, we must consider the weighted counts
where φ belongs to a class of functions that includes step-functions. A proof of a central limit theorem for flat counts remains essentially unchanged as long as φ is of bounded variation and compactly supported in . Convergence to a fractional Brownian motion as a corollary of this central limit theorem is discussed in Section 7.
2 An expression for moments
In this section, we prove Proposition 1.7, giving an expression for . The results proved in this section all suppose that the sieving set B is such that has index α. (We do not yet need to suppose that is regularly varying.)
In order to eventually have a nice framework to prove Theorem 1.14 on fractional Brownian motion, we generalize the moments we consider. Suppose is a bounded function supported in a compact subset of . Define by
For , this recovers as defined above. We have tried to write this paper so that a reader interested only in the more traditional Theorem 1.11 can read it with this specialization in mind.
We introduce the arithmetic function defined by
Given a bounded function supported on a compact subset , we also define the 1-periodic function
where, as usual, for . Finally, given , we denote by the subset of the group given by
If the sequence has index and φ is of bounded variation and supported in a compact subset of , then for all fixed integers and any , as long as ,
where is defined by the absolutely convergent sum
Obviously, Proposition 2.1 implies Proposition 1.7 by setting .
Our ultimate goal will be to prove generalizations of Theorems 1.1 and 1.10, showing that the moments and have Gaussian asymptotics for general φ. This will be the content of Theorems 6.1 and 6.2.
2.1 The Fundamental Lemma and other preliminaries
We prove Proposition 2.1 in the next subsection, but first we must introduce a few tools.
For and , we let
Note that is for . One has for all t, where
and recall is the distance from t to the nearest integer. The next lemma studies the first and second moment of . The second moment was studied in [34, Lemma 4], and the first moment is implicit in [37, Lemma 2.3]; we include a proof for completeness.
We use (2.2) to obtain
for . If , this is
and we use the estimate
to conclude. For , we use
and consider and separately. ∎
For φ supported in a compact subset of and of bounded variation,
where the implied constant depends on φ only.
Suppose φ is supported on an interval . By partial summation,
As φ is of bounded variation, the claim follows. ∎
We introduce the arithmetic function
For a sieving set B, for any and any ,
Note that vanishes for and is multiplicative for . If , then . But, for any , this implies for A a constant depending on ε and k only. Thus, for ,
where is the number of prime factors of d and we use the estimate (see e.g. [35, Theorem 2.10]). This implies the claim. ∎
The next lemma is a variation on an estimate of Nunes [37, Lemma 2.4], who treated the corresponding result when with .
Suppose the sequence has index . Let and be a k-tuple of distinct non-negative integers and let . We have
We prove the claim by induction on k. For , the inner sum in the left-hand side of (2.3) is (by considering and separately), so we obtain the bound
which is stronger than what is needed. We now assume that the bound holds for and prove it for .
We set and and introduce a parameter to be chosen later. We write the left-hand side of (2.3) as , where, in , we sum only over tuples with holding for all , and in , we sum over the rest.
Observe that, by comparing exponents in the factorizations, . To bound , observe that the condition on implies . Using the Chinese remainder theorem, the inner sum for is , and so we obtain the upper bound
where we have used Lemma 2.4 in the last step.
For each of the tuples summed over in , there is some j ( ) with . The tuples corresponding to this j contribute to at most
where we used the facts that
the number of will be , and
Applying the induction hypothesis to bound the last double sum, we obtain
Taking , we see that does not exceed the desired bound. ∎
Finally, throughout this paper, in order to control the inner sums defining , we will use the Fundamental Lemma of Montgomery and Vaughan . The following is a generalization of the result proved in , which corresponds to the case of B being the set of primes. The original proof in  works without any change under the more general assumptions. Let
Lemma 2.6 (Montgomery and Vaughan’s Fundamental Lemma).
Let , , …, be positive integers from , and set . For each , let be a complex-valued function defined on . Suppose each prime factor of r divides at least two of the . Then
Later, in Lemmas 4.1 and 5.5, we will cite variants of this result.
2.2 Proof of Proposition 2.1
We examine the inner sum defining . Note, for integers n,
where for notational reasons we have written
The function has period d. Considering it as a function on , it has mean 0. By taking the finite Fourier expansion in n, we have
From the definition (2.4), each term involves summing over indices m. Thus, from Lemma 2.5, we have
for a parameter z to be chosen later.
On the other hand, by (2.5),
where is defined as in (2.1). Note that if , we have
and in this latter case,
Furthermore, note using Lemma 2.3 and the first part of Lemma 2.2 that
Thus the contribution of terms for which is
Thus (2.7) is
We now complete the sum above. Directly applying Lemma 2.6 (and appealing to Lemma 2.3 and the second part of Lemma 2.2), we see that the corresponding sum over tuples
Hence, from (2.6), (2.8), (2.9),
(Note that the absolute convergence of this sum is implied by the above derivation.) Setting , we obtain the desired error term.
It remains to demonstrate . To do this, note that if , with , we can find a maximal such that , and so that is B-free. Moreover, writing does not affect the condition . Consequently, we have
Note also that, as , . Therefore, setting in each of the above sums, we have
The sums over factor as
and we see , as required. ∎
From (2.10), reindexing and , we have the following useful alternative expression for :