Squarefrees are Gaussian in short intervals

We show that counts of squarefree integers up to $X$ in short intervals of size $H$ tend to a Gaussian distribution as long as $H\rightarrow\infty$ and $H = X^{o(1)}$. This answers a question posed by R.R. Hall in 1989. More generally we prove a variant of Donsker's theorem, showing that these counts scale to a fractional Brownian motion with Hurst parameter $1/4$. In fact we are able to prove these results hold in general for collections of $B$-free integers as long as the sieving set $B$ satisfies a very mild regularity property, for Hurst parameter varying with the set $B$.

1. Introduction 1.1.Statistics of counts of squarefrees.Let S ⊂ N be the set of squarefree natural numbers (that is natural numbers without a repeated prime factor; by convention we include 1 ∈ S).We write for the number of squarefrees no more than x.It is well known that N S (x) ∼ x/ζ(2), thus the squarefrees have asymptotic density 1/ζ(2) = 6/π 2 .Our purpose in this note is to investigate their distribution at a finer scale.In particular we will investigate the distribution of squarefrees in a random interval (n, n + H], where n is an integer chosen uniformly at random from 1 to X, with X → ∞. If H is fixed and does not grow with X, at most O(1) squarefrees can lie in such an interval.Their distribution as X → ∞ is slightly complicated but completely understood; it may be described by Hardy-Littlewood type correlations which can be derived from elementary sieve theory (see [32,31]).Or, more abstractly, the distribution of squarefrees in an interval of size O(1) can be described by a non-weakly mixing stationary ergodic process (see [7,43]).
For H tending to infinity with X matters become at once simpler and more difficult; simpler because some of the irregularities in the distribution just described are smoothed out at this scale, but more difficult in that natural conjectures become more difficult to prove. Let be the count of squarefrees in the interval (n, n + H].R.R. Hall [16,17] was the first to investigate the distribution of this count when H grows with X.In [16, Corollary 1], Hall proved that the variance of the number of squarefrees is of order √ H if H is not too large with respect to X.More exactly, as X → ∞ we have as long as H → ∞ with H ≤ X 2/9−ε .Keating and Rudnick [25] studied this problem in a function field setting, connecting it with Random Matrix Theory, and suggested based on this that (1.1) will hold for H ≤ X 1−ε .The best known result is [13], where it is shown that (1.1) holds for H ≤ X 6/11−ε unconditionally and H ≤ X 2/3−ε on the Lindelöf Hypothesis.In fact in [13] it is shown that even an upper bound of order √ H for H ≤ X 1−ε for all ε > 0 would already imply the Riemann Hypothesis.
Because N S (n, H) is on average of order H, one might naively have expected the variance to also be of order H.That the variance is of order √ H speaks to the fact that the squarefrees are a rather rigid sequence.This can be discerned even visually in comparison, for instance, to the primes (Figure 1) and we will return to give a more exact description of it in Section 1. 3.
In [17] Hall 1 studied higher moments of counts of squarefrees in short intervals where k is a positive integer, proving the upper bound lim X→∞ M k (X, H) k H (k−1)/2 .Various authors have asked whether this can be refined, with the most recent result, for any ε > 0 as long as H ≤ X 4/(9k)−ε , being due to Nunes [37].For H in the range considered, this is an optimal upper bound up to the factor of H ε .In [2] extensive numerical evidence is presented that suggests that these moments are in fact Gaussian.
Our first main result confirms this conjecture.
Theorem 1.1.For 1 ≤ H ≤ X, as X → ∞, ), for every positive integer k, where Thus if H → ∞ and H ≤ X 4/(9k)−ε for some ε > 0, we have Note that the main term is the k-th moment of a centered Gaussian random variable with variance A √ H.
If H = X o (1) , then for any fixed k we have that H satisfies H ≤ X 4/(9k)−ε for some ε > 0 for sufficiently large X.Hence by the moment method (see [4, Section 30]), we obtain the following result.
Then, for any z ∈ R, That is, the centered, normalized counts tend in distribution to a Gaussian random variable.Gaussian limit theorems are known for the sums over short intervals of several important arithmetic functions (for instance divisor functions d k (see [28]) and the sums-of-squares representation function r (see [20])), and Gaussian limit theorems for counts of primes in short intervals are known under the assumption of strong versions of the Hardy-Littlewood conjectures [33], but Theorem 1.2 seems to be the first instance of an unconditional proof of Gaussian behavior for short interval counts of a non-trivial, natural number theoretic sequence.
Hall in [17] asked also about the order of magnitude of the absolute moments λ , and as a standard corollary of Theorems 1.1 and 1.2 we obtain an asymptotic formula for these.
Corollary 1.3.For fixed λ > 0, let H = H(X) satisfy 1 Note that our definition of M k differs slightly from [17].Hall does not normalize by the factor 1/X.  Proof.We give a quick derivation in the language of probability.By [4,Theorem 25.12] and the subsequent corollary, if Y j is a sequence of random variables tending in distribution to a random variable Y and sup j E |Y j | 1+ε < +∞ for some ε > 0, then E Y j → E Y .Having chosen the function H = H(X), for each X let n ∈ [1, X] be chosen randomly and uniformly and define the random variables ∆ X = |(N S (n, H) − H/ζ(2))/(A 1/2 H 1/4 )| λ .Then as X → ∞, we have that ∆ X tends in distribution to |G| λ for G a standard normal random variable, by Theorem 1.2.Moreover, Theorem 1.1 implies for any even integer 2 > λ that lim sup X→∞ E|∆ X | 2 < +∞.Thus the result follows by computing E |G| λ via calculus.
Remark 1.4.For a given result relating to the behavior of an arithmetic function in short intervals, it is natural to consider the analogous problem in a short arithmetic progression.For example, in analogy to the quantity N S (n, H) for n ∈ [1, x] and H = H(x) slowly growing, one might consider the quantity N S (x; q, a) := |{n ≤ x : n ∈ S, n ≡ a mod q}|, where 1 ≤ q ≤ x is chosen so large that x/q (which corresponds in this context to H) is only slowly growing, and a mod q is a specified residue class.When a is coprime to q the expected size of N S (x; q, a) and one can define the analogous moments As noted by Nunes [37, Sections 1.2, 3.2], one may reduce the estimation of Mk (x; q) to a quantity that is very similar to what is obtained in the course of estimating M k (x, H), making the analysis of the problem for arithmetic progressions nearly identical to that of short intervals.As such, one could very similarly obtain an arithmetic progression analogue of the Gaussian limit theorem Theorem 1.2, if desired.However, unlike the short interval problem the problem in progressions does not seem to admit a nice interpretation in the language of fractional Brownian motion (see Section 1.3).We have therefore chosen to focus on the short interval problem in this paper in order to avoid making this paper even longer.
1.2.B-frees.It is natural to write our proofs in the more general setting of B-free numbers.We recall their definition shortly, but first we fix some notation.For a sequence J of natural numbers we will write 1 J for the indicator function of J, and for the count of elements of J no more than x.
Definition 1.6.A measurable function L defined, finite, positive, and measurable on [K, ∞) for some K ≥ 0 is said to be slowly varying if for all a > 0, A sequence J ⊂ N is said to be regularly varying if N J (x) ∼ x α L(x) for some α ∈ [0, 1] and some slowly varying function L.
For instance the function L(x) = (log x) j is slowly varying, for any j ∈ R. For any slowly varying function L it is necessary that L(x) = x o (1) , but this condition is not sufficient.Clearly in the definition above α will be the index of the regularly varying sequence J.Further information on regularly varying sequences can be found in [41,Chapter 4.1] Fix a non-empty subset B ⊂ N >1 of pairwise coprime integers with b∈B 1/b < ∞; we call such a set a sieving set.We say that a positive integer is B-free if it is indivisible by every element of B. For instance if B = {p 2 : p prime}, B-frees are nothing but squarefrees.Another studied example is B = {p m : p prime} for some m ≥ 3, for which B-frees are the m-th-power free numbers, i.e., integers indivisible by an m-th power of a prime.The notion of B-frees was introduced by Erdős [12], who was motivated by Roth's work [42] on gaps between squarefrees.(See also [6,22] for the closely related notion of convergent sieves.We also mention that [22] upper bounded the quantity M2 (x; q) mentioned in Remark 1.4.) We write 1 B-free : N → C for the indicator function of B-free integers, and N B-free (x) for the number of B-free integers n ≤ x.For the sets B we are considering here, it is known (see e.g., [10,Theorem 4.1]) that We write B for the multiplicative semigroup generated by B, that is the set of positive integers that can be written as a product of (possibly repeated) elements of B. By convention 1 ∈ B , as 1 arises from the empty product.(For instance if B = {p 2 : p prime}, then B = {n 2 : n ∈ N}.) We introduce an arithmetic function µ B : N → C, analogous to the Möbius function µ, defined by Observe that µ B and 1 B-free are multiplicative and relate via the multiplicative convolution

We denote by
Proposition 1.7.If the sequence B has index α ∈ (0, 1), then for each fixed positive integer k, exists and moreover for any ε > 0 we have We will describe an explicit formula for C k,B (H) later.When k = 2 we have the following general result for the variance, building on ideas of Hausman and Shapiro [19] and Montgomery and Vaughan [34], Proposition 1.8.If the sequence B is of index α ∈ (0, 1), then C 2,B (H) = H α+o (1) .
If B is in addition a regularly varying sequence then we prove an asymptotic formula for C 2,B (H).Proposition 1.9.If the sequence B is regularly varying with index α ∈ (0, 1), then where This generalizes a result of Avdeeva [3] which requires more robust assumptions about B .It also gives a new proof for (1.1) that does not use contour integration and is essentially elementary.
In fact, we do not need an asymptotic formula for the variance to prove that the moments are Gaussian.
Theorem 1.10.If B has index α ∈ (0, 1), then for every positive integer k.Here c is an absolute constant depending only on α.
It is evident that Proposition 1.8 and Theorem 1.10 recover the moment estimate Theorem 1.1 for squarefrees.Moreover, for the same reasons as given for the central limit theorem there, we have the following theorem.
If the sequence B has index α ∈ (0, 1), then for any z ∈ R, Remark 1.12.Combining Theorem 1.10 and Proposition 1.7, we see that for each k, the moments M k,B (X, H) will be asymptotically Gaussian as long as H, X → ∞ with .
In Section 6.2 we give a further application of Theorem 1.10 to estimates for the frequency of long gaps between consecutive B-free numbers, improving results of Plaksin [39] and Matomäki [29].
1.3.Fractional Brownian motion.We have mentioned that the squarefrees and more generally B-frees in a random interval (n, n + H] with n ≤ X are governed by a stationary ergodic process if H remains fixed.The number of B-frees in such an interval is M B H on average.This process has measure-theoretic entropy which becomes smaller the larger H is chosen to be, and thus does not seem very 'random'.This may be compared to primes in short intervals (n, n + λ log X ], which contain on average λ primes.In such intervals primes are conjectured to be distributed as a Poisson point process, and thus appear very 'random.' Nonetheless a glance at Figure 1 comparing squarefrees to primes -along with consideration of the central limit theorems we have just discussed -reveals that at a scale of H → ∞, B-frees still retain some degree of randomness.It turns out that there is a natural framework to describe the 'random' behavior of B-frees at this scale (analogous to the Poisson process for primes above) and this is fractional Brownian motion.
We give here a short introduction to fractional Brownian motion, as we believe this perspective sheds substantial light on the distribution of B-frees; however, the remainder of the paper is arranged so that a reader only interested in the central limit theorems of the previous sections can avoid this material.Definition 1.13.A random process {Z(t) : t ∈ [0, 1]} is said to be a fractional Brownian motion with Hurst parameter γ ∈ (0, 1) if Z is a continuous-time Gaussian process which satisfies Z(0) = 0 and also satisfies E Z(t) = 0 for all t ∈ [0, 1] and has covariance function Using Z(0) = 0, it is easy to see the covariance condition (1.3) is equivalent to For a proof that such a stochastic process exists and is uniquely defined by this definition see e.g.[36].Classical Brownian motion is a fractional Brownian motion with a Hurst parameter of γ = 1/2.If γ > 1/2, increments of the process are positively correlated, with a rise likely to be followed by another rise, while if γ < 1/2, increments of the process are negatively correlated.
Donsker's theorem is a classical result in probability theory showing that a random walk with independent increments scales to Brownian motion (see [5,Section 8]).We prove an analogue of Donsker's theorem for counts of B-frees using the following set up.We select a random starting point n ≤ X at uniform and define the random variables ξ 1 , ξ 2 , . . . in terms of n by where {τ } denotes the fractional part.For integer τ this is a random walk which increases on B-frees and decreases otherwise, and for non-integer τ , the function Q(τ ) linearly interpolates between values; thus Q(τ ) is continuous.
and choose a random integer n ∈ [1, X] at uniform.Suppose that B is a regularly varying sequence of index α ∈ (0, 1) and define the function where A α is defined by (1.2).Then, as a random element of C[0, 1], the function W X converges in distribution to a fractional Brownian motion with Hurst parameter α/2 as X → ∞.
Our proof of this result follows similar ideas as the proof of Theorem 1.10.Note that α/2 < 1/2, so only a fractional Brownian motion with negatively correlated increments can be induced this way.It would be very interesting to understand functional limit theorems of this sort in the context of ergodic processes related to the B-frees described in e.g.[2,8,9,10,24,27,30].There seem to exist only a few other constructions in the literature of a fractional Brownian motion as the limit of a discrete model, e.g.[1,11,18,38,44].1.4.Notation and conventions.Throughout the rest of this paper we allow the implicit constants in and O(•) to depend on k when considering a k-th moment, and implicit constants are always for a fixed sieving set B. Later we will introduce a weight function ϕ and implicit constants depend on ϕ as well.Throughout the paper where B has an index α, for simplicity we will assume that α ∈ (0, 1).Some proofs would remain correct if α = 0 or 1, but the proofs of our central results would not be.We use the notation a ∼ x as a subscript in some sums to mean x < a ≤ 2x.In general we follow standard conventions; in particular e(x) = e i2πx , x denotes the distance of x ∈ R to the nearest integer, (a, b, c) is the greatest comment divisor of a, b, and c, while [a, b, c] is the least common multiple.
1.5.The structure of the proof.There is a heuristic way to understand the Gaussian variation of N B-free (n, H).Note that The Heuristically, if X is large and n ∈ [1, X] is chosen uniformly at random, one may expect the terms {e(n /d) : ( , d) B-free} to behave like a collection of independent random variables, and this would imply the Gaussian oscillation of N B-free (n, H).
Nonetheless we do not quite have independence; instead, roughly speaking, one may use the same Fourier decomposition to relate the k-th moment M k,B (X, H) to (weighted) counts of solutions to the equation (1.5) where i /d i < 1/H for all i.Indeed, the realization that M k (X, H) for the squarefrees are related to these counts appears already in [17].
It can be seen that Gaussian behavior will then follow from most solutions to the above equation being diagonal, meaning there is some pairing i /d i ≡ − j /d j mod 1 for all i, and this is what we demonstrate.Our main tool is the Fundamental Lemma and its extensions (see Lemmas 2.6, 4.1, and 5.5), developed by Montgomery-Vaughan [34] and later used by Montgomery-Soundararajan [33] to prove a conditional central limit theorem for primes in short intervals.
However, this strategy if used by itself is not sufficient to prove a central limit theorem; the Fundamental Lemma was already known to Hall who used it to obtain his upper bound M k (X; H) H (k−1)/2 .The reason this strategy does not work as it did for Montgomery-Soundararajan is that M k,B (X; H) has size roughly H αk/2 ; for primes the k-th moment of a short interval count is (conditionally) much larger.Thus in order to recover a main term, our error terms must be shown to be substantially smaller than for the primes.
We obtain an upper bound for the number of off-diagonal solutions to (1.5) by bringing in two ideas in addition to those used by Montgomery-Soundararajan.The first is due to Nunes, who showed in the recent paper [37] that solutions to (1.5) make a contribution to M k (X; H) only when d i is larger than H 1−o (1) for all i and the least common multiple of the d i is larger than H k/2−o (1) ; this argument appears in Section 5.1.The second is an observation that bounding a term T 4 which appears in a variant of the Fundamental Lemma (Lemma 5.5) requires a more delicate treatment than that which appears in [34,Lemma 8].However, it can be accomplished in our context by counting solutions to a certain congruence equation, which in turn can be estimated using an averaging argument relying crucially on the Pólya-Vinogradov inequality; this is done in Section 5.2.See Remark 5.10 for further discussion of how the more direct, but less quantitatively precise, approach of [34, Lemma 8] is insufficient in our context.
In order to prove that counts of B-frees scale to a fractional Brownian motion it is not sufficient to consider the flat counts N B-free (n, H), but instead we must consider the weighted counts where ϕ belongs to a class of functions that includes step-functions.A proof of a central limit theorem for flat counts remains essentially unchanged as long as ϕ is of bounded variation and compactly supported in [0, ∞).Convergence to a fractional Brownian motion as a corollary of this central limit theorem is discussed in Section 7.
1.6.Acknowledgements.O.G. is supported by funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No 851318).Most of this work was completed while A.M. was a CRM-ISM postdoctoral fellow at the Centre de Recherches Mathématiques.He would like to thank the CRM for its financial support.B.R. received partial support from an NSERC grant and US NSF FRG grant 1854398.Some work on this project was done during a research visit to the American Institute of Mathematics and we thank that institute for its hospitality.We thank Francesco Cellarosi for useful discussions, as well as the anonymous referee for carefully reading the paper and providing a number helpful comments leading to improvements in its exposition.

An expression for moments
In this section we prove Proposition 1.7, giving an expression for C k,B (H).The results proved in this section all suppose that the sieving set B is such that B has index α.(We do not yet need to suppose that B is regularly varying.) In order to eventually have a nice framework to prove Theorem 1.14 on fractional Brownian motion, we generalize the moments we consider.Suppose For ϕ = 1 (0,1] this recovers M k,B (X, H) as defined above.We have tried to write this paper so that a reader interested only in the more traditional Theorem 1.11 can read it with this specialization in mind.
We introduce the arithmetic function g B = g defined by Given a bounded function ϕ : R → R supported on a compact subset [0, ∞) we also define the 1-periodic function where as usual, e(u) := e 2πiu for u ∈ R. Finally, given r ∈ [B] we denote by R B (r) the subset of the group R/Z given by Proposition 2.1.If the sequence B has index α ∈ (0, 1) and ϕ is of bounded variation and supported in a compact subset of [0, ∞), then for all fixed integers k ≥ 1 and any ε > 0, as long as where C k (H; ϕ) is defined by the absolutely convergent sum Obviously Proposition 2.1 implies Proposition 1.7 by setting ϕ = 1 (0,1] . Our ultimate goal will be to prove generalizations of Theorems 1.1 and 1.10, showing that the moments M k (X, H; ϕ) and C k (H; ϕ) have Gaussian asymptotics for general ϕ.This will be the content of Theorems 6.1 and 6.2.
2.1.The Fundamental Lemma and other preliminaries.We prove Proposition 2.1 in the next subsection but first we must introduce a few tools.
For t ∈ R and Y ≥ 1 we let Note that Proof.We use (2.2) to obtain and we use the estimate to conclude.For k = 2, we use and consider d ≤ H and d > H separately.
Lemma 2.3.For ϕ supported in a compact subset of [0, ∞) and of bounded variation, where the implied constant depends on ϕ only.
Proof.Suppose ϕ is supported on an interval [0, c].By partial summation, As ϕ is of bounded variation the claim follows.
We introduce the arithmetic function For a sieving set B, for any k ≥ 1 and any ε > 0, where ω(d) is the number of prime factors of d and we use the estimate ω(d) = o(log d) (see e.g.[35,Theorem 2.10]).This implies the claim.
The next lemma is a variation on an estimate of Nunes [37, Lemma 2.4], who treated the corresponding result when B = {p m : p prime} with m ≥ 2.
Proof.We prove the claim by induction on k.For k = 1, the inner sum in the left-hand side of (2.3) is X/d 1 (by considering X > d 1 and X ≤ d 1 separately), so we obtain the bound (1) , which is stronger than what is needed.We now assume that the bound holds for k 0 and prove it for and introduce a parameter z ≤ X to be chosen later.We write the left-hand side of (2.3) as T 1 + T 2 , where in T 1 we sum only over tuples d 1 , . . ., d k0+1 with D i ≤ z holding for all 1 ≤ i ≤ k 0 + 1, and in T 2 we sum over the rest.
Observe that by comparing exponents in the factorizations, D ≤ k0+1 i=1 (D i ) 1/k0 .To bound T 1 , observe that the condition on D i implies D ≤ (z ) 1+1/k0 .Using the Chinese remainder theorem, the inner sum for T 1 is X/D + 1, and so we obtain the upper bound where we have used Lemma 2.4 in the last step.
For each of the tuples summed over in T 2 , there is some j (1 ≤ j ≤ k 0 + 1) with D j > z .The tuples corresponding to this j contribute to T 2 at most where we used the facts that (i) the number of d j | n + m j will be X o (1) , and (ii) 1/d j = O(1).Applying the induction hypothesis to bound the last double sum, we obtain .
Finally, throughout this paper in order to control the inner sums defining C k (H; ϕ) we will use the Fundamental Lemma of Montgomery and Vaughan [34].The following is a generalization of the result proved in [34], which corresponds to the case of B being the set of primes.The original proof in [34] works without any change under the more general assumptions.Let For each 1 ≤ i ≤ k, let G i (ρ i ) be a complex-valued function defined on C(r i ).Suppose each prime factor of r divides at least two of the r i . 2 Then .
Later, in Lemmas 4.1 and 5.5, we will cite variants of this result.
2 Equivalently, each b ∈ B that divides r divides at least two of the r i .

Proof of Proposition 2.1.
Proof.We examine the inner sum defining M k (X, H; ϕ).Note for integers n, where for notational reasons we have written The function n → ψ H (n, d) has period d.Considering it as a function on Z/dZ it has mean 0. By taking the finite Fourier expansion in n we have Thus From the definition (2.4), each term ψ H (n, d) involves summing over H indices m.Thus from Lemma 2.5 we have for a parameter z to be chosen later.
On the other hand, by (2.5), where E X is defined as in (2.1).Note that if and in this latter case Furthermore, note using Lemma 2.3 and the first part of Lemma 2.2 that Thus the contribution of terms for which X .
We now complete the sum above.Directly applying Lemma 2.6 (and appealing to Lemma 2.3 and the second part of Lemma 2.2), we see that the corresponding sum over tuples (1) .
Hence from (2.6), (2.8), (2.9), where (Note that the absolute convergence of this sum is implied by the above derivation.)Setting z = X (1−ε)/(α+1) we obtain the desired error term.It remains to demonstrate To do this, note that if σ i = i /d i , with 0 < i < d i , we can find a maximal e i ∈ [B] such that i = e i l i , d i = e i d i , and so that (l i , d i ) is B-free.Moreover, writing σ i = l i /d i does not affect the condition σ 1 + • • • + σ k ∈ Z. Consequently, we have Note also that as i = 0, d i /e i = 1.Therefore, setting r j := d j /e j in each of the above sums, we have The sums over e j factor as The absolute convergence of this sum is implied by the above argument.
Remark 2.8.When ϕ = 1 (0,1] the corresponding expression for C 2 (H; ϕ) may also be calculated using correlation formulae for µ B .Indeed, upon expanding the k-th power in the definition of M k (X, H; 1 (0,1] and swapping orders of summation, we find When B = {p 2 : p prime}, Hall [17,Lemma 2] used this approach to compute the main terms C k (H; 1 (0,1] ).This was generalized straightforwardly to the setting with B = {p m : p prime} for m ≥ 3 by Nunes [37, Lemma 2.2] and can be generalized to B-frees as well.
3. Estimates for variance Proof.Suppose B has index α.We first show that [B] has index α also.Plainly, (1) .
On the other hand, b≤x,b∈ B Introducing a parameter y = x ε , we upper-bound the left-hand sum as follows.For b ≥ y with b ∈ B we apply , and for b < y we use monotonicity in the form (1) .Since this holds for every ε > 0, we obtain the lower bound needed for the claim.
To prove an upper bound for N B (x), use (3.1) and note the left-hand side will be x α+o (1) Other authors have proved results in this area based on assumptions regarding the index of the set B (for instance [15]).Though we will not require it in what follows, for completeness' sake we note the following implication.
We will first show for any ε > 0 there is a constant C ε such that For notational reasons let This gives (3.2) for k = 1.But (3.2) then follows inductively for all k from the above bounds and [35,Theorem 2.10]).Hence for all x there exists k = o(log x) such that (1) .As ε is arbitrary this implies x α+o (1) .
Using (3.1) as before this implies N B (x) x α+o (1) as desired.
It seems likely that the converse to Proposition 3.2 is false, but we do not pursue this here.

3.2.
Variance for B with index α.We now show that the exponent of the variance is determined by the index of B .
Proof.We have For an upper bound we apply Lemma 2.3 and the second part of Lemma 2.2 to see that |Φ H (σ)| 2 r min{r, H}.
Suppose ϕ : R → R is supported on an interval [0, c].Before embarking on the proof of the lower bound, we make the following observation.Since ϕ is of bounded variation we have We now split the proof of the lower bound into two cases, depending on whether or not Using the convolution formula 1 B-free = 1 * µ B , for each r ∈ [B] we obtain Moreover, if m ≥ 10cH then by Plancherel's theorem on Z/mZ, we obtain where in the last step we used the fact that ϕ is non-vanishing on some open interval.Since g(r) 2 r −2 for r ∈ [B], we find that uniformly in m, we may use positivity to restrict to m ≥ 10cH and apply (3.4), getting 1) , where in the second to last step we used the fact that [B] has index α by Lemma 3.1.This proves the lower bound in this case.

Case 2:
∞ 0 ϕ(t)dt = 0.In this case, we again use positivity to restrict the sum in C 2 (H; ϕ) as Let K ≥ 10 be a large constant.Then for r ≥ KcH, we have e(j/r) = 1 + O(1/K) uniformly for 1 ≤ j ≤ cH, so from (3.3) we get since, again by Lemma 3.1, [B] has index α.The claim is thus proved in this case as well.

3.3.
Variance for regularly varying B .If B is a regularly varying sequence we can say more; in this case we will show the asymptotic formula of Proposition 1.9.We begin with a useful expression for C 2 (H).Throughout this subsection we use the notation Lemma 3.4.We have Proof.We begin with the expression in Remark 2. .
But in this sum 1 /d 1 = 2 /d 2 and using Then the above expression for C 2 (H) simplifies to which in turn simplifies to (3.5).
We will use the following result of Pólya to estimate the above sum.
for some ε > 0, then Proof.This can be found in Pólya and Szegő's book [41,Problem No. 159 in Part II, Chapter 4 of Volume I]; see also Pólya's paper [40].
Proof of Proposition 1.9.We define For both F(s) and G(s) the above sum and Euler product converge absolutely for s > α.Note that for s > α we have where the coefficients u c are defined by this relation.(The coefficients u c will be supported on B but this fact will not be important.) Note the Euler product defining U(s) converges absolutely for s > α/2, and therefore the Dirichlet series also converges absolutely in this region.Hence it follows that for any ε > 0, with the implications which will be important later.By using the Dirichlet convolution implicit in (3.6), we can write But from the bound V (ξ) 2 min(1, 1/ξ 2 ), we have The last estimate follows because if x > 1 the first term vanishes while the second is bounded trivially, while if x ≤ 1 both terms satisfy the claimed estimate by (3.7).Therefore one may check that Pólya's proposition may be applied and where rearrangement of sums and integrals is justified in the second line of (3.8) by absolute convergence.By a change of variables τ = λ/tc, the last line can be simplified to since the sums over λ and c can then be simplified as ζ(2 − α) and U(α) respectively.But (see [14, formula 3.823 which recovers the constant (1.2) claimed in the proposition.
Remark 3.6.We will not require it, but with a bit more work one can show that if B is regularly varying with index α ∈ (0, 1), and ϕ is of bounded variation with compact support, where

Diagonal terms
In this section we show how the approximation of C k (H; ϕ) by Gaussian moments arises from terms in which r 1 , . . ., r k are paired and non-repeated in the sum defining this quantity.In the next section we will show that the remaining terms are negligible.
We say that the tuple r 1 , . . ., r k is paired if k is even and we may partition {1, 2, . . ., k} i=1 with r ai = r bi and a i = b i .We say that r 1 , . . ., r k is repeated if r i1 = r i2 = r i3 for some i 1 < i 2 < i 3 .For k = 2 all the terms in C 2 (H; ϕ) must be paired.
We adopt the abbreviations r = [r 1 , . . ., r k ] for the lcm and r = (r 1 , . . ., r k ) for the vector of the r i .Given integers r 1 , r 2 , . . ., r k > 1 belonging to [B], let Our approach in this section largely follows Montgomery and Soundararajan's proof of [33,Theorem 1].We will use the following variant of the Fundamental Lemma; the result is a generalization of [33, Lemma 2], which corresponds to B being the set of primes.The original proof3 works as is.
Lemma 4.1 (Montgomery and Soundararajan).Let q 1 , . . ., q k be integers with q i > 1 and Let G be a complex-valued function defined on (0, 1), and suppose that G 0 is a non-decreasing function on [B] such that The next lemma separates repeated or non-paired r from what we will show is the main contribution to C k (H; ϕ).Lemma 4.2.Let k ≥ 3, and suppose ϕ is of bounded variation and supported in a compact subset of [0, ∞).If k is odd we have If k is even we have where Proof.We first consider odd k.There are no vectors (r 1 , . . ., r k ) that are both paired and non-repeated.The proof of this case is concluded by recalling that g(n) is dominated by µ 2 B (n)/n and Φ H by F H .We now consider even k.By the triangle inequality, (4.1) ways in which the pairing in the first sum in (4.1) can occur.We take the pairing to be r i = r k/2+i without loss of generality.We further write ρ i = a i /r i and set b i to be the unique integer in [1, r i ] congruent to a i + a k/2+i modulo r i .Hence r is paired and non-repeated r1,...,r k >1 k i=1 g(r i ) This finishes the proof.
We now show that paired and non-repeated terms above can be reduced to powers of the variance.  (1.
Proof.Let j be the number of values of i for which b i = r i , so that b i = r i for the remaining k/2 − j values of i.Since there are k/2 j ways of choosing the j indices, we see that the left-hand side is where W 0 ≡ 1 and (1) by Lemma 3.3.The term j = 0 will contribute the main term, so it remains to bound the other terms.It is also clear that j = 1 contributes 0 as b 1 /r 1 ∈ (0, 1).To finish the proof it suffices to show that W j (H) = O(H α(j−1)+o( 1) ) for j ≥ 2.
Montgomery and Soundararajan [33, equation (34)] showed that (4.2) for B being the set of primes and ϕ = 1 (0,1] .Their argument works as is for general B and general ϕ.Indeed their proof proceeds by dropping the conditions (a, n) = 1 and (b − a, n) = 1 from the sum defining J, applying the estimate Φ H F H (in the general case this is Lemma 2.3), and straightforwardly estimating the sum of positive terms that result, so their bound applies to our sum as well.
We take Z = H j/2 to find that the left-hand side is H α(j−1)+o (1)   as j ≥ 2.
Taken together, Lemma 4.2 and Proposition 4.3 show that paired and non-repeated terms in C k (H; ϕ) are enough to recover the claimed main term.

Off-diagonal terms
In this section, we show that the repeated or non-paired terms contribute negligibly to C k (H; ϕ).Our main tool will be a refinement of the Fundamental Lemma, due also to Montgomery and Vaughan [34,Lemma 8].However, we will also crucially use ideas of Nunes to bound the range of r that we need to consider.We also will critically make use of the Pólya-Vinogradov inequality to bound a certain term that appears in the refined Fundamental Lemma; this is a new ingredient of our proof and some argument of this sort seems to be essential when the index α is less than or equal to 1/2 (see Remark 5.10 for a relevant discussion).5.1.Preliminary estimates: Nunes's reduction in the range of r.Following Nunes [37], in this subsection we will show that in the sum defining C k (H; ϕ), those r for which r is large yet each r i is relatively small make a negligible contribution.We first make a few simple observations.
The Fundamental Lemma only sees the L 2 -norm of F H (i/r). Lemma 5.2 is superior to Lemma 5.1 in certain ranges, as it makes use of the much smaller L 1 -norm.As Nunes does, we may combine (5.1) and (5.2), obtaining the bound We now use Nunes's bound (5.2) to deal with r with small lcm.They turn out to contribute negligibly, a fact that is not detected directly by the Fundamental Lemma bound (5.1).Lemma 5.3.Let H, M > 1. Suppose that B has index α ∈ (0, 1).We have (1) .
Proof.Appealing to (5.2), the contribution of r with r ≤ M is at most The claimed bound then follows from Lemma 2.4 since B has index α.
We now use the Fundamental Lemma to show that among those r with r > M , the contribution of r with r i ≤ N is small.Here M and N are parameters to be chosen later. (1).
Proof.If r i ≤ N for some i, and all r j > 1 are in [B] then (5.1) implies that , so that our sum is at most r1,...,r k >1 ri≤N for some i r>M The claimed bound again follows from Lemma 2.4, as B has index α.

5.2.
Using the Fundamental Lemma.To estimate off-diagonal contributions we use the following variant of the Fundamental Lemma.It generalizes [34,Lemma 7] of Montgomery and Vaughan, which corresponds to B being the set of primes and T = H 1/9 .The proof follows that of [34] essentially without change and so we do not include it here.
We use this to produce a first bound on repeated or non-paired r outside of the range treated by Nunes's bound.Proposition 5.6.Let k ≥ 3. Let N be in the range H ≥ N ≥ H 8/9 and M > 1. Suppose that B has index α ∈ (0, 1).We have where .
Proof.Let r be a vector which is either repeated or non-paired with r > M and such that r i > N for all i.It suffices to bound If r is repeated with r i1 = r i2 = r i3 , then we apply Lemma 5.5 with r i1 , r i2 in place of r 1 , r 2 , obtaining ). (The parameter T is taken to be H/N .)To study the T i s, recall that d = (r i1 , r i2 ) = r i1 factors as d = st, where In our case, t = 1 and s = d = r i1 > N .We have and T 4 = 0, so the total contribution of the T i 's in the repeated case is at most which is absorbed in the error term since the series over r is M α−1+o (1) .Suppose that r is non-paired and also non-repeated.The contribution of r is 0 if there is a prime p dividing only one of the r i (as then S H (r) = 0), so we assume that each prime divisor of r divides at least two of the r i .This implies that r i | j =i r j .As in [34, p. 323], this implies r i ≤ j =i (r j , r i ) and so for each i there exists j = i such that (r j , r i ) ≥ r We claim that there is at least one pair r i , r j with i = j, (r j , r i ) > N 1/(k−1) and r i = r j .Indeed, if there is no such pair, it means that each value in the multiset {r i : 1 ≤ i ≤ k} appears at least twice.We also know that each value appears at most twice, as we are in the non-repeated case.Hence each value appears twice, contradicting the fact that we are in the non-paired case.
Hence necessarily r i1 = r i2 with (r i1 , r i2 ) > N 1/(k−1) for some i 1 = i 2 .We again apply Lemma 5.5 with r i1 , r i2 in place of r 1 , r 2 , and T = H/N .By definition, T 3 = 0.As we have T 1 = (log H)/(H/N ) 1/2 and T 2 = d −1/4 < N −1/(4(k−1)) , we get that the total contribution of T 1 , T 2 and T 3 in the non-paired case is at most which is also absorbed in the error term.We now treat the contribution of T 4 when r is non-paired and non-repeated, and show that it is at most H k/2−1+o(1) M α−1+o(1) E 1/2 , so is absorbed as well.
Assuming always that Lemma 5.5 is applied with the first two elements of r (at the cost of a constant of size k 2 from permuting the indices), and recalling that if r is non-paired and non-repeated we may assume (r 1 , r 2 ) > N 1/(k−1) , we see that T 4 contributes at most by Cauchy-Schwarz, where (Note that the condition t > d 1/2 is the same as t > s.) The second sum is at most r>M τ B,lcm,k (r) r M α−1+o (1) .
To study the first sum, we write r as r 1 r 2 stw where r i = r i /d and w = r/(r 1 r 2 st).Note that r 1 , r 2 , s and t are pairwise coprime so w must be an integer.Instead of summing over r 1 and r 2 we sum over r 1 , r 2 , s and t (so r 1 and r 2 are determined).Given r 1 , r 2 and w we have r; given r, r 1 and r 2 there are at most τ k−2 (r) possibilities for r with these values of r 1 , r 2 and r, since each r i (3 ≤ i ≤ k) divides r = wr 1 r 2 st.Here τ is the usual divisor function.Hence we have Because [B] has index α and τ (n) = n o (1) , the innermost sum above is Plugging the definition of T 4 in the last equation, and first summing over s, t and only later over r i we obtain .
In order to get a good upper bound on E we must estimate the frequency with which asτ /(as) is smaller than 1/H.We will do this by reduction to congruence conditions and we will bound sums over such congruence conditions using the following simple consequence of the Pólya-Vinogradov inequality.(For Pólya-Vinogradov see e.g.[21,Theorem 12.5]).Lemma 5.7.Let χ be a Dirichlet character modulo q and suppose A q. We have The bound for principal characters is evident, so we now consider non-principal ones.The sum over smaller |i| satisfies the required bound by combining the Pólya-Vinogradov inequality 1≤|i|≤n with the trivial bound |χ| ≤ 1.To bound the sum over larger i, we can either use the trivial bound H/A, or else again appeal to Pólya-Vinogradov together with partial summation as follows: where In what follows we use the notation a ∼ x to mean x < a ≤ 2x.The following proposition will be used shortly.
Proposition 5.8.Suppose B is of index α ∈ (0, 1).Let s, t ∈ [B] be coprime positive integers with t ≤ H. Let r be a B-free divisor of t.
We can detect (5.4) using orthogonality of characters, obtaining By Lemma 5.7, the contribution of the principal character χ = χ 0 is which gives the first term in the required bound.We now consider the non-principal characters.
Applying the pointwise bound for the sum of F twisted by χ as given in Lemma 5.7, we see that they contribute This gives the second contribution to the bound, and we are done.
Proposition 5.9.Let N be in the range H ≥ N ≥ H 8/9 .Suppose that B has index α ∈ (0, 1).In the notation of Proposition 5.6, we have Proof.We dyadically decompose the inner sum over a in the definition of E.  F H (y). The denominator of the fraction asτ in reduced form is exactly t/(t, τ 1 ), since both s and a are coprime with t.We write the left-hand side of (5.5) as a sum over the possible values of asτ , and need to count the number of times a given value is obtained, that is, count solutions to asτ = i/(t/r), which is an equation that determines a modulo t/r up to a sign, yielding the new inner sum over a.
Putting this in the definition of E and applying the Cauchy-Schwarz inequality, we thus obtain, Now, by orthogonality we have 1 a1≡a2 mod t/r .Thus, using this in the conclusion of Proposition 5.8 with A = 2 v , we get Inserting this estimate back into our upper bound for E, we obtain a bound For simplicity, assume that r = 1 and, fixing v, write A = 2 v .In the context of [34] we may replace [B] with the set of divisors t of the (squarefree) modulus q, in which case one may estimate the inner sum in G 1,v (τ 1 ) using the simple bound held uniformly in b ∈ (Z/tZ) × , tracing through the remainder of the proof of [34,Lemma 8] we would find that the corresponding savings obtained is of the shape H 1/2−α+o (1) , and therefore only provides power savings for α > 1/2.To deal with the most general situation (i.e., potentially with α ≤ 1/2 and no guarantee of equidistribution) we cannot simply rely on pointwise counts for elements of [B] in residue classes.The proof of Proposition 5.9 demonstrates that power savings in H may be obtained upon averaging in both the residue class b mod t and the modulus t.
6.The central limit theorem for general weights ϕ 6.1.A proof of the central limit theorem.We assemble the estimates of the previous sections to show that C k (H; ϕ) and therefore M k (X, H; ϕ) exhibit Gaussian behavior.
Theorem 6.1.Suppose ϕ is of bounded variation and supported in a compact subset of [0, ∞).Assume moreover that ϕ is non-vanishing on some open interval.If B has index α ∈ (0, 1) then for every positive integer k.Here c is an absolute constant depending only on α.
We apply Proposition 5.9 to bound E, so that the right-hand side is .
We now choose N = H 1−c1 and M = H k/2−c2/k where c i are sufficiently small with respect to α, and recall C 2 (H; ϕ) = H α+o(1) by Lemma 3.3.
Obviously this implies Theorem 1.10 and thus the central limit theorem, Theorem 1.11 for flat counts in short intervals.In fact more generally, combining Theorem 6.1 with Proposition 2.1 we see that as long as H ≤ X cα/k−ε , we have . By the moment method this implies that weighted counts also satisfy a central limit theorem.Improving on work of Plaksin [39], Matomäki [29] used a sieve-theoretic method to show that for any ε > 0, |G(X, H)| XH −1+ε for 1 ≤ H ≤ X Since we have (2−α)k ≥ 1 whenever k ≥ 1 and α ∈ (0, 1), our result improves on that of Matomäki in some range of H.Note that, for instance, if k = 2 then c α /k ≥ 1/6 whenever 0 < α < (7− √ 33)/2 = 0.6277..., and our range contains hers, at least if B does not consist only of primes.Proof.This is a direct consequence of [23,Lemma 16.2 and Theorem 16.3].
We also have the following device for proving tightness.Proof.This is a special case of [23,Corollary 16.9].
7.2.The B-free random walk.We apply these results to the random functions W X (t) with X → ∞.
We need one last lemma regarding regularly varying sequences.
Lemma 7.3.If we have a sequence of natural numbers J is regularly varying with index α ∈ (0, 1) then for any ε > 0, N J (tH) N J (H) ε t α−ε , for all t ≤ 1 and H larger than the first element of J.
Proof.Obviously the result is true if tH < 1, so suppose tH ≥ 1.
If J ⊂ N is regularly varying with index α ∈ (0, 1), then there is some slowly varying L such that N J (x) x α L(x) for all x and N J (x) x α L(x) for all x larger than the first element of J.For convenience we will take L(x) to be defined for all x ≥ 1 with inf x∈ [1,K] L(x) > 0 for any K > 0.
(The reader should check that this may be done.) It then follows from Karamata's representation of slowly varying functions (see [26,   4 That is for all ε > 0, there is a compact subset K of C[0, 1] such that P(Yn / ∈ K) < ε for all sufficiently large n.For C[0, 1] this is equivalent to the condition that Yn(0) is a tight family of real-valued random variables and lim δ→0 lim sup n→∞ P(ω(Yn, δ) ≥ ε) = 0, where the modulus of continuity of a function f ∈ C[0, 1] is given by ω(f, δ) = sup |s−t|≤δ |f (s) − f (t)|; see [5,Theorem 7.3].
by Proposition 1.9.As W X (0) = 0, this allows one to deduce that W X (t) has the same limiting covariance function as Z(t).
Thus we will have the finite dimensional distributions of W X (t) tend to those of Z(t) if we show for any k ≥ 1 and any t 1 , . . ., t k that the vector (W X (t 1 ), . . ., W X (t k )) tends to a Gaussian vector.By the Cramér-Wold device [4,Theorem 29.4] this will be true if for any fixed real numbers θ 1 , . . ., θ k the random variable and so the Gaussian behavior follows from Theorem 6.2.
We now demonstrate (ii).Note that for any positive integer ν and ε > 0, we have as long as X (and thus H) is sufficiently large so that N B (H) is non-zero, using Lemma 7.3 in the last step.Hence choosing ε smaller than α and ν large enough that (α − ε)ν > 1, condition (ii) follows from the Kolmogorov-Chentsov Theorem.This completes the proof.

Figure 1 .
Figure 1.A comparison between a short interval containing 125 primes with a short interval containing 125 squarefrees.The relative paucity of gaps and clusters of squarefrees is indicative of the rigidity of their distribution.

1 . 2 . 1 .
[B] ⊂ B the subset of B of elements n satisfying µ 2 B (n) = 1, i.e., those n ∈ B such that no b ∈ B divides n twice.(For instance, if B = {p 2 : p prime}, then we have [B] = {n 2 : n squarefree}.)We will often use without mention the fact that both [B] and B are closed under gcd and lcm, or equivalently, they are sublattices of the positive integers with respect to these two operations.Variance and Moments.Let N B-free (n, H) = N B-free (n+H)−N B-free (n) be the count of B-frees in an interval (n, n + H].We consider the moments (a) A random walk on squarefrees (b) A random walk on primes

Figure 2 .
Figure 2. A graph of Q(τ ) and an analogue for the primes in the short interval (n, n + H] with n = 875624586 and H = 3000.(a) illustrates a walk which increments by 1 − 6/π 2 if u is squarefree, and −6/π 2 otherwise.(b) illustrates a walk which increments by 1 − 1/ log u if u is prime, and −1/ log u if u is composite.

Figure 3 .
Figure 3.A fractional Brownian motion and Brownian motion respectively, randomly generated using Mathematica, to be compared with the previous figure.
contribution of the first summand is close to the value M B H around which N B-free (n, H) oscillates.On the other hand, the functions n → {(n + H)/d} − {n/d} are mean-zero functions of period d and thus are linear combinations of terms e(n /d) for 1 ≤ ≤ d − 1. Upon reducing the fractions /d by the maximal divisor b|( , d) with b ∈ B , we see that N B-free (n, H) − M B H is approximated by a linear combination of terms e(n /d) for ( , d) B-free and d ∈ [B].

Proposition 3 . 5 (
Pólya).If N R (x) is the counting function of a sequence R which regularly varies with index α, and if

(Corollary 6 . 3 .
with no upper bound constraint on the range of H if B consists only of primes).As a consequence of our kth-moment bounds, we can prove the following.If B is of index α ∈ (0, 1) then for any k ≥ 1 and 1 ≤ H ≤ X cα/k−ε we have |G(X, H)| k XH −(2−α)k .

Theorem 7 . 2 (E
Kolmogorov-Chentsov). Using the notation of the previous theorem, if Y n (0) = 0 for all n and if there are an absolute constant C and constants a, b > 0 such thatsup n |Y n (s) − Y n (t)| a ≤ C|s − t| 1+b ,for all s, t ∈ [0, 1], then the sequence of random elements Y n of C[0, 1] is tight.
Lemma 3.1.For a sieving set B, the sequence B has index α if and only if [B] has index α.
7for C 2 (H; ϕ).If we specialize to the case ϕ = 1 (0,1] , then Φ H (t) = E H (t). If we use the identity 6niformly over both reduced residues b mod t, and over t.This follows from the equidistribution in residue classes of integers in an interval.While somewhat crude, this estimate suffices to produce the required power savings in H in Montgomery and Vaughan's analogue of E, as found in Proposition 5.6.In contrast, when B is a sparse set of index α < 1 we cannot reasonably hope for such equidistribution in residue classes in general.Even if, say, the optimistic bound and choose a random integer n ∈ [1, X] at uniform.Suppose that B is a regularly varying sequence of index α ∈ (0, 1) and ϕ is a real-valued function of bounded variation and supported in a compact subset of [0, ∞) and non-vanishing on some open interval.Then the random variable tends to the standard normal distribution N R (0, 1) as X → ∞.6.2.An application to long gaps.Estimate (6.1) allows us to obtain strong information about the frequency of long gaps between consecutive B-free integers.Given 1 ≤ H ≤ X, let G(X, H):= {n ≤ X : N B-free (n, H) = 0}.Thus, |G(X, H)| records the number of length H intervals with an endpoint n ∈ [1, X] that contains no B-free numbers.