Bell’s stated assumptions in deriving his inequality were sufficient conditions. It is shown that a far simpler condition exists for derivation of the inequality: the mere existence of finite data sets regardless of their statistical or deterministic characteristics. When explicitly computing various quantum correlations, the non-commutation of some observables must be taken into account. The resulting variation in correlations among the observables leads to satisfaction of the Bell inequality. Bell’s mistaken assumption of the same functional form for all correlations is the principal reason for inequality violation. Upon correction of this error, it is no longer necessary to invoke non-locality or non-reality to explain violation of the Bell inequality.
Bell’s inequality  is derived from a difference of probability-averaged correlation functions. Thus, it is expected that if applied to correlations of finite data sets it might be violated in a manner analogous to the way in which finite statistical measures fluctuate with respect to probability averages. Further, derivation of the inequality, as Bell understood and stated it, appears to depend on assumptions of locality (no instantaneous interactions between detectors) and counterfactual definiteness (a function specifying detector outputs at all settings before observation). These conditions are commonly believed to differentiate classical from quantum physics.
As shown below, however, some of Bell’s stated assumptions for the derivation of the inequality are not logically necessary and have led to logically unnecessary conclusions in Bell’s theorem. As pointed out by researchers [2, 3], Bell-like inequalities were discovered before Bell in the development of the theory of probability. However, the inequality holds under a very general condition: the existence of data sets with equal numbers of items consisting of ± 1’s, one set for each variable in the inequality . The Bell inequality must be satisfied by mutual cross-correlations of these data sets regardless of their other characteristics, even their randomness or lack thereof, since their mutual existence at the time of correlation computation is all that is used in its derivation. This leads to the ultimate loophole in the Bell theorem: mutual cross-correlations of measured or predicted quantum data for three (or four) variables must identically satisfy a Bell inequality. (This conclusion has also been reached in Refs. [2–4].)
Experiments to which the Bell inequality has been applied include observables that do not all commute. Detector outputs for non-commuting observables that are not all measureable (in the theorem) may in fact be represented by quantum probability predictions of values at alternate settings to those used in acquisition of actual data. Bell’s main error in constructing the inequality and theorem was that he assumed rather than computed the correlations of these non-commuting alternative observables , as shown below. These correlations are mathematical creations. However, when quantum mechanical probabilities are used to compute correlations of predicted values with experimentally realizable variables , the resulting sets of correlations satisfy the Bell inequality - as they must, since all the corresponding data under consideration consists of ± 1’s.
A distinction between quantum and classical treatments of non-commutation resides in the fact that non-commutation is explicitly built into quantum formalism, whereas it appears to be an ad hoc add-on in classical considerations. However, upon reflection it may be observed that non-commutation occurs frequently in classical contexts, yet has not been included in an overarching formal logical structure as it has in quantum mechanics. Perhaps that is why in his hidden variables construction, Bell used a representation of spin correlation measurements that ignored non-commutation, and led to the assumption that any non-quantum accounting of quantum correlations should be represented as a second order stationary process. It should also be noted that non-commutation was not historically included in classical probability theory, though incompatibilities of probability spaces at alternative instrument settings have previously been recognized [2, 3]. Non-commutation provides a specific mechanism for the generation of such incompatibilities. A probability counterpart to the treatment of non-commutation in quantum mechanics has recently been given in the concept of conditional probabilities by Khrennikov .
The plan of this paper is as follows: Bell’s derivation of the inequality will be contrasted with an alternative derivation that demonstrates that it results only from the operation of computing mutual correlations of datasets consisting of ± 1’s . It is a purely algebraic result. When applied to either random or deterministic variables, it restricts the values of their mutual cross correlations, which is also indicated in . In the second part of the paper, quantum mechanically given probabilities are used to compute correlations among commuting and non-commuting observables for an experiment, and for a nonlocal hidden variable model. The correlations satisfy the Bell inequality, and contrary to what Bell assumed, they are not second order stationary . Their lack of stationarity ultimately results from their non-commutation. Quantum mechanical commuting observables possess common eigenstates whereas in general, non-commuting observables do not. This leads to qualitatively different behavior in measurement scenarios.
2 Analysis of Bell’s derivation of the inequality
The inequality that Bell derived applies to experiments that measure quantum mechanical correlations originating in pairs of entangled particle spins. Such correlations were calculated by Bell for the experimental schematic of Fig. 1. The quantities in the inequality are experimentally measured values consisting of ± 1’s corresponding to components of spin parallel and anti-parallel to the magnetic field directions on the A and B sides of the apparatus. The entanglement formalism furnishes joint probabilities allowing the calculation of count correlation C(a, b) = −cos (θa − θb) at field angles θa and θb used in the measurements on the two sides.
Bell represented the count occurrences as a stochastic process [1, 6] by defining a function A(a, λ) that may take values ± 1 only, where a = θa is the magnetic field direction (or polarizer direction in the optical case) on the A-side of the apparatus. Parameter λ represents one or more random variables having probability density ρ(λ) that determine through A(a, λ) which values of are observed. The outcome of two measurements, one on each particle of a pair, is determined by the value of random variable λ occurring in that trial. Readouts at all possible angular settings are represented by A(a, λ).
The readout for the particle on the B-side of the apparatus is given by B(b, λ) = −A(b, λ), so that one stochastic process is used to describe all observations, and a correlation of -1 automatically occurs for equal settings on the two sides. The correlation of the detected values for the particle pairs may then be expressed as a probability average using a conventional probability integral (where triangular brackets on the left-hand-side indicate this average):
Because three variables were necessary to obtain a condition on the correlations in the form of an inequality, Bell calculated the difference of two correlations arising from two different settings, b and c, on the B-side of the apparatus:
From Eq. (2.2)
since B(b,λ)2 = 1 . Taking absolute values of both sides produces
after bringing the absolute value inside the integral on the right-hand side. Defining C(x, y) = 〈A(x, λ)B(y, λ)〉, the Bell inequality is obtained:
All the correlations now occur between variables defined on opposite sides of the apparatus.
When C(x, y) = −cos(θx − θy) is inserted in Eq. (2.5) for each of the correlations, with x and y taking appropriate values among a, b, and c, the inequality is violated at certain angular differences. Reasons given for this have been the stated assumptions believed necessary for the derivation of Eq. (2.5). First, observed values on each side of the apparatus depend only on the setting on that side and are independent of the setting on the opposite side, e.g., A(a, λ) is independent of b, rather than dependent on both a and b as A(a, b, λ) would be. (Otherwise changing a value from b to c would cause a new value A(a, c, λ) that would disallow factoring A(a, λ) from the difference terms in Eq. (2.2) .) Second, the function A(a, λ), as used in the derivation, may be interpreted to imply that readouts exist at all instrument settings (values of a) before measurement. This would seem to conflict with the wave function collapse postulate of quantum mechanics. These two assumptions of the theorem are frequently referred to as locality and reality and are often cited as the reason for inequality violation by experimentally observed cosine correlations.
The meaning of the last step in Eq. (2.6) that defines C(c, b) must be considered in the context of application of the inequality to a Bell experiment. From quantum mechanics, the measurements on opposite sides of the apparatus on the two different particles commute. However, two spin measurements on the same particle, i.e., on the same side of the apparatus, do not commute. The principles of quantum mechanics prescribe that if variable b is measured followed by b′, the probability of different outcomes for b′ are conditional on the outcome at the previous setting b. Also, if b is measured again, its previous value is not necessarily obtained, since the probabilities of its outcomes are now conditional on that of b′. By contrast, if the observables in question commute, the same value of b as before must be observed if it is measured again after b′.
Thus, pairs of commutative measurements on two different particles on opposite sides of a Bell apparatus are described by qualitatively different probabilities than pairs of non-commutative measurements on the same particle on one side. It will be shown later in explicit examples that this leads to differing functional forms for correlations among the variables and to satisfaction of the Bell inequality.
In his book, Bell described the measurement B(c, λ) to be a predicted alternative value that allowed him to ignore the lack of commutation with measurement B(b,λ) . B(c, λ) was the value that would have been observed had the angular setting been c rather b for the same particle in the same measurement trial. It was a counterfactual that commuted with A(a, λ) but not with B(b, λ), and could not be measured at the same time as B(b, λ). Its replacement by −A(c, λ)) was the corresponding counterfactual value that would have existed on the A-side at the same angular setting. Again, since this value now does not commute with A, it cannot be measured on the same A-side particle except by a second sequential measurement on the A-side.
After choosing to use a predicted value rather than a measurement for the third variable in the inequality, Bell assumed, without stating it, that the stochastic process represented by his function A(a, λ) was second order stationary , meaning that all pairs of correlations inserted in Eq. (2.5) have the same functional form dependent on coordinate differences. This will emerge as one of the most important flaws in the Bell theorem.
In Bell’s derivation, the inequality appears as a property of probabilistically computed correlations. As such, actual data fluctuations might be expected to cause some numerical disagreement with it. Further, the inequality results from the stochastic process represented by A(a, λ) and its assumed characteristics: locality and the definition of observables’ values independent of measurement. On this basis, it has been subject to experimental “test” by the physics community [9, 10].
3 Alternate derivation of the Bell inequality as a deterministic result
3.1 Three variables case
In the laboratory, one does not observe correlations such as C(a, b), but sets of ± 1’s from which correlations are calculated. A question may be asked as to how the statistical fluctuations that occur in such quantities due to finite sample size, affect satisfaction and interpretation of the Bell inequality. In Eq. (2.4), probability averages corresponding to infinite data sets are first computed, and the inequality is derived from them. Herein, Eq. (2.5) will be re-derived using a simpler notation to reveal its logic . (For the remainder of the paper, small letters a, b, c, will be used to denote random variables taking values ± 1 corresponding to physical setting angles θa, θb, etc.)
A situation is considered in which there are three variables a, b, and c at corresponding settings. At each setting a series of data consisting of ± 1’s occurs as the data acquisition procedure is repeated. At the i-th repetition, the data values at settings corresponding to a, b, and c are defined as ai, bi, and ci. From these data, one may compute:
Summing (3.1) over N data triplets from the three settings, and dividing by N produces
Taking absolute values of both sides,
Since the expression within the absolute value sign in the right-most term is zero or positive for each i, the absolute value may be removed to produce
Equation (3.4) has the same general form as Eqs. (2.4, 2.5) but must be identically satisfied by any three finite data sets, each set consisting of ± 1’s. Eq. (3.4) depends on no assumptions other than that the data are mutually cross-correlated. Due to the experimental production of correlation estimates from finite data sets rather than computed correlations for infinite numbers of particle pairs, it might be expected that Eq. (2.5) could be violated by statistical fluctuations. But inequality (3.4) cannot be violated by such fluctuations. Its importance justifies restating the obvious: The terminology as used herein implies that the value ai that multiples bi is identical to that which multiples ci, and that the that occur in different terms are numerically identical for each value of i. Thus, Eq. (3.4) is not an inequality that holds for probabilistic averages. Rather it is a non-statistical result that holds for both deterministic (non-fluctuating) data and random data under mutual cross-correlation.
In cases of interest in quantum mechanics, the law of large numbers holds, i.e., the estimates in Eq. (3.4) applied to Bell correlations statistically converge as N → ∞, yielding Bell’s result of Eq. (2.5):
But in practice, all experimental data sets are finite. Even statistical convergence to a mean for large N entails small fluctuations about the mean. The means plus their fluctuations must identically satisfy Eq. (3.4).
Now suppose that a set of functions isproposed for the mutual correlations between data sets a, b, and c, and these correlations violate inequalities (3.4, 3.5). It follows that no data sets can exist that produce the proposed mutual correlations, since any data sets whatsoever must produce correlations that satisfy (3.4), and assuming that statistical limits exist, also (3.5).
Finally, note that for the case of a predicted counterfactual that Bell considered, as opposed to a real measurement, and without the assumption of non-locality, a set of values for b′ corresponding to those measured for a and b would necessarily satisfy the Bell inequality given mutual cross-correlation. (Thus, a value of b′ would exist for each ai and bi.) This must be true even if the predicted values are incorrect as long as they consist of ± 1’s. Bell did not predict such values, however, but assumed a value for the correlation function itself, C(b, b′).
3.2 Four variable Bell inequality
In attempting to avoid dealing with the question of the relation between a counterfactual and a measurement, a four variable CHSH  version of the Bell inequality has been used by experimentalists rather than Eq. (2.5). However, Bell experiments employ two measurements, one on each particle of an entangled pair. How these are related to the four measurements needed in a four variable inequality raises the same questions as occur in the three variable case. In this article, the three variable case will be emphasized for reasons of simplicity and clarity. Key logical findings are similar for the two cases.
The four variable inequality may be derived as follows. One assumes four lists a, a′,b, b′, with i-th members of the lists indicated as Each i-th member may take only values ± 1. One forms the expression
This may be summed over the N items in the lists to obtain:
After dividing by N, one obtains
Using the correlation notation of Eq. (2.5), and assuming that limits exist as N → ∞, this may be written
Since the only property used in the derivation of (3.7b) is that each subscripted variable may only take values of ±1, the conclusions stated above for the three variable inequality also follow for this four variable inequality. Further, as in the three variable case, if the experimentally validated C(x, y) = − cos(θx − θy) is inserted for each of the correlations, using the corresponding angular settings of x and y among a, a′, b, b′, then (3.8) is violated. It follows that no data sets exist that when mutually cross-correlated produce this set of correlation functions.
4 Second order stationarity does not hold for Bell correlations
Algebraic equivalents to Bell’s inequalities were discovered in the development of classical probability theory by Boole , Borob’ev , and others before Bell (see Sec. 8.3 of Ref.  for a discussion). These results and the related analysis given in Sec. 3 have been largely unknown in the quantum foundations community where it has been reasoned that inequality violation is due to the physical assumptions stated by Bell in the construction of the theorem. These assumptions were locality and realism, in which disbelief has thus been reinforced, and which are commonly believed to distinguish classical from quantum mechanical phenomena.
In the Bell theorem, due to the use of a counterfactual of a non-commuting operation, these issues become connected to the question of how correlations of three data sets are generated from measurements on two particles. If data to compute C(a, b) and C(a, b′) are measured in two separate runs, the value of the i-th ai measured with bi will not in general be the same as that measured with . The factorization step that occurs in the first line in either of the above derivations of the inequality is not allowed, so that conditional correlations among the data sets must be constructed to allow application of the inequality. (An example is given in Sec. 5.1 below.)
Bell inserted an assumed correlation of a non-commuting counterfactual (rather than a calculated one) into the inequality. While commuting pairs of variables on opposite sides of a Bell apparatus are computed and observed to have negative cosine of angular differences correlation, the remaining correlation 〈−B(b) B(c)〉 involving the non-commuting counterfactual, is constrained by the laws of algebra and quantum mechanics to have a different form. In using a single functional form to describe correlations of three measurements, one non-commuting, on two particles, Bell assumed a stochastic process that was second-order stationary. Such processes may be common but are not universal . If the three Bell correlations resulted from such a process, the inequality would be satisfied (up to small fluctuations) even in the absence of mutual data cross-correlation. Since only one correlation functional form would exist for all angular setting pairs, the inequality (3.4) would suffer only small violations due to independently fluctuating statistical correlations, in contrast to its complete satisfaction by coordinated fluctuations under correctly cross-correlated data sets. The large experimental violations of the Bell inequality show that the assumption of second-order stationarity does not characterize the quantum variables in question.
A common view  is that violation of either the three or four variable inequalities is due to their misapplication due to non-locality or specification of readouts before measurement. If the A-side output is ai(θa, θb ), then the reading ai(θa, θb′) is different from ai(θa, θb), and the factoring step in the first line of the derivation cannot be carried out, as indicated before in Sec. 2. This problem may be avoided in the three-variable case (assuming the effect exists) by performing a measurement on the A-side before the setting and measurement on the B-side are decided . (The entangled state contains no time parameter.) Now, under the assumption that some nonlocal interference exists between measurements on the two sides, only three data sets exist (two real and one predicted counterfactual alternative). This implies that the inequality must be satisfied.
5 Explicit calculation of Bell correlations in the three variable case
Two examples will now be given to illustrate the principles discussed above. In Example 1, it will be shown that by using probabilities given by quantum mechanics, a set of measureable correlations may be computed that satisfy the Bell inequality, even for data taken in independent runs . Thus, no inconsistency between quantum mechanics and the analysis presented above occurs. In Example 2, hidden variables using nonlocal information are employed to simulate quantum probabilities for computation of correlations between measurements, and between measurements and a counterfactual . This is the case that Bell considered, but when the correlation is explicitly computed, rather than assumed, the Bell inequality must be satisfied in spite of the use of nonlocal information.
5.1 Example 1
The Bell inequality is not violated if quantum probabilities are used to predict correlations based on experimentally acquired data from separate experimental runs. Data from separate runs may be combined to form three data sets and compute C(b, b′) as predicted using the conditional probabilities of quantum mechanics.
Since quantum mechanics allows computation of conditional probabilities P(b|a) and P(b′|a) that result from entanglement, one can construct a factorable conditional probability P(b, b′|a) = P(b|a)P(b′|a) with data taken in two separate runs on different days. One experiment records data at settings corresponding to a and b, the other at settings for a and b′. Then b and b′ are statistically independent of each other except through their mutual correlation with a. The conditional correlation C(b, b′|a) then depends on products of probabilities each conditional on a. (This is not the situation Bell assumed in his derivation, which will be considered later.)
To establish the notation, first compute the correlation
These joint probabilities at angular settings θa and θb on opposite sides of a Bell apparatus correspond to the pairs of values taken by variables a and b. (From Eqs. (5.2), only opposite values of a and b can occur at equal angles.) Summing over values of a, b in Eq. (5.1) and using Eqs. (5.2) produces the Bell correlation
with C(a, b′) immediately following by replacing θb by θb′ in Eq. (5.3).
The conditional probabilities that will be needed to compute C(b, b′|a) are obtainable from the joint probabilities of Eqs. (5.2), since P(a, b) = P(b|a)P(a) with P(a) = P(b) = 1/2 for a, b = ± 1. (Recall that measurement operations yielding a and b commute.) Thus, the conditional probabilities for b-values given a-values are
Probabilities for P(b′|a) may be immediately obtained by substituting θb′, the setting of b′, for θb in (5.4). The correlation C(b, b′) is now:
The two final sums of (5.5) equal C(b, b′|a = 1) P(a = 1) and C(b, b′|a = −1) P(a = −1) which each equal 1/2 cos(θb − θa) cos(θb′ − θa). Consequently, upon adding them,
Thus, correlations may be computed that are measureable and consistent with both quantum mechanics and Bell’s inequality. The correlation C(b, b′) is different from the other two in that data at b and b′, each conditional on the value of a, are taken in different runs. Values for pairs a and b, and a and b′, are each obtained at the same time, since the two particles and two detector settings exist at the same time. Detector settings for b and b′ do not exist at the same time since one detector cannot be in two places at the same time. (Two detector systems could be used in sequence, however, although this has never been done, to the knowledge of the author.) Consequently, measurement correlations of b and b′ are conditional on a outcomes common to the two runs.
5.2 Example 2
The hidden variable concept that Bell used in derivation of the inequality will now be considered. The hidden variable model employed uses the setting angle θa corresponding to variable a, and a’s resulting measurement outcome as values from which conditional outcomes for variables b and b′ are computed. According to conventional thinking, the Bell inequality should be violated, since nonlocal information regarding setting θa, and the corresponding measurement value for a will be used in the determination of measurement values for b and b′. Variable b′ simply occurs at an alternate setting angle to that of b for the same value of the hidden variables and the same value of a. This situation allows a correlation of b with b′ to be computed even though only one of them can be measured together with a in the conventional configuration of a Bell experiment.
By using two statistically independent, uniformly distributed random parameters 0 ⩽ λ1, λ2 ⩽ 1 per triplet of measurements, both b and b′ may be assigned values simultaneously for each value of a, so that C(b, b′) may be computed as well as C(a, b) and C(a, b′) (see Fig. 2 and its caption). If the first variable λ1 ∈ [0, .5), a = 1. If λ1 ∈ [.5,1), a = −1. The quantum conditional probabilities will be produced using values of parameter λ2 to simulate
with probabilities P(b′, a) obtained from Eq. (5.7) by substituting θb′ for θb. (It is assumed, without loss of generality, that θb′ ⩾ θb.) Using uniformly distributed variable λ2, with 0 ⩽ λ2 ⩽ 1, and referring to Fig. 2, for a = 1:
Similarly, for a = −1,
To obtain corresponding values for b′, replace b by b′, and θb by θb′ in Eqs. (5.8). From Fig. 2, C(b, b′) may be obtained given that the probability that a = ± 1 equals ½ and C(b, b′|a = 1) = C(b, b′|a = −1):
Inserting these correlations into Bell’s inequality:
since θb′ ⩾ θb and 0 ⩽ (θb − θa), (θb′ − θa) ⩽ π, the right-hand side of (5.12b) is always positive under the conditions given, and the Bell inequality is satisfied.
6 Discussion and Conclusion
The Bell inequality, as originally derived and applied in the theorem, appears to be the result of probability averaging of correlations of a stochastic process. In the completion of the theorem, the process was assumed to be second order stationary. However, Bell’s stated assumptions are only sufficient conditions for the derivation of the inequality. It may be derived without these assumptions and shown to be a deterministic result that depends on the mutual cross-correlation of data sets that Bell used. It must be identically satisfied assuming only that the data exist. Since a measurement on the A-side of a Bell apparatus may be completed before the detector setting on the B-side is decided upon, interference between detectors, assuming that it exists, cannot lead to inapplicability of the inequality because three data sets still exist that correspond to the three variables in the inequality.
The fact that measurements in Bell experiments involve two observables while three are used in the Bell inequality has been a major source of confusion in interpretation of the inequality. A third magnetic field could be added in tandem with the field on one side of the usual Bell apparatus in such a way that data for two variables could be obtained by retrodiction. Instead, Bell assumed that the third variable b′ would furnish an alternative measurement to b on the B-side of the apparatus. He then assumed that the correlations of b and b′ with a, and with each other, were all given by the same function. This is true for alternatives ab and ab′ for which the observables in the pairs commute, but not for the correlation of bb′ for which the observables do not commute.
Since quantum probabilities for values at b and b′ are each conditional on that of a, their correlation may be computed consistently with quantum mechanics. The three correlations now satisfy the Bell inequality and are experimentally measureable.
When nonlocal information regarding the experimental outcome on the A-side of a Bell apparatus is used to enable hidden variables that account for a measurement and a counterfactual on the B-side, the concept Bell employed to prove his theorem is realized. Now, however, the resulting correlations satisfy the Bell inequality although the two involving the counterfactual are mathematical creations that are not measurable. Thus, the inequality does not distinguish between locally and non-locally generated correlations.
Actual counter-examples must be constructed if Bell’s theorem is to be falsified. The creation of such counter examples has been claimed [17, 18], but their discussion involves issues beyond the scope of this article.
The author would like to thank Joe Foreman for stimulating conversations on the subject of this paper and Armen Gulian for suggestions leading to its clarification. He would also like to thank the referee whose criticisms resulted in improvement of the manuscript.
 Bell J.S., Speakable and unspeakable in quantum mechanics, Cambridge University Press, Cambridge, 1987, Chap. 2Search in Google Scholar
 Hess K., Einstein was right, Pan Stanford Publishing Pte. Ltd. Singapore 2015, Chap. 8Search in Google Scholar
 Sica L., Bell’s inequality violation due to misidentification of spatially non-stationary random processes, J. Mod. Opt., 2003, 50, (15-17), 2465-247410.1080/0950034032000120858Search in Google Scholar
 Papoulis A., Pillai S.U., Probability, random variables, and stochastic processes, McGraw-Hill Companies, Inc., New York, N.Y., 2002, Chap. 9Search in Google Scholar
 Bell J. S., op. cit., p. 65Search in Google Scholar
 Weihs, G., Jennewein T., Simon C., Weinfurter H., Zeilinger A., Violation of Bell’s inequality under strict Einstein locality conditions, Phys. Rev. Lett., 1998, 81, 5039-504310.1103/PhysRevLett.81.5039Search in Google Scholar
 Mandl F., Quantum mechanics. John Wiley & Sons, New York, 1992, Chap. 5Search in Google Scholar
 Sica L., Bell correlations without entanglement: A local wave model using Gaussian-Poisson statistics and single count-pair selection, 2014, Applied Mathematics, 5, 2899-2907 http://dx.doi.org/10.4236/am.2014.51827610.4236/am.2014.518276Search in Google Scholar
 De Raedt H., Michielsen K., Hess K., Irrelevance of Bell’s theorem for experiments involving correlations in space and time: a specific loophole-free computer-example, arxiv:1605.05237v1Search in Google Scholar
© 2017 Louis Sica
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.