# Understanding Bell’s theorem part 2: the nonlocal version

Continuing the series on Bell’s theorem, I will now write about its most popular version, the one that people have in mind when they talk about quantum nonlocality: the version that Bell proved in his 1975 paper The theory of local beables.

But first things first: why do we even need another version of the theorem? Is there anything wrong with the simple version? The problem is that it is rather misleading.
Quantum mechanics clearly respects no conspiracy and no action at a distance, but clearly does not respect determinism, so the most natural interpretation of the theorem is that trying to make quantum mechanics deterministic is a bad idea. Which is true, but is not the whole story: one might think that giving up determinism is enough to retain locality. It’s not. It’s enough to retain no action at a distance, but you still need to sacrifice a precious notion of locality in order to violate Bell inequalities: local causality.

Informally, it says that causes are close to their effects. A bit more formally, it says that probabilities of events in a spacetime region $A$ depend only on stuff in its past light cone $\Lambda$, and not on stuff in a space-like separated region $B$. So we have

• Local causality: $p(A|\Lambda,B) = p(A|\Lambda)$. How do we derive a Bell inequality from that? Start with the identity
$p(ab|xy\lambda) = p(a|bxy\lambda)p(b|xy\lambda)$and consider Alice’s probability $p(a|bxy\lambda)$: obtaining an outcome $a$ certainly counts as an event in $A$, and Alice’s setting $x$ and the physical state $\lambda$ certainly count as stuff in $\Lambda$. On the other hand, $b$ and $y$ are clearly stuff in $B$. So we have
$p(a|bxy\lambda) = p(a|x\lambda)$Doing the analogous reasoning for Bob we have
$p(b|xy\lambda) = p(b|y\lambda)$and substituting this back we get
$p(ab|xy\lambda) = p(a|x\lambda)p(b|y\lambda).$This equivalence is called factorizability, and is all that we need. If we recall the decomposition of the probabilities we get from no conspiracy
$p(ab|xy) = \sum_\lambda p(\lambda)p(ab|xy\lambda)$ and join it with factorizability, we end up with
$p(ab|xy) = \sum_\lambda p(\lambda)p(a|x\lambda)p(b|y\lambda)$Noting that for any coefficients $M^{ab}_{xy}$ the Bell expression
$p_\text{succ} = \sum_{abxy} \sum_\lambda M^{ab}_{xy} p(\lambda)p(a|x\lambda)p(b|y\lambda)$is upperbounded by deterministic probability distributions $p(a|x\lambda)$ and $p(b|y\lambda)$, the rest of the proof of the simple version of Bell’s theorem applies, and we’re done.

So there we have it, a perfectly fine derivation of Bell’s theorem, using only two simple and well-motivated assumptions: no conspiracy and local causality.

It annoys me to no end that people very often use factorizability as an assumption instead of local causality. Why would you go for some dry technical assumption instead of one with a clear physical meaning? Is it just a desperate move to avoid admitting that there is something nonlocal about quantum mechanics? Or maybe is there a good way to motivate factorizability? I don’t think so.

It was first postulated by Clauser and Horne in 1974. Their justification is that factorizability

…is a natural expression of a field-theoretical point of view, which in turn is an extrapolation from the common-sense view that there is no action at a distance.

What are they talking about? Certainly not about quantum fields, which do not factorize. Maybe about classical fields? But only those without correlations, because otherwise they don’t factorise either! Or are they thinking about deterministic fields? But then they could postulate determinism directly! And anyway why do they claim that it is an extrapolation of no action at a distance? They don’t have a derivation to be able to claim such a thing!

Nowadays people don’t use Clauser and Horne’s motivation for factorizability, though, but instead Reichenbach’s principle, which states that if two events A and B are correlated, then either A influences B, B influences A, or there is a common cause C such that
$p(AB|C) = p(A|C)p(B|C)$
It is easy to see that this directly implies factorizability for the Bell scenario.

It is often said that Reichenbach’s principle embodies the idea that correlations cry out for explanations. This is bollocks. It demands the explanation to have a very specific form, namely the factorised one. Why? Why doesn’t an entangled state, for example, count as a valid explanation? If you ask an experimentalist that just did a Bell test, I don’t think she (more precisely Marissa Giustina) will tell you that the correlations came out of nowhere. I bet she will tell you that the correlations are there because she spent years in a cold, damp, dusty basement without phone reception working on the source and the detectors to produce them. Furthermore, the idea that “if the probabilities factorise, you have found the explanation for the correlation” does not actually work.

I think the correct way to deal with Bell correlations is not to throw your hands in the air and claim that they cannot be explained, but to develop a quantum Reichenbach principle to tell which correlations have a quantum explanation and which not. This is currently a hot research topic.

But leaving those grandiose claims aside, is there a good motivation for Reichenbach’s principle? I don’t think so. Reichenbach himself motivated his principle from considerations about entropy and the arrow of time, which simply do not apply to a simple quantum state of two qubits. There may be another motivation other than his original one, but I don’t know of any.

Therefore, as far as I know local causality is really the only way to motivate factorisability. So please do not conflate them or, even worse, claim that they are equivalent. They’re not.

To conclude, should we consider them the nonlocal version of Bell’s theorem as the ultimate version, and forget about the simple version? We can’t, because it doesn’t allow you to do quantum key distribution based on Bell’s theorem.1 If you use the simple version of Bell’s theorem and believe in no action at a distance, a violation of a Bell inequality implies not only that your outcomes are correlated with Bob’s, but also that they are in principle unpredictable, so you managed to share a secret key with him, which you can use for example for a one-time pad2

Update: Rewrote the paragraph about QKD.

This entry was posted in Uncategorised. Bookmark the permalink.

### 4 Responses to Understanding Bell’s theorem part 2: the nonlocal version

1. Gláucia Murta says:

Hey Mateus,
Very nice set of posts making explicit the different hypothesis in Bell’s theorem! I really enjoyed reading them.

So, in this comment I’m going to summarize our offline discussions about my objection with respect to your last paragraph.
I disagree (and I guess you also disagree now) with your conclusion that the “nonlocal version” of Bell’s theorem “doesn’t allow you to do quantum key distribution”.
My claim is that the security proofs for key distribution are consistent with both, the simple version and the nonlocal version:
* The key distribution protocols where security relies in quantum mechanics uses the Bell violation to estimate what is the best probability of Eve guessing an outcome of Alice. And the idea is to look for all possible tripartite (Alice, Bob and Eve) quantum boxes such that the marginal of Alice and Bob are consistent with the Bell violation.
The more refined protocols, with better rates, make use of the self-testing properties of the CHSH inequality, and therefore they can estimate how much information a quantum Eve could have, given a set of states consistent with the particular violation.
* There is also protocols that uses the violation of a Bell inequality to state security against a non-signalling eavesdropper. (On that subject I cannot say much) but as far as I understand, even for this case the security still holds independent of your interpretation. Again the security is based on optimizing over all possible tripartite (Alice, Bob and Eve) no signalling boxes such that the marginal of Alice and Bob are consistent with the Bell violation and therefore do not admit a decomposition of the “local” form.

So in summary, the security proofs only assume that given the violation there exist no decomposition of the form $p(ab|xy)=\sum_{\lambda} p(\lambda)p(a|x\lambda)p(b|y\lambda)$ for the probability distribution of Alice and Bob’s outcomes.

Finally I just want to comment that even in the nonlocal version we can still conclude that the outcomes of Alice and Bob “are in principle unpredictable” when one observe a violation. And that is because: the fact that they are local causally correlated also implies that, in principle, there could exist a local causal decomposition of your probabilities distribution where, given the knowledge of “lambda”, the outcomes of Alice and Bob are deterministic. Therefore the violation is witnessing that there exist no such a decomposition.

Cheers!

2. Mateus Araújo says:

Thanks for the comment, Gláucia. I did have in mind QKD against no-signalling adversaries, precisely because the normal QKD against quantum adversaries simply assumes the validity of quantum mechanics, which makes Bell’s theorem just an inspiration for them, not a proof of security. But I am guilty of not making this clear =)

But I don’t quite agree with you about QKD against no-signalling adversaries. As you noticed, their proofs (e.g. in arXiv:quant-ph/0405101 or arXiv:0807.2158) only assume that the joint probability distribution of Alice, Bob, and Eve $p(abe|xyz)$ is non-signalling, and prove that if the marginal $p(ab|xy)$ violates a Bell inequality then $e$ cannot be perfectly correlated with $a$. So they don’t need to talk directly about Bell’s theorem or its assumptions. In this sense, you are right: QKD is going to work regardless of your interpretation of Bell’s theorem.

But there is still the question of explaining why QKD works. In the case where there is no violation of a Bell inequality, the situation is clear: the probabilities can be written as
$p(abe|xyz) = \sum_\lambda p(\lambda)p(a|x\lambda)p(b|y\lambda)p(e|z\lambda)$
for deterministic $p(a|x\lambda)$, $p(b|y\lambda)$, and $p(e|z\lambda)$, so Eve can know everything. She just needs to set $e=\lambda$ and wait until Alice and Bob announce their inputs $x$ and $y$ (they need to do this in order to do QKD). Since she built the boxes she knows the functions $p(a|x\lambda)$ and $p(b|y\lambda)$, so she can just calculate all $a$ and $b$ and have their key.

In the case where Alice and Bob do violate a Bell inequality, the probabilities can still be written as
$p(abe|xyz) = \sum_\lambda p(\lambda)p(a|xyz\lambda)p(b|xyz\lambda)p(e|xyz\lambda),$
for deterministic $p(a|xyz\lambda)$, $p(b|xyz\lambda)$, and $p(e|xyz\lambda)$, so a Bohmian Eve can still know everything using the same strategy as before. To rule that out, you need to assume that it is not possible that $a$ depends on $y$ and $b$ depends on $x$. The nonlocal version of Bell’s theorem does not help with that, as it can only say that these probabilities are not locally causal. The simple version, on the other hand, has the assumption no action at a distance ready to do the job. You can insist that it is true, and in this case there will be simply no deterministic decomposition for $p(abe|xyz)$.

In any case, I have some rewriting to do in the post. Cheers!

3. Gláucia Murta says:

Hey Mateus,

Oh see your point about “explaining why QKD works” using a Bell violation. And I agree with you that the negation of Bell’s theorem in the Bohmnian interpretation and in the nonlocal version do not allow us to rule out that Eve could have information about the key.
And if for the simple version we consider violation+non-action at a distance then we could rule out the deterministic box you’ve constructed.
However I disagree we can take non-action at a distance for granted in the simple version. The simple version shows that assuming determinism + no action at a distance implies the Bell theorem. Therefore violating a Bell inequality just tell we cannot have these 2 assumptions at the same time. So I would say that by itself the simple version also do not lead to qkd, unless we add the extra assumption that no-action at a distance (no-signalling) holds even when violating a Bell inequality. But then why not add this extra assumption also for the nonlocal version?

Cheers!

4. Mateus Araújo says:

Indeed, one does need to assume no action at a distance to have QKD in the simple version, and then one might as well also assume it in the nonlocal version. In fact, most people I know do believe that local causality is false, but no action at a distance is true.

But I don’t think this makes much sense as a theorem, as no action at a distance is a strictly weaker assumption than local causality. I have never seen a theorem being proved from assumptions A and B, where A implies B.

Also sociologically, the authors that prefer to present the nonlocal version of Bell’s theorem usually say that the conclusion is that “the world is nonlocal”. It is just weird to say that you conclude that the world is nonlocal, but actually not so nonlocal that you violate no action at a distance, so in fact the world is also not deterministic.