Understanding Bell’s theorem part 3: the Many-Worlds version

This post is based on discussions with Harvey Brown, Eric Cavalcanti, and Nathan Walk..

After going through two versions of Bell’s theorem, one might hope to be done with it. Well, this was the situation in 1975, and judging by the huge amount of literature produced since then about Bell’s theorem, I think it is clear that the scientific community is far from being done with it. Why is that so? One reason is that many people really don’t want to give up any of the assumptions behind the simple version of Bell’s theorem: they are used to classical mechanics, which offers them a world with determinism and no action at a distance, and they want to keep it that way. But if you ask more specifically the quantum community, they do not lose any sleep over the simple version: they are happy to give up determinism and keep no action at a distance. Instead, the real thorn in their side is the failure of local causality. It is after all a well-motivated locality assumption that, even if it is not demanded by relativity, it seems to be a plausible extrapolation from it. Furthermore, the failure of local causality is not even a brute experimental fact that people must just accept and be done with it. To see that your probabilities have changed as a result of a measurement done in a space-like separated region you need to know the result of said measurement. And then it is not space-like separated anymore, it has moved to your past light cone.

But this is is just an abstract complaint about the theorem, that doesn’t suggest any obvious solution. A more concrete problem, which is much easier to address, is that both the simple and the nonlocal versions blissfully ignore the Many-Worlds interpretation. Even if you don’t find this interpretation compelling, it is taken seriously by a big part of the scientific community, and I don’t think it is defensible to simply ignore it when discussing the foundations of quantum mechanics.

So how do we reformulate Bell’s theorem to take the Many-Worlds interpretation into account? In this point the literature is rather disappointing, as nobody seems to have tried to do that. The papers I know either exclude the Many-Worlds interpretation via an explicit assumption, or simply note that Bell’s theorem does not apply to it, as the derivation implicitly assume that measurements have a single outcome. This is true, but rather unsatisfactory. Should we conclude then that Bell’s theorem is just a mistake? And how about local causality, is it violated or not? And how about quantum key distribution, does it work at all, or do we need to change cryptosystems if we believe that Many-Worlds is true?

Let us start by examining local causality, or more precisely one of the equations we used in the derivation:
\[ p(a|bxy\lambda) = p(a|x\lambda) \]
this says that the probability of Alice obtaining outcome $a$ depends only on her setting $x$ and the physical state $\lambda$, and not on Bob’s setting $y$ or his outcome $b$. We immediately have a problem: what can “Bob’s outcome $b$” possibly mean in the Many-Worlds interpretation? After all if Alice and Bob share an entangled state $\frac{\ket{00}+\ket{11}}{\sqrt2}$, then before Bob’s measurement their joint state is
\[ \ket{\text{Alice}}\frac{\ket{00}+\ket{11}}{\sqrt2}\ket{\text{Bob}} \]
which, after his measurement, becomes
\[ \ket{\text{Alice}}\frac{\ket{00\text{Bob}_0}+\ket{11\text{Bob}_1}}{\sqrt2}\]
So there is no such thing as “Bob’s outcome”. There are two copies of Bob, each seeing a different outcome. Maybe we can then use $b=\frac{\ket{00\text{Bob}_0}+\ket{11\text{Bob}_1}}{\sqrt2}$ in that equation, instead of $b=0$ or $b=1$. Does it work then? Well, there is still the problem that the equation is about the “probability of Alice obtaining outcome $a$”. But we know that there is also no such thing as “Alice’s outcome”: there will be two copies of Alice, each seeing a different outcome. So from a third-person perspective it makes no sense to talk about the “probability of Alice obtaining outcome $a$”. On the other hand, from Alice’s perspective she will experience a single outcome (if you experience more than one outcome, I want to know what are you smoking), so we can talk about probabilities in a first-person, decision-theoretic way. The equation is then about how much Alice should bet on experiencing outcome $a$, or more precisely the maximum she should pay for a ticket that gives her 1€ if the outcome she experiences is $a$.

So, how much should she? Well, the right hand side $p(a|x\lambda)$ is easy to decide: she only knows that she is making a measurement on a half of an entangled state, whose reduced density matrix is $\mathbb{1}/2$. Her probabilities are $1/2$, independently of the basis in which she measures. How about the left hand side of the equation, $p(a|bxy\lambda)$? Well, now she knows in addition that Bob is in the state $\frac{\ket{00\text{Bob}_0}+\ket{11\text{Bob}_1}}{\sqrt2}$ (we are assuming for simplicity that he measured in the Z basis). So what? How does that help her in predicting which outcome she will experience? This state has no bias towards 0 or 1, and there is no more information, outside her future light cone, that could help her make the prediction. This is no surprise, as in the Many-Worlds interpretation whatever Bob does is assumed to be a unitary, and unitaries applied to one half of a entangled state cannot affect the probabilities of measurements on the other half. For Bob’s measurement to affect Alice in any way it would have to cause a collapse of the wave function, and this is precisely what the Many-Worlds interpretation says that does not happen. We must therefore conclude that $p(a|bxy\lambda) = 1/2$ and that this bastardised version of local causality is respected.

Does this imply that Bell inequalities are not violated in the Many-Worlds interpretation? Of course not! To derive them we needed the version of local causality where Bob had a single outcome. Can we still use it in some way? Well, Bob does obtain a single outcome from Alice’s point of view after they interact in the future (and become decohered with respect to eachother), so then (and only then) we can talk about the joint probabilities $p(ab|xy)$. As eloquently put by Brown and Timpson:

We can only think of the correlations between measurement outcomes on the two sides of the experiment actually obtaining in the overlap of the future light-cones of the measurement events—they do not obtain before then and—a fortiori—they do not obtain instantaneously.

But at this point in time the assumption of local causality becomes ill-motivated: Bob’s measurement is now in Alice’s past light-cone, and it is perfectly legit for her probabilities to depend on it. The information from it had, after all, to sluggishly crawl the intervening space in order to influence her.

So the nonlocal version of Bell’s theorem simply falls apart in the Many-Worlds interpretation. Can we still derive some version of Bell’s theorem from well-motivated assumptions, or do we need to give up and say that it simply doesn’t make sense? Well, I wouldn’t be writing this post if I didn’t have a solution.

To do it, we start by formalising the version of local causality presented above. It says that Alice’s probability of experiencing outcome $a$ depend only on stuff in her past light-cone $\Lambda,$ and not on anything else in the entire region $\Gamma$ outside her future light-cone.

  • Generalised local causality:  $p(a|\Gamma) = p(a|\Lambda)$.

Note that we had to condition on the entire region $\Gamma$ instead of only on Bob’s lab because the state $\frac{\ket{00\text{Bob}_0}+\ket{11\text{Bob}_1}}{\sqrt2}$ is defined in the former region, not on the latter.

I think it is fair to call this generalised local causality because it reduces to local causality if one assumes that Bob’s measurement had a single outcome, via some sort of wavefunction collapse. Note also that in the Many-Worlds interpretation generalised local causality is essentially the same thing as no action at a distance. This is because Many-Worlds is a deterministic theory (not in the sense that the outcome of the measurement is predictable, but in the sense that the post-measurement state is uniquely determined by the pre-measurement state), and therefore conditioning on the post-measurement state doesn’t bring us any additional information. This is not really a surprise, since local causality also reduces to no action at a distance for deterministic theories.

This brings us to the second assumption needed to derive the Many-Worlds version of Bell’s theorem. Since we have now some sort of no action at a distance, one might expect some sort of determinism to do the job and complete the derivation. This is indeed the case, but the terminology here becomes unfortunately confusing, because as explained above Many-Worlds is a deterministic theory, but not in the sense demanded by determinism. The assumption we need is predictability, i.e., that an observer with access to the physical state $\lambda$ can predict the measurement outcomes1. As wittily put by Howard Wiseman, determinism means that “God does not play dice”, and predictability means that “God does not let us play dice”. Putting in a more boring way, we simply write

  • Predictability:  $p(ab|xy\lambda) \in \{0,1\}$.

Using then predictability together with generalised local causality we can again prove Bell’s theorem, following the same steps we did for the simple version. The interesting thing is that while generalised local causality is always true, there are some situations where predictability holds and some where it doesn’t, and a violation of a Bell inequality implies that it does not hold.

I think it is instructive to consider some concrete examples to see how this works. The simplest case is where Alice and Bob share a pure product state and the Eve knows it. For example their joint state could be
\[ \ket{\text{Alice}}\ket{00}\ket{\text{Bob}}\ket{\text{Eve}^{00}} \]
In this case it is clear that Eve can predict the result of their measurements (in the computational basis) and that therefore they cannot violate any Bell inequality. A slightly less simple case is where they all start in this same state, but Alice and Bob do a measurement in the superposition basis. Now Eve can not predict the result of the measurement, but Alice and Bob still cannot violate a Bell inequality. This is ok, because predictability is a sufficient condition for a Bell inequality to hold, not a necessary one.

A more interesting case is where Alice and Bob share a maximally entangled state and Eve again knows it:
\[ \ket{\text{Alice}}\frac{\ket{00}+\ket{11}}{\sqrt2}\ket{\text{Bob}}\ket{\text{Eve}^{\phi^+}} \]
Eve’s knowledge doesn’t help her predict the outcome of the measurement, because there is no outcome of the measurement to be predicted. Both outcomes will happen, and eventually decoherence will make two copies of Eve, one in the 00 branch and another in the 11 branch. Both ignorant of which branch they are in. In this case Alice and Bob will violate a Bell inequality, and correctly conclude that Eve couldn’t have possibly predicted their outcomes.

The most interesting case is where Alice and Bob share the mixed state $\frac12\ket{00}\bra{00} + \frac12\ket{11}\bra{11}$ and Eve holds its purification. Their joint state is
\[ \ket{\text{Alice}}\frac{\ket{00}\ket{\text{Eve}^{0}}+\ket{11}\ket{\text{Eve}^{1}}}{\sqrt2}\ket{\text{Bob}}\]
and it is clear that both copies of Eve, $\ket{\text{Eve}^{0}}$ and $\ket{\text{Eve}^{1}}$ can predict the result of Alice and Bob’s measurements, and that they cannot violate any Bell inequality. Note that this state represents the case where Eve does her measurement before Alice and Bob. She could also make it after them, it makes no difference.

This concludes the Many-Worlds version of Bell’s theorem, and my series of posts about it. I hope that they helped clear some of the misunderstandings about it, and that even if you disagree with my conclusions, you would agree that I’m asking the right questions. I’d like finish with a quotation from the man himself:

The “many world interpretation” seems to me an extravagant, and above all an extravagantly vague, hypothesis. I could almost dismiss it as silly. And yet… It may have something distinctive to say in connection with the “Einstein-Podolsky-Rosen puzzle,” and it would be worthwhile, I think, to formulate some precise version of it to see if this is really so.

This entry was posted in Uncategorised. Bookmark the permalink.

3 Responses to Understanding Bell’s theorem part 3: the Many-Worlds version

  1. Eric Cavalcanti says:

    Hey Mateus,

    A few peaceful comments on your post. ;)

    1. I don’t understand the difference between what you call predictability and what I (and Bell) would call determinism.

    2. I’d like to see a formal proof that MWI satisfies what you call “Generalized Local Causality”. For that you’d need to be more clear about where probabilities in MWI come from, what they apply to, and how you decide that this notion is satisfied in it.


  2. Mateus Araújo says:

    Hi Eric,

    Thanks for comment!

    1 – There isn’t much of a difference, as predictability is just determinism rewritten to make sense in the Many-Worlds interpretation. This is clearest in the last example I gave: there it is not true that Alice’s outcome is a function of her setting and the physical state, so determinism is not satisfied, but it is true that both $\ket{\text{Eve}^0}$ and $\ket{\text{Eve}^1}$ can predict which outcome she will observe, so predictability is satisfied.

    2 – I could do that. Would a precise exposition of the Deutsch-Wallace approach to probabilities in Many-Worlds do the job, in your opinion? But I’m not sure how generalised local causality could possibly fail. After all the probabilities boil down to calculating mod squared amplitudes, and the amplitudes do not change outside of Alice’s future light-cone.

  3. Davi Barros says:


    Somehow I felt that this poster would fit in this comment box here.


Comments are closed.