Meaningful Bell inequalities part 2: examples

Here I’ll show examples of the meaningful Bell inequalities I talked about in the previous post, where they are rephrased as nonlocal games and the gap between the local bound and the Tsirelson bound is made as large as possible.

First we have to talk about notation. Instead of writing Bell inequalities like
\[ \sum_{abxy} M^{ab}_{xy}p(ab|xy) \le L(M),\]I’m going to focus on the coefficients $M^{ab}_{xy}$. Since the meaningful Bell inequalities are nonlocal games, the coefficients have a very specific form,
\[M^{ab}_{xy} = \mu(x,y)V(a,b,x,y),\] where $\mu(x,y)$ is the probability that the referee asks questions $x$ and $y$ to Alice and Bob, and $V(a,b,x,y)$ is the probability that the referee accepts answers $a$ and $b$ to questions $x$ and $y$. Usually $V(a,b,x,y)$ is restricted to be either 0 or 1, so that it simply determines which answers will make them win or lose the game. Unfortunately we cannot make this restriction, as it would make it impossible to transform every Bell inequality into a nonlocal game. This might seem obvious, but curiously we couldn’t find any proof in the literature, so we had to do one ourselves in our paper, and it wasn’t easy. Even in the cases where the Bell inequality can be transformed into a nonlocal game with such a deterministic predicate, it will not necessarily be optimal, as we’ll see below in the CGLMP case, so there’s really no getting around that.

Braunstein-Caves inequalities

The first example I’d like to show are the Braunstein-Caves inequalities, known to the computer scientists as the Odd Cycle game (even though it works just as well for even cycles)1. Since they can be formulated as an XOR game, which is a particular case of unique games, their optimal version is just the XOR game, as I mentioned in the previous post. The game is very simply: it has $m$ inputs and 2 outputs per party, the inputs are uniformly distributed over integers $x,y \in \{0,\ldots,m-1\}$ that are either equal or differ by one mod $n$, and Alice and Bob win if they give equal outputs for all inputs except for $x=y=0$. More formally, the game has distribution over inputs
\[\mu(x,y) = \frac1{2m}([x=y]+[x=y+1 \mod m]),\] where $[\Pi]$ is 1 if the proposition $\Pi$ is true and 0 otherwise, and predicate
\[V(a,b,x,y) = [a\oplus b=\delta_{x0}\delta_{y0}],\]where $\delta_{ij}$ is the usual Kronecker delta.

The local bound of this nonlocal game, the maximal probability of winning it with local hidden variables, is
\[\omega_\ell = 1-\frac1{2m},\]and its Tsirelson bound, the supremal probability of winning it with quantum mechanics, is
\[\omega_q = \cos^2\left(\frac\pi{4m}\right).\] At first sight this is very disappointing, as $\omega_q-\omega_\ell$ decreases monotonically with $m$, so it would seem that the best game is the one for $m=2$, where it reduces to the CHSH game. What matters though is not really the gap $\chi$ between the local and Tsirelson bounds, but the expected $p$-value you get after playing $n$ rounds of the nonlocal game. The expected $p$-value is upperbounded by $(1-\chi^2)^n$, which reinforces the impression that the Braunstein-Caves game sucks, but a more refined upper bound is
\[p \le \left(\sqrt{\omega_\ell \omega_q}+\sqrt{(1-\omega_\ell)(1-\omega_q)}\right)^{2n},\] which is actually minimized for $m=4$. So yes, the Braunstein-Caves game is actually better than the CHSH game for rejecting local hidden variables! This is not all that matters, though; even though the expected $p$-value is smaller, the Braunstein-Caves game for larger $m$ is still useful for self-testing and proving measurement incompatibility.

$I_{mm22}$ inequalities

The second example I’d like to show are the $I_{mm22}$ inequalities. Like the Braunstein-Caves inequalities, they have 2 outputs and $m$ inputs per party, but unlike them, they are facets of the local polytope, so we could expect the $I_{mm22}$ inequalities to be better in all aspects. Well, not for the gap. We only have numerics here, as the Tsirelson bounds are not known exactly, and the $I_{mm22}$ inequalities cannot be transformed into a unique game or a CGLMP game, the only cases where we know the optimal form analytically, but the numbers are pretty disheartening:

$n$ $\omega_\ell$ $\omega_q$ $\chi$
2 0.7500 0.8536 0.1036
3 0.8000 0.8502 0.0502
4 0.8333 0.8654 0.0321
5 0.8571 0.8795 0.0224


In this case the small gap is not misleading: the expected $p$-value does go down monotonically with $m$, and thus the best $I_{mm22}$ inequality for rejecting local hidden variables is in fact the one for $m=2$, where it also reduces to the CHSH. Again, the $p$-value is not everything: the $I_{3322}$ inequality is violated by quantum states that do not violate the CHSH inequality, and its Tsirelson bound violation is probably attained only with infinite-dimensional quantum states, so it is still a pretty interesting inequality.

The example of the $I_{3322}$ inequality is also a good illustration of the arbitrariness in how you write a Bell inequality: pretty much all papers that talk about it formulate it differently, and it is rather tedious to switch from one form to another. To solve this problem we formulated it as a somewhat nice nonlocal game with the bounds mentioned above: if Alice and Bob are given inputs $(0,1)$, $(1,0)$, $(0,2)$, or $(2,0)$, which happens with probability $1/10$ each, they have to give different answers. If they are given instead inputs $(1,2)$ or $(2,1)$, which again happens with probability $1/10$, they have to give equal answers. The last case is when they get inputs $(0,0)$ or $(1,1)$, which happens with probability $1/5$ each. In this case they have to answer anything other than $(0,0)$. We couldn’t find even mildly nice nonlocal games for higher $m$, but since the solution is far from unique hopefully somebody else will manage to do it.

CGLMP inequalities

The third example I’d like to show is the CGLMP inequalities. Unlike the Braunstein-Caves and $I_{mm22}$, the gap finally gets larger than CHSH’s. Apparently it is easier to get a larger gap by increasing the number of outputs than by increasing the number of inputs. In any case, that’s what CGLMP does: it has 2 inputs and $d$ outputs per party. Using the form presented here, it’s easy to massage them into the nonlocal game with input distribution $\mu(x,y) = 1/4$ and predicate
\[V(a,b,x,y) = \sum_{k=0}^{d-1} \left(1-\frac{k}{d-1}\right)\left[a-b = (-1)^{x\oplus y}k + xy \mod d\right],\]which as mentioned before is already the optimal one.

Curiously, this predicate is probabilistic: if the answers satisfy the condition $\left[a-b = (-1)^{x\oplus y}k + xy \mod d\right]$ for some $k$, the referee accepts them with probability $\left(1-\frac{k}{d-1}\right)$. Even more curiously, it is always possible to make this predicate deterministic, at the cost of having a non-optimal nonlocal game. The best deterministic predicate we found2 is
\[ V(a,b,x,y) = \left[ (a-b)(-1)^{x\oplus y} \ge xy \right],\] which reduces the size of the gap by a factor of $(d-1)/d$.

The optimal version has a local bound of $\omega_\ell = 3/4$ for all $d$. The Tsirelson bounds are pretty much known: the lower bounds found here match the upper bounds from the NPA hierarchy within numerical precision, so with all probability they are the actual Tsirelson bounds. There’s no proof, however, except for the $d=2$ case, which is as usual just the CHSH inequality. For $d=3$ there’s an almost nice expression, $\frac{1}{24}(15+\sqrt{33})$, but for higher $d$ the lower bound is just the norm of a $d \times d$ matrix with irrational coefficients, there’s little hope for a simple form. The limit is known, though: for large $d$ the Tsirelson bound converges to 1.

Here are some value for small $d$:

$d$ $\omega_\ell$ $\omega_q$ $\chi$
2 0.7500 0.8536 0.1036
3 0.7500 0.8644 0.1144
4 0.7500 0.8716 0.1216
5 0.7500 0.8770 0.1270

CHSH-$d$ game

The fourth example I’d like to show is the CHSH-$d$ game, introduced here. It looks rather innocent: it has $d$ inputs and $d$ outputs per party, with uniform distribution over the inputs $\mu(x,y) = \frac{1}{d^2}$ and predicate
\[V(a,b,x,y) = [a+b=xy \mod d],\]
so it is a straightforward generalisation of the CHSH game. It is also a unique game, so this form is already the optimal one. Seems simple, no? Nothing could be further than the truth. Neither the local bound nor the Tsirelson bound is known; the only analytical result is an upper bound to the Tsirelson bound,
\[ \omega_q \le \frac1d + \frac{d-1}{d\sqrt{d}},\] valid only when $d$ is a prime number or a power thereof, and known not to be tight in general. It was first proved here, but I find the later proof much easier to understand. A fascinating variant of CHSH-$d$ for which the above expression is in fact the Tsirelson bound was found here.

The local bound is known only for small $d$, found here together with good lower bounds on the Tsirelson bound:

$d$ $\omega_\ell$ $\omega_q$ $\chi$
2 0.7500 0.8536 0.1036
3 0.6667 0.7124 0.0457
5 0.4800 0.5375 0.0575
7 0.3878 0.4587 0.0709
11 0.3058 0.3325 0.0267
13 0.2781 0.2987 0.0206


We see that the gap is maximal for $d=2$, with a smaller peak at $d=7$, and seems to go only down afterwards. So for all its beauty, the CHSH-$d$ game is useless for the bare purpose of rejecting local hidden variables.

Tailored inequalities

Introduced here, these are an ingenious generalisation of the Braunstein-Caves and together with a variant of the CGLMP inequalities, with $m$ inputs and $d$ outputs per party. They are tailored for the purpose of achieving maximal violation with the maximally entangled state of dimension $d$, and have the very convenient feature that the local and Tsirelson bounds are known exactly for all $m$ and $d$.

At first sight they look quite complicated, but when reformulated as a nonlocal game they are actually rather straightforward: the distribution over the inputs is the same as for the Braunstein-Caves inequalities,
\[\mu(x,y) = \frac1{2m}([x=y]+[x=y+1 \mod m]),\]and like in the CGLMP case the predicate is probabilistic, being given by
\[V(a,b,x,y) = \sum_{k=0}^{d-1} \alpha_k [ a-b + (-1)^{\delta_{xy}+1}k + \delta_{x0}\delta_{y0} = 0 \mod d],\]and the $\alpha_k$ are where all the fun is happening:
\[\alpha_k = \frac{g(k)-g(d-1)}{g(0)-g(d-1)},\] where
\[g(k) = \cot\left(\pi\frac{2mk+1}{2md}\right).\]Luckily this game shares the key property of the CGLMP that makes it already be the optimal solution: the predicate depends only on $a-b \mod d$, so we don’t need to do any numerics.

The non-signalling bound is simply 1, unlike in the paper, where it is a horrible expression, the local bound is
\[\omega_\ell = 1-\frac1{2m},\]the same as for the Braunstein-Caves inequalities (and also unlike in the paper, where it is another horrible expression), and the Tsirelson bound is
\[\omega_q = \frac{d g(\frac{d-1}{2m})-g(d-1)}{g(0)-g(d-1)},\]which I must admit is more complicated than the expression in the paper. Well, we cannot have everything. For fixed $d$ and large $m$ the Tsirelson bound converges to 1, and for fixed $m$ and large $d$ it converges to $\frac1{2m} + \frac{2m-1}{4m^2}\pi \cot(\frac{\pi}{2m})$.

The gap here decreases monotonically as a function of both $d$ and $m$, but not the expected $p$-value: it is minimised at $d=2$ and $m=4$, when the tailored inequality reduces to Braunstein-Caves. For a fixed $m$ increasing $d$ only decreases the $p$-value, but for a fixed $d$ the minimum seems to be attained at some $m$ that increases with $d$. If one wants a larger gap one could set $\alpha_k = (1-k/(d-1))$, so that the tailored inequalities strictly generalise CGLMP, but that would defeat both points of having a maximal violation for a maximally entangled state and having a simple expression for the Tsirelson bound.

Żukowski, Gisin, and Tsirelson

These are two closely related XOR games with the curious property that the measurement setting is a continuous variable, $x,y\in[0,1]$. One might hope that one can do better than CHSH by allowing this extravagance, but alas, it is not the case. I find them interesting anyway, because these games make it impossible to make the usual mistake of doing the experiment by estimating each individual conditional probability appearing in the Bell inequality. There are infinitely many! You actually need to do the experiment by playing the game: sampling the input according the probability distribution $\mu(x,y)$, and checking whether you won that round by looking at the predicate $V(a,b,x,y)$.

The first variant was introduced by Żukowski in 1993, and the second by Gisin in 1999. Żukowski proved the local bound for his variant, but not the Tsirelson bound. Gisin proved neither, and Tsirelson proved both bounds for Gisin’s variant.

Let’s start with Gisin’s version, the simplest one. Here the probability density over the inputs is uniform, $\mu(x,y) = 1$, and the predicate is given by
\[V(a,b,x,y) = [a\oplus b = [x \ge y]].\] The local bound is $\omega_\ell = 3/4$, and the Tsirelson bound is $\omega_q = 1/2+1/\pi$. The gap is thus $1/\pi-1/4 \approx 0.068$, smaller than CHSH’s.

In Żukowski’s version the probability density of the inputs is nontrivial,
\[\mu(x,y) = \frac1{8\pi}|\sin(2\pi(x+y))|,\]and the predicate a bit weird3,
\[V(a,b,x,y) = [a\oplus b = [x+y \ge 1/2 \mod 1]],\] but that pays off with a slightly larger gap: the local bound is $\omega_\ell = 1/2+1/\pi$, and the Tsirelson bound respects $\omega_q \ge 1/2+\pi/8$, so the gap is at least $\pi/8-1/\pi \approx 0.074$. Also, this nonlocal game has a nice generalisation for multiple parties, that in some sense is stronger than the usual WWŻB inequalities4

That’s it! I plan to use this post as a catalogue of the meaningful forms of cool Bell inequalities, so if you know some that I missed please let me know. I’m specially interested in Bell inequalities for which the gap increases with increasing number of inputs; I’m a bit surprised that I couldn’t find a single example of this happening in the physics literature, as the computer science literature is full of them.

Posted in Uncategorised | Leave a comment

Meaningful Bell inequalities

Since I’ve started learning about Bell inequalities, it always bothered me that they are usually present as just some linear combinations of conditional probabilities, without any physical interpretation besides the fundamental fact that the maximal value this linear combination can attain when the probabilities come from local hidden variable models is smaller than when they come from quantum mechanics5.

For example, the most famous Bell inequality, the CHSH inequality, is usually written as
\[ \langle A_0 B_0 \rangle + \langle A_0 B_1 \rangle + \langle A_1 B_0 \rangle – \langle A_1 B_1 \rangle \le 2,\] where
\[\langle A_xB_y\rangle := p(a{=}b|xy)-p(a{\neq}b|xy).\] It looks like just an arbitrary sum of conditional probabilities; it is justified by noting that this is a facet of the local polytope2, but still this doesn’t give an interpretation of what this sum measures.

To make things worse, there’s a lot of arbitrariness in how we write a Bell inequality: we can simply divide both sides by a constant, which clearly doesn’t make a difference in whether the inequality is violated or not, add a combination of probabilities that sum to 1 to the left hand side and 1 to the right hand side, which also clearly doesn’t make a difference, or add a no-signalling constraint, which I’ll talk about later.

As a consequence, the local bound of the CHSH inequality, 2, is just an arbitrary number, and likewise the Tsirelson bound $2\sqrt{2}$, as these transformations change them willy-nilly. Valerio Scarani famously has a football t-shirt with the player number $2\sqrt{2}$. But he might as well have put $\sqrt{2}$ on it, as people sometimes write the CHSH inequality divided by 2, or pretty much any real number – as long as somebody wrote a paper with the corresponding version of the inequality ;)

One can notice, though, that the CHSH expression is maximized when the correlations are such that $a=b$ when the inputs are $(0,0)$, $(0,1)$, or $(1,0)$, and $a\neq b$ when the inputs are $(1,1)$, and reformulate it as the probability that this happens. The result is the inequality
\[ \frac14 \left( p(a{=}b|00) + p(a{=}b|01) + p(a{=}b|10) + p(a{\neq}b|11) \right ) \le \frac34,\]which now has a nice interpretation as a nonlocal game: it measures the probability that Alice and Bob win the game of outputting $(a,b)$ for input $(x,y)$ such that $a\oplus b = xy$, when the inputs are given with uniform probability. The local bound $\frac34$ is no longer arbitrary, but it is the maximal probability of winning the game using local hidden variables, and likewise the Tsirelson bound $\frac{2+\sqrt{2}}{4}$ is the maximal probability of victory with quantum mechanics. As far as I know the first person that formulated the CHSH inequality like this was Boris Tsirelson, which makes it specially appropriate to call $\frac{2+\sqrt{2}}{4}$ a Tsirelson bound.

It turns out that every Bell inequality can be rewritten as a nonlocal game – provided we allow the winning condition of the nonlocal game be probabilistic – so all of them, together with their local and Tsirelson bounds, can be made meaningful in this way. End of story? No, because there is still arbitrariness in the nonlocal game formulation.

For example, one can add the equation
\[ \alpha\left(p(00|00) + p(01|00) + p(10|00) + p(11|00) \right) = \alpha\] to the nonlocal game form of the CHSH inequality (after multiplying it by 4, for convenience). We then divide it by $4+\alpha$, so that it is still a valid nonlocal game, albeit a weird one: when given input $(0,0)$, which happens with probability $\frac{1+\alpha}{4+\alpha}$, Alice and Bob win the game with certainty if their outputs $(a,b)$ are equal, and win the game with probability $\frac{\alpha}{1+\alpha}$ if they are different. That is, the referee will decided probabilistically whether to accept their answers. For the other inputs the winning condition doesn’t change, but they are now distributed with probability $\frac{1}{4+\alpha}$ each.

The local bound of the nonlocal game becomes $\frac{3+\alpha}{4+\alpha}$, and the Tsirelson bound $\frac{2+\sqrt{2}+\alpha}{4+\alpha}$. This might be surprising, since adding such a normalisation condition shouldn’t change anything! Well, it doesn’t change which correlations violate the inequality or not, but it certainly changes something: since both local hidden variables and quantum mechanics will by definition produce normalized probabilities, the normalisation condition cannot discriminate between them. Hence the larger the $\alpha$, the worse the nonlocal game will be at this task. Indeed both the local and Tsirelson bounds converge to 1 for large $\alpha$, as in this limit the game reduces to only asking the players to produce normalized probabilities, which they answer with “Yes, sure, no problem. But why?”.

Now, this is obviously silly, but we have to be able to deal with this silliness in a systematic fashion in order to remove this arbitrariness from the nonlocal game formulation. The way I propose to do it, in my new paper with Flavien Hirsch and Marco Túlio, is to optimise over the form of the nonlocal game to make the gap between the local and Tsirelson bounds as large as possible. We show that the expected $p$-value you get after playing $n$ rounds of a nonlocal game with gap $\chi$ is upperbounded by $(1-\chi^2)^n$, so there is a precise sense in which maximizing the gap between the local and Tsirelson bounds increases the power of the game for rejecting local hidden variables.

It turns out that optimizing over such normalisation conditions is rather easy: we just need to use them to make the smallest coefficient of each setting $(x,y)$ equal to zero, and rescale the game such that the largest coefficient of each setting is equal to one.

What is more difficult is dealing with the no-signalling constraints: we can also add to the CHSH game an equation like
\[ \gamma\left(p(00|00) + p(01|00) – p(00|01) – p(01|01) \right) = 0,\]which simply says that the marginal probability that Alice gets result 0 doesn’t depend on whether Bob’s input $y$ is 0 or 1. Now both local hidden variables and quantum mechanics will by definition respect this equation, so again it doesn’t change which correlations violate the inequality or not, but again it does change the power of the game. After a bit of work to remove the negative coefficients and renormalise the game, we see that the local bound becomes $\frac{3+\gamma}{4+2\gamma}$ and the Tsirelson bound $\frac{2+\sqrt2+\gamma}{4+2\gamma}$. For large $\gamma$ both bounds converge to $1/2$, which makes sense: in this limit the game reduces to asking Bob to signal to Alice, but since they cannot do it, irrespectively of using local hidden variables or quantum mechanics, they simply guess randomly, with probability of success $1/2$.

In this case it is trivial to do the optimisation, we just set $\gamma = 0$, and we have the maximal possible discriminating power, but in general this is quite hard: the best we could do is formulate the optimisation as a linear program, so at least it is easy to do it numerically. We also managed to find analytical solutions for two particular cases: when the nonlocal game can be turned into a unique game3 (as is the case for CHSH), doing that makes it optimal, and when the nonlocal game can be turned into a CGLMP game4, doing that makes it optimal.

In these two cases the solution is unique, so maximizing the gap $\chi$ gives you the optimal local bound, the optimal Tsirelson bound, and the optimal form of the Bell inequality5. But in general life is not so easy: fixing the gap $\chi$ to be the maximal one is not enough to determine the local and Tsirelson bounds. We can deal with that by maximising the Tsirelson bound as well, to make it easier to win the nonlocal game with quantum mechanics.

This is enough to get you optimal, unique, and physically meaningful local and Tsirelson bounds. Pure bliss, no? A bit awkward is the fact that this is still not enough to fix the form of the nonlocal game itself. Perhaps it doesn’t matter, and the differences between these form are as inconsequential as relabellings. Or perhaps they do matter, and somebody else will figure out how to pick up the best among them.

I’ll show the optimal forms of some Bell inequalities in a separate post, as this one is getting too long already. But there’s one more thing I’d like to say: Valerio, if you’re reading this, put $\frac{2+\sqrt{2}}{4}$ on your t-shirt.

Posted in Uncategorised | Leave a comment

The Choi-Jamiołkowski isomorphism: you’re doing it right

Yes, you are. There’s nothing to worry about. I’m just writing this blog post because there are some people trying to convince you otherwise, and I don’t want you to get trolled by them.

First of all, what is the Choi-Jamiołkowski isomorphism? It is one way to write a linear map $\mathcal E$ as a matrix $\mathfrak{C}(\mathcal E)$ so that the matrix is positive semidefinite if and only the map is completely positive. This is very convenient if you want to find out whether $\mathcal E$ is completely positive, or if you’re doing semidefinite programming. If all you want is to write the linear map as a matrix, you’re better off using a representation with a more convenient formula for applying it to a quantum state.

The usual way to define it is
\[\mathfrak{C}(\mathcal E) := \sum_{ij} \ket{i}\bra{j} \otimes \mathcal E(\ket{i}\bra{j}), \]and I assume this is what you’re doing. Yes, it is fine really, there’s nothing wrong with it. What Leifer, Allen et al., and Hoffreumon et al. complain about it is that it’s basis dependent. Which is true: if instead of the basis $\{ \ket{i} \}$ in the above formula you use the basis $\{ U\ket{i}\}$ for some unitary $U$, the resulting matrix does depend on $U$. To see that just let $U$ apply to $\ket{i}$ a phase that depends on $i$.

But so what? It will be a different matrix, but still a perfectly valid representation for $\mathcal E$. And why are you changing the basis anyway? I don’t think anybody ever did the Choi-Jamiołkowski isomorphism in a basis other than the computational one. Nevertheless, since the choice of basis clearly doesn’t matter for the isomorphism, perhaps it’s more elegant to use an isomorphism that doesn’t depend on it.

Leifer proposes using
\[\mathfrak{J}(\mathcal E) := \sum_{ij} \ket{j}\bra{i} \otimes \mathcal E(\ket{i}\bra{j})\]instead. It’s easy to see that this is indeed basis-independent, but it has a fatal flaw: it doesn’t represent completely positive maps as positive semidefinite matrices. For example, the representation of the identity map will have a negative eigenvalue. Since this was the main point of using the Choi-Jamiołkowski isomorphism in the first place, no thanks, I’ll pass.

Allen et al. and Hoffreumon et al. promise to give us the best of both worlds, an isomorphism which is basis-independent and represents completely positive maps as positive semidefinite matrices, by defining instead
\[\mathfrak{B}(\mathcal E) := \sum_{ij} \ket{i}_{A^*}\bra{j} \otimes \mathcal E(\ket{i}_A\bra{j}), \]where $\{\ket{i}_A\}$ is a basis for the primal Hilbert space $A$, and $\{\ket{i}_{A^*}\}$ the dual basis, belonging to the dual Hilbert space $A^*$, the space of linear functionals on $A$.

Now, what is this thing? What I understand as the dual of $\ket{i}$ is simply its bra $\bra{i}$, which would make this isomorphism the same thing that Leifer proposed. After a long discussion with Marco Túlio and Oreshkov, I finally understood what they meant: one can indeed write it as Leifer’s version of the isomorphism, but the operator $\ket{j}\bra{i}$ on the first tensor subsystem has to be understood as a map taking linear functionals to linear functionals, that is, it acts on bras to the left, being defined as
\[ (\bra{\psi})\ket{j}\bra{i} := \langle \psi | j \rangle \bra{i},\]or, to use their notation,
\[ \ket{i}_{A^*}\bra{j}(\ket{\psi}_{A^*}) := \langle \psi | j \rangle \ket{i}_{A^*}.\]Ok, this works, but it is quite inconvenient; I don’t want to use operators that act on the left or change the rules of the bra-ket notation in order to make calculations with this thing. Can’t we redefine this isomorphism so that it works in the usual way?

There is an easy way around it: remember that $\langle \psi | j \rangle = \overline{\langle j | \psi \rangle}$, so if we define these funny vectors as the complex conjugate of the usual vectors, $\ket{\psi}_{A^*} := \overline{\ket{\psi}_A}$, then linear algebra is back to normal.

We can then redefine the isomorphism as
\[\mathfrak{B}(\mathcal E) := \sum_{ij} \overline{\ket{i}\bra{j}} \otimes \mathcal E(\ket{i}\bra{j}), \]and it is easy to show that it is indeed independent of the choice of basis $\{\ket{i}\}$: let $\mathcal E$ be the identity map again (by linearity it is enough to consider this case). Its representation is given by
\[\mathfrak{B}(\mathcal I) = \sum_{ij} \overline{U\ket{i}\bra{j}U^\dagger} \otimes U\ket{i}\bra{j}U^\dagger. \]Inserting two resolutions of the identity matrix we get
\begin{align*} \mathfrak{B}(\mathcal I) &= \sum_{ijkl} \overline{U\ket{i}\bra{j}U^\dagger} \otimes \ket{k}\bra{k}U\ket{i}\bra{j}U^\dagger\ket{l}\bra{l} \\
&= \sum_{ijkl} \overline{U\ket{i}\bra{i}U^\dagger\ket{k}\bra{l}U\ket{j}\bra{j}U^\dagger} \otimes \ket{k}\bra{l} \\
&= \sum_{kl} \overline{\ket{k}\bra{l}} \otimes \ket{k}\bra{l},
\end{align*} where we used the fact that $\bra{i}M\ket{j} = \overline{\bra{j}M^\dagger\ket{i}}$.

This isomorphism is not actually basis-independent, though, because it depends on the basis we have chosen to do the complex conjugation in. And to add insult to the injury, the operator you get by choosing some specific basis for the complex conjugation is exactly the same as the operator you get by using this basis to define the usual Choi-Jamiołkowski isomorphism.

So yes, you are doing it right.

Posted in Uncategorised | Leave a comment

A great month for the quantum switch

If you’re a fan of the quantum switch, like me, the arXiv brought two great news this month: first, an Argentinian-Brazilian-Chilean collaboration finally managed to implement a superposition of more than two causal orders, as I had been bothering people to do since I first worked on the quantum switch. After the experiment was done they invited me, Alastair Abbott, and Cyril Branciard to help with the theory, which we gladly accepted. With their fancy new laboratory in Concepción, they managed to put 4 quantum gates in a superposition of 4 different causal orders, using the path degree of freedom (in optical fibres) of a photon as the control system, and the polarisation degree of freedom as the target system. Using the original algorithm I had developed for the quantum switch, that would have been pointless, as that algorithm required a target system of dimension at least $n!$ (and maybe even $n!^{n-1}$) to make use of a superposition of $n$ gates. To get around that they came up with a new algorithm, that seems just as hard as the old one, but only needs a target system of dimension 2 to work.

They also did an elegant refutation of a criticism that was levelled to previous versions of the experiment: people complained that depending on the state of the control system, the photons were hitting the waveplates implementing the unitaries at different points, which acted differently. In this way the experiment wasn’t actually probing the same unitaries in different orders, but rather different unitaries, which isn’t really interesting. Well here the experimentalists have shown that if you let it all four paths shine at a waveplate implement a unitary simultaneously, you get a beautiful interference pattern, showing that the paths are indistinguishable at the waveplate, and thus the same unitary is implemented, independently of the control system.

The other piece of great news appeared on the arXiv today: Barrett, Lorenz, and Oreshkov managed to show that no bipartite causally nonseparable process is purifiable. This means that if you have a process matrix, encoding the causal relationship between two parties, Alice and Bob, and this process matrix is not simply a mixture of saying that Alice acts before Bob or Bob acts before Alice, but rather encodes an indefinite causal order, then it is not possible to purify this matrix, that is, it is not possible to recover this process from a larger process, with more parties, that is compatible with the principle of conservation of information.

Now I hold the principle of conservation of information very close to my heart, and proposed a purification postulate, that if a process is not purifiable then it is unphysical. If you believe that, then the result of Barrett et al. implies that the only way to put two parties in an indefinite causal order is really just the quantum switch6. I had asked this question 3 years ago, managed only to show that a few processes are not purifiable, and speculated that none of them are. Now they came and did the hard part, solving the question completely for two parties.

UPDATE 13.03: And today the arXiv brought a paper by Yokojima et al. that proved the same result independently. They formulated it in a different way, that I find more clear: if you have a pure process, which encodes the causal relationships between Alice, Bob, a global past, and a global future, then this process is necessarily a variation of the quantum switch.

This paper also has a well-hidden gem: they showed that if you try to superpose two different causal orders, without using a control system to switch between them, then this will never be a valid process matrix. This plays wells in the theme “the quantum switch is the only thing that exists”, and allows us to be careless about talking about superpositions of causal orders. Since we can’t have superpositions without a control, then there’s no need to distinguish those with a control from those without.

Apparently this had been shown by Fabio Costa some years ago, but he kept the result a secret and got scooped.

Posted in Uncategorised | 8 Comments

Boris Tsirelson 1950-2020

Boris Tsirelson died on the 21st of January 2020 in Switzerland, via assisted suicide after being diagnosed with high grade cancer. It is with great sadness that I type these news. I never met him personally, but I appreciate his work, and had several pleasant interactions with him online. As an undergrad student I asked him some inane questions about quantum correlations, that he patiently and politely answered. I also brushed with him on Wikipedia, where he was an avid contributor.

He was a famous mathematician, but his work in physics was not always recognized: the groundbreaking paper where he proved Tsirelson’ bound and started the characterization of the set of quantum correlations was published in 1980, but only started to get noticed by the time of the fall of the Soviet Union. Discouraged by the lack of interest, he decided to quit the field, and upon doing so made what he called his “scandalous failure”: asserted without proof that the set of quantum correlations generated by tensor-product algebras is equal to the set of quantum correlations generated by commuting algebras. Today this is known as Tsirelson’s problem, since asking the right question is more important than getting the right answer. The story was told much better by Tsirelson himself in a contribution to the IQOQI blog.

The solution of Tsirelson’s problem a mere week before his death was serendipitous. He was fascinated by the result, and concluded: “Envy me, I am very lucky. I am comfortably leaving during a peak of fame.”

Posted in Uncategorised | 2 Comments

Infinity in Nature?

Last week a bomb landed on the arXiv: Ji et al. posted a proof that MIP*=RE which implies, among other goodies, a solution to Tsirelson’s problem. I’m speaking here as if their proof actually holds; I can’t check that myself, as the paper is 165 pages of serious complexity theory, and I’m not a computer scientist. What I could understand made sense, though, and the authors have a good track record of actually proving what they claim, so I’ll just be positive and assume that MIP* is in fact equal to RE.

My first reaction to the result was ill-informed and overly dramatic (which of course a journalist took from my comment on Scott Aaronson’s blog to the Nature News article about it), but a few days after that, I still find the result disturbing, and perhaps you should too.

To explain why, first I need to explain what Tsirelson’s problem is: it asks whether the supremum of the violation of a Bell inequality we can achieve with tensor product strategies, that is, with observables of the form $A_i = A_i’\otimes \mathbb{I}$ and $B_j = \mathbb{I} \otimes B_j’$, is the same as the one we can achieve with commuting strategies, where we only require that $[A_i,B_j] = 0$ for all $i,j$.

It is easy to show that these two values must coincide for finite-dimensional systems, and furthermore that the tensor product value can be approximated arbitrarily well with finite-dimensional systems, which opens up a terrifying possibility: if there exists a Bell inequality for which these two values differ, it would mean that the commuting value can only be achieved by infinite-dimensional systems, and that it can’t even be approximated by finite-dimensional ones! It would make it possible for an experiment to exists that would prove the existence of literal, unapproachable infinity in Nature. Well, Ji et al. proved that there exists a Bell inequality for which the tensor product value is at most $1/2$, whereas the commuting value is $1$. Do you feel the drama now?

Now for the soothing part: Ji et al. cannot show what this Bell inequality is, they cannot even say how many inputs and outputs it would need and, more importantly, they cannot show what the commuting strategy achieving the value $1$ is. It turns out that even showing a purely mathematical commuting strategy that can do this is really hard.

We need more than that, though, if we’re talking about an actual experiment to demonstrate infinity: we need the commuting strategy to be physical. That’s the part I was ill-informed about in my initial reaction: I thought it was natural for QFTs to only commute across spacelike separated regions, and not be separated by a tensor product, but this is not the case: not a single example is known of a non-pathological QFT that is not separated by a tensor product, at least when considering bounded spacetime regions that are spacelike separated and are more than some distance apart. Even that wouldn’t be enough, as a QFT that only commutes might be residually finite, which would mean that it can only achieve the tensor product value in a Bell inequality.

So, I’m not shitting bricks anymore, but I’m still disturbed. With this result the infinity experiment went from being mathematically impossible to merely phisically impossible.

UPDATE: Perhaps it would be useful to show their example of a nonlocal game that has a commuting Tsirelson bound larger than the tensor product Tsirelson bound. For that we will need the key construction in the paper, a nonlocal game $G_\mathcal{M}$ that has a tensor product Tsirelson bound of 1 if the Turing machine $\mathcal{M}$ halts, and a tensor product Tsirelson bound at most $1/2$ if the Turing machine $\mathcal{M}$ never halts.

Consider now $\mathcal{M}(G’)$ to be the Turing machine that computes the NPA hierarchy for the nonlocal game $G’$, and halts if at some level the NPA hierarchy gives an upper bound strictly less than 1 for the commuting Tsirelson bound.

The nonlocal game $G_{\mathcal{M}(G’)}$ will then have a tensor product Tsirelson bound of 1 if the nonlocal game $G’$ has a commuting Tsirelson bound strictly less than 1, and $G_{\mathcal{M}(G’)}$ will have a tensor product Tsirelson bound at most $1/2$ if the nonlocal game $G’$ has a commuting Tsirelson bound equal to 1 (as in this case the Turing machine $\mathcal{M}(G’)$ never halts).

The nonlocal game we need will then be the fixed point $G^*$ such that $G_{\mathcal{M}(G*)} = G^*$. It cannot have a commuting Tsirelson bound strictly less than 1, because then it would need to have a tensor product Tsirelson bound equal to 1, a contradiction. Therefore it must have a commuting Tsirelson bound equal to 1, which also implies that it must have a tensor product Tsirelson bound at most 1/2.

Posted in Uncategorised | 6 Comments

scientists4future #unter1000

I just signed the #unter1000 pledge to not fly to destinations that are under 1,000 km away, and I encourage every reader to sign as well2. We, scientists, need to set an example. We understand better than most the dire situation we are in, and we have more time and money than most to do something about it.

It’s a mild inconvenience for us: instead of flying from Vienna to Cologne in 3 to 5 hours2, we’ll need to spend the whole day (or night) on the train, sacrificing one day of work or of weekend. Yet, our work has flexible hours, and we can be productive inside the train. Unlike most. If we can’t do even this small sacrifice, what hope is there in the fight against climate change?

I don’t want to spread the misconception that individual action is the most effective way to fight climate change. It’s not. The most effective action is to vote the dinosaurs out of office. The government can make the biggest impact by cleaning up the electric grid and banning fossil cars. The second most effective action, though, is collective action, like this one. Don’t just do it yourself, but do it, tell everyone, and tell them to do it as well. The wider #flugscham campaign is having an effect. Domestic flights fell by 12% year-on-year last November in Germany, and by 14% in Sweden. Losses of this scale set a fire under the airlines’ asses to invest in technology to make flying carbon neutral. That’s the goal, because we don’t want to go back to the stone age. But while airplanes still burn fossil fuels, I won’t do it under 1,000 km.

Posted in Uncategorised | Leave a comment

Quantum supremacy won me a bet

About three years ago, on 21.03.2016, I made a bet with Miguel Navascués. At the time superconducting qubits were in their infancy: devices based on them had very few qubits, and quite shitty ones at that. Nevertheless, I was sure that this was the architecture of the future, as it seemed easy to scale up: there was no obvious problem with adding more qubits to a chip. Moreover, they consisted essentially of electronic circuits, which we know very well how to manufacture. Finally, Google and IBM were pouring serious amounts of money on it.

Miguel was skeptic. He had heard the same story before about ion traps, and instead of flourishing the architecture stagnated. People managed to get a dozen or so excellent qubits in a trap, but that was it. You can’t really add more without messing up the vibrational modes or squeezing the energy levels too close together. There were proposals about how to go around these problems, like linking multiple traps using photons and entanglement swapping, but they didn’t really do the trick. It will be the same with superconducting qubits, he said.

So, we made a bet. If in five years, by 21.03.2021, anybody in the world managed to achieve universal quantum control over 30 physical (not logical) superconducting qubits, Miguel would pay me the burger equivalent of 10 Big Macs in Vienna3. Otherwise, I would pay him.

As the years went by, it became clear that things were going well for superconducting qubits, but we still lacked a dramatic publication that demonstrated the capability (and I wasn’t in Vienna anyway). Until this October, when Google published its demonstration of quantum supremacy with 53 qubits. I wrote then Miguel, who graciously conceded the bet (we didn’t need to use our arbiter, Flaminia Giacomini). Photographic evidence indicates that he wasn’t happy about it, though.

First order photographer: Adán Cabello. Second order photographer: Zizhu Wang.

Posted in Uncategorised | 2 Comments

Superdeterminism is unscientific

Yesterday I saw with disappointment a new paper on the arXiv by Hossenfelder and Palmer, Rethinking Superdeterminism. There they argue that physics took a wrong turn when we immediately dismissed superdeterminism; instead it is a solution to the conundrum of nonlocality and the measurement problem.

No. It’s not. It’s a completely sterile idea. I’ll show why, by fleshing out the calculations of the smoking and cancer example they quote in the paper, and then examining the case of the Bell test.

Let’s suppose you do the appropriate randomized trial, and measure the conditional probabilities2
\[ p(\text{cancer}|\text{smoke}) = 0.15\quad\text{and}\quad p(\text{cancer}|\neg\text{smoke}) = 0.01,\]a pretty damning result. A tobacco company objects to the conclusion, saying that the genome of the subjects was correlated with whether you forced them to smoke2, such that you put more people predisposed to have cancer in the smoking group.

It works like this: the law of total probability says that
\[ p(a|x) = \sum_\lambda p(\lambda|x)p(a|x,\lambda),\] where in our case $a \in \{\text{cancer},\neg\text{cancer}\}$, $x \in \{\text{smoke},\neg\text{smoke}\}$, and $\lambda \in \{\text{predisposed},\neg\text{predisposed}\}$ is the hidden variable, in this case the genome determining whether the person will have cancer anyway. The tobacco company says that your results are explained by the conspiracy $p(\text{predisposed}|\text{smoke}) = 0.15$ and $p(\text{predisposed}|\neg\text{smoke}) = 0$, from which we can calculate the actual cancer rates to be
\begin{gather*}
p(\text{cancer}|\text{smoke},\neg\text{predisposed}) = 0 \\
p(\text{cancer}|\neg\text{smoke},\neg\text{predisposed}) = 0.01,
\end{gather*}so the same data indicates that smoking prevents cancer! If you assume, though, that $p(\text{predisposed}|\text{smoke}) = p(\text{predisposed}|\neg\text{smoke})$, then the absurd conclusion is impossible.

With this example I want to illustrate two points: first, that assuming $p(\lambda|x) \neq p(\lambda)$ is just a generic excuse to dismiss any experimental result that you find inconvenient, be it that smoking causes cancer or that Bell inequalities are violated. Second, that without assuming $p(\lambda|x) = p(\lambda)$ 3 you can’t conclude anything from your data.

In their paper, Hossenfelder and Palmer dismiss this example as merely classical reasoning that is not applicable to quantum mechanics. It’s not. One can always use the law of total probability to introduce a hidden variable to explain away any correlation, whether it was observed in classical or quantum contexts. Moreover, they claim that while $p(\lambda|x) = p(\lambda)$ is plausible in classical contexts, it shouldn’t be assumed in quantum contexts. This is laughable. I find it perfectly conceivable that tobacco companies would engage in conspiracies to fake results related to smoking and cancer, but to think that Nature would engage in a conspiracy to fake the results of Bell tests? Come on.

They also propose an experiment to test their superdeterministic idea. It is nonsense, as any experiment about correlations is without the assumption that $p(\lambda|x) = p(\lambda)$. Of course, they are aware of this, and they assume that $p(\lambda|x) = p(\lambda)$ would hold for their experiment, just not for Bell tests. Superdeterminism for thee, not for me. They say that when $x$ is a measurement setting, changing it will necessarily cause a large change in the state $\lambda$, but if you don’t change the setting, the state $\lambda$ will not change much. Well, but what is a measurement setting? That’s human category, not a fundamental one. I can just as well say that the time the experiment is made is the setting, and therefore repetitions of the experiment done at different times will probe different states $\lambda$, and again you can’t conclude anything about it.

Funnily, they say that “…one should make measurements on states prepared as identically as possible with devices as small and cool as possible in time-increments as small as possible.” Well, doesn’t this sound like a very common sort of experiment? Shouldn’t we have observed deviations from the Born rule a long time ago then?

Let’s turn to how superdeterministic models dismiss violations of Bell inequalities. They respect determinism and no action at a distance, but violate no conspiracy, as I define here. The probabilities can then be decomposed as
\[ p(ab|xy) = \sum_\lambda p(\lambda|xy)p(a|x,\lambda)p(b|y,\lambda),\]and the dependence of the distribution of $\lambda$ on the settings $x,y$ is used to violate the Bell bound. Unfortunately Hossenfelder and Palmer4 do not specify $p(\lambda|xy)$, so I have to make something up. It is trivial to reproduce the quantum correlations if we let $\lambda$ be a two-bit vector, $\lambda \in \{(0,0),(0,1),(1,0),(1,1)\}$, and postulate that it is distributed as
\[p((a,b)|xy) = p^Q(ab|xy),\] where $p^Q(ab|xy)$ is the correlation predicted by quantum mechanics for the specific experiment, and the functions $p(a|x,\lambda)$ and $p(b|y,\lambda)$ are given by
\[p(a|x,(a’,b’)) = \delta_{a,a’}\quad\text{and}\quad p(b|y,(a’,b’)) = \delta_{b,b’}.\] For example, if $p^Q(ab|xy)$ is the correlation maximally violating the CHSH inequality, we would need $\lambda$ to be distributed as
\[ p((a,b)|xy) = \frac14\left(1+\frac1{\sqrt2}\right)\delta_{a\oplus b,xy}+\frac14\left(1-\frac1{\sqrt2}\right)\delta_{a\oplus b,\neg(xy)}.\]The question is, why? In the quantum mechanical case, this is explained by the quantum state being used, the dynamical laws, the observable being measured, and the Born rule. In the superdeterministic theory, what? I have never seen this distribution be even mentioned, let alone justified.

More importantly, why should this distribution be such that the superdeterministic correlations reproduce the quantum ones? For example, why couldn’t $\lambda$ be distributed like
\[ p((a,b)|xy) = \frac12\delta_{a\oplus b,xy},\] violating the Tsirelson bound?5 Even worse, why should the superdeterministic distributions respect even no-signalling? What stops $\lambda$ being distributed like
\[ p((a,b)|xy) = \delta_{a,y}\delta_{b,x}?\]

In their paper, Hossenfelder and Palmer define a superdeterministic theory as a local, deterministic, reductionist theory that reproduces quantum mechanics approximately. I’m certain that such a theory will never exist. Its dynamical equations would need to correlate 97,347,490 human choices with the states of atoms and photons in 12 laboratories around the planet to reproduce the results of the BIG Bell test. Its dynamical equations would need to correlate the frequency of photons emitted by other stars in the Milky Way with the states of photons emitted by a laser in Vienna to reproduce the results of the Cosmic Bell test. Its dynamical equations would need to correlate the bits of a file of the movie “Monty Python and the Holy Grail” with the state of photons emitted by a laser in Boulder to reproduce the results of the NIST loophole-free Bell test. It cannot be done.

Posted in Uncategorised | 56 Comments

Nerd sniping

Last week a friend of mine, Felipe, left written the following equation in the blackboard of the coffee room: \[ \frac{a}{b+c} + \frac{b}{c+a} + \frac{c}{a+b} = 4,\]asking for a solution with positive integers.

It was a harmless sort of nerd sniping (as opposed to the harmful one), that he saw as a meme online. It looks innocent, but it is a rather difficult problem6. The idea was to post it to family WhatsApp groups, as revenge for the retarded math problems with fruits as variables that come up regularly. Finding the idea hilarious, I went on and posted it to my family’s WhatsApp group.

But… how does one solve it? I had inadvertently sniped myself. Trying a little bit, I managed to show that there is no integer solution for $a=b$, and therefore one couldn’t reduce the problem from a third-degree polynomial to a second degree one, so there was no easy way out. I also realised that any integer multiple of a solution is a valid solution, so one can fix $c=1$, look for a solution for rational $a,b$, and multiply this solution by the least common denominator to solve the original problem. Fixing then $c=1$ and rewriting $a,b$ as $x,y$ for clarity, we end up with the polynomial equation
\[ x^3-3x^2-3x^2y-3x + y^3-3y^2-3y^2x-3y-5xy + 1 = 0,\]which looks pretty much hopeless. To get some intuition I plotted the curve, obtaining this:

Well great. Now I was stuck. Off to Google then. Which immediately gave me this answer, a thorough explanation of where the problem came from and how to solve it. End of story? Not really. The answer depended on two magical steps2 that were neither easy nor explained. No no, I want to actually find the solution, not just reproduce it using stuff I don’t understand.

As the mathematician helpfully explains there, the core idea is the centuries-old chord and tangent technique: if one draws a line through two rational points on the curve, or on a tangent point, this line will intersect the curve again on a rational point. So if we have a single rational solution to start with, we can just iterate away and produce new rational points.

Well, this I can understand and do myself, off to work then! I did everything on the programming language that’s all the rage with the kids nowadays, Julia3 on a Jupyter notebook, which you can download here. I wrote up this solution having in mind people like me, who don’t really know anything, but have a generally favourable attitude towards math.

The first step is to brute force search for a rational solution. That’s easy enough, but gives us mostly repetitions of the points $(-1,-1)$, $(1,-1)$, and $(-1,1)$. They make some denominators of the original equation evaluate to zero, and by looking at the graph it is obvious that the point $(-1,-1)$ doesn’t work4, so I excluded them and selected a nice-looking solution as the starting point:
\[P_1 = \left(-\frac5{11},\frac9{11}\right)\]Now, I needed to calculate the slope of the curve at this point. This lies firmly on the straightforward-but-tedious territory5, so I won’t bore you with the details. The result is that
\[ \frac{\mathrm{d}y}{\mathrm{d}x} = \frac{-3x^2+6x+6xy+3y^2+5y+3}{-3x^2-5x-6xy+3y^2-6y-3},\]and with the slope we can easily find the tangent line. Now finding the intersection of this line with the curve is even more straightforward and tedious6, so I’m not going to even type out the answer (for the details check the Wikipedia page linked above). Anyway, now we have the intersection point
\[I = \left(\frac{5165}{9499},-\frac{8784}{9499}\right),\]but we have a problem: if we draw the line between $P_1$ and $I$ we won’t get a new point, as this line intersects the curve only at $P_1$ (twice) and $I$. We could just apply the same procedure that we did to $P_1$: calculate the tangent at $I$ and with that get a new intersection point. This works, but the integers in the numerator and denominator of the numbers we get by iterating this procedure grow hideously fast7, and the computer dies before returning a solution with only positive rationals.

There is an easy way out, though: note that the equation is symmetric under exchange of $x$ and $y$, so we can simply take the new point as
\[P_2 = \left(-\frac{8784}{9499},\frac{5165}{9499}\right),\]and the line through $P_1$ and $P_2$ does intersect the curve at a new point. Taking $P_3$ to be again its flipped version
\[P_3 = \left(-\frac{396650011}{137430135},-\frac{934668779}{137430135}\right),\]we can go on.

A new question does arise. Do we take $P_4$ from the line through $P_2$ and $P_3$, or from the line through $P_1$ and $P_3$? It turns out that both methods work, but the number of digits grows much faster with the former method than with the latter8, so it is unwise to use the former method in general. We iterate away, and find the solution with $P_{13}$, with the resulting integers having more than 150 digits.

Posted in Uncategorised | Leave a comment