## SDPs with complex numbers

For mysterious reasons, some time ago I found myself reading SeDuMi’s manual. To my surprise, it claimed to support SDPs with complex numbers. More specifically, it could handle positive semidefiniteness constraints on complex Hermitian matrices, instead of only real symmetric matrices as all other solvers.

I was very excited, because this promised a massive increase in performance for such problems, and in my latest paper I’m solving a massive SDP with complex Hermitian matrices.

The usual way to handle complex problems is to map them into real ones via the transformation
$f(M) = \begin{pmatrix} \Re(M) & \Im(M) \\ \Im(M)^T & \Re(M) \end{pmatrix}.$The spectrum of the $f(M)$ consists of two copies of the spectrum of $M$, and $f(MN) = f(M)f(N)$, so you can see that one can do an exact mapping. The problem is that the matrix is now twice as big: the number of parameters it needs is roughly twice what was needed for the original complex matrix1, so this wastes a bit of memory. More problematic, the interior-point algorithm needs to calculate the Cholesky decomposition, which has complexity $O(d^3)$, so we are slowing the algorithm down by a factor of 8!

I wrote then a trivial SDP to test SeDuMi, and of course it failed. A more careful reading of the documentation showed that I was formatting the input incorrectly, so I fixed that, and it failed again. Reading the documentation again and again convinced me that the input was now correct: it must have been a bug in SeDuMi itself.

Lured by the promise of a 8 times speedup, I decided to dare the dragon, and looked into the source code of SeDuMi. It was written more than 20 years ago, and the original developer is dead, so you might understand why I was afraid. Luckily the code had comments, otherwise how could I figure out what it was supposed to do when it wasn’t doing it?

It turned out to be a simple fix, the real challenge was only understanding what was going on. And the original developer wasn’t to blame, the bug had been introduced by another person in 2017.

Now with SeDuMi working, I proceeded to benchmarking. To my despair, the promised land wasn’t there: there was no difference at all in speed between the complex version and the real version. I was at the point of giving up, when Johan Löfberg, the developer of YALMIP kindly pointed out to me that SeDuMi also needs to do a Cholesky decomposition of the Hessian, a $m \times m$ matrix where $m$ is the number of constraints. The complexity of Sedumi is then roughly $O(m^3 + d^3)$ using complex numbers, and $O(m^3 + 8d^3)$ when solving the equivalent real version. In my test problem I had $m=d^2$ constraints, so no wonder I couldn’t see any speedup.

I wrote then another test SDP, this time with a single constraint, and voilà! There was a speedup of roughly 4 times! Not 8, probably because computing the Cholesky decomposition of a complex matrix is harder than of a real matrix, and there is plenty of other stuff going on, but no matter, a 4 times speedup is nothing to sneer at.

The problem now that this was only when calling SeDuMi directly, which requires writing the SDP in canonical form. I wasn’t going to do that for any nontrivial problem. It’s not hard per se, but requires the patience of a monk. This is why we have preprocessors like YALMIP.

To take advantage of the speedup, I had to adapt YALMIP to handle complex problems. Löfberg is very much alive, which makes things much easier.

As it turned out, YALMIP already supported complex numbers but had it disabled, presumably because of the bug in SeDuMi. What was missing was support for dualization of complex problems, which is important because sometimes the dualized version is much more efficient than the primal one. I went to work on that.

Today Löberg accepted the pull request, so right now you can enjoy the speedup if you use the latest git of SeDuMi and YALMIP. If that’s useful to you please test and report any bugs.

What about my original problem? I benchmarked it, and using the complex version of SeDuMi did give me a speedup of roughly 30%. Not so impressive, but definitely welcome. The problem is that SeDuMi is rather slow, and even using the real mapping MOSEK can solve my problem faster than it.

I don’t think it was pointless going through all that, though. First because there are plenty of people that use SeDuMi, as it’s open source, unlike MOSEK. Second because now the groundwork is laid down, and if another solver appears that can handle complex problems, we will be able to use that capability just by flipping a switch.

Posted in Uncategorised | 1 Comment

## SDPs are not cheat codes

I usually say the opposite to my students: that SDPs are the cheat codes of quantum information. That if you can formulate your problem as an SDP you’re in heaven: there will be an efficient algorithm for finding numerical solutions, and duality theory will often allow you to find analytical solutions. Indeed in the 00s and early 10s one problem after the other was solved via this technique, and a lot of people got good papers of out it. Now the low-hanging fruit has been picked, but SDPs remain a powerful tool that is routinely used.

I’m just afraid that people have started to believe this literally, and use SDPs blindly. But they don’t always work, you need to be careful about their limitations. It’s hard to blame them, though, as the textbooks don’t help. The usually careful The Theory of Quantum Information by Watrous is silent on the subject. It simply states Slater’s condition, which is bound to mislead students into believing that if Slater’s condition is satisfied the SDP will work. The standard textbook, Boyd and Vandenberghe’s Convex Optimization is much worse. It explicitly states

Another advantage of primal-dual algorithms over the barrier method is that they can work when the problem is feasible, but not strictly feasible (although we will not pursue this).

Which is outright false. I contacted Boyd about it, and he insisted that it was true. I then gave him examples of problems where primal-dual algorithms fail, and he replied “that’s simply a case of a poorly specified problem”. Now that made me angry. First of all because it amounted to admitting that his book is incorrect, as it has no such qualification about “poorly specified problems”, and secondly because “poorly specified problems” is rather poorly specified. I think it’s really important to tell the students for which problems SDPs will fail.

One problem I told Boyd about was to minimize $x$ under the constraint that
$\begin{pmatrix} x & 1 \\ 1 & t \end{pmatrix} \ge 0.$Now this problem satisfies Slater’s condition. The primal and dual objectives are bounded, and the problem is strictly feasible, i.e., there are values for $x,t$ such that the matrix there is positive semidefinite (e.g. $x=t=2$). Still, numerical solvers cannot handle it. Nothing wrong with Slater, he just claimed that if this holds then we have strong duality, that is, the primal and dual optimal values will match. And they do.

The issue is very simple: the optimal value is 0, but there is no $x,t$ where it is attained, you only get it in the limit of $x\to 0$ with $t=1/x$. And no numerical solver will be able to handle infinity.

Now this problem is so simple that the failure is not very dramatic. SeDuMi gives something around $10^{-4}$ as an answer. Clearly wrong, as usually it gets within $10^{-8}$ of the right answer, but still, that’s an engineer’s zero.

One can get a much more nasty failure with a slightly more complicated problem (from here): let $X$ by a $3\times 3$ matrix, and minimize $X_{22}$ under the constraints that $X \ge 0, X_{33} = 0$, and $X_{22} + 2X_{13} = 1$. It’s easy enough to solve it by hand: the constraint $X_{33} = 0$ implies that $X_{13}$ must be equal to zero, otherwise $X$ cannot be positive semidefinite. In turns this implies that $X_{22} = 1$, and we’re done. That’s nothing to optimize. If you give this to SeDuMi it goes crazy, and gives 0.1319 as an answer, together with the message that it had numerical problems.

Now my point is not that SeDuMi should be able to solve nasty problems like this. It’s that we should teach the students to identify this nastiness so they don’t get bitten in the ass when it’s not so obvious.

And they are being bitten in the ass. I’m writing about this because I just posted a comment on the arXiv, correcting a paper that had mistakenly believed that when you add constraints to the NPA hierarchy the answers are still trustworthy. Don’t worry, it’s still possible to solve the constrained NPA hierarchy, you just need to be careful. To learn how, read the comment. Here I want to talk about how to identify nasty problems.

One might think that looking at the manual of a specific solver would help. After all, who could better tell which problems can’t be solved than the people who actually implemented the algorithm? Indeed it does help a bit. In the MOSEK Cookbook they give several examples of nasty problems it cannot handle. At least this dispels Boyd’s naïveté that everything can be solved. But they are rather vague, there’s no characterization of nasty or well-behaved problems.

The best I could find was a theorem in Nesterov and Nemirovskii’s ancient book “Interior-Point Polynomial Algorithms in Convex Programming”, which says that if the primal and dual problems are strictly feasible, then there will exist primal and dual solutions that reproduce the optimal value (i.e., the optimum will not be reached only in the limit). Barring the usual limitations of floating point numbers, this should indeed be a sufficient condition for the SDP to be well-behaved. Hopefully.

It’s not a necessary condition, though. To see that, consider a primal-dual pair in standard form
\begin{equation*}
\begin{aligned}
\min_X \quad & \langle C,X \rangle \\
\text{s.t.} \quad & \langle \Gamma_i, X \rangle = -b_i \quad \forall i,\\
& X \ge 0
\end{aligned}
\end{equation*}\begin{equation*}
\begin{aligned}
\max_{y} \quad & \langle b, y \rangle \\
\text{s.t.} \quad & C + \sum_i y_i \Gamma_i \ge 0
\end{aligned}
\end{equation*}and assume that they are both strictly feasible, so that there exist primal and dual optimal solutions $X^*,y^*$ such that $\langle C,X^* \rangle = \langle b, y^* \rangle$. We can then define a new SDP by redefining $C’ = C \oplus \mathbf{0}$ and $\Gamma_i’ = \Gamma_i \oplus \mathbf{0}$, where $\oplus$ is the direct sum, and $\mathbf{0}$ is an all-zeros matrix of any size you want. Now the dual SDP is not strictly feasible anymore2, but it remains as well-behaved as before; the optimal dual solution doesn’t change, and an optimal primal solution is simply $X^* \oplus \mathbf{0}$. We can also do a change of basis to mix this all-zero subspace around, so the cases where it’s not necessary are not so obvious.

Still, I like this condition. It’s rather useful, and simple enough to teach. So kids, eat your vegetables, and check whether your primal and dual SDPs are strictly feasible.

Posted in Uncategorised | 7 Comments

## Redefining classicality

I’m in a terrible mood. Maybe it’s just the relentless blackness of Austrian winter, but I do have rational reasons to be annoyed. First is the firehose of nonsense coming from the wormhole-in-a-quantum-computer people, that I wrote about in my previous post. Second are two talks that I attended to here in Vienna in the past couple of weeks. One by Spekkens, claiming that he can explain interference phenomena classically, and another by Perche, claiming that a classical field can transmit entanglement, and therefore that the Bose-Marletto-Vedral proposed experiment wouldn’t demonstrate that the gravitational field must be quantized.

These talks were about very different subjects, but both were based on redefining “classical” to be something completely divorced from our understanding of classicality in order to reach their absurd conclusions. One might object that this is just semantics, you can define “classical” to be whatever you want, but I’d like to emphasize that semantics was the whole point of these talks. They were not trying to propose a physically plausible model, they only wanted to claim that some effect previously understood as quantum was actually classical.

The problem is that “classical” is not well-defined, so each author has a lot of freedom in adapting the notion to their purposes. One could define “classical” to strictly mean classical physics, in the sense of Newtonian mechanics, Maxwell’s equations, or general relativity. That’s not an interesting definition, though, first because you can’t explain even a rock with classical physics, and secondly because the context of these discussion is whether one could explain some specific physical effect with a new, classical-like theory, not whether current classical physics explains it (as the answer is always no).

One then needs to choose the features one wishes this classical-like theory to have. Popular choices are to have local dynamics, deterministic evolution, and trivial measurements (i.e., you can just read off the entire state without complications).

Spekkens’s “classical” theory violates two of these desiderata, it’s not local and you can’t measure the state. The entire theory is based on an “epistemic restriction”, that you have some incompatible variables that by fiat you can’t measure simultaneously. For me that already kills the motivation for studying such a theory: you’re copying the least appealing feature of quantum mechanics! And quantum mechanics at least has an elegant theory of measurement to determine what you can or can’t measure simultaneously, here you have just a bare postulate. But what makes the whole thing farcical is the nonlocality of the theory. In the model of the Mach-Zehnder interferometer, the “classical” state must pick up the phase of the right arm of the interferometer even if it actually went through the left arm. This makes the cure worse than the disease, quantum mechanics is local and if the particle went through the left it won’t pick up any phase from the right.

When I complained to Spekkens about this, he replied that one couldn’t interpret the vacuum state as implying that the particle was not there, that we should interpret the occupation number as just an abstract degree of freedom without consequence to whether the mode is occupied or not. Yeah, you can do that, but can you seriously call that classical? And again, this makes the theory stranger than quantum mechanics.

Let’s turn to Perche’s theory now. Here the situation is more subtle: we’re not trying to define what a classical theory is, but what a hybrid quantum-classical theory is. In a nutshell, the Bose-Marletto-Vedral proposal is that if we entangle two particles via the gravitational interaction, this implies that the gravitational field must be quantized, because classical fields cannot transmit entanglement.

The difficulty with this argument is that there’s no such thing as a hybrid quantum-classical theory where everything is quantum but the gravitational field is classical (except in the case of a fixed background gravitational field). Some such Frankesteins have been proposed, but always as strawmen that fail spectacularly. To get around this, what people always do is abstract away from the physics and examine the scenario with quantum information theory. Then it’s easy to prove that it’s not possible to create entanglement with local operations and classical communication (LOCC). The classical gravitational field plays the role of classical communication, and we’re done.

Perche wanted to do a theory with more meat, including all the physical degrees of freedom and their dynamics. A commendable goal. What he did was to calculate the Green function from the classical gravitational interaction (which subsumes the fields), and postulate that it should also be the Green function when everything else is quantum. The problem is that you don’t have a gravitational field anymore, and no direct way to determine whether it is quantum or classical. The result he got, however, was that this classical Green function was better at producing entanglement than the quantum one. I think that’s a dead giveaway that his (implicit) field was not classical.

The audience would have none of that, and complained several times that his classical field was anything but. Perche would agree that “quantum-controlled classical” would better describe his gravitational field, but would defend anyway calling it just “classical field” as an informal description.

If you want a theory with more meat, my humble proposal is to not treat classical systems as fundamentally classical, but accept reality: the world is quantum, and “classical” systems are quantum systems that are in a state that is invariant under decoherence. And to make them invariant under decoherence we simply decohere them. In this way we can start with a well-motivated and non-pathological quantum theory for the whole system, and simply decohere the “classical” subsystems as often as needed.

It’s easy to prove that the classical subsystems cannot transmit entanglement in such a theory. Let’s say you have a quantum system $|\psi\rangle$ and a classical mediator $|C\rangle$. After letting them interact via any unitary whatsoever, you end up in the state
$\sum_{ij} \alpha_{ij}|\psi_i\rangle|C_j\rangle.$ Now we decohere the classical subsystem (in the $\{|C_j\rangle\}$ basis, without loss of generality), obtaining
$\sum_{ijk} \alpha_{ij}\alpha_{kj}^*|\psi_i\rangle\langle\psi_k|\otimes|C_j\rangle\langle C_j|.$ This is equal to
$\sum_j p_j \rho_j \otimes |C_j\rangle\langle C_j|,$ where $p_j := \sum_i |\alpha_{ij}|^2$ and $\rho_j := \frac1{p_j}\sum_{ik} \alpha_{ij}\alpha_{kj}^*|\psi_i\rangle\langle\psi_k|$, which is an explicitly separable state, which therefore has no entanglement to transmit to anyone.

Posted in Uncategorised | 2 Comments

## The death of Quanta Magazine

Yesterday Quanta Magazine published an article written by Natalie Wolchover, Physicists Create a Wormhole Using a Quantum Computer. I’m shocked and disappointed. I thought Quanta Magazine was the most respectable source of science news, they have published several quality, in-depth articles in difficult topics. But this? It falls so far below any journalistic standard that the magazine is dead to me. The problem is, if they write such bullshit about topics that I do understand, how can I trust their reporting on topics that I do not?

Let’s start with the title. No, scientists haven’t created a wormhole using a quantum computer. They haven’t even simulated one. They simulated some aspects of wormhole dynamics under the crucial assumption that the holographic correspondence of the Sachdev–Ye–Kitaev model holds. Without this assumption they just have a bunch of qubits being entangled, no relation to wormholes.

The article just takes this assumption for granted, and cavalierly goes on to say nonsense like “by manipulating the qubits, the physicists then sent information through the wormhole”. Shortly afterwards, though, it claims that “the experiment can be seen as evidence for the holographic principle”. But didn’t you just assume it was true? And how on Earth can this test the holographic principle? It’s not as if we can do experiments with actual wormholes in order to check if their dynamics match the holographic description.

The deeper problem, though, is that the article never mentions that this simulation can easily be done in a classical computer. Much better, in fact, than in a quantum computer. The scientific content of the paper is not about creating wormholes or investigating the holographic principle, but about getting the quantum computer to work.

As bizarre and over-the-top the article is, it is downright sober compared to the cringeworthy video they released. While the article correctly points out that one needs negative energy to make a wormhole traversable, and that negative energy does not exist, and that the experiment merely simulated a negative energy pulse, the video has no such qualms. It directly stated that the experiment created a negative energy shockwave and used it to transmit qubits through the wormhole.

For me the worst part of the video was at 11:53, where they showed a graph with a bright point labelled “negative energy peak” on it. The problem is that this is not a plot of data, it’s just a drawing, with no connection to the experiment. Lay people will think they are seeing actual data, so this is straightforward disinformation.

Now how did this happen? It seems that Wolchover just published uncritically whatever bullshit Spiropulu told her. Instead of, you know, checking with other people whether it made sense? The article does quote two critics, Woit and Loll. Woit mentions that the holographic correspondence simulates an anti-de Sitter space, whereas our universe is a de Sitter space. Loll mentions that the experiment simulates 2d spacetime, whereas our universe is 4d. Both criticisms are true, of course, but they don’t touch the reason why the Quanta article is nonsense.

EDIT: Quanta has since then changed the title of the article to add the qualification that the wormhole is holographic, and deleted the tweet that said “Physicists have built a wormhole and successfully sent information from one end to the other”. I commend them for taking a step in the right direction, but they haven’t addressed the main problem, which is the content of the article and the video, so this is not enough to get back on my list of reliable sources. Wolchover herself is unrepentant, explicitly denying that she was fooled by the scientists behind the research. Well, the bullshit is her fault then.

Posted in Uncategorised | 13 Comments

## Doing induction like a physicist

If something is true for dimension 2, it doesn’t mean much. We know that 2 is very special. The set of valid quantum states is a sphere, we can have a basis of unitary and Hermitian matrices for the Hilbert space, extremal quantum correlations can always be produced by projective measurements, you can have a noncontextual hidden variable model, and so on. None of that holds for larger dimensions.

If something is true for dimensions 2 and 3, that’s already much better, but by no means conclusive. There exists a simple formula for SIC-POVMs for these dimensions, that doesn’t work for 4 onwards. If we go beyond dimensions, there are more interesting examples: for 2 qubits, there exists a single class of entangled states, namely the $|00\rangle+|11\rangle$ class. For 3 qubits, there are two classes, namely the $|000\rangle + |111\rangle$ and $|001\rangle + |010\rangle + |100\rangle$ classes. One could hope that for 4 qubits we would have three classes, but no, there are infinitely many. The same non-pattern happens for Bell inequalities. For bipartite inequalities with two outcomes per party, if each has 2 settings then there exists only one facet inequality, the CHSH. If each party has 3 settings, then there are two facets, CHSH and I3322. If they have 4 settings, though, there are 175 different facets. Ditto if you fix the number of settings to be 2, and increase the number of outcomes. For 2 outcomes, again only CHSH, for 3 outcomes you have CHSH and CGLMP, and for 4 outcomes at least 34 facets.

If something is true for dimensions 2, 3, and 4, then it will be also true for dimension 5, so we skip this one.

If something is true for dimensions 2, 3, 4, and 5, it is very good evidence that it will be true for all dimensions, but it is still not enough for a proof by physicist induction. We have even primes, odd primes, and prime powers, but no non-trivial composite numbers. MUBs are a good example, they exist for dimensions 2, 3, 4, and 5, but not 62.

If something is true for dimensions 2, 3, 4, 5, and 6, that’s it. It will be true for all dimensions. SIC-POVMs are a good example. It is not too hard to construct analytical examples for dimensions 2, 3, 4, 5, and 6, and from that we know that they always exist2.

This is of course not true in mathematics, which is a demanding and capricious mistress. The most horrifying example I know is the logarithmic integral. Quantum mechanics, on the other hand, is a mother. She will not humiliate you, she will not lead you astray. She only wants you to do a bit of honest work with small dimensions, and she will reward you with the truth.

The only potential counterexample I know is the Tsirelson bound of the I3322 inequality, which is supposed to be 0.85 for dimensions 2 to 8, and from dimension 9 onwards it starts increasing. I don’t count it as an actual counterexample because nobody managed to actually prove that the Tsirelson bound is 0.85 for dimensions 2 to 6, there is just numerical evidence. And I do demand a proof for this part of physicist induction, the reasoning is already flimsy enough as it is.

## Do not project your relative frequencies onto the non-signalling subspace

It happens all the time. You make an experiment on nonlocality or steering, and you want to test whether the data you collected is compatible with hidden variables. You plug them into the computer and the answer is no, they are not. You examine them a bit more closely, and you see that they are also incompatible with quantum mechanics, because they are signalling. After a bit of cold sweating, you realize that they are very close to non-signalling, all the trouble happened because the computer needs them to be exactly non-signalling. You then relax, project them onto the non-signalling subspace, and call it a day.

Never do this. Experimental data is sacred. You can’t arbitrarily chop it off to fit your Procrustean bed.

First of all, remember that even if your probabilities are strictly non-signalling, the probability of obtaining relative frequencies that respect the no-signalling equations exactly is effectively zero. There’s nothing wrong with “signalling” frequencies. On the contrary, if some experimentalist reported relative frequencies that were exactly non-signalling I’d be very suspicious. What you should get in a real experiment are frequencies that are very close to non-signalling, but not exactly3.

“That doesn’t help me”, you reply. “I can accept signalling frequencies all day long, but the computer still needs them to be non-signalling in order to test hidden variable models.”

Sure, but what the computer needs are non-signalling probabilities, that you should infer from the signalling frequencies.

“Exactly, and to infer non-signalling probabilities I just project the frequencies onto the non-signalling subspace.”

No! Inferring probabilities from frequencies is the oldest problem in statistics. People have studied this problem to death, and came up with several respectable methods. There’s no point in reinventing the wheel. And if you do insist in reinventing the wheel, you’d better be damn sure that it’s round.

To make it clear that this projection technique is a square wheel, I’ll examine in detail a toy version of the problem of getting non-signalling probabilities. The simplest case of the real problem involves getting from a 12-dimensional space of frequencies to a 8-dimensional non-signalling subspace, which is too much to do by hand for even the most dedicated PhD students2. Instead I’ll go for the minimal scenario, a 2-dimenionsal space of frequencies that goes down to a 1-dimensional subspace.

Consider then an experiment with 3 possible outcomes, 0,1, and 2, where our analogue of the no-signalling assumption is that $p_1 = 2p_0$. The possible relative frequencies we can observe are in triangle bounded by $p_0 \ge 0$, $p_1 \ge 0$, and $p_0 + p_1 \le 1$. The possible probabilities are just the line $p_1 = 2p_0$ inside this triangle. Again, if we generate data according to these probabilities they will almost surely not fall in the $p_1 = 2p_0$ line. Let’s say we observed $n_0$ outcomes 0, $n_1$ outcomes 1, and $n_2$ outcomes 2. What is the probability $p_0$ we should infer from this data?

Let’s start with the projection technique. Compute the relative frequencies $f_0 = n_0/n$ and $f_1 = n_1/n$, and project the point $(f_0,f_1)$ onto the line $p_1 = 2p_0$. Which projection, though? There are infinitely many. The most natural one is an orthogonal projection, but that already weirds me out. Why on Earth are we talking about angles between probability distributions? They are vectors of real numbers, sure, we can compute angles, but we shouldn’t expect them to mean anything. Doing it anyway, we get that
$p_0 = \frac15(f_0 + 2f_1)\quad\text{and}\quad p_1 = \frac25(f_0 + 2f_1),$which do not respect positivity: if $f_0=0$ and $f_1=1$ we have that $p_0+p_1 = 6/5$, which implies that $p_2 = -1/5$.3 What now? Arbitrarily make the probabilities positive? Invent some other method, such as minimizing the distance from the point $(f_0,f_1)$ to the line $p_1 = 2p_0$? Which distance then? Euclidean? Total variation? No, it’s time to admit that it was a bad idea to start with and open a statistics textbook.

You’ll find there a very popular method, maximum likelihood. We write the likelihood function
$L(p_0) = p_0^{n_0} (2 p_0)^{n_1} (1-3p_0)^{n_2},$which is just the probability of the data given the parameter $p_0$, and maximize it, finding
$p_0 = \frac13(f_0 + f_1)\quad\text{and}\quad p_1 = \frac23(f_0+f_1).$Now maximum likelihood is probably the shittiest statistical method one can used, but at least the answer makes sense. The resulting probabilities are normalized, and they mean something: they are those which assigned the highest probability to the observed data. My point is that even the worst statistical method is better than arbitrarily chopping off your data. Moreover, it’s very easy to do, so there’s no excuse.

If you want to do things properly, though, you have to do Bayesian inference. You have to multiply the likelihood function by the prior, normalize that, and compute the expected $p_0$ from the posterior in order to obtain a point estimate. It’s a bit more work, but in this case is still easy, and for a flat prior it gives
$p_0 = \frac13\frac{n_0 + n_1+1}{n+1}\quad\text{and}\quad p_1 = \frac23\frac{n_0 + n_1+1}{n+1}.$Besides getting a more sensible answer and the ability to change the prior, the key advantage of Bayesian inference is that it gives you the whole posterior distribution. It naturally provides you a confidence region around your estimate, the beloved error bars any experimental paper must include. It’s harder to do, sure, but none of you got into physics because it was easy, right?

Posted in Uncategorised | Comments Off on Do not project your relative frequencies onto the non-signalling subspace

## Redistribution

Stuck at home with corona, I decided to try my hand at writing science fiction to pass the time. The result was not science fiction at all, but I think it’s still fun to read, so I’m posting it here.

Trevor Norquist, the world’s first trillionaire, died in a fiery explosion. His private jet was hit by a Stinger missile when it was taking off the Köln/Bonn airport. Panic was immediate and widespread: the entire EU closed its airspace in fear of another terrorist attack. Germany erected roadblocks in the area around the airport, searching every single car, and generating monstrous traffic jams.

Videos from the attack made it easy to pinpoint where the missile was fired from: a hunter’s watchtower in the nearby Königsforst. The police was there half an hour after the attack, but the assassin was long gone. He had abandoned the missile launcher there, and nothing else. Forensics went over it with zeal, but couldn’t find anything. No fingerprints, not even a drop of sweat. Clearly they were dealing with a professional.

—/—/—

“How the fuck these fucking eco-terrorists got their hands on a fucking Stinger missile?!” – exclaimed Ernst Dieter, investigator of the Bundeskriminalamt. He was in a terrible mood. Just a week before the eco-terrorists had dynamited the iconic Bagger 288. He was already working overtime to coordinate security for Norquist’s visit. With the threat escalation, the work started cutting his sleep time. He had to get reinforcements from the nearby Hessen. Did anybody know how many forms did he need to sign to get police from another Bundesland to come? And all that because the stubborn prick wouldn’t accept postponing his visit. Norquist’s only concession to reality was giving up on visiting the mines themselves. Still, that left him with the problem of escorting him from the airport to the hotel through thousands of protesters. When he saw the jet taking off he finally relaxed a bit, and dared to dream that he would take the rest of the day off and sleep. “Norquist is not my problem anymore!”, he celebrated, only to have his stomach drop when he saw a bloody Stinger missile hitting the jet.

“We shouldn’t jump to conclusions” – said Robert Weil, his partner – “It doesn’t fit the style of Vergeltung der Klimaopfer, that you insist on calling eco-terrorists. They have never killed anyone before. They insist they are saboteurs, not terrorists”.
“And do you believe their propaganda now? Get serious. They hated the guy more than anything!” – countered Dieter angrily.
“Then why destroy Bagger 288 just before? This only served to increase security” – Weil pointed out.
“Distraction manoeuvre.” – answered Dieter – “It focused our attention on the ground, on the mines, when they knew that the real danger was in the air”
“You are really overestimating their competence.” – dismissed Weil – “We caught the clowns responsible for Bagger 288 in less than 2 days. Both by tracing the explosives they used and the IP address from which they posted the manifest.”
“They are several people, Rob.” – replied Dieter – “They let the idiots handle Bagger 288, and got the real pros to get Norquist, which was the target that actually mattered.”
“I’m just asking you to keep an open mind, Ernst, plenty of people wanted Norquist dead. It could also be the Montenegrins.” – conjectured Weil.
At this moment, both their phones vibrated at the same time. This could only mean something from work, and indeed, it was an email from forensics about the Stinger launcher. It had been tracked to a batch of Stingers that Italy had provided to Ukraine as military help against Russia.
“I knew it!” – Dieter allowed half a smile to flicker through his face – “The fucking Russians gave them the missile.”
“The Russians? Not the Ukrainians?” – asked Weil.
“All the Stingers we gave the Ukrainians are accounted for.” – replied Dieter – “Either safely in storage, used in the war, or captured by the Russians. Guess which ones could end up here?”
“Fair enough, but I doubt the Russians would deal directly with Vergeltung der Klimaopfer” – countered Weil – “Doesn’t fit their ideology, and besides, why not just put the Stingers in the black market? They make a neat profit and don’t get involved in any messy affair.”
“They are involved, and they will pay for this! Selling Stingers to terrorists puts the blood on their hands!” – raged Dieter.
“I’m not sure how we could make them pay. There’s nothing left to sanction.” – replied Weil – “In any case, how many Stingers did they capture in Ukraine?”
After pausing a bit to think, Dieter answered slowly – “Fourteen.”
“Scheiße”.
“Ja, Scheiße.”

—/—/—

After a week of interrogating Vergeltung der Klimaopfer members, Ernst Dieter had to admit they probably had nothing to do with the death of Trevor Norquist. His interrogations were tough and produced results, or so he would say. Others would say that he was nothing but a torturer.
“Nonsense,” – he thought – “Torture is illegal, and I’m strictly following the new counter-terrorism law, that allows for enhanced interrogation of terrorism suspects.”
But all the enhancement was for nothing, he still could get no information out of them. “Maybe they really don’t know anything.” – he thought as he released another traumatised student.
“Maybe they are indeed the poorly-organised students that can barely afford legal explosives that they claim to be.” – he thought – “A black-market Stinger must cost millions of euros. And they aren’t point-and-click as an automatic camera, one needs training to handle one. No, we are after a wealthy, well-organised terrorist group with military background.”

Reluctantly, he turned to the other suspects Robert Weil had mentioned, the Montenegrins. That was harder work, as their criminal background was clean, so he had to restrict himself to “gentle” interrogations. Still, they were even worse fits for the terrorists behind Norquist’s assassination than Vergeltung der Klimaopfer. To start with, there weren’t many of them. Even in Montenegro itself, there were less than a million Montenegrins. In Köln he managed to find 10 that joined the anti-Norquist protests. Some of them knew each other, but they were not an organisation in any way, shape, or form, just families making a living. Most had come to Köln a long time ago, during the Yugoslav wars, and a couple more had arrived after Norquist’s rise to power. Besides being elated by Norquist’s death, the only thing they had in common was a lack of money. The rich Montenegrins were those that stayed in Montenegro and profited from Norquist’s regime.

“No,” – he thought – “I have to look at the wealthy people that wanted Norquist dead.” There was no lack of those either. In fact, it was hard to find someone who didn’t want Norquist dead. Even he himself couldn’t shed a tear for Norquist, he was only angry at the assassination because it had been his job to keep him alive. “Maybe his ex-wives and children?” – he considered. Norquist had 9 children by 4 different women, that had already started an epic fight for the inheritance. – “Plenty of people would kill for hundreds of billions of euros, but kill their own father? Just to get the inheritance a bit sooner? No, that’s too absurd. Besides, his family could just poison or stab him. Using a Stinger indicates an external enemy.”

—/—/—

Dieter started considering Norquist’s trajectory from the start. He had been a mostly unknown investor, mainly busy with multiplying the couple of billions he had inherited, until he saw a golden opportunity in fossil fuel divestment. As the planet warmed, more and more big banks decided that the damage to their reputation was not worth the profit, and so cut off financing to fossil fuel projects. To Norquist this meant he could charge them higher interest rates, as they didn’t have much of a choice. He had the capital to spare and no use for a reputation, so he slowly specialized in financing coal power plants and oil drilling in Africa and Asia. The more polluting the better; not because Norquist wanted to pollute the planet per se, it’s just that the specially dirty projects had the most trouble in finding financing, and thus he could charge the highest interest rates.

And so it started his hostile takeover of Montenegro. The country’s GDP was only 15 billion euros, or about 5% of his wealth. Money was therefore not a problem. The difficulty lay in that Montenegro was not for sale, and Norquist had the curious idiosyncrasy that he always respected the letter of the law. He started by buying the aluminium smelter in Podgorica and making a major expansion. The investment didn’t seem to make much sense as electricity in Montenegro was not particularly cheap or reliable, but the government of Montenegro was anyway overjoyed with the huge investment. So much that it didn’t think twice about allowing Norquist to build a massive extension of the port of Bar, under his private ownership, necessary to export the increased production of aluminium. Or about allowing Norquist to buy the Podgorica-Bar railway, in exchange for a renovation. At this point Norquist was in a position similar to Nokia in Finland: he was so important for the economy of the country that when he asked for something, the government listened. And what he asked for was so little: a mere reform of the campaign finance laws, so that anybody could donate as much they want to any politician, without having to make the donation public. In other words, legalized bribery.

After this law went through, things went much quicker. The media market was completely deregulated, and Norquist promptly acquired all the major newspapers, radio stations, and television channels. This ensured his portrayal as the saviour of Montenegro, the man that had doubled the country’s GDP overnight, and the silencing of his critics. The next step was privatising the whole electricity generation system of the country, coupled with complete deregulation, allowing Norquist to deny power to whoever he wanted for any reason. Legalized extortion. This was followed by abrogating the international treaties Montenegro had entered to fight global warming, allowing fossil fuel power plants to be built again. Norquist then built massive coal and oil power plants to power his aluminium smelter. With the price of coal and oil so low, he managed to produced aluminium at a price that even Iceland couldn’t beat. He didn’t stop there: Bitcoin mining, hydrogen electrolysing, Norquist started any energy-intensive industry that he could think of, and built more fossil power plants for them.

By the time the Montenegrins realised that they were losing their country, it was too late. Anybody that dared to oppose Norquist’s plans found themselves subject to a barrage of negative media coverage, highlighting real or imaginary corruption affairs. The recalcitrant ones were forced into economic ruin by strategically timed power cuts. Social media became tightly controlled, and protest was criminalised. Montenegro became a Singapore-style “democracy”: elections were still held, and votes were still counted with strict correctness. It’s just that the government openly retaliated against those that voted against it, ensuring that it always won with huge majorities. Norquist had no taste for fake elections or murdering the opposition like in Russia. No, it was vital to him that the rule of law was strictly respected.

The final touch was a radical reform of the tax code. Norquist had an almost religious opposition to income or property taxes, and changed the state to rely only on consumption taxes, effectively eliminating his own tax burden. This proved wildly popular with the global super-rich, who parked their wealth in Montenegro en masse. The country quickly overtook Switzerland as the country of choice for tax avoidance, and thus became a massive financial centre, rivalling London, New York, and Tokyo. This sudden influx of cash would easily wreck the currency of such a small country, but Montenegro had the unique advantage of using the euro without being a member of the European Union. It thus enjoyed the stability of the currency without having to obey any of the European Union’s regulations.

And thus it came to pass that Montenegro rose from poverty and stagnation to be one of the most wealthy and dynamic nations in the world. It also achieved such an astounding air pollution that one often couldn’t even see the Sun, a feat that had only been achieved before by China in 2012. Internationally, it was a scandal. The brazen disregard for international norms caused it to be hit with sanctions after sanctions after sanctions. It was slowly becoming as isolated as Russia. There was even talk of war. Norquist didn’t care, he had already made his profit. Après moi, le déluge.

He had been yet another billionaire asshole, and now became the most hated person in the entire world. He wore his infamy with relish. He always bragged about being a self-made trillionaire, having started as a mere billionaire. To those that accused him of being unethical, he always replied that he had never broke any law. Deep down he believed that making money was the only measure of morality that mattered. Since he was so rich he must have been doing it right. It seemed to be working, until Norquist flew to Germany to make a deal to buy their brown coal at negative prices, and got hit by a Stinger.

—/—/—

He was proven wrong when another Stinger hit a private jet. This time the victim was Zhang Shaopeng, a Chinese electric vehicles tycoon, that was landing in Warsaw to close a deal to convert its bus fleet from diesel to electric ones. Now Zhang wasn’t a nice person – the reason his vehicles were consistently cheaper than the competition was his passion for using Uyghur forced labour – but he was hardly a prime target for environmentalists. His electric buses alone were responsible for cutting oil demand significantly, and he was the first manufacturer of electric cars that managed to produced them at lower cost than comparable fossil cars. This was a critical point in the transition away from oil: fossil cars became rich people’s toys, and electric cars the financially sensible choice.

“Scheiße!” – exclaimed Dieter – “Not again!”
“It was bound to happen” – said Weil sombrely – “I told the Kanzler that we couldn’t reopen the airspace before we recovered the thirteen Stingers on the loose, or caught the terrorists. But no, the planes must keep flying! Everybody believed that Norquist was the only target. That was just wishful thinking.”
“Fools!” – concurred Dieter – “There’s nothing we can do to defend civilian aircraft, the whole idea is that there will be nobody shooting at them!”
“Indeed.” – added Weil – “Civilian aircraft are sitting ducks. They broadcast their position, don’t have flares, can’t do high-g manoeuvres…”
“Yeah no shit Sherlock.” – interrupted Dieter – “Instead of blathering about military tactics, tell me what the Polish found out.”
Impervious to Dieter’s rudeness after long years of working together, Weil answered calmly – “As you know, everybody has been watching like crazy the forests near airports, and Las Kabacki was no exception. The Stinger was not fired from there, but from a communal garden, Kępa Służewiecka, right next to the Chopin airport. As before, the launcher was abandoned on the spot, but this time somebody saw the assassin.”
“Aha!” – exclaimed Dieter – “So watching the forests was not in vain!”
“Indeed it wasn’t” – agreed Weil – “A Polish pensioner was trimming his hedge when he saw somebody climbing on the roof of a shed a few blocks away. He was a bit surprised, people in the communal garden are usually too old for that. He was even more surprised when the guy put something on his shoulder and stared directly at the airport. He thought it was a TV camera. Then the “TV camera” spit a Stinger and he saw the private jet exploding. Then he got really scared and hid inside the hedge. Got a few scratches from that.”
“I don’t care about his scratches!” – interrupted Dieter – “Do we have a description of the assassin?”
“I was getting there” – complained Weil – “White, tall, strong, short dark hair. The Polish couldn’t get more out of the pensioner, he wasn’t very close and his eyesight isn’t particularly good.”
“That describes half of Europe” – grumbled Dieter – “Doesn’t help much”.
“It does exclude the other half” – Weil pointed out – “You know how Bild has been making noises about Zombie ISIS being behind it.”
“Bild can write whatever nonsense they want” – Dieter replied curtly – “We’re in charge of the investigation, not them”.
“There’s more” – Weil closed his eyes and breathed deeply, his patience wearing thin – “While among the bushes, the pensioner heard a petrol engine starting up and going away.”
“That does make things easier, petrol engines are getting quite uncommon” – Dieter got a tiny bit happier.
“This is Poland we’re talking about, Ernst.” – countered Weil – “Petrol engines are still the majority there. But it allows us to exclude the environmentalists as the culprits.”
“You’ve got to be joking. You think some eco-terrorists would go as far as requiring their assassinations to be carbon-free?” – replied Dieter incredulously.
“I’m dead serious.” – replied Weil – “They are downright religious about being carbon-free. Have you forgotten that time when Partigiani Padani tried to hijack a cruise ship, but were quickly turned into minced meat by the police? It turns out that they were using a carbon-free alternative to gunpowder, which meant that their guns failed more often than not.
“Meh. It’s not as if the eco-terrorists would be after Zhang anyway.” – replied Dieter – “I’m still struggling to see any connection between him and Norquist. Do they have any enemies in common at all? Why would anyone want both of them dead?”
At this point an assistant barged in: “Ernst, Robert, you’ve got to see this. They posted a manifesto.”
“They uploaded a torrent to ThePirateBay, and have been posting links to it all over social media: Twitter, Facebook, Reddit… it’s spreading like wildfire. Here is the file” – the assistant showed in his tablet.
“For fuck’s sake, it’s 200 pages! And it’s written in German, French, English, and Spanish! Self-important lunatics.” – exclaimed Dieter.
A while later, Weil started summarising:
“They’re promising to “eradicate billionairism” in Europe. They’ve started with the “worst offenders”, but they emphasize that nobody with more than a billion euros is safe. They write that they don’t want to kill anybody, just redistribute wealth, so anyone can stop being a target by giving money away. Then there’s some blah blah about class war, media brainwashing people, democracy being a tool of oppression, and taking direct action. Lots of pages lamenting the death of the staff in the private jets, praising their “heroic sacrifice”, and warning anybody working for billionaires to get away.”
“Communists!” – cursed Dieter – “We’re in 2041 and have to deal with communist terrorists?! What is this, a 20th century revival?”
“It does give a hint to our next target: the richest man in Europe is now Jules Hermet, a French banker.” – replied Weil. He turned to the assistant: “Call Mr. Hermet. His security is not our responsibility, but we can help with intelligence.”
“Come on, the target will certainly not be Hermet!” – interjected Dieter – “Norquist and Zhang were caught by surprise, but how could they possible get Hermet when everybody is expecting them?”
“They could have started with no-name billionaires instead of Norquist and Zhang. It would have been easier, but they went for the spectacle instead.” – countered Weil – “For terrorists it’s only the spectacle that matters. And what would be more spectacular then getting Hermet now?”
“Maybe.” – grumbled Dieter – “But they must also hit the poorest billionaires at some point. If only the richest one is in danger there is no terror. And without terror they have nothing, there’s no way they’ll manage to kill hundreds of billionaires.”
“Maybe.” – concurred Weil – “I wonder what are they hoping to achieve. Do they seriously expect billionaires to give money away?”
Dieter laughed – “Not even these lunatics can believe that. They did specify that they’ll only kill in Europe, so they probably expecting them to flee to the Caribbean or their bunkers in New Zealand or whatever.”
“That makes sense.” – said Weil pensively – “As the number of billionaires here dwindles, the terror increases among the remaining ones, so they might really believe they can make Europe free of billionaires.”
The assistant chimed in again – “Just got a message from the linguistics analyst. She says that the German version of the manifesto has some grammatical mistakes that are typical of native French speakers. She is now trying to find experts in English in Spanish to see if the pattern repeats”.
“French communists, hä!” – exclaimed Dieter – “We are after a French communist soldier. Someone that probably has experience with Stingers, and is no longer active in the military.”
“There can’t be so many.” – concurred Weil – “Let’s ask the French military for a list. Emphasis on those that are working on private security after leaving the military, it’s the perfect cover for a terrorist.”

—/—/—

“Ah these loudmouths. They couldn’t resist posting the damn manifesto.” – thought Pierre Barère annoyed – “at least they agreed to wait until I got the second target down. Otherwise the mission would certainly be a failure. Now it will just probably be a failure.”
His thoughts were interrupted by the train announcing the stop at the station of Lille.
“Argh how I hate trains. Stuck here with all this noise all these peasants. With an airplane at least the torture is over quickly. But no, the airspace is closed. Motherfuckers.” – he chuckled as he realise the non-sequitur – “Ok, no, that’s my own fault.” – he laughed out loud. The feeling of power was good. His good mood quickly evaporated as his thoughts turned to the mission ahead.
“It had to be this fucking country! I had to show my real passport to board the train, and can’t even bring any weapons.” – cursed Barère mentally – “Knives! I have to terminate the target using bloody knives like a street thug. And I don’t even get a car. My “getaway car” is the bloody metro. Argh. Just want to get this over with and move on to the next target, that will be a proper operation with glamour.”

—/—/—

“Two days! Two days to get us the damn list! The bloody French are as stubborn and cranky as always. They act as if they’re doing us a big favour, when we’re the ones trying to save their asses!” – complained Dieter.
Weil ignored the whining and read the report from the IT specialist:
“It’s very promising. There are less than a hundred former soldiers employed in private security, we can tail them all. Most have social media accounts, and of these the vast majority have expressed right-wing opinions. Excluding these, we’re left with 27 suspects. One of them is employed as a bodyguard by the French Communist Party!”
“The French Communist Party?!” – snorted Dieter – “Do they still exist? I thought La France Insoumise had taken over the French far-left completely.”
“Not completely.” – answered Weil – “The communists refused to join. Maybe they still dream about the power they had back in the 20th century.”
“Let’s interrogate this… ” – Dieter looked at the file – “Pierre Barère then.”
“He took the Eurostar to London a couple of days ago.” – replied Weil – “and Jules Hermet is in France. We should focus on the other suspects now. We can always get Barère when he comes back to Schengen.”

—/—/—

“Doesn’t this freak ever leave his house!?” – thought Barère more annoyed than ever. He had installed a microcamera across the street from Frederic Hoyle, to establish his routine. It turned out, there wasn’t one. He saw a man coming every morning and leaving every evening, presumably his butler, but that was it. No sign of Hoyle. Lots of packages being delivered, though.
“I’d never seen such a low-key billionaire.” – thought Barère. “He owns a detached house in London, so he obviously has millions, but there’s nothing else! Where are the armies of servants, the fancy cars, the women, the parties? Is he trying hard to pretend to be poor, or is he just a freak?”
From the satellite pictures he could see a pool in the backyard, but that was pretty much it, no hidden helipad or similar extravagances. The roof was mildly interesting: it consisted entirely of solar panels. Such a large roof must generate a serious amount of power. Barère didn’t think much of it, and assumed that Hoyle was just a clean energy freak.
“This guy is supposed to be richest man in Europe, the secret identity of Satoshi Nakamoto.” – thought Barère dismissively – “At least that’s what the Party says. Not my problem. They gave me a job, I’ll dispatch it, and go back to having fun. The first two targets were a lot of fun, I can’t complain.” – he basked in the memories of firing Stingers for a while, and slowly let his mood sour again as he came back to the reality of being holed up in a hotel in London with no end in sight.
“Am I going to have to break in his house?” – he thought with a tinge of desperation – “The Party couldn’t get the blueprints, or any information about the security system. That would be just foolish. Maybe I’ll catch the butler and beat some info out of him? That’s still dangerous, I’ll just have one night before Hoyle discovers that something is up and vanishes.”
At this moment, his camera showed what he had almost given up on seeing, Hoyle leaving the house.
“That’s my lucky break!” – Barère thought – “He’s alone, I might even manage to end this tonight.”
He ran out of his room, and calmly walked out of his hotel, that was strategically situated between Hoyle’s house and the nearest underground station.
“He’s walking straight towards me!” – Barère couldn’t believe his luck, and tightened the grip on the knife in his pocket.
“This is going to be so easy! Nobody on the street either, even the escape will be easy” – Barère could almost sing, until he noticed that Hoyle noticed him. First Hoyle slowed, then stopped, and then started walking back to his house.
“What a paranoid freak! Who turns around just because they saw someone looking at them on the street?!” – cursed Barère. He kept walking towards Hoyle with the same pace. Until Hoyle looked back discretely, and tightened his pace. Barère also tightened his pace. Hoyle looked back again, and started running at full speed.
“Merde! It’s now or never!” – exclaimed Barère, and started running at full speed as well.
Hoyle was no match for Barère’s fitness, but he had a dozen metres of advantage, with the result that both reached Hoyle’s house at the same time. Barère had jumped with his knife out, aiming for Hoyle’s heart through his back, when the scene became visible to Hoyle’s security cameras. In less than a millisecond Hoyle’s AI correctly deduced that Hoyle was in mortal danger, and fired the house’s weapons.

Those were laser effectors that Rheinmetall had developed for the German Navy, but never used in combat. It turned out that atmospheric dispersion limited their range too much, and the power supply for a 60 kW pulse was too bulky. Neither were problems for a house. Laser effectors also had the critical advantage that they were unknown, and their components passed for research lasers to untrained eyes. As such Hoyle had no problem smuggling them through the British customs, that were extremely effective at blocking explosives and firearms.

The lasers instantly vaporised Barère’s head, while making Hoyle’s skin only slightly warmer. In this respect they were completely successful. The fatal problem was completely unforeseen. While photons do carry momentum, despite being massless, the momentum of even a 60 kW laser pulse was still too small to make any measurable difference for Barère’s headless body, that drove the knife into Hoyle’s heart purely out of inertia.

Posted in Uncategorised | 13 Comments

## Thou shalt not optimize

It’s the kind of commandment that would actually impress me if it were in the Bible, something obviously outside the knowledge of a primitive people living in the Middle East millennia ago. It’s quite lame to have a book that was supposedly dictated by God, but only contains stuff that they could come up with it themselves. God could just dictate the first 100 digits of $\pi$ to demonstrate that people were not just making shit up. There’s a novel by Iain Banks, The Hydrogen Sonata, that has this as key plot point: as a prank, an advanced civilization gives a “holy book” to a primitive society from another planet containing all sorts of references to advanced science that the latter hadn’t developed yet. The result is that when the primitives eventually develop interstellar travel, they become the only starfaring civilization in the galaxy that still believes in God.4

Why not optimize, though? The main reason is that I have carefully optimized some code that I was developing, only to realize that it was entirely the wrong approach and deleted the whole thing. It wasn’t the first time this happened. Nor the second. Nor the third. But somehow I never learn. I think optimization is addictive: you have a small puzzle that is interesting to solve and gives you immediate positive feedback: yay, your code runs twice as fast! It feels good, it feels like progress. But in reality you’re just procrastinating.

Instead of spending one hour optimizing your code, just let the slow code run during this hour. You can then take a walk, relax, get a coffee, or even think about the real problem, trying to decide whether your approach makes sense after all. You might even consider doing the unthinkable: checking the literature to see if somebody solved a similar approach already and their techniques can be useful for you.

Another reason not to optimize is that most code I write is never going to be reused, I’m just running it a couple of times myself to get a feeling about the problem. It then really doesn’t matter if it takes ten hours to run, but you could optimize it to run in ten minutes instead. Just let it run overnight.

Yet another reason not to optimize is that this really clever technique for saving a bit of memory will make your code harder to understand, and possibly introduce a bug.2 What you should care about is having a code that is correct, not fast.

On the flipside, there are optimizations that do make your code easier to understand. I saw a code written by somebody else that was doing matrix multiplication by computing each matrix element as $c_{i,j} = \sum_k a_{i,k}b_{k,j}$. That’s slow, hard to read, and gives plenty of opportunity for bugs to appear. You should then use the built-in matrix multiplication instead, but the optimization is incidental, the real point is to avoid bugs.

There are two exceptions to the “thou shalt not optimize” commandment that I can think of: the first is if your code doesn’t run at all. Sometimes this indicates that you’re doing it wrong, but if you really can’t think of something better, yeah, optimizing it is the only way to get an answer. The second is if your paper is already written, the theorems are already proven, you’re certain your approach is the correct one, and you’re about to publish the code together with your paper. Then if you optimize it you are a great person and I’ll love you forever. Just remember to use a profiler.

Posted in Uncategorised | Comments Off on Thou shalt not optimize

## Tsirelson Memorial Workshop

(thanks to Flavio Baccari for the photo)

After more than two years of the world conspiring against me and Miguel, we finally managed to pull it off: the workshop in honour of Boris Tsirelson happened! In my opinion and of the participants that I asked it was a resounding success: people repeatedly praised the high level of the talks, and were generally happy with finally having a conference in person again. An often-overlooked but crucial part of every conference is the after-hours program, which is rather lame in online conferences if it happens at all. I didn’t really take part here as I have a young child, but the participants told me that it was quite lively. We did have problems because of the recent corona wave in Austria and the war in Ukraine3, but still, it happened.

Initially we had planned to have a small workshop, but we ended up with 78 registered participants (and some unregistered ones). It was great having so much interested, but it did create problems. The idea was to have a small amount of long talks, where the authors could go through their results in depth, and people would have plenty of time for discussion. We kept the talks long (45 minutes), but we ended up with a completely packed schedule (9 invited talks + 22 contributed). We thought this wouldn’t be a problem, as people could simply skip the talks they were not interested in and use that time for discussion. That didn’t work. It turns out students felt guilty about skipping talks (I never did), and there wasn’t a good place for discussing in our conference venue. We apologize for that. Another issue is that we had to review a lot of contributions (44); thanks to a small army of anonymous referees we managed to get through them, but we still had to do the painful job of rejecting good contributions for lack of space.

A curious feedback I got from some participants is that the talks were too long. The argument was that if you are not familiar with the topic already you won’t be able to understand the technical details anyway, so the extra time we had to go through them was just tiresome. I should do some polling to determine whether this sentiment is widespread. In any case, several long talks on the same day are indeed tiresome; perhaps we could reduce the time to 30 minutes. What I won’t ever do is organize a conference with 20-minute talks (which unfortunately happens often); most of the time is spent in introducing the problem, the author can barely state what their result was, and there’s no chance of explaining how they did it.

There were two disadvantages of organizing a conference that I hadn’t thought of: first, that even during the conference I was rather busy solving problems, and couldn’t pay much attention to the talks, and secondly that I couldn’t present my own paper that I had written specially for it.

As for the content of the talks, there were plenty that I was excited about, like Brown’s revolutionary technique for calculating key rates in DIQKD, Ligthart’s impressive reformulation of the inflation hierarchy, Farkas and Łukanowski’s simple but powerful technique for determining when DIQKD is not possible, Plávala’s radically original take on GPTs, Wang’s tour de force on Bell inequalities for translation-invariant systems, Sajjad’s heroic explanation of the compression technique for a physics audience, among others. But I wanted to talk a bit more about Scarani’s talk.

He dug out an obscure unpublished paper of Tsirelson, where he studied the following problem: you have a harmonic oscillator with period $\tau$, and do a single measurement of its position at a random time, either at time 0, $\tau/3$, or $2\tau/3$. What is the probability that the position you found is positive? It turns out that the problem is very beautiful, and very difficult. Tsirelson proved that in the classical case the probability is at most $2/3$, but ironically enough couldn’t find out what the bound is in the quantum case. He showed that one can get up to 0.7054 with a really funny Wigner function with some negativity, but as for an upper bound he only proved that it is strictly below 1. Scarani couldn’t find the quantum bound either; he developed a finite-dimensional analogue of the problem that converges to the harmonic oscillator in the infinite limit and got some tantalising results using it, but no proof. The problem is still open, then. If you can prove it, you’ll be to Tsirelson what Tsirelson was to Bell.

Posted in Uncategorised | Comments Off on Tsirelson Memorial Workshop

## The horrifying world of confidence intervals

We often see experimental results reported with some “error bars”, such as saying that the mass of the Higgs boson is $125.10 \pm 0.14\, \mathrm{GeV/c^2}$. What do these error bars means, though? I asked some people what they thought it was, and the usual answer was that the true mass was inside those error bars with high probability. A very reasonable thing to expect, but it turns out that this is not true. Usually these error bars represent a frequentist confidence interval, which has a very different definition: it says that if you repeat the experiment many times, a high proportion of the confidence intervals you generate will contain the true value.

Fair enough, one can define things like this, but I don’t care about hypothetical confidence intervals of experiments I didn’t do. Can’t we have error bars that represent what we care about, the probability that the true mass is inside that range? Of course we can, that is a Bayesian credible interval. Confusingly enough, credible intervals will coincide with confidence intervals in most cases of interest, even though they answer a different question and can be completely different in more exotic problems.

Let’s focus then on the Bayesian case: is the intuitive answer people gave correct then? Yes, it is, but it doesn’t help us define what the credible interval is, as there will be infinitely many intervals that contain the true value with probability (e.g.) 0.95. How do we pick one? A nice solution would be to demand the credible interval to be symmetric around the estimate, so that we could have the usual $a\pm b$ result. But think about the most common case of parameter estimation: we want to predict the vote share that some politician will get in an election. If the poor candidate was estimated to get 2% of the votes, we can’t have the error bars to be $\pm$4%. Even if we could do that, there’s no reason why it should be symmetric: it’s perfectly possible that a 3% vote share is more probable than a 1% vote share.

A workable, if more cumbersome, definition is the Highest Posterior Region: it is a region where all points inside it have a posterior distribution larger than the points outside it. It is well-defined, except for some pathological cases we don’t care about, and is also the smallest possible region containing the true value with a given confidence. Great, no? What could go wrong with that?

Well, for starters it’s a region, not an interval. Think of a posterior distribution that has two peaks: the highest posterior region will be two intervals, each centred around one of the peaks. It’s not beautiful, but it’s not really a problem, the credible region is accurately summarizing your posterior. Your real problem is having a posterior with two peaks. How did that even happen!?

But this shows a more serious issue: the expectation value of a two-peaked distribution might very well be in the value between the peaks, and this will be almost certainly outside the highest posterior region. Can this happen with a more well-behaved posterior, that has a single peak? It turns out it can. Consider the probability density
$p(x) = (\beta-1)x^{-\beta},$ defined for $x \ge 1$ and $\beta > 2$. To calculate the highest posterior region for some confidence $1-\alpha$, note that $p(x)$ is monotonically decreasing, so we just need to find $\gamma$ such that
$\int_1^\gamma \mathrm{d}x\, (\beta-1)x^{-\beta} = 1-\alpha.$Solving that we get $\gamma = \frac1{\sqrt[\beta-1]{\alpha}}$. As for our estimate of the (fictitious) parameter we take the mean of $p(x)$, which is $\frac{\beta-1}{\beta-2}$. For the estimate to be outside the credible interval we need than that
$\frac{\beta-1}{\beta-2} > \frac1{\sqrt[\beta-1]{\alpha}},$which is a nightmare to solve exactly, but easy enough if we realize that the mean diverges as $\beta$ gets close to 2, whereas the upper boundary of the credible interval grows to a finite value, $1/\alpha$. If we take then choose $\beta$ such that the mean is $1/\alpha$ it will always be outside the credible interval!

A possible answer is “deal with it, life sucks. I mean, there’s a major war going on in Ukraine, estimates lying outside the credible interval is the least of our problems”. Fair enough, but maybe this means we chose our estimate wrong? If we take our estimate as the mode of the posterior then by definition it will always be inside the highest posterior region. The problem is there’s no good justification for using the mode as the estimate. The mean can be justified as the estimate that minimizes the mean squared error, which is quite nice, but I know of no similar justification for the mode. Also, the mode is rather pathological: if our posterior again has two peaks, but one of them is very tall and has little probability mass, the mode will be there but will be a terrible estimate.

A better answer is that sure, life sucks, we have to deal with it, but note that the probability distribution $(\beta-1)x^{-\beta}$ is very pathological. It will not arise as a posterior density in any real inference problem. That’s fine, it just won’t help against Putin. Slava Ukraini!

Posted in Uncategorised | 5 Comments