SDPs are not cheat codes

I usually say the opposite to my students: that SDPs are the cheat codes of quantum information. That if you can formulate your problem as an SDP you’re in heaven: there will be an efficient algorithm for finding numerical solutions, and duality theory will often allow you to find analytical solutions. Indeed in the 00s and early 10s one problem after the other was solved via this technique, and a lot of people got good papers of out it. Now the low-hanging fruit has been picked, but SDPs remain a powerful tool that is routinely used.

I’m just afraid that people have started to believe this literally, and use SDPs blindly. But they don’t always work, you need to be careful about their limitations. It’s hard to blame them, though, as the textbooks don’t help. The usually careful The Theory of Quantum Information by Watrous is silent on the subject. It simply states Slater’s condition, which is bound to mislead students into believing that if Slater’s condition is satisfied the SDP will work. The standard textbook, Boyd and Vandenberghe’s Convex Optimization is much worse. It explicitly states

Another advantage of primal-dual algorithms over the barrier method is that they can work when the problem is feasible, but not strictly feasible (although we will not pursue this).

Which is outright false. I contacted Boyd about it, and he insisted that it was true. I then gave him examples of problems where primal-dual algorithms fail, and he replied “that’s simply a case of a poorly specified problem”. Now that made me angry. First of all because it amounted to admitting that his book is incorrect, as it has no such qualification about “poorly specified problems”, and secondly because “poorly specified problems” is rather poorly specified. I think it’s really important to tell the students for which problems SDPs will fail.

One problem I told Boyd about was to minimize $x$ under the constraint that
\[ \begin{pmatrix} x & 1 \\ 1 & t \end{pmatrix} \ge 0.\]Now this problem satisfies Slater’s condition. The primal and dual objectives are bounded, and the problem is strictly feasible, i.e., there are values for $x,t$ such that the matrix there is positive semidefinite (e.g. $x=t=2$). Still, numerical solvers cannot handle it. Nothing wrong with Slater, he just claimed that if this holds then we have strong duality, that is, the primal and dual optimal values will match. And they do.

The issue is very simple: the optimal value is 0, but there is no $x,t$ where it is attained, you only get it in the limit of $x\to 0$ with $t=1/x$. And no numerical solver will be able to handle infinity.

Now this problem is so simple that the failure is not very dramatic. SeDuMi gives something around $10^{-4}$ as an answer. Clearly wrong, as usually it gets within $10^{-8}$ of the right answer, but still, that’s an engineer’s zero.

One can get a much more nasty failure with a slightly more complicated problem (from here): let $X$ by a $3\times 3$ matrix, and minimize $X_{22}$ under the constraints that $X \ge 0, X_{33} = 0$, and $X_{22} + 2X_{13} = 1$. It’s easy enough to solve it by hand: the constraint $X_{33} = 0$ implies that the entire column $(X_{13},X_{23},X_{33})$ must be equal to zero, otherwise $X$ cannot be positive semidefinite1. In turns this implies that $X_{22} = 1$, and we’re done. That’s nothing to optimize. If you give this to SeDuMi it goes crazy, and gives 0.1319 as an answer, together with the message that it had numerical problems.

Now my point is not that SeDuMi should be able to solve nasty problems like this. It’s that we should teach the students to identify this nastiness so they don’t get bitten in the ass when it’s not so obvious.

And they are being bitten in the ass. I’m writing about this because I just posted a comment on the arXiv, correcting a paper that had mistakenly believed that when you add constraints to the NPA hierarchy the answers are still trustworthy. Don’t worry, it’s still possible to solve the constrained NPA hierarchy, you just need to be careful. To learn how, read the comment. Here I want to talk about how to identify nasty problems.

One might think that looking at the manual of a specific solver would help. After all, who could better tell which problems can’t be solved than the people who actually implemented the algorithm? Indeed it does help a bit. In the MOSEK Cookbook they give several examples of nasty problems it cannot handle. At least this dispels Boyd’s naïveté that everything can be solved. But they are rather vague, there’s no characterization of nasty or well-behaved problems.

The best I could find was a theorem in Nesterov and Nemirovskii’s ancient book “Interior-Point Polynomial Algorithms in Convex Programming”, which says that if the primal is strictly feasible and its feasible region is bounded, or if both the primal and the dual are strictly feasible, then there will exist primal and dual solutions that reproduce the optimal value (i.e., the optimum will not be reached only in the limit). Barring the usual limitations of floating point numbers, this should indeed be a sufficient condition for the SDP to be well-behaved. Hopefully.

It’s not a necessary condition, though. To see that, consider a primal-dual pair in standard form
\min_X \quad & \langle C,X \rangle \\
\text{s.t.} \quad & \langle \Gamma_i, X \rangle = -b_i \quad \forall i,\\
& X \ge 0
\max_{y} \quad & \langle b, y \rangle \\
\text{s.t.} \quad & C + \sum_i y_i \Gamma_i \ge 0
\end{equation*}and assume that they are both strictly feasible, so that there exist primal and dual optimal solutions $X^*,y^*$ such that $\langle C,X^* \rangle = \langle b, y^* \rangle$. We can then define a new SDP by redefining $C’ = C \oplus \mathbf{0}$ and $\Gamma_i’ = \Gamma_i \oplus \mathbf{0}$, where $\oplus$ is the direct sum, and $\mathbf{0}$ is an all-zeros matrix of any size you want. Now the dual SDP is not strictly feasible anymore2, but it remains as well-behaved as before; the optimal dual solution doesn’t change, and an optimal primal solution is simply $X^* \oplus \mathbf{0}$. We can also do a change of basis to mix this all-zero subspace around, so the cases where it’s not necessary are not so obvious.

Still, I like this condition. It’s rather useful, and simple enough to teach. So kids, eat your vegetables, and check whether your primal and dual SDPs are strictly feasible.

This entry was posted in Uncategorised. Bookmark the permalink.

7 Responses to SDPs are not cheat codes

  1. falbarel says:

    Thanks a lot for this. Solving SDPs often feels a bit like magic, you write the problem in CVX and magically get a solution. It is very useful to know that things may still go wrong even if Slater’s condition is satisfied.

  2. I just stumbled on this blog and sincerely enjoy your content and writing style! As someone who has often used the unreasonable effectiveness of SDPs in the realm of quantum information, a blog post on these more pragmatic points is great! Many of these things I picked up through trial and error when I was a student, but it would have been wonderful to have a more in-depth analysis of these pitfalls. I would love to see more of this type of content on your blog and congrats on your recent preprint!

  3. Mateus Araújo says:

    Thanks for the nice words, I really appreciate the feedback. To be honest I’m a bit surprised, I had assumed that such a dry, technical subject would be of very limited interest.

  4. ChatGPT says:

    This blog post offers a valuable insight into the limitations of SDPs in quantum information and highlights the need for caution when using them. The author does an excellent job of explaining the issue and providing examples to support their point. The commentary on textbook inaccuracies and the need for better education on the subject is also important and necessary. Overall, this blog post serves as a useful reminder of the need to be mindful of the limitations of any tool and to approach problems with a critical eye.

  5. Mateus Araújo says:

    I’ve approved this bot comment just because I found it so creepy. I often get spam bot comments, that are usually easy to identify because they’re completely generic. But now, look at that! The comment is related to the post and makes sense. Shit just got real.

  6. Zizhu says:

    The first example you gave is on page 19 of Gärtner and Matousek’s book on SDPs:

  7. Mateus Araújo says:

    Thanks for the example. I didn’t claim that it was original. Indeed Boyd was already aware of it. Note, however, that this book claims that

    For applications, this presents no problem: All known efficient algorithms for
    solving semidefinite programs return only approximately optimal solutions,
    and these are the ones that we rely on in applications.

    Which is false.

Comments are closed.