Zhengfeng Ji
Seminars on Quantum Computing is a quantum computing course designed for undergraduate students in the computer science.
This is an incomplete set of notes prepared by Zhengfeng Ji for the course.
What is quantum computing, or more generally, what is quantum information science?
The first example question in ChatGPT: Explain quantum computing in simple terms.
The answer I got was:
Quantum computing is a type of computing that uses the principles of quantum mechanics, the science of the very small, to process information. Instead of using classical bits, which can be either 0 or 1, quantum computers use quantum bits or qubits, which can be in multiple states at the same time.
This ability to exist in multiple states simultaneously is called superposition and allows quantum computers to perform certain calculations much faster than classical computers. In addition, quantum computers can also use a phenomenon called entanglement, where the state of one qubit is correlated with the state of another, to further enhance their computational power.
Quantum computing has the potential to solve some of the world's most challenging problems in fields such as cryptography, drug discovery, and climate modeling. However, building a fully functional quantum computer remains a major scientific and engineering challenge, and many of the applications of quantum computing are still in the experimental stage.
Quantum + Information: The second quantum revolution
The success of quantum mechanics. Laser, GPS, etc.
From passive understanding of the collective effects to active manipulation of individual particles.
History and highlights
Current status
Primitive quantum computers are being built. Many players are racing to build quantum computers: Google, IBM, China, …
Different labs/companies are betting on different hardware routes: Superconducting qubits, ion traps, photonic systems, topological qubits, …
Nowadays, there are programmable quantum computers with 50-400 qubits.
Major problem: devices are noisy!
Supremacy and simulation of Google and USTC
Sycamore, 53 qubits
Zhuchongzhi, 66 qubits
The biggest competitor to quantum computers are clever classical algorithms: In Nov 2021, a group from Chinese Academy of Sciences used tensor network algorithms to reproduce the statistics of the Google supremacy experiment using 512 GPUs in 15 hours.
On the theory side: new algorithms, new protocols, fault-tolerant quantum computing, new insights on the power of quantum computers
More computer scientists get involved
Hype?
TIME's new cover: Quantum computers will transform our world—and create a 21st century "space race"
Quantum computing has a hype problem, MIT Technology Review, By Sankar Das Sarma.
Quantum computing startups are all the rage, but it's unclear if they'll be able to produce anything of use in the near future.
What you will learn in this subject
Is quantum computing hard to learn?
The mathematics is easy, the physics is hard.
There is a simple mathematical framework describing quantum mechanics with merely four postulates, about state space, evolution, measurement, and joint system, respectively, in both the pure state and mixed state settings.
The postulates are simple, but the truly amazing thing to note is how far these four simple postulates can bring us.
"Shut up and calculate!"
Mingsheng Ying and Zhengfeng Ji
Office: FIT 3-515 and East Main Building 8-307
Research direction of Mingsheng
Quantum computing, theory of programming, mathematical logic.
Research direction of Zhengfeng
Quantum computing, theory of computing.
Online: Wechat group, learn.tsinghua.edu.cn, email
Office hour: after each class
Homework on the quantum information basics. (40%)
Work on projects in groups and take an idea to the extreme (Given projects, or projects of your choice approved by your lecturers), write an essay, present your findings.
We will grade your project by its writing and format (10%), originality (20%), completion (20%), and presentation (10%).
Quantum Computation and Quantum Information (Mike & Ike)
Foundations of Quantum Programming by Mingsheng (Tsinghua Online Access)
Computer science
Mathematics
Physics
Three basic statements about the physical world
Probability
\(\sum_j P_j = 1, P_j \ge 0\)
Monotonicity
Single-slit case
Locality
Things can only propagate through the universe at a certain speed.
In physics, locality naturally emerges from Einstein's special theory of relativity, which implies that no signal can propagate faster than the speed of light.
Local Realism
Things have predetermined value, which depends on parameters in a local region.
Einstein: I like to think the moon is there even if I am not looking at it.
Church-Turing thesis
What is Church-Turing thesis?
TCS is mathematics but the Church-Turing thesis connects it to physics.
Information is physical.
Conversely: It from qubit.
Quantum mechanics and the above principles
Probability stays, yet monotonicity goes away.
Locality stays but realism is abandoned.
EPR and Bell's inequality.
Quantum computing does not falsify Church-Turing, but seriously challenges the extended Church-Turing thesis.
Watch this video.
Double-slit with an observer: which path did the electron take?
If there is a conscious observer, the distribution goes back to classical! That is the same as decoherence in a quantum system interacting with an environment!
Quantum mechanics is an (unusual) theory using probability.
A cat isn't found in a superposition of alive and dead states, because it interacts constantly with its environment. These interactions essentially leak information about the 'cat system' out.
This explains the difficulty of building a quantum computer. Quantum computers need to be isolated well from the environment yet still easy to manipulate.
Shor's factoring algorithm can be thought of as a large-scale interference experiment.
Born's rule
Probability stays but monotonicity goes.
God does play dice, and an unusual one.
Einstein liked inventing phrases such as "God does not play dice," "The Lord is subtle but not malicious." On one occasion Bohr answered, "Einstein, stop telling God what to do."
Instead of probabilities, quantum mechanics uses amplitudes.
\(p = \abs{\alpha}^2\).
Let \(\alpha_i\) be the amplitude that it lands on a position through slit \(i\).
\(\alpha=\alpha_1 + \alpha_2\).
Example \(\alpha_1 = 1/2\) and \(\alpha_2 = -1/2\). What is the probability \(p\)?
Cancellation!
Consider the problem of flipping a coin.
\(\Pr(\text{heads}) = p, \Pr(\text{tails}) = q\).
Apply a transform \(X\).
In general, the transform is a matrix of transition probabilities \(p(a|b)\).
Markov chain and stochastic matrix.
Now consider two independent coins. This leads to the tensor product very naturally.
More generally, the tensor product of two matrices \(A, B\) is defined as a block matrix
\begin{equation*} A \otimes B = \begin{pmatrix} A_{1,1} B & A_{1,2} B & \cdots & A_{1,n} B\\ A_{2,1} B & A_{2,2} B & \cdots & A_{2,n} B\\ \vdots & \vdots & \ddots & \vdots \\ A_{m,1} B & A_{m,2} B & \cdots & A_{m,n} B\\ \end{pmatrix}. \end{equation*}
Consider the CNOT matrix.
It creates a correlation between the two coins.
Quantum mechanics is similar, except that it uses amplitudes.
A quantum state is a unit vector of \(\complex^N\).
Finite dimensional spaces suffice for this course.
A qubit is a superposition of 0 and 1.
Quantum states are superposition of possibilities.
Dirac notation.
ket, bra (\(\ket{0}, \ket{1}, \ket{+}, \ket{-}, \ket{\psi}\)).
Unitary. \(U^\dagger U = I\).
Examples: \(I, X, Y, Z, H, S, T\), CNOT, rotations.
\begin{equation*} H = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1\\ 1 & -1 \end{bmatrix}. \end{equation*}
\begin{equation*} H \ket{0} = \ket{+}, H \ket{1} = \ket{-}. \end{equation*}
Why unitary? Unitary operators are the only operators that preserve the length of vectors.
一场从根上开始的革命 (by Mingsheng Ying)
Probability, amplitude, and Born's rule
Two-slit experiment
A (classical) random bit the mixture of \(0\) or \(1\) described by probabilities.
A quantum bit (qubit) is a superposition state described by amplitudes for \(0\) and \(1\).
1-norm vs. 2-norm theory.
Linear algebra for probability
A (random) bit is represented as a vector \(\begin{pmatrix} p_0 \\ p_1 \end{pmatrix}\).
\(X\) and CNOT as operations on the probability vector.
\(X\) is the NOT gate in matrix form!
CNOT says if the control bit is \(1\), flip the target bit.
Tensor product arises naturally.
Binary symmetric channel as a matrix.
Linear algebra for quantum mechanics
Quantum gates
Unitary gates
Pauli \(X, Y, Z\)
The Hadamard gate \(H\)
In some sense, the most important quantum gate.
The Hadamard gate \(H\) has the matrix form
\begin{equation} H = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1\\ 1 & -1 \end{bmatrix}. \end{equation}
It maps \(\ket{0}, \ket{1}\) to \(\ket{+}, \ket{-}\), and vice versa.
\(H\) and \(H^2\).
Destructive interference (in the tree for \(H^2 \ket{0}\)).
All quantum gates are changes of basis.
We can expand \(\ket{0}\) in the basis of \(\ket{+}\) and \(\ket{-}\).
\(\ket{0}, \ket{1}\) is not special.
There are different ways to measure in quantum!
What if we measure along another choice of basis? Say, \(\ket{+}, \ket{-}\)?
Decompose the state \(\ket{\psi}\) as a combination of \(\ket{+}\) and \(\ket{-}\).
Two views:
The projection picture on the circle.
Projective measurements.
Like calculating the probability of two independent random bits, we can use the tensor product to represent the state of multiple qubits.
It is now the tensor product of quantum states (vectors of amplitudes instead of vectors of probabilities).
Two-qubit states
A general two-qubit state is
\begin{equation} \ket{\psi} = \alpha \ket{00} + \beta \ket{01} + \gamma \ket{10} + \delta \ket{11} \end{equation}
It is the superposition of four possibilities.
Born's rule gives the probability of four outcomes.
In principle there's no distance limitation between the two qubits.
Let's say the first qubit is in Alice's hand. What happens if Alice measures it?
If Alice obtained \(0\) outcome, then the state is
\begin{equation} \frac{\alpha \ket{00} + \beta \ket{11}}{\sqrt{\abs{\alpha}^2 + \abs{\beta}^2}}. \end{equation}
This is called the Partial Measurement Rule, which follows from the measurement postulate.
\begin{equation*} \begin{split} M_0 & = \ket{0}\bra{0} \otimes I\\ M_1 & = \ket{1}\bra{1} \otimes I \end{split} \end{equation*}
Two-qubit gates
Single-qubit gates as a two-qubit gate: \(X \otimes I\) and \(I \otimes X\).
Applying the same single-qubit gate on the two qubits: \(H \otimes H\).
CNOT gate.
We need some facts from linear algebra to take our grasp of quantum mechanics to the next level.
\(A (\sum_j a_j \ket{v_j}) = \sum_j a_j A \ket{v_j}\).
Matrix entries \(A_{ij} = \bra{i} A \ket{j}\).
Hilbert space: vector space with an inner product.
What is an inner product \((\,\cdot\,, \,\cdot\,)\)?
A concrete example is
\begin{equation*} (\ket{u}, \ket{v}) = \sum_{j=1}^d {u_j}^* v_j. \end{equation*}
Question: is this the only possibility?
Dirac notation \(\langle u | v \rangle\).
The norm is induced by the inner product (of the Hilbert space).
\begin{equation*} \norm{\ket{u}} = \sqrt{\langle u | u \rangle}. \end{equation*}
The normalization of \(\ket{u}\) is \(\ket{u} / \norm{\ket{u}}\).
The decomposition of a state into basis
\begin{equation*} \ket{\psi} = \sum_j a_j \ket{\lambda_j}. \end{equation*}
Then
\begin{equation*} a_j = \langle \lambda_j | \psi \rangle. \end{equation*}
Orthonormal basis and completeness relation:
If \(\{ \ket{\lambda_i} \}\) form an orthonormal basis, then we have
\begin{equation*} \sum_j \ket{\lambda_j} \bra{\lambda_j} = I. \end{equation*}
Adjoints of operators
Let \(A\) be a linear operator on a Hilbert space. There exists a unique operator \(A^\dagger\) such that for all \(\ket{v}, \ket{w}\),
\begin{equation*} (\ket{v}, A \ket{w}) = (A^\dagger \ket{v}, \ket{w}). \end{equation*}
In the matrix form, \(A^\dagger\) is the conjugate transpose of \(A\).
\(A \ket{\lambda} = \lambda \ket{\lambda}\).
Matrix \(A\) is diagonalizable if \(A = \sum_j \lambda_j \ket{\lambda_j} \bra{\lambda_j}\) where \(\ket{\lambda_j}\) form an orthonormal set of eigenvectors of \(A\).
Example: What is the eigenvalues and eigenvectors of \(X\)?
Spectral decomposition theorem
Theorem. Any normal operator \(A\) on space \(V\) is diagonalizable with respect to some orthonormal basis of \(V\), and vice versa.
Families of diagonalizable matrices
Matrix | Condition | Eigenvalues |
---|---|---|
Normal | \(A\) and \(A^\dagger\) commute | Complex |
Hermitian | \(H = H^\dagger\) | Real |
Unitary | \(U^\dagger U = I\) | Phase |
Projection | \(P^2 = P, P = P^\dagger\) | \(\{0,1\}\) |
Reflection | \(R^2 = I, R = R^\dagger\) | \(\{\pm 1\}\) |
\(A \otimes B\) is a block matrix whose \(i,j\)-th block is \(a_{i,j} B\).
\begin{equation*} X \otimes Y = \begin{bmatrix} 0 & 0 & 0 & -i\\ 0 & 0 & i & 0\\ 0 & -i & 0 & 0\\ i & 0 & 0 & 0 \end{bmatrix} \end{equation*}
For \(A = \sum_a a \ket{a}\bra{a}\), define \(f(A) = \sum_a f(a) \ket{a}\bra{a}\).
Example: \(\exp(A)\).
State
The state of a closed quantum system can be described by a unit vector in a Hilbert space (called the state space).
Closed: no observer, no leak of information to the environment!
If a "particle" can be in one of the \(d\) possibilities from a finite set \(\Gamma\), the state space is \(\mathcal{H} = \complex^\Gamma\). That is, to each possibility in \(\Gamma\), a complex number (the amplitude) is assigned.
\begin{equation*} \ket{\psi} = \begin{bmatrix} \psi_1 \\ \psi_2 \\ \vdots \\ \psi_d \end{bmatrix} \end{equation*}
such that \(\norm{\ket{\psi}} = \abs{\psi_1}^2 + \cdots + \abs{\psi_d}^2 = 1\).
Evolution
The discrete time evolution of the closed system can be described by a unitary operator. If the state at time \(t_0\) is \(\ket{\psi_0}\), the state at time \(t_1\) can be obtained as
\begin{equation*} \ket{\psi_1} = U \ket{\psi_0}. \end{equation*}
It also follows from the Schrödinger equation
\begin{equation*} i \hbar \frac{\partial}{\partial t} \ket{\psi(t)} = H \ket{\psi(t)}. \end{equation*}
Take \(\hbar = 1\), the evolution from \(t_0\) to \(t_1\) is
\begin{equation*} U = \exp \bigl(-i H (t_1 - t_0) \bigr). \end{equation*}
Measurement
Composite system
The state of a composite system whose components have states \(\ket{\psi_1}, \ket{\psi_2}, \ldots, \ket{\psi_m}\) is the tensor product state
\begin{equation*} \ket{\psi_1} \otimes \ket{\psi_2} \otimes \cdots \otimes \ket{\psi_m}. \end{equation*}
The measurement postulate
Quantum measurements are described by a collection \(\{M_m\}\) of measurement operators. These are operators acting on the system being measured and must satisfy
\begin{equation*} \sum_m M_m^\dagger M_m = I. \end{equation*}
The index \(m\) refers to the measurement outcomes that may occur.
(Compare the \(U^\dagger U = I\) condition)
If the state of the system is \(\ket{\psi}\) immediately before the measurement then the probability that result \(m\) occurs is given by
\begin{equation*} p(m) = \bra{\psi} M_m^\dagger M_m \ket{\psi}. \end{equation*}
The state after the measurement is
\begin{equation*} M_m \ket{\psi} \big/ \sqrt{p(m)}. \end{equation*}
Why this general form?
It is essentially Born's rule when we use ancillary systems and apply a joint unitary operation.
It is known as the Naimark theorem.
Hint: Consider a unitary
\begin{equation*} U = \begin{pmatrix} M_1 & \cdots & \cdots\\ \vdots & \ddots & \vdots\\ M_m & \cdots & \cdots \end{pmatrix} \end{equation*}
Special case: projective measurement if \(M_m = \ket{\psi_m}\bra{\psi_m}\) and this will recover the Born's rule.
Observable: an operator that models the observation of a physical quantity.
For any Hermitian operator \(O\), the spectral decomposition theorem tells us that it has the form
\begin{equation*} O = \sum_m m P_m, \end{equation*}
where \(m\) is from a finite subset of \(\real\).
\begin{equation*} \E (O) = \sum_m m\, p(m) = \bra{\psi} O \ket{\psi} = \langle O \rangle. \end{equation*}
So the average value \(\langle O \rangle\) on state \(\ket{\psi}\) is the expectation of outcome value \(m\).
Measurement collapses the state.
The three polarizer "paradox".
Gates are:
Invertible
This means that unitary gates never erase information. Compare this with classical gates!
Deterministic
Nothing is random in the application of a quantum gate.
Continuous
There is a unitary operation that represents the evolution of a smaller period of time.
Explains the necessity of using complex numbers.
What is the square root of the \(Z\) gate?
Measurements are:
Irreversible
We cannot go back to the state before measurement in general!
The superposition state (wave function) collapses.
Probabilistic
Born's rule.
Discontinuous
The collapse of the wave function in measurement is considered instantaneous!
How does quantum mechanics reconcile these two conflicting operations? Born's rule!
Gates preserve the 2-norm, and measurements give the probabilities determined by the 2-norm.
If you want to think more about superposition, the lecture by Allan Adams on superposition is highly recommended.
Quantum measurements are described by measurement operators \(\{M_m\}\) satisfying
\begin{equation*} \sum_m M_m^\dagger M_m = I. \end{equation*}
The probability that \(m\) occurs is \(\bra{\psi} M_m^\dagger M_m \ket{\psi}\).
The post-measurement state given outcome \(m\) is proportional to \(M_m \ket{\psi}\).
An observable is a Hermitian operator whose eigenprojectors form a set of measurement operators indexed by the eigenvalues.
\(Z\) is an observable, which corresponds to the computational measurement \(\{\ket{a}\bra{a}\}\) with outcomes \((-1)^a\).
How do we process information using the four postulates?
Boolean functions, Boolean circuits, and Boolean formulas
A Boolean function is function \(f : \{0,1\}^n \to \{0,1\}\).
There are \(2^{2^n}\) of them!
A Boolean circuit is a model of computation using gates and wires.
A Boolean formula is a Boolean circuit where all gates have fan-out at most \(1\).
Functional completeness
A set of Boolean functions \(f_i : \{0,1\}^{n_i} \to \{0,1\}\) is functionally complete if the functions "generate" every \(f : \{0, 1\}^n \to \{0, 1\}\) for all \(n \ge 1\).
\(\{\text{AND}, \text{OR}, \text{NOT}\}\) is functionally complete.
Shannon formula:
\begin{equation*} \begin{split} f(x_1,...,x_n) & = ((\neg x_1) \land f(0,x_2,...,x_n)) \\ & \qquad \lor (x_1 \land f(1,x_2,...,x_n)). \end{split} \end{equation*}
\(\{\text{NAND}\}\) is functionally complete.
Disjunctive normal form (sum of minterms):
\begin{equation*} \bigvee_{f(a_1,...,a_n)=1} \bigwedge_{i=1}^{n}x_i^{a_i}, \text{ where } x_i^{a_i} = \begin{cases} \phantom{\neg} x_i & \text{if } a_i = 1,\\ \neg x_i & \text{if } a_i=0. \end{cases} \end{equation*}
Quantum circuit is our new language for describing quantum information processing.
Reversibility implies that the quantum circuit looks more regular (the same number of input and output qubits).
From left to right: initial state \(\ket{\psi_0}\), the first gate \(U_1\), the second gate \(U_2\), …, the last gate \(U_T\).
Reverse the order when writing them in an equation:
\begin{equation*} U_T \cdots U_2 U_1 \ket{\psi_0}. \end{equation*}
Multi-qubit gates CNOT, C-U, CCNOT (Toffoli) in circuit diagrams.
What does it mean to apply a single qubit \(U\) to a multi-qubit states?
Prepare a classical bit string \(x\) from the all-zero string.
\(H \ket{0}\) and measure.
Bell state preparation.
Qubit is the quantum analog of bit and is, therefore, the simplest yet most important concept in quantum information processing.
Some noteworthy examples of states are:
\begin{equation*} \begin{split} & \ket{0} = \begin{pmatrix} 1 \\ 0 \end{pmatrix},\quad \ket{1} = \begin{pmatrix} 0 \\ 1 \end{pmatrix},\\ & \ket{+} = \frac{\ket{0} + \ket{1}}{\sqrt{2}} = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix},\\ & \ket{-} = \frac{\ket{0} - \ket{1}}{\sqrt{2}} = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ -1 \end{pmatrix}. \end{split} \end{equation*}
Single-qubit states
\begin{equation*} \ket{\psi} = \alpha \ket{0} + \beta \ket{1}. \end{equation*}
Examples of single-qubit states \(\ket{0}, \ket{1}, \ket{+}, \ket{-}\).
\(\alpha\) and \(\beta\) are complex numbers.
Quantum theory is a \(2\)-norm theory: \(\abs{\alpha}^2 + \abs{\beta}^2 = 1\).
Global phases do not matter and so we can write
\begin{equation*} \ket{\psi} = \cos\frac{\theta}{2} \ket{0} + \sin\frac{\theta}{2}\, e^{i\varphi} \ket{1} \end{equation*}
Pure single-qubit states are on the Bloch sphere.
Before discussing mixed states in the Bloch sphere, we review the Pauli operators.
\begin{equation*} \begin{split} & \sigma_x = X = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix},\quad \sigma_y = Y = \begin{pmatrix} 0 & -i \\ i & 0 \end{pmatrix},\\ & \sigma_z = Z = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}. \end{split} \end{equation*}
Pauli operators are unitary, Hermitian, and traceless.
What are their eigenvalues and eigenvectors?
Pauli operators as observables.
Commutation and anti-commutation relations of Pauli operators
\begin{align} [\sigma_x, \sigma_y] & = 2 i \sigma_z, & \{\sigma_x, \sigma_y\} & = 0,\\ [\sigma_y, \sigma_z] & = 2 i \sigma_x, & \{\sigma_y, \sigma_z\} & = 0,\\ [\sigma_z, \sigma_x] & = 2 i \sigma_y, & \{\sigma_z, \sigma_x\} & = 0. \end{align}
For any unit vector \(\vec{n} = (n_x, n_y, n_z) \in \real^3\), define
\begin{equation*} \vec{n} \cdot \vec{\sigma} = \sum_{k \in \{x,y, z\}} n_k\, \sigma_k. \end{equation*}
It is unitary, Hermitian, and traceless
\begin{equation*} (\vec{n} \cdot \vec{\sigma})^2 = I. \end{equation*}
For pure state \(\ket{\psi} = \cos\frac{\theta}{2} \ket{0} + \sin\frac{\theta}{2} e^{i\varphi} \ket{1}\) and its coordinate \(\vec{n}_\psi\), we have
\begin{equation*} \vec{n}_\psi \cdot \vec{\sigma} = \begin{pmatrix} \cos\theta & \sin\theta\, e^{-i\varphi}\\ \sin\theta\, e^{i\varphi} & -\cos\theta \end{pmatrix}. \end{equation*}
What are the eigenvectors of the above operator?
The coordinate of \(\ket{\psi}\) in \(\real^3\) is
\begin{equation*} (n_x, n_y, n_z) = (\sin\theta \cos\varphi, \sin\theta \sin\varphi, \cos\theta). \end{equation*}
It would be more natural first to define the \(\vec{n} \cdot \vec{\sigma}\) operator for any coordinate and define the pure state as its eigenvalue-\(1\) eigenstate.
It explains the \(\frac{\theta}{2}\) used to define the state \(\ket{\psi}\).
Pauli \(X\), \(Y\), \(Z\) rotate the Bloch sphere by angle \(\pi\) along the \(x\), \(y\), \(z\) axis, respectively.
The example of the \(Z\) action:
\begin{equation*} \cos\frac{\theta}{2} \ket{0} + \sin\frac{\theta}{2} e^{i \varphi} \ket{1} \\ \mapsto \cos\frac{\theta}{2} \ket{0} - \sin\frac{\theta}{2} e^{i \varphi} \ket{1}. \end{equation*}
How about \(\vec{n} \cdot \vec{\sigma}\)?
We do quantum information processing simply by rotating the qubits in our system!
Hadamard \(H\), phase gate \(S\), \(\pi/8\) gate \(T\)
\begin{equation*} \begin{split} & H = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix},\quad S = \begin{pmatrix} 1 & 0\\ 0 & i \end{pmatrix},\\ & T = \begin{pmatrix} 1 & 0 \\ 0 & e^{\pi i / 4} \end{pmatrix}. \end{split} \end{equation*}
Rotations through \(a = x, y, z\) axis
\begin{equation*} R_a(\phi) = \exp\Bigl( - \frac{\phi}{2} i \sigma_a \Bigr) = \cos\frac{\phi}{2} I - i\sin\frac{\phi}{2} \sigma_a. \end{equation*}
In particular, \(R_z(\phi) = \begin{pmatrix} e^{-\frac{\phi}{2} i} & 0 \\ 0 & e^{\frac{\phi}{2} i} \end{pmatrix}\), and \(T\) is \(R_z(\pi / 4)\) up to a global phase.
Rotations along \(\vec{n}\)
\begin{equation*} \begin{split} R_{\vec{n}}(\phi) & = \exp\bigl( - \frac{\phi}{2} i\, \vec{n} \cdot \vec{\sigma}\bigr)\\ & = \cos\frac{\phi}{2} I - i \sin\frac{\phi}{2} \vec{n} \cdot \vec{\sigma}. \end{split} \end{equation*}
Any single-qubit unitary operator \(U\) can be written as
\begin{equation*} U = e^{i\alpha} R_{\vec{n}}(\phi), \end{equation*}
for some \(\alpha\), \(\phi\), and \(\vec{n}\).
(Z-Y decomposition) Any single-qubit unitary \(U\) has the form
\begin{equation*} U = e^{i\alpha} R_z(\beta) R_y(\gamma) R_z(\delta). \end{equation*}
The computational (standard) basis measurement (\(Z\) measurement)
Is the state \(\ket{0}\) or \(\ket{1}\)?
(Born's rule) When we measure \(\ket{\psi} = \alpha \ket{0} + \beta \ket{1}\):
With probability \(\abs{\alpha}^2\), the measurement outcome is \(0\) and the state collapse to \(\ket{0}\);
With probability \(\abs{\beta}^2\), the measurement outcome is \(1\) and the state collapse to \(\ket{1}\).
Hadamard basis measurement (\(X\) measurement):
Is the state \(\ket{+}\) or \(\ket{-}\)?
When we measure \(\ket{\psi} = \alpha \ket{0} + \beta \ket{1}\):
With probability \(\abs{\alpha + \beta}^2/2\), the outcome is \(+\) and the state collapse to \(\ket{+}\);
With probability \(\abs{\alpha - \beta}^2/2\), the outcome is \(-\) and the state collapse to \(\ket{-}\).
Measurement operators
\begin{equation*} M_{+} = \frac{I + X}{2} = \ket{+}\bra{+},\quad M_{-} = \frac{I - X}{2} = \ket{-}\bra{-}. \end{equation*}
\(X\) as an observable.
Why is it called the Hadamard basis measurement?
A projective measurement in the direction of \(\vec{n}\)
Note that \(\vec{n} \cdot \vec{\sigma}\) is an observable.
Measurement operators
\begin{equation*} M_b = \frac{I + (-1)^b\, \vec{n} \cdot \vec{\sigma}}{2}, \text{ for } b = 0, 1. \end{equation*}
Check that \(M_b\) is indeed projective and \(M_0 + M_1 = I\).
Someone sends you a box, which is either a dud (empty box) or a bomb attached to a quantum device that will trigger the explosion if it measures a \(\ket{1}\) on the photon sent in. Your task is to detect if there is a bomb inside.
Not very interesting
Dud: No explosion, measures \(\ket{+}\).
Bomb: 50% explosion, 25% measures \(\ket{+}\), 25% measures \(\ket{-}\).
Start with \(\ket{0}\), rotate \(\epsilon\) angle, and send it in.
Use \(R_y(\epsilon)\) to implement the \(\epsilon\) rotation.
Repeat \(N = O(1/\epsilon)\) times without measuring.
Measure in the standard basis.
Dud: No explosion, measures \(\ket{1}\) if \(N = \pi/\epsilon\).
Bomb: \(O(\epsilon)\) probability of explosion, measures \(\ket{0}\).
\begin{equation*} \Pr(\text{explodes}) \le N \sin^2 \Bigl( \frac{\epsilon}{2} \Bigr) = O(\epsilon). \end{equation*}
This is known as the quantum Zeno effect.
Entanglement is yet another peculiar quantum phenomenon along with superposition and measurement.
Superpositions of 00, 01, 10, 11
\begin{equation*} \ket{\psi_{AB}} = \alpha_{00} \ket{00} + \alpha_{01} \ket{01} + \alpha_{10} \ket{10} + \alpha_{11} \ket{11}. \end{equation*}
Tensor products \(\ket{ab} = \ket{a} \otimes \ket{b}, \text{ for } a, b \in \{0, 1\}\).
Tensor product of two matrices
Vector representation \((\alpha_{ij})\)
Normalization condition \(\sum_{a,b} \abs{\alpha_{ab}}^2 = 1\).
CNOT, SWAP, Controlled-\(U\), …
\begin{equation*} \begin{split} \text{CNOT} :\; & \ket{a, b} \mapsto \ket{a, a\oplus b}, \\ \text{SWAP} :\; & \ket{a, b} \mapsto \ket{b, a}, \\ \text{Controlled-}U :\; & \ket{a, \psi} \mapsto \ket{a} \otimes U^a \ket{\psi}. \end{split} \end{equation*}
Circuit identities
Control vs. target
CNOT and single-qubit gates are all you need for universal quantum computing.
…, then they can no longer be described in the same way as before, viz. by endowing each of them with a representative of its own. I would not call that one but rather the characteristic trait of quantum mechanics, the one that enforces its entire departure from classical lines of thought. By the interaction the two representatives [the quantum states] have become entangled.
Four Bell states
\begin{alignat*}{3} & \ket{\beta_{00}} & = \frac{\ket{00} + \ket{11}}{\sqrt{2}},\quad && \ket{\beta_{01}} & = \frac{\ket{01} + \ket{10}}{\sqrt{2}},\\ & \ket{\beta_{10}} & = \frac{\ket{00} - \ket{11}}{\sqrt{2}},\quad && \ket{\beta_{11}} & = \frac{\ket{01} - \ket{10}}{\sqrt{2}}. \end{alignat*}
The first index determines the sign; the second determines the parity.
Prepare the Bell states
The Bell states are also known as the EPR states, and we often use \(\ket{\text{EPR}}\) to denote \(\ket{\beta_{00}}\).
Singlet state \(\ket{\beta_{11}}\) is an anti-symmetric state.
The other three states are all symmetric and span the symmetric subspace of the two-qubit state space.
It is easy to verify that this state is not a product of single-qubit states!
A picture of entanglement.
Basic rules of quantum mechanics imply the existence of EPR, arguably what troubled Einstein the most in quantum mechanics! It is what he called the 'spooky action at a distance'.
Consider two players Alice and Bob sharing an EPR state. Alice brings her particle to the moon while Bob stays on earth.
Fact: If Alice measures her particle, she instantaneously knows whether Bob's state is \(\ket{0}\) or \(\ket{1}\) (which Bob can confirm if he performs the standard basis measurement).
Is it a big deal?
It is not very different from a pair of two perfectly correlated classical bits.
A pair of gloves in two boxes.
EPR considers other possibilities: Alice can measure in a different basis.
There is another way to open the box.
We have measured \(Z\) basis, what about \(X\) measurement? That is, what happens if Alice measures \(\ket{\pm}\)?
Method I. \(M_0 = \ket{+}\bra{+} \otimes I\) and \(M_1 = \ket{-}\bra{-} \otimes I\).
Method II. Write EPR state
\begin{equation*} \ket{\text{EPR}} = \frac{\ket{{+}{+}} + \ket{{-}{-}}}{\sqrt{2}}. \end{equation*}
The conclusion is that Alice will instantaneously know whether Bob's post-measurement state is \(\ket{+}\) or \(\ket{-}\).
Consider two situations. In the first, Alice measures in the standard basis, she will obtain \(a=0,1\) with equal probability and Bob's qubit collapses to \(\ket{a}\). In the second, she measures in the Hadamard basis, she will obtain \(a={+},{-}\), with equal probability and Bob's qubit collapses to \(\ket{a}\).
It looks as if Alice's measurement choice affects Bob's final state! Would it allow Alice to send information instantaneously to Bob?
What are the states of Bob in the two situations?
Bob's view and Alice's view.
The mixed state framework will help us to answer the question and see why this won't allow superluminal communication.
In the first situation, Bob's state is in an equal distribution over \(\ket{0}, \ket{1}\). In the second, it is an equal distribution over \(\ket{+}\) and \(\ket{-}\). Can we distinguish these two cases in quantum mechanics?
Let's generalize a little bit and consider a distribution of pure states described by an ensemble \(\{(p_i, \ket{\psi_i})\}\). It means that our knowledge of the quantum system says that it is in state \(\ket{\psi_i}\) with probability \(p_i\).
Consider a quantum measurement \(\{M_0, M_1\}\) measuring the state ensemble. The probability that \(a\) occurs is
\begin{equation*} \Pr(a) = \sum_i p_i \bra{\psi_i} M_a^\dagger M_a \ket{\psi_i}. \end{equation*}
We can rewrite it using cyclic property of the trace as
\begin{equation*} \Pr(a) = \tr \Bigl(\Bigl(\sum_i p_i \ket{\psi_i}\bra{\psi_i} \Bigr) \, M_a^\dagger M_a \Bigr). \end{equation*}
We call
\begin{equation*} \rho = \sum_i p_i \ket{\psi_i}\bra{\psi_i} \end{equation*}
the density matrix of the ensemble. The probability is determined by the density matrix, so it makes sense to use the density matrix to describe the state of the system.
We now go back to the EPR paradox. When Alice measures \(\ket{0}, \ket{1}\), Bob's state is also \(\ket{0}\) or \(\ket{1}\), which Alice knows for sure as it will be the same as her measurement outcome. But Bob does not know the measurement outcome, so before Alice send the outcome to him, his knowledge of the system is that it is \(\ket{0}\) or \(\ket{1}\) with equal probability. The density matrix is
\begin{equation*} \rho = \frac{1}{2} (\ket{0}\bra{0} + \ket{1}\bra{1}) = \frac{I}{2}. \end{equation*}
If Alice measured \(\ket{+}, \ket{-}\). Bob's qubit is \(\ket{+}\) or \(\ket{-}\) with equal probability. Hence, the density matrix is
\begin{equation*} \rho' = \frac{1}{2} (\ket{+}\bra{+} + \ket{-}\bra{-}) = \frac{I}{2}. \end{equation*}
So even though the two ensembles look very different, the density matrices in the two situations are the same. It means that no measurement on Bob's system can tell the two cases apart. This makes perfect sense as the state of Bob should not change when Bob has done nothing to his system.
It explains why density matrix is more fundamental then ensemble of quantum states.
No superluminal communication is possible using the strategy above!
Mathematically, a density matrix is positive semidefinite and of trace \(1\) (iff there is a corresponding ensemble). It is a description of the state of a quantum system that is more general than state vectors for pure states.
Any pure state \(\ket{\psi}\) has a density matrix representation \(\rho = \ket{\psi}\bra{\psi}\).
Example: The density matrix for \(\ket{+}\) is
\begin{equation*} \ket{+}\bra{+} = \frac{1}{2} \begin{bmatrix} 1 & 1\\ 1 & 1 \end{bmatrix}. \end{equation*}
The spectral decomposition of a density matrix
\begin{equation*} \rho = \sum_j \lambda_j \ket{\psi_j}\bra{\psi_j} \end{equation*}
gives a possible ensemble \(\{(\lambda_j, \ket{\psi_j})\}\) for the density matrix.
Probability can be considered a special case where the density matrix is diagonal.
The state is now a density matrix.
Evolution: \(\rho \mapsto U\rho U^\dagger\).
Measurement: \(\Pr(m) = \tr \bigl( M_m^\dagger M_m \rho \bigr)\) and the post-measurement state is proportional to
\begin{equation*} M_m \rho M_m^\dagger. \end{equation*}
Tensor: \(\rho = \rho_1 \otimes \cdots \otimes \rho_n\).
Define an inner product on the \(2\) by \(2\) matrix space
\begin{equation*} (A, B) = \frac{\tr(A^\dagger B)}{2}. \end{equation*}
Together with \(I\), Pauli matrices form an orthonormal basis!
For all \(\rho\), we can write
\begin{equation*} \rho = \frac{I + \vec{n} \cdot \vec{\sigma}}{2}, \text{ where } n_a = \tr(\rho \sigma_a). \end{equation*}
The positive semidefinite condition is equivalent to \(\norm{\vec{n}} \le 1\).
The Bloch sphere has all valid single-qubit states.
What are the states of the north pole, the origin, and \((1, 0, 0)\)?
From coordinate \(\vec{n}\), we can compute the state as
\begin{equation*} \rho = \frac{I + \vec{n} \cdot \vec{\sigma}}{2}. \end{equation*}
Conversely, given the density matrix \(\rho\), the expansion in the Pauli operators basis gives a vector \(\vec{n} = (n_x, n_y, n_z)\)
\begin{equation*} n_a = \tr(\rho \sigma_a), \text{ for } a = x, y, z. \end{equation*}
Classical probability over a bit can be represented as diagonal density matrices corresponding to the line segment between the north and south poles.
Summarize: Things that can be identified with point \(\vec{n}\) on the sphere.
When we measure \(\rho = (I + \vec{n} \cdot \vec{\sigma})/2\):
With probability \(\tr(\rho\, \ket{0}\bra{0}) = (1 + n_z) / 2\), the measurement outcome is \(0\) and the state collapse to \(\ket{0}\bra{0}\);
With probability \(\tr(\rho\, \ket{1}\bra{1}) = (1 - n_z) / 2\), the measurement outcome is \(1\) and the state collapse to \(\ket{1}\bra{1}\).
What is the state of \(A\) given the state of \(AB\)? We cannot answer the question in the pure state framework, and we can now with density matrices.
Let \(\rho_{AB}\) the joint state of \(A\) and \(B\) systems. What is the state of \(B\) only? Is there a density matrix associated with system \(B\) given that we know the state of both \(A\) and \(B\)?
We need an effective representation of the state on \(B\) so that measurement only depends on this reduced state (not the joint state).
Let \(M\) be a measurement operator; the probability its outcome occurs is
\begin{equation*} \tr \bigl( \rho_{AB} (I_A \otimes M^\dagger M) \bigr), \end{equation*}
which is a linear functional of \(M^\dagger M\). So there is a \(\rho_B\) such that
\begin{equation*} \tr \bigl( \rho_{AB} (I_A \otimes M^\dagger M) \bigr) = \tr (\rho_B\, M^\dagger M). \end{equation*}
The partial trace is defined by
\begin{equation*} \tr_A : \ket{i,j}\bra{k,l}_{A,B} = \tr(\ket{i}\bra{k}_A) \ket{j}\bra{l}_B. \end{equation*}
Extend by linearity.
The reduced density matrix \(\rho_B = \tr_A (\rho_{AB})\).
What is the partial trace of the EPR state?
The reduced density matrices of \(\rho_{AB} = \rho_A \otimes \rho_B\) is \(\rho_A\) and \(\rho_B\). The converse is usually not true!
For any mixed state \(\rho_B\), there is a system \(A\) of the same dimension as \(B\) and a pure state \(\ket{\psi}_{AB}\) such that
\begin{equation*} \tr_A \bigl( \ket{\psi}\bra{\psi}_{AB} \bigr) = \rho_B. \end{equation*}
Use the spectral decomposition theorem again!
Example: The EPR state is a purification of \(I/2\).
Mixed states arise when we lose control of a system!
Homework I
Problem 3 about computing
\begin{equation*} \bra{x} H^{\otimes n} \ket{y}. \end{equation*}
Problem 4
Spectral decomposition
Consider a special case where \(A\) is Hermitian.
Let \(\lambda_1\) be an eigenvalue of \(A\) and \(\ket{e_1}\) a corresponding eigenvalue.
\begin{equation*} \begin{split} & \lambda_1 (\ket{e_1}, \ket{e_1}) = (\ket{e_1}, A\ket{e_1}) \\ = \, & (A\ket{e_1}, \ket{e_1}) = \lambda_1^* (\ket{e_1}, \ket{e_1}). \end{split} \end{equation*}
This means that \(\lambda_1\) is real.
Consider subspace \(K\) spanned by all vectors orthogonal to \(\ket{e_1}\).
We prove that \(K\) is an invariant subspace of \(A\). Suppose \(\ket{\psi} \in K\), we have \(\bra{e_1} A \ket{\psi} = \lambda_1 \langle e_1 | \psi \rangle = 0\). Use induction to complete the proof.
In standard textbooks, it says that \(A = UDU^\dagger\). Using Dirac notation, we write it as
\begin{equation*} A = U \bigl( \sum_j \lambda_j \ket{j}\bra{j} \bigr) U^\dagger. \end{equation*}
In this form, \(\ket{e_j} = U\ket{j}\) will be the eigenvectors of \(A\), and we write
\begin{equation*} A = \sum_j \lambda_j \ket{e_j}\bra{e_j}, \end{equation*}
and \(\ket{e_j}\)'s are orthonormal.
Measurement
Projective measurements
Measurement operators are projectors such as \(\ket{0}\bra{0}\) and \(\ket{1}\bra{1}\).
Note that \(\ket{\psi}\bra{\psi}\) is the projector to the span of \(\ket{\psi}\).
A set of projectors \(\{P_j\}\) form a projective measurement if \(\sum_j P_j = I\).
General measurement satisfies \(\sum_j M_j^\dagger M_j = I\)
When we don't care about the post-measurement state, we consider measurement operators \(N_j = M_j^\dagger M_j \ge 0\).
The completeness condition is \(\sum_j N_j = I\).
The probability that \(j\) occurs is \(\tr(N_j \rho)\).
They are called positive-operator valued measure (POVM).
(Schmidt Decomposition). For a bipartite pure quantum state \(\ket{\psi_{AB}}\), we can write
\begin{equation*} \ket{\psi_{AB}} = \sum_{j=1}^k \sqrt{\lambda_j} \, \ket{\psi_A^{(i)}} \otimes \ket{\psi_B^{(i)}}, \end{equation*}
where \(k \le \min(\dim_A, \dim_B)\), \(\lambda_j > 0\), and both \(\{\ket{\psi_A^{(i)}}\}\) and \(\{\ket{\psi_B^{(i)}}\}\) are orthonormal.
The meaning of the theorem says that, up to a change of local basis, an entangled state can be written as
\begin{equation*} \sum_j \sqrt{\lambda_j} \ket{j}_A \otimes \ket{j}_B. \end{equation*}
Hint: Consider the singular value decomposition of matrix \((\psi_{j,k})\).
EPR as a mirror!
For any matrix \(U\):
\begin{equation*} (U \otimes I ) \sum_j \ket{j,j} = (I \otimes U^T) \sum_j \ket{j,j} = \sum_{i,j} U_{i,j} \ket{i,j} \end{equation*}
\(\lambda_j\)'s are called the Schmidt coefficients
What are the Schmidt coefficients of the Bell states?
How about product states?
Thanks to this theorem, the theory of pure bipartite entanglement is well established.
Local operation and classical communication (LOCC)
Nielsen's Theorem
Let \(\ket{\psi}\) and \(\ket{\phi}\) be two entangled states on systems \(A,B\). Their Schmidt coefficients are \(\lambda = (\lambda_j)\) and \(\gamma = (\gamma_j)\) respectively.
Theorem (Nielsen). There is an LOCC transforming \(\ket{\psi}\) to \(\ket{\phi}\) if and only if the majorization condition \(\lambda \prec \gamma\) holds.
The condition \(\lambda \prec \gamma\) is that for all \(k\)
\begin{equation*} \sum_{j=1}^k \lambda_j^{\downarrow} \le \sum_{j=1}^k \gamma_j^{\downarrow}. \end{equation*}
A simple corollary is that there is an LOCC transforming an EPR pair to an arbitrary entangled pair of qubits.
EPR: Maximally entangled states.
Superpositions of 000, 001, 010, \(\ldots\), 111
GHZ state
\begin{equation*} \ket{\text{GHZ}} = \frac{\ket{000} + \ket{111}}{\sqrt{2}}. \end{equation*}
W state
\begin{equation*} \ket{\text{W}} = \frac{\ket{001} + \ket{010} + \ket{100}}{\sqrt{3}}. \end{equation*}
Entanglement is a resource for many quantum information processing tasks
Now utilized for good at a distance of at least 1200km (Mozi satellite).
Quantum entanglement—physics at its strangest—has moved out of this world and into space. In a study that shows China's growing mastery of both the quantum world and space science, a team of physicists reports that it sent eerily intertwined quantum particles from a satellite to ground stations separated by 1200 kilometers, smashing the previous world record. (Science, 15 JUN 2017)
A qubit is a unit of quantum information storing the quantum state of a two-level physical system.
Quantum information is more general than classical information. We can store a bit in a qubit. But to store a qubit, it may require an infinite number of bits.
Consider an encoding \(a \mapsto \rho_a\) for \(a=0, 1\).
Promise: The system is in one of the two states \(\rho_0\) and \(\rho_1\),
Problem: Decide which is the case.
Measurement is the only bridge between the quantum and classical worlds. So we use a measurement with measurement operators \(M_0\) and \(M_1\)
\begin{equation*} \begin{split} \text{Minimize: } & \Tr(M_0 \rho_1) + \Tr(M_1 \rho_0)\\ \text{Subject to: } & M_0 \ge 0, M_1 \ge 0, M_0 + M_1 = I. \end{split} \end{equation*}
The optimal \(M_0\) minimizes \(\Tr(M_0 (\rho_1 - \rho_0))\)
In the qubit case, we write \(\rho_b = (I + \vec{v}_b \cdot \vec{\sigma}) / 2\), and \(\rho_0 - \rho_1 = \vec{v} \cdot \vec{\sigma}\) for \(\vec{v} = (\vec{v}_0 - \vec{v}_1)/2\). The optimal \(M_0\) should be
\begin{equation*} \frac{1}{2} \left( I + \frac{\vec{v}}{\lVert \vec{v} \rVert} \cdot \vec{\sigma} \right) \end{equation*}
The minimal error probability is therefore \((2 - \lVert \vec{v}_0 - \vec{v}_1 \rVert )/ 4\).
The amount of information one can store and reliably read from a qubit is only a single bit.
The states \(\ket{0}\) and \(\ket{+}\) are different but not perfectly distinguishable.
Non-orthogonality is the key issue.
What is information? What is quantum information?
We can store the whole Internet in a single qubit, but we cannot reliably read information out.
For a quantum state \(\rho\), its von Neumann entropy is \(S(\rho) = -\tr (\rho \log(\rho))\).
The eigenvalues of \(\rho\) form a probability distribution whose Shannon entropy equals the von Neumann entropy \(S(\rho)\).
Mutual information \(H(X{\,:\,}Y) = H(X) + H(Y) - H(XY)\).
Theorem. Let \(\{(p_x, \rho_x)\}\) be an ensemble of states. Alice prepares \(\rho_x\) with probability \(p_x\) and sends it to Bob. Bob measures the state and gets outcome \(y\). Then the mutual information (accessible information) between \(X\) and \(Y\) satisfies
\begin{equation*} H(X{\,:\,}Y) \le S \Bigl( \sum_x p_x \rho_x \Bigr) - \sum_x p_x S(\rho_x). \end{equation*}
The right-hand side is called Holevo \(\chi\).
We will prove a poor man's version of Holevo's theorem.
Let \(X\) be a uniformly distributed random variable taking values in \([N]\). Consider an encoding of classical information \(x \mapsto \rho_x\) into quantum systems of \(n\) levels. For all measurement (POVM) \(\{M_m\}\), the probability that the measurement correctly guesses \(x\) is
\begin{equation*} p_x = \tr(M_x \rho_x) \le \tr(M_x). \end{equation*}
In expectation the success probability is therefore
\begin{equation*} \E_x p_x \le \frac{1}{N} \sum_x \tr(M_x) = \frac{n}{N}. \end{equation*}
Let \(\rho\) be an arbitrary qubit state, say \(\ket{0}\bra{0}\).
The quantum one-time pad works as follows:
For a player who does not know \(a,b\), the state is
\begin{equation*} \frac{1}{4} \sum_{a,b} U_{a,b} \rho U_{a,b}^\dagger = \frac{I}{2}. \end{equation*}
Unknown quantum states cannot be copied.
There is no unitary transformation \(U\) that can map any \(\ket{\psi} \ket{0}\) to \(\ket{\psi} \ket{\psi}\).
Compute the inner product.
Note: \(\ket{0}\) and \(\ket{1}\) can be cloned using the CNOT gate.
Given a large number of i.i.d samples of quantum state \(\rho\), find the density matrix for \(\rho\).
In the single-qubit case, \(I\), Pauli \(X\), \(Y\), \(Z\) form an orthonormal basis, so
\begin{equation*} \rho = \frac{\tr(\rho)I + \tr(X \rho) X + \tr(Y \rho) Y + \tr(Z \rho) Z}{2}. \end{equation*}
For example, to estimate \(\tr(Z \rho)\), we need to measure \(\rho\) in the \(Z\) basis multiple times, obtaining \(z_1, z_2, \ldots, z_m \in \{\pm 1\}\). Then \((\sum_i z_i) / m\) is an empirical estimate of \(\tr(Z \rho)\).
Note that the standard deviation is \(1/\sqrt{m}\) and so we need to repeat about \(O(1/\eps^2)\) times to have \(\eps\) precision.
Multiple qubit case is similar:
\begin{equation*} \rho = \frac{\sum_{P} \tr(\rho P) P}{2^n}, \end{equation*}
where the summation goes over \(n\)-qubit Pauli operators.
Check that \(I\) and Pauli operators form a basis also in the multiple qubit case.
But it's a summation of \(4^n\) terms!
In the lab, it would take days to tomography a system of more than 10 qubits.
Theorem. The sample complexity for tomography is \(\tilde{\Omega}(d^2/\eps^2)\), where \(d\) is the dimension and \(\eps\) is the precision in the so called trace distance.
To make things worse, each term needs high precision for the triangle inequality to give meaningful bounds.
There are \(4^n\) small real numbers \(\tr(\rho P)\), and we have to ensure that their error sum is still controlled.
How do we even know the quantum device works as expected when tomography of its state is impractical?
I | X | Y | Z | |
---|---|---|---|---|
I | 0 | 0 | 0 | |
X | 0 | -1 | 0 | 0 |
Y | 0 | 0 | -1 | 0 |
Z | 0 | 0 | 0 | -1 |
We need to estimate 15 real numbers \(\mu_P = \tr(\ket{\beta_{11}}\bra{\beta_{11}} P)\) for non-identity two-qubit Pauli operators \(P\).
For \(P = XX, YY, ZZ\), \(\mu_P = -1\),
For all other \(P\), \(\mu_P = 0\).
Note that in this simple case, it suffices to check the \(XX, YY, ZZ\) average values in the first case to certify that the state is the singlet state.
The singlet state is the unique state stabilized by \(-XX\) and \(-ZZ\).
Entanglement and quantum information
Bell states, Schmidt decomposition
Tomography
We have two players, Alice and Bob.
Alice and Bob share an EPR state.
Alice also has an input qubit given in some unknown state \(\ket{\psi}\).
Classically, there are two methods.
She wants to transfer this qubit to Bob using classical communication only.
Even if Alice knows the description of the state \(\ket{\psi}\), for example, maybe she prepared the state using a circuit on her own, the task is still really challenging.
To describe the state, we need to specify two real numbers!
The teleportation protocol needs to transfer two bits from Alice to Bob by consuming the shared EPR state.
We name the qubit in the shared EPR state between Alice and Bob and the input qubit as A, B, and C, respectively.
The Bell states form an orthonormal basis for the two-qubit system \(AC\).
The protocol is simple but very hard to find and only discovered in 90's (quantum mechanics was completely established in the 30's).
Teleporting an unknown quantum state via dual classical and Einstein-Podolsky-Rosen channels
[C. Bennett, G. Brassard, C. Crépeau, R. Jozsa, A. Peres and W. Wootters, '93]
The Origins of Quantum Teleportation - Charles Bennett
What is it that they don't have that they need that would enable them to do as well if they were in separate locations than if they were in the same location.
… So then we realized that by sharing entanglement between the two observers and by permitting them to communicate, you could simulate being in the same place.
\begin{alignat*}{3} & \ket{\beta_{00}} & = \frac{\ket{00} + \ket{11}}{\sqrt{2}},\quad && \ket{\beta_{01}} & = \frac{\ket{01} + \ket{10}}{\sqrt{2}},\\ & \ket{\beta_{10}} & = \frac{\ket{00} - \ket{11}}{\sqrt{2}},\quad && \ket{\beta_{11}} & = \frac{\ket{01} - \ket{10}}{\sqrt{2}}.\\ \end{alignat*}
Before Alice's measurement, the state is
\begin{equation*} \ket{\text{Init}} = \frac{1}{\sqrt{2}} {(\alpha\ket{0} + \beta\ket{1})}_C \otimes (\ket{00} + \ket{11})_{AB}. \end{equation*}
Expand the state using the Bell basis, we have
\begin{equation*} \begin{split} \ket{\text{Init}} \propto \, & \ket{\beta_{00}} \otimes {(\alpha\ket{0}+\beta\ket{1})}_B + \ket{\beta_{01}} \otimes {(\alpha\ket{1}+\beta\ket{0})}_B +\\ & \ket{\beta_{10}} \otimes {(\alpha\ket{0}-\beta\ket{1})}_B + \ket{\beta_{11}} \otimes {(\alpha\ket{1}-\beta\ket{0})}_B. \end{split} \end{equation*}
If it is the first you see this, you may feel a bit confused about how we get the above expansion. The key is to use linearity to do this on product state terms and take the sum. Alternatively, you may consider multiplying
\begin{equation*} I = \sum_{i,j} \ket{\beta_{ij}}\bra{\beta_{ij}}_{AC} \otimes I_B \end{equation*}
If the measurement has outcome \(a, b\), the state on \(B\) after correction is
\begin{equation*} \alpha \ket{0} + \beta \ket{1}. \end{equation*}
Measurements can be delayed to the end.
Proof: Use controlled gates to simulate classical control.
For \(a = 0, 1\),
\begin{equation*} \begin{split} & \ket{a} (\ket{00} + \ket{11}) \\ \mapsto\; & \ket{aa0} + \ket{a \bar{a} 1} \\ \mapsto\; & \ket{aaa} + \ket{a\bar{a}a}\\ \mapsto\; & \ket{0aa} + \ket{0\bar{a}a}\\ =\,\; & \sqrt{2} \ket{0} \ket{+} \ket{a}. \end{split} \end{equation*}
Trace in tensor network!
Tensors are generalizations of matrices.
Matrix \(A\) maps a pair of indices \((i,j)\) to a scaler \(A_{i,j}\).
A rank-k tensor \(T\) maps a \(k\)-tuple \((i_1, i_2, \ldots, i_k)\) to a scaler \(T_{i_1, i_2, \ldots, i_k}\).
Tensor diagrams.
We use a circle (or other shapes) to represent the tensor and \(k\) open edges connected to it to represent the \(k\) indices.
The dimension of each index is the number of different values it takes.
Tensor contraction generalizes matrix multiplication.
If two indices have the same dimension, the contraction operation identifies them and sums them over.
\begin{equation*} \sum_k A_{ijk} B_{lkm}. \end{equation*}
States, gates, and measurements are all tensors
Quantum circuits are tensor networks!
Teleportation as a tensor contraction:
The EPR state is a wire in the tensor picture;
We can move the tensor node freely along the wires.
Bob gets the one-time pad of Alice input qubit as \(U_{ab} = Z^a X^b\).
Specifying quantum states using operators:
\begin{alignat}{2} & XX & \ket{\text{EPR}} & = \ket{\text{EPR}}\\ & ZZ & \ket{\text{EPR}} & = \ket{\text{EPR}} \end{alignat}
\(\ket{\text{EPR}}\) is stabilized by \(XX\) and \(ZZ\).
\(XX\) and \(ZZ\) commute.
\(XX\) and \(ZZ\) as observables:
Measure both qubits in the Hadamard basis (\(X\) measurement), and output the parity.
\begin{equation*} \frac{I + XX}{2} = \frac{I+X}{2}\otimes \frac{I+X}{2} + \frac{I-X}{2} \otimes \frac{I-X}{2} \end{equation*}
The Bell measurement is the same as measuring the \(XX\) and \(ZZ\) observables.
We can measure both as they are compatible.
Back propagation of operators
Teleportation is explained in the stabilizer framework.
What if the input state on C is part of a larger system \(CD\) and the state on \(CD\) is \(\ket{\psi}_{CD}\)?
After the teleportation, the state on \(BD\) is \(\ket{\psi}\). That is, the correlation (entanglement) between \(CD\) is preserved and teleported to \(BD\)!
A special case is when \(\ket{\psi} = \ket{\text{EPR}}\), another EPR pair.
It is known as the Entanglement Swapping protocol.
Would teleportation allow faster-than-light quantum transmission?
Bob has one-time padded quantum information without the two bits.
Would teleportation allow the cloning of quantum information?
Did the Bell measurement learn anything about the input state?
The four Bell states can be transformed to each other locally.
The four Bell states are orthogonal to each other and are therefore perfectly distinguishable.
A scheme for sharing classical secrets (2 bits) among two players (each holding one qubit).
Question: is it possible to share quantum secrets?
Teleportation, superdense coding, and many other protocols use EPR and entanglement as resource, without which the protocols cannot be carried out.
They motivated the study of entanglement as a new type of resource.
How do we quantify the amount of entanglement needed for a certain task?
A mixed state \(\rho_{AB}\) is said to be separable if there is an ensemble of the form \(\bigl\{ \bigl( p_i, \rho_{A}^{(i)} \otimes \rho_{B}^{(i)} \bigr) \bigr\}_{i=1}^k\) such that
\begin{equation*} \rho_{AB} = \sum_{i=1}^k\, p_i\, \rho_{A}^{(i)} \otimes \rho_{B}^{(i)}. \end{equation*}
Separable states are preparable using LOCC only
Wide range of applications
[M. Ying, L. Zhou, and Y. Li, Reasoning
about Parallel Quantum Programs, '18],
[C. Piveteau and D. Sutter, Circuit Knitting with classical
communication, '22]
How to check if a mixed state is separable or not?
Partial transpose
\begin{equation*} (\cdot)^{T_B}: \ket{ij}\bra{kl} \mapsto \ket{il} \bra{kj}. \end{equation*}
(PPT condition). If \(\rho\) is a separable state, then \(\rho^{T_B}\) is positive semidefinite.
\begin{equation*} \sum_{i=1}^k p_i\, \rho_A^{(i)} \otimes \rho_B^{(i)} \mapsto \sum_{i=1}^k p_i\, \rho_A^{(i)} \otimes {\bigl( \rho_B^{(i)} \bigr)}^{T_B}. \end{equation*}
Check that the \(\ket{\text{EPR}}\) state is not PPT.
The PPT condition is a necessary and sufficient condition for separability only for \(2\) by \(2\) and \(2\) by \(3\) systems.
In general, it is NP-hard even to approximate with \(1/\poly\) error!
[S. Gharibian, Strong NP-Hardness of the Quantum Separability Problem, '08]
Distillable entanglement and entanglement cost
\(E_{\text{D}} \le E_{\text{C}}\).
Distillable entanglement \(E_{\text{D}}(\rho_{AB})\) is the rate at which we can generate the EPR state using LOCC.
Entanglement cost \(E_{\text{C}}(\rho_{AB})\) is the rate at which EPR can be transformed to the state \(\rho_{AB}\).
If \(\rho_{AB}\) is a pure state,
\begin{equation*} E_{\text{D}}(\rho_{AB}) = E_{\text{C}}(\rho_{AB}) = S(\rho_A) = S(\rho_B). \end{equation*}
Bound entanglement: entangled states which cannot be distilled exist!
[Horodecki\(^3\), Mixed-state entanglement and distillation, 1998]
Quantum networks
Teleportation empowers entanglement swapping and quantum repeaters.
Gate teleportation
[Aliferis and Leung, Computation by measurements: a unifying picture, 2014]
Port-based teleportation
We have seen many quantum gates such as \(H\) and CNOT. We now introduce two three-qubit gates called Toffoli and Fredkin.
Toffoli
\begin{equation*} \mathrm{Toffoli}: \ket{a, b, c} \mapsto \ket{a, b, c \oplus a b}. \end{equation*}
Fredkin
\begin{equation*} \mathrm{Fredkin}: \ket{a, b, c} \mapsto \begin{cases} \ket{a, b, c} & \text{ if } a = 0,\\ \ket{a, c, b} & \text{ if } a = 1. \end{cases} \end{equation*}
Both Toffoli and Fredkin are classically universal.
Simulate \(\mathrm{NAND}\) using Toffoli.
Reversible classical computation and Landauer's principle
Energy consumption and irreversibility in computation
In the scenario where a computer erases bit of information, it results in the dissipation of energy into its surrounding environment. The minimum amount of energy that is released is proportional to \(k_B T \ln 2\), where \(k_B\) is Boltzmann's constant and \(T\) denotes the temperature of the environment.
Does computation have to consume energy?!
It's possible to perform classical computation reversibly (using Toffoli or Fredkin).
Uncompute technique
Clean the computational garbage!
A set of quantum gates is universal if all other unitary operators can be generated exactly (or approximately) from that set.
Quantum computers are digital.
Approximation in the sense of the following distance
\begin{equation*} d(U, V) = \norm{U - V} = \sup_{\ket{\psi}} \norm{(U-V) \ket{\psi}}. \end{equation*}
Example:
CNOT and single-qubit gates (exact);
CNOT, H, T (approximate);
Toffoli, H (weakly universal).
Complex numbers are essential for quantum mechanics but not for quantum computing.
With an extra qubit, it is possible to map a circuit \(U\) with complex entries to a circuit \(V\) with real entries.
Trick: \(a + bi \mapsto \begin{bmatrix}a & b \\ -b & a\end{bmatrix}\).
According to Wikipedia, it was first announced by Robert M. Solovay in 1995 and independently proven by Alexei Kitaev in 1997.
Let \(G\) be a universal gate set for \(\mathrm{SU}(d)\). Then there is a constant \(c\) such that any unitary in \(\mathrm{SU}(d)\) can be approximated to error \(\eps\) using a product of elements in \(G\) of length \(\log^c(1/\eps)\).
Lemma. Let \([U,V] = UVU^{-1}V^{-1}\) be the group commutator of \(U, V\) and let \(S_\eps\) be \(\{ U \in \mathrm{SU}(2) : \norm{I - U} \le \eps \}\). If \(\Gamma\) is an \(\eps^2\)-net for \(S_\eps\), then \([\Gamma, \Gamma] = \{ [U, V] \mid U, V \in \Gamma \}\) is an \(\eps^3\)-net for \(S_{\eps^2}\).
Note that the lemma is not easy to use iteratively as the scaling of \(\eps\)-net and the size of the neighborhood of the identity do not match the initial condition.
The closedness condition under inverse is not needed anymore.
Suppose \(U_i, V_i\) are unitary operators satisfying \(\norm{U_i - V_i} \le \eps\) for all \(i = 1, 2, \ldots, t\). Then
\begin{equation*} \norm{U_t \cdots U_2 U_1 - V_t \cdots V_2 V_1} \le t \eps. \end{equation*}
Proof: By the triangle inequality and unitary invariance of the operator norm,
\begin{equation*} \begin{split} & \norm{U_t \cdots U_2 U_1 - V_t \cdots V_2 V_1}\\ =\; & \norm{U_t \cdots U_2 U_1 - U_t V_{t-1} \cdots V_1 + U_t V_{t-1} \cdots V_1 - V_t \cdots V_2 V_1} \\ \le\; & \norm{U_t \cdots U_2 U_1 - U_t V_{t-1} \cdots V_1} + \norm{U_t V_{t-1} \cdots V_1 - V_t \cdots V_2 V_1} \\ =\; & \norm{U_{t-1}\cdots U_2 U_1 - V_{t-1} \cdots V_2 V_1} + \norm{U_t - V_t } \\ \le\; & t \eps. \end{split} \end{equation*}
This means that for a circuit of \(m\) gates using one gate set can be transformed to a circuit of size \(m \polylog(m/\eps)\) using one other gate where the total approximation error is \(\eps\).
Let \(f : \X \rightarrow \Y\) be a function that we know how to efficiently implement classically.
This implies that we can efficiently perform the following transform
\begin{equation*} O_f : \ket{x}\ket{y} \mapsto \ket{x} \ket{y + f(x)}. \end{equation*}
The superposition principle then implies that we can efficiently prepare
\begin{equation*} \frac{1}{\sqrt{\abs{\X}}} \sum_{x\in \X} \ket{x} \ket{f(x)}. \end{equation*}
Use the uncompute technique.
Phase oracle: If \(\Y = \{0,1\}\), we can use the so-called "phase-kickback" technique to implement the transform
\begin{equation*} O_{f,\pm} : \ket{x} \mapsto (-1)^{f(x)} \ket{x}. \end{equation*}
We add a \(\ket{-}\) ancilla to implement the phase oracle using the standard oracle.
To implement the standard oracle, we may need a controlled phase oracle.
Let \(n\) be an integer and \(N = 2^n\).
We are given oracle access to \(f: \{0,1\}^n \rightarrow \{0,1\}\) with the promise that
Problem: find out which of the above two cases is true.
A classical algorithm will need \(\Omega(N)\) queries to \(f\) in the worst case.
Quantum parallelism and cancellation!
After Step 2, the state is
\begin{equation*} \frac{1}{\sqrt{\abs{\X}}} \sum_{x} (-1)^{f(x)} \ket{x}. \end{equation*}
The action of \(H^{\otimes n}\) is
\begin{equation*} H^{\otimes n} : \ket{x} \mapsto \frac{1}{\sqrt{\abs{\X}}} \sum_y (-1)^{x \cdot y} \ket{y}. \end{equation*}
So after Step 3, the state is proportional to
\begin{equation*} \frac{1}{\abs{\X}} \sum_{x, y} (-1)^{f(x) + x \cdot y}\, \ket{y}. \end{equation*}
The probability of observing \(y = 0\) is \(\left( \frac{1}{\abs{\X}} \sum_x (-1)^{f(x)} \right)^2\).
Bernstein-Vazirani algorithm.
Given oracle access to \(f(x) = s \cdot x\) where \(s\) is a hidden secret.
Problem: Find \(s\).
Analysis
Similar to the analysis of the Deutsch-Jozsa algorithm.
Before measurement, the state is
\begin{equation*} \frac{1}{\abs{\X}} \sum_{x, y} (-1)^{f(x) + x \cdot y}\, \ket{y} = \frac{1}{\abs{\X}} \sum_{x, y} (-1)^{x \cdot (y + s)}\, \ket{y} = \ket{s}. \end{equation*}
Any classical algorithm (deterministic or randomized) needs \(n\) queries.
The final answer consists of \(n\) bits, and one classical query learns at most \(1\) bit of information.
Simon's algorithm is the main inspiration for Shor's factoring quantum algorithm.
It shows that quantum computers are provably exponentially more efficient (in terms of the number of queries) than bounded-error randomized algorithms.
Simon's problem setup:
Oracle \(f: \{0,1\}^n \rightarrow \{0,1\}^n\) satisfies that \(f(x) = f(x')\) if and only if \(x = x'\) or \(s + x = x'\) for some hidden secret \(s \ne 0\).
Problem: Find \(s\).
Note that the function \(f\) outputs multiple bits now, and it is a \(2\)-to-\(1\) function.
The state on \(A\) after the measurement is
\begin{equation*} \frac{1}{\sqrt{2}}(\ket{x} + \ket{x + s})_A, \end{equation*}
for some random \(x\).
It is the superposition of preimages of \(f\) for \(f(x)\).
After Step 1.3 each iteration, the state on \(A\) can be written as
\begin{equation*} \begin{split} & \frac{1}{\sqrt{2^{n+1}}} \sum_y \bigl( (-1)^{x \cdot y} + (-1)^{(x + s) \cdot y} \bigr) \ket{y} \\ = \; & \frac{1}{\sqrt{2^{n+1}}} \sum_y (-1)^{x \cdot y} \bigl(1 + (-1)^{s \cdot y} \bigr) \ket{y}. \end{split} \end{equation*}
The measurement will output a random \(y\) such that \(s \cdot y = 0\).
Step 1.2 is not necessary.
Each \(y^{(i)}\) is sampled independently among those satisfying \(s \cdot y^{(i)} = 0\).
To determine \(s\), we need to have \(n-1\) linearly independent \(y^{(i)}\)'s.
If the dimension of \(\text{span}(y^{(1)}, y^{(2)}, \ldots, y^{(i-1)})\) has rank \(k < n-1\), then the probability that \(y^{(i)}\) is not in the span is \(1 - \frac{2^{k}}{2^{n-1}} \ge \frac{1}{2}\). This implies that the algorithm runs in \(O(n)\) iterations with high probability.
Simon's algorithm queries the oracle \(O(n)\) times.
Yet any classical algorithm needs \(\Omega(\sqrt{2^n})\) queries!
Upper bound: By the birthday paradox, we expect to find \(x, x'\) such that \(f(x) = f(x')\) after \(O(\sqrt{2^n})\) queries.
Lower bound: Any classical (randomized or deterministic) algorithm needs \(\Omega(\sqrt{2^n})\) queries.
We say that a sequence \(x_1, x_2, \ldots, x_k\) is bad if there is no collision.
This excludes at most \(k \choose 2\) possible \(s\)'s.
For all \(k \ll \sqrt{2^n}\), the probability that \(x_1, x_2, \ldots, x_k\) is bad can be bounded as
\begin{equation*} \begin{split} \Pr(x_1, \ldots, x_k \text{ is bad}) & = \prod_{j=2}^k \Pr(x_1, \ldots, x_j \text{ is bad} \,|\, x_1, \ldots, x_{j-1} \text{ is bad}) \\ & = \prod_{j=2}^k \left(1 - \frac{j-1}{2^n - {j-1 \choose 2}}\right)\\ & \ge 1 - \sum_{j=2}^k \frac{j-1}{2^n - {j-1 \choose 2}} \approx 1 - \frac{k^2}{2^{n+1}}. \end{split} \end{equation*}
Let \(\X=\{0,1,\ldots,N-1\}\) be a finite set.
The discrete Fourier transform transforms a function
\begin{equation*} f:\X \rightarrow \complex \end{equation*}
to another function \(\hat{f}\) where
\begin{equation*} \hat{f}(y) = \frac{1}{\sqrt{N}}\sum_{x\in\X} \omega_N^{xy} f(x). \end{equation*}
Here, \(\omega_N = e^{2\pi i/N}\) is the \(N\)-th root of unity.
The normalization factor makes the map a unitary operator!
A transformation on a sequence of \(N\) complex numbers.
An important tool in classical signal processing.
Cooley–Tukey algorithm, from \(O(N^2)\) to \(O(N\log N)\):
\begin{equation*} \hat{f}(y) = \frac{1}{\sqrt{N}}\sum_{x\in\X} \omega_N^{xy} f(x). \end{equation*}
Gauss (1805)!
Assume for simplicity \(N=2^n\).
Consider the even and odd parts of \(f\) defined on \(\{0, 1, \ldots, N/2 - 1\}\),
\begin{equation*} f_e(x) = f(2x),\quad f_o(x) = f(2x+1). \end{equation*}
Recursive structure: for \(0\le y < N/2\),
\begin{equation*} \hat{f}(\overline{0y}) = \frac{1}{\sqrt{2}} \Bigl[ \hat{f_\text{e}}(y) + \omega_N^y \hat{f_{\text{o}}}(y) \Bigr]. \end{equation*}
What about the second half? For \(1y\), we have
\begin{equation*} \begin{split} \hat{f}(\overline{1y}) & = \frac{1}{\sqrt{N}} \sum_{0 \le x < N/2} \bigl(\omega_N^{2 x (y + N/2)} f_e (x) + \omega_N^{(2x + 1) (y + N/2)} f_o(x) \bigr)\\ & = \frac{1}{\sqrt{2}} (\hat{f}_e(y) - \omega_N^y \hat{f}_o(y)). \end{split} \end{equation*}
The recursion relation is
\begin{equation*} \begin{split} \hat{f}(\overline{0y}) & = \frac{1}{\sqrt{2}} \Bigl[ \hat{f_\text{e}}(y) + \omega_N^y \hat{f_{\text{o}}}(y) \Bigr],\\ \hat{f}(\overline{1y}) & = \frac{1}{\sqrt{2}} \Bigl[ \hat{f_\text{e}}(y) - \omega_N^y \hat{f_{\text{o}}}(y) \Bigr]. \end{split} \end{equation*}
FFT: compute the fast Fourier transform (recursively) on the even and odd parts respectively and combine the results using the above rule.
For function \(f:\X \rightarrow \complex\), define quantum state.
\begin{equation*} \ket{f} = \sum_{x\in\X} f(x) \ket{x}. \end{equation*}
Quantum Fourier transform maps the superposition \(\ket{f}\) to \(\ket{\hat{f}}\)
Verify that \(F \ket{f} = \ket{\hat{f}}\).
Recall that quantum computation is about the processing of superposition states.
In QFT, we transform the amplitudes by DFT.
The QFT is a unitary transform:
\begin{equation*} F = \frac{1}{\sqrt{N}} \sum_{x,y\in\X} \omega_N^{xy}\ket{y}\bra{x}. \end{equation*}
Hadamard is the smallest Fourier transform (N=2)!
Hadamard and Toffoli are quantum universal.
Use the recursion in a quantum way.
\begin{equation*} \begin{split} \hat{f}(\overline{0y}) & = \frac{1}{\sqrt{2}} \Bigl[ \hat{f_\text{e}}(y) + \omega_N^y \hat{f_{\text{o}}}(y) \Bigr]\\ \hat{f}(\overline{1y}) & = \frac{1}{\sqrt{2}} \Bigl[ \hat{f_\text{e}}(y) - \omega_N^y \hat{f_{\text{o}}}(y) \Bigr] \end{split} \end{equation*}
Apply QFT of \(n-1\) qubits to the first \(n-1\) qubits only once!!!
Then do some post-processing.
Cheating?! No!
After the QFT on the first \(n-1\) qubits, we have
\begin{equation*} \ket{f} = \ket{f_{\text{e}}} \ket{0} + \ket{f_{\text{o}}} \ket{1} \mapsto \ket{\hat{f_{\text{e}}}} \ket{0} + \ket{\hat{f_{\text{o}}}} \ket{1}. \end{equation*}
Define function \(\hat{g}_{\text{o}}(y) = \omega_N^y \hat{f_{\text{o}}}(y)\), and apply \(n-1\) controlled phase gates, we have
\begin{equation*} \ket{\hat{f_{\text{e}}}} \ket{0} + \ket{\hat{g}_{\text{o}}} \ket{1}. \end{equation*}
Apply Hadamard to the last qubit, we get
\begin{equation*} \frac{\ket{\hat{f_{\text{e}}}} + \ket{\hat{g}_{\text{o}}}}{\sqrt{2}} \ket{0} + \frac{\ket{\hat{f_{\text{e}}}} - \ket{\hat{g}_{\text{o}}}}{\sqrt{2}} \ket{1}. \end{equation*}
Move the last qubit to the first by SWAP gates, and we have \(\ket{\hat{f}}\).
The last step is needed because we need \(\hat{f}(0y)\) and \(\hat{f}(1y)\).
We have shown an efficient implementation of QFT.
Complexity is \(n^2\) or \(\log^2 N\).
A result shows that, if small rotations are omitted, QFT still works approximately and has complexity \(n \log n\).
Do not expect it to be useful in most problems of signal processing!
You cannot read out the values of the amplitudes.
In phase estimation, we have access to controlled-\(U^x\) for all \(x\in \X\) and also an eigenstate \(\ket{\phi}\) of \(U\) where
\begin{equation*} U \ket{\phi} = e^{i\phi} \ket{\phi}. \end{equation*}
The problem is to estimate the phase \(\phi \in [0, 2\pi)\) to some desired precision.
Inverse Fourier transform
\begin{equation*} F^{-1} = \frac{1}{\sqrt{N}} \sum_{x,y\in\X} \omega_N^{-xy}\ket{y}\bra{x}. \end{equation*}
Phase estimation is a powerful quantum algorithmic subroutine used in many other quantum algorithms.
To get an \(m\)-bit estimate of \(\phi\), the phase estimation procedure performs the following with \(n = m + O(\log(1/\eps))\),
Prepare state \(\displaystyle \frac{1}{\sqrt{N}} \sum_{x\in\X} \ket{x} \ket{\phi}\);
Apply unitary \(\displaystyle \sum_{x\in\X} \ket{x}\bra{x} \otimes U^x\), and the state becomes \(\displaystyle \frac{1}{\sqrt{N}} \sum_{x\in\X} e^{i\phi x} \ket{x}\ket{\phi}\);
Apply an inverse quantum Fourier transform on the first register.
If \(\phi = 2\pi y / 2^n\) for some \(y\in \X\), the state in the first register before the inverse Fourier transform is \(F \ket{y}\).
The first register is a good approximation of \(\phi/2\pi\).
The approximation will have \(m\) bits of precision with probability at least \(1-\eps\).
If \(n = m\), the outcome is the best \(n\)-bit approximation with probability at least \(4/\pi^2\).
2015 Turing Award
New directions in cryptography
Given a cyclic group \(G = \langle g_0 \rangle\) generated by \(g_0\) and an element \(h\in G\), the size \(N=\abs{G}\) is known.
The discrete logarithm \(\log_{g_0} h\) of \(h\) in \(G\) with respect to \(g_0\) is the smallest non-negative integer \(\alpha\) such that \(g_0^\alpha = h\).
Important for classical cryptography (Diffie-Hellman key exchange)!
We can completely break the protocol, if we can solve the discrete logarithm for \(A\) and \(B\), the two messages exchanged by the players,
Yet, no efficient classical algorithm is known.
We attack the protocol by giving a quantum algorithm for computing the discrete logarithm of a given \(h\).
Consider the following unitary.
\begin{equation*} U: \ket{g} \mapsto \ket{hg} \end{equation*}
It is possible to implement controlled-\(U^x\) operators for any \(x\) by repeated squaring.
Define states \(\displaystyle \ket{\phi_y} = \frac{1}{\sqrt{N}} \sum_{x\in\X} \omega_N^{xy} \ket{g_0^x}\), then
\begin{equation*} U \ket{\phi_y} = \frac{1}{\sqrt{N}} \sum_{x\in\X} \omega_N^{xy} \, \ket{g_0^{x+\log_{g_0}h}} = \omega_N^{-y\log_{g_0} h} \ket{\phi_y}. \end{equation*}
Use phase estimation!
How do we prepare the eigenstate \(\ket{\phi_y} = \frac{1}{\sqrt{N}} \sum_{x\in\X} \omega_N^{xy} \ket{g_0^x}\)?
It is not easy as the phase \(\omega_N^{xy}\) depends on the discrete log of \(g_0^x\).
It won't work if we prepare \(\frac{1}{\sqrt{N}} \sum_{x \in \X} \omega_N^{xy} \ket{x} \ket{g_0^x}\).
Forgetting quantum information coherently (\(\ket{x}\) here) is hard!
We can sample a random pair of \((y, \ket{\phi_y})\).
Factoring has been widely studied, and no classical algorithm is known.
The best known classical algorithm runs in time \(2^{(\log N)^c}\) for \(c = 1/2\) (or \(1/3\) depending on number theory conjectures).
People believe factoring is hard and use it to construct public-key cryptography.
Rivest, Shamir, and Adleman receive 2002 Turing Award for this.
Factoring is in BQP.
Shor's algorithm for factoring runs in \(\poly (\log N)\) quantum time.
A challenge to extended Church-Turing thesis:
Any efficient computation performed on a physically realistic model can be carried out efficiently on a Turing machine.
How did Peter Shor discover his algorithm?
Knowing the period of \(x^0, x^1, x^2, \ldots \mod N\) helps solving the factoring problem for \(N\).
The smallest non-zero \(r\) such that \(x^r \equiv 1 \pmod{N}\).
If \(r\) is even and \(x^{r/2}+1\) and \(x^{r/2}-1\) are not multiples of \(N\), we have
\begin{equation*} x^r - 1 = (x^{r/2}-1)(x^{r/2}+1) = kN. \end{equation*}
Then \(\mathrm{gcd}(x^{r/2}-1, N)\) and \(\mathrm{gcd}(x^{r/2}+1, N)\) will be non-trivial factors of \(N\).
Number theory: The condition above happens with probability \(\ge 1/2\) over random \(x\).
Find the period of \(f(a) = x^a \pmod{N}\).
The period can be huge, and the problem is believed to be classically hard.
Phase estimation applied to
\begin{equation*} U \ket{y} = \ket{x y\;\mathrm{mod}\;N}. \end{equation*}
Some eigenvectors of \(U\):
\begin{equation*} \begin{split} \ket{u_s} & = \frac{1}{\sqrt{r}} \sum_{k=0}^{r-1} \omega_r^{-sk} \ket{x^k\;\mathrm{mod}\;N}.\\ U \ket{u_s} & = \frac{1}{\sqrt{r}} \sum_{k=0}^{r-1} \omega_r^{-sk} \ket{x^{k+1}\;\mathrm{mod}\;N} = \omega_r^{s} \ket{u_s}. \end{split} \end{equation*}
\(U\) is unitary, and by using modular exponentiation, we can implement controlled-\(U^j\).
How do we prepare the eigenstate \(\ket{u_s}\) depending on \(r\)?
We don't know how to prepare \(\ket{u_s}\) for given \(s\).
Yet, we have
\begin{equation*} \frac{1}{\sqrt{r}} \sum_{s=0}^{r-1} \ket{u_s} = \frac{1}{r} \sum_{s=0}^{r-1} \omega_r^{-sk} \ket{x^k\;\mathrm{mod}\; N} = \ket{1}. \end{equation*}
We can obtain high precision of the phase \(\varphi \approx s/r\) for some random \(r\).
Use continued fraction algorithm to find \(r\).
Suppose \(s/r\) is a rational number satisfying
\begin{equation*} \abs{\frac{s}{r} - \varphi} \le \frac{1}{2 r^2}. \end{equation*}
Then the continued fraction algorithm for \(\varphi\) can find \(s\) and \(r\) efficiently.
Some hidden detail: Need a minor fix when \(s\) and \(r\) are not coprime.
The development of Simon's and Shor's algorithms motivated the quantum algorithmic framework called hidden subgroup problem (HSP).
It generalized the technique to a lot more problems with group (algebraic) structures.
Simons problem: \(\{0, s\}\) is a subgroup of \(\{0, 1\}^n\).
The Abelian HSP problem is completely understood.
Most of the non-Abelian cases are open with a few exceptions.
A famous open problem is the dihedral subgroup problem, for which the best quantum algorithm (Kuperberg's algorithm) runs in time \(O(2^{\sqrt{n}})\).
Shor's algorithm breaks the RSA public-key cryptosystem if large-scale quantum computers can be built.
Similar algorithms can break elliptic-curve based public-key cryptography.
Post-quantum cryptography is emerging.
NIST PQC standardization process announced the third-round candidates in July, 2020 and the candidates to be standardized in July, 2022.
Year | Public-Key encryption/KEMs | Digital signatures | |
---|---|---|---|
2020 | Classic McEliece (Code-based) | CRYSTALS-DILITHIUM (Lattice-based) | |
2020 | CRYSTALS-KYBER (Lattice-based, LWE) | FALCON (Lattice-based) | |
2020 | NTRU (Lattice-based) | Rainbow (Multivariate) | |
2020 | SABER (Module Learning With Rounding) | ||
2022 | CRYSTALS-KYBER (Lattice-based, LWE) | CRYSTALS-DILITHIUM (Lattice-based) | |
2022 | FALCON (Lattice-based) | ||
2022 | SPHINCS+ (Hash-based) |
Both Shor and Grover were at Bell Labs at that time.
Suppose you have a Boolean function \(f : \{0,1\}^n \to \{0,1\}\) to which we can make oracle access as before.
We know that there is one and only one input \(x\) such that \(f(x) = 1\).
We call this \(x\) the marked item.
How many queries to \(f\) do we need to find the marked item?
Classically \(\Theta(N)\) for \(N = 2^n\).
In 1996, Grover gave a quantum algorithm that makes \(O(\sqrt{N})\) queries only.
How?! It's not very different from Elitzur-Vaidmen.
This marks the beginning of an interesting line of research on quantum algorithmic techniques different from QFT and HSP.
Remark: Grover's algorithm requires the oracle or QRAM (for database search).
Define Grover iterate as
\begin{equation*} G = H^{\otimes n} R H^{\otimes n}\, O_{f,\pm}, \end{equation*}
where \(R\) is the reflection along \(\ket{0^n}\).
An example for \(n=2\) and marked item \(x=10\):
\begin{equation*} \frac{1}{2} \begin{pmatrix} 1 \\ 1 \\ 1 \\ 1 \end{pmatrix} \mapsto \frac{1}{2} \begin{pmatrix} 1 \\ 1 \\ -1 \\ 1 \end{pmatrix} = \frac{1}{4} \begin{pmatrix} 1 \\ 1 \\ 1 \\ 1 \end{pmatrix} + \frac{1}{4} \begin{pmatrix} 1 \\ 1 \\ -3 \\ 1 \end{pmatrix} \mapsto \begin{pmatrix} 0 \\ 0 \\ 1 \\ 0 \end{pmatrix}. \end{equation*}
Good state \(\ket{\mathrm{G}} = \ket{x}\) for \(f(x) = 1\) and bad state \(\ket{\mathrm{B}} = \frac{1}{\sqrt{N-1}} \sum_{x: f(x)=0} \ket{x}\).
The uniform superposition state \(\ket{\mathrm{U}}\) is in the span of \(\ket{\mathrm{G}}\) and \(\ket{\mathrm{B}}\),
\begin{equation*} \ket{\mathrm{U}} = \sin\theta\, \ket{\mathrm{G}} + \cos\theta\, \ket{\mathrm{B}}, \text{ for } \theta = \arcsin(1/\sqrt{N}). \end{equation*}
The Grover iterate consists of two reflections which generate a rotation of \(2\theta\).
After \(k\) iterations, the state is \(\sin\bigl((2k+1)\theta \bigr)\, \ket{\mathrm{G}} + \cos\bigl((2k+1)\theta \bigr)\, \ket{\mathrm{B}}\).
Choose \(k \approx \frac{\pi}{4\theta} = O(\sqrt{N})\) as \(\sin\theta \le \theta\).
This is known as BBBV (Bennett-Bernstein-Brassard-Vazirani), proved even before Grover's algorithm where the authors tried to show that there shouldn't be quantum polynomial time algorithms for NP-hard problems.
Quantum algorithms need \(\Omega(\sqrt{N})\) queries to solve the search problem.
There are now many proofs for this fact and there is an advanced theory called quantum query complexity answering related questions.
A quantum query algorithm for solving the search problem has the form
\begin{equation*} U_T O_T \cdots O_2 U_1 O_1 U_0. \end{equation*}
\(\phantom{f(}x\) | \(1\) | \(2\) | \(\cdots\) | \(x_0\) | \(\cdots\) | \(N\) |
---|---|---|---|---|---|---|
\(f(x)\) | \(0\) | \(0\) | \(\cdots\) | \(0\) | \(\cdots\) | \(0\) |
Hybrid argument proof idea:
First, consider a case where \(f(x) = 0\) for all \(x\) and so all oracle calls are identity maps.
There will be an \(x_0\) that is not queried very often because there are so many \(x\)'s to query.
If we reprogram the oracle so that \(f(x_0) = 1\), the algorithm will not notice this change (by a hybrid argument).
We use a hybrid argument to prove the claim.
Consider Hybrid \(0\) and let the state before the \(t\)-th query be \(\ket{\psi_{t}} = \sum_{x\in \X} \alpha_x^{(t)} \ket{x}\).
Then \(\sum_{x\in \X} \sum_{t=1}^T \abs{\alpha_x^{(t)}}^2 = T\).
There is an \(x_0\) such that \(\sum_{t=1}^T \abs{\alpha_{x_0}^{(t)}}^2 \le T/N\).
By Cauchy-Schwarz, \(\sum_{t=1}^T \abs{\alpha_{x_0}^{(t)}} \le T/\sqrt{N}\).
It is easy to show \(\norm{\xi_{t} - \xi_{t-1}} \le 2 \abs{\alpha_{x_0}^{(T-t+1)}}\) and \(\norm{\xi_{T} - \xi_0} \le 2 \sum_t \abs{\alpha_{x_0}^{(t)}} \le 2T/\sqrt{N}\).
I am a quantum engineer, but on Sundays I have principles.
(from a talk of John S. Bell)
I hope you are all in a quantum mood today. I know I am. Because this is going to be a pretty special lecture. In fact, if an hour from now, you are not feeling deeply disturbed and unsettled and feel like your grasp of reality has been forever altered, then I haven't done my job very well.
(from John Preskill, Ph CS 219A Lecture 6 Bell Inequalities)
Back in the '30s, many physicists were puzzled by the weird nature of quantum mechanics and quantum entanglement.
The \(X\) (or \(Z\)) measurement extracts one bit of classical information from the quantum system.
When and how is this piece of classical information formed?
Predetermined vs. freshly cooked up
Is the information there if we didn't measure it.
Is the moon there when nobody looks?
"We often discussed his notions on objective reality. I recall that during one walk, Einstein suddenly stopped, turned to me and asked whether I really believed that the moon exists only when I look at it."
"I cannot seriously believe in [the quantum theory] because it cannot be reconciled with the idea that physics should represent a reality in time and space, free from spooky actions at a distance."
In their famous 1935 paper, Einstein, Rosen, and Podolsky questioned the completeness of quantum theory after observing the correlation behavior of local measurements on the singlet state \(\ket{\beta_{11}}\).
Suppose Alice has the first qubit and measures the observable \(\vec{u} \cdot \vec{\sigma}\) and Bob measures his half of the state along the direction \(\vec{v} \cdot \vec{\sigma}\).
Consider the following quantity measuring the correlation between Alice and Bob's outcomes:
\begin{equation} Q(\vec{u},\vec{v}) = \bigbra{\beta_{11}} (\vec{u} \cdot \vec{\sigma})_A \otimes (\vec{v} \cdot \vec{\sigma})_B \bigket{\beta_{11}} = - \vec{u} \cdot \vec{v}. \end{equation}
Where is the paradox?
\begin{equation} \Pr\bigl[ \text{A and B have the same outcome} \bigr] = \frac{Q(\vec{u}, \vec{v}) + 1}{2} = \frac{1 - \vec{u} \cdot \vec{v}}{2}. \end{equation}
If Alice measures \(\sigma_x\) and Bob measures \(\sigma_z\), Bob will see a random \(\pm 1\) outcome. While if Alice measures \(\sigma_z\) and Bob measures \(\sigma_z\), Bob's outcome is already determined (the opposite of Alice's outcome) even before the measurement of Bob.
The puzzle here is that operations done on Alice's side seem to have influenced Bob's system right away, causing Bob's particle to behave differently.
There are two ways to open a box by \(X\) and \(Z\) measurements.
The result \(a\) of measuring \(\vec{u} \cdot \vec{\sigma}\) on Alice's side is determined by \(\vec{u}\) and supposedly some hidden variable \(\lambda\).
There are functions \(a\) and \(b\) and hidden variable \(\lambda\) so that the correlation can be written as
\begin{equation} P(\vec{u},\vec{v}) = \mathop{\mathbb{E}}_\lambda\, a(\vec{u},\lambda) b(\vec{v},\lambda). \end{equation}
We can explain all the Pauli measurements on the singlet state using a hidden variable model.
Let \(\lambda\) be a random unit vector in \(\real^3\) and define functions \(a, b\),
\begin{equation} \begin{split} a(\vec{u}, \lambda) & = \phantom{-}\sign(\lambda \cdot \vec{u}),\\ b(\vec{v}, \lambda) & = -\sign(\lambda \cdot \vec{v}). \end{split} \end{equation}
That means that even if we tomography the state, it is still impossible to exclude hidden variable models!
For any three binary properties \(A,B,C\) of a set of objects,
\begin{equation} N(A, \bar{B}) + N(B,\bar{C}) \ge N(A,\bar{C}). \end{equation}
Example:
A | B | C |
---|---|---|
Undergraduate | Female | CS Major |
Proof:
\begin{equation} \begin{split} \text{LHS} & = N(A,\bar{B},C) + N(A,\bar{B},\bar{C}) + N(A,B,\bar{C}) + N(\bar{A},B,\bar{C})\\ \text{RHS} & = N(A,B,\bar{C}) + N(A,\bar{B},\bar{C}). \end{split} \end{equation}
It fails quantum-mechanically!
Bell scenario
On state \(\ket{\beta_{11}}\) measure
\begin{equation} \begin{split} A & = \vec{n}_A \cdot \vec{\sigma},\\ B & = \vec{n}_B \cdot \vec{\sigma},\\ C & = \vec{n}_C \cdot \vec{\sigma}.\\ \end{split} \end{equation}
Born's rule
\begin{equation} N(A,\bar{B}) \approx N \cdot p(A, \bar{B}) = N \cdot \bigbra{\beta_{11}} \frac{I+A}{2} \otimes \frac{I+B}{2} \bigket{\beta_{11}} = \frac{1 - \vec{n}_A \cdot \vec{n}_B}{4} \cdot N. \end{equation}
Choose \(\vec{n}_A = (1,0,0), \vec{n}_B = (1/2, \sqrt{3}/2, 0), \vec{n}_C = (-1/2, \sqrt{3}/2, 0)\), then
\begin{equation} N(A,\bar{B}) + N(B,\bar{C}) = N/4 < N(A,\bar{C}) = 3N/8. \end{equation}
An improvement to the original Bell's inequality is called the CHSH inequality proposed by John Clauser, Michael Horne, Abner Shimony and Richard Holt. The CHSH inequality is as follows
\begin{equation} -2 \le E \le 2, \end{equation}
where
\begin{equation} E = P(\vec{u}_0,\vec{v}_0) + P(\vec{u}_0,\vec{v}_1) + P(\vec{u}_1,\vec{v}_0) - P(\vec{u}_1,\vec{v}_1). \end{equation}
The favored answers are those with \(a \oplus b = x \land y\).
By definition of \(P(\vec{u},\vec{v})\),
\begin{align*} \abs{E} = & \Bigl|\, \E_\lambda\, a(\vec{u}_0,\lambda) \bigl[ b(\vec{v}_0,\lambda) + b(\vec{v}_1,\lambda) \bigr] + a(\vec{u}_1,\lambda) \bigl[ b(\vec{v}_0,\lambda) - b(\vec{v}_1,\lambda)\bigr] \,\Bigr|\\ & \le \E_\lambda \, \Bigl| a(\vec{u}_0,\lambda) \bigl[ b(\vec{v}_0,\lambda) + b(\vec{v}_1,\lambda) \bigr] + a(\vec{u}_1,\lambda) \bigl[ b(\vec{v}_0,\lambda) - b(\vec{v}_1,\lambda)\bigr] \Bigr|\\ & \le 2. \end{align*}
The last step follows as either \(b(\vec{v}_0,\lambda) + b(\vec{v}_1,\lambda)\) or \(b(\vec{v}_0,\lambda) - b(\vec{v}_1,\lambda)\) must be \(0\).
Quantum mechanically, however, it is possible to choose vectors \(\vec{u}_s\) and \(\vec{v}_t\) such that
\begin{equation} Q(\vec{u}_s,\vec{v}_t) = - \vec{u}_s\cdot \vec{v}_t = (-1)^{st} \frac{\sqrt{2}}{2}, \end{equation}
which gives a corresponding value of \(E=2\sqrt{2}\), a violation of the classical bound by a factor of \(\sqrt{2}\).
Notice the 45-degree rotation between Alice and Bob's observables.
This is what makes it different from tomography.
CHSH as a game
CHSH inequality as a game value (maximum acceptance probability)
\begin{equation} \val(\text{CHSH}) = \frac{4 + E}{8} \le \frac{3}{4} = 0.75. \end{equation}
If Alice and Bob are allowed to share entanglement
\begin{equation} \vale(\text{CHSH}) = \frac{4 + E}{8} \ge \frac{4 + 2\sqrt{2}}{8} \approx 0.85. \end{equation}
Local hidden variable models cannot model all predictions of quantum mechanics.
There is a simple statistical test to tell quantum theory apart from classical theories.
If we measure on a Bell state, both Alice and Bob will observe a random bit, no matter which single qubit observable \(\vec{n} \cdot \vec{\sigma}\) they choose.
These random outcomes are not predetermined before the measurement.
Useful for certified randomness from quantum mechanics.
The solution we gave earlier for CHSH is optimal (assuming quantum mechanics).
Let \(\ket{\psi}\) the be shared state between Alice and Bob and let \(A^0, A^1\) be the observable Alice uses and \(B^0, B^1\) be the observable Bob uses, then
\begin{equation} \langle \text{CHSH} \rangle = \bigbra{\psi}\, \sum_{x,y} (-1)^{xy} A^x \otimes B^y\, \bigket{\psi} \le 2\sqrt{2}. \end{equation}
Proof.
\begin{equation} \biggl\langle {\Bigl( I \otimes B^0 - \frac{A^0 + A^1}{\sqrt{2}} \otimes I\Bigr)}^2 + {\Bigl( I \otimes B^1 - \frac{A^0 - A^1}{\sqrt{2}} \otimes I\Bigr)}^2\, \biggr\rangle = 4 - \sqrt{2} \langle \text{CHSH} \rangle. \end{equation}
If \((\ket{\psi}, \{A^x\}, \{B^y\})\) is \(\eps\)-optimal, and \(A = (A^0 + A^1)/\sqrt{2}\) then the first square simplifies to
\begin{equation} \norm{ \bigl( I \otimes B^0 - A \otimes I \bigr) \bigket{\psi}} \le O(\sqrt{\eps}). \end{equation}
This implies that
\begin{equation} \begin{split} \norm{ \bigl( A \otimes B^0 - A^2 \otimes I \bigr) \bigket{\psi}} & \le O(\sqrt{\eps}),\\ \norm{ \bigl( I \otimes (B^0)^2 - A \otimes B^0 \bigr) \bigket{\psi}} & \le O(\sqrt{\eps}). \end{split} \end{equation}
Take the sum, we get
\begin{equation} \norm{ \bigl( I \otimes I - A^2 \otimes I \bigr) \bigket{\psi}} \le O(\sqrt{\eps}). \end{equation}
Therefore, \(A^0\) and \(A^1\) are approximately anti-commuting on state \(\ket{\psi}\):
\begin{equation} \langle \{A^0,A^1\}^2 \rangle \le O(\eps). \end{equation}
Using Jordan's lemma, the approximate anti-commuting relation of \(A^0\) and \(A^1\) on the state implies the following structural result.
If \((\ket{\psi}, \{A^x\}, \{B^y\})\) is \(\eps\)-optimal for CHSH inequality, then there is an isometry \(V_A\) on Alice's ssytem such that
\begin{equation} \begin{split} \norm{(A^0 - V_A^\dagger (X \otimes I) V_A) \ket{\psi}} & \le O(\sqrt{\eps}),\\ \norm{(A^1 - V_A^\dagger (Z \otimes I) V_A) \ket{\psi}} & \le O(\sqrt{\eps}). \end{split} \end{equation}
The power of the rigidity theorem is that we now have a "classical leash" on the quantum system.
Useful for device-independent cryptography and information processing.
CHSH game has quantum and classical values \(0.85\) and \(0.75\), respectively.
It is sometimes useful to have a simple game where the quantum value is \(1\) while the classical value is less than \(1\).
Mermin-Peres magic square game is such an example.
Every row and column has even parity except the last column.
Prove that there is no perfect classical strategy.
Share two EPRs and measure the observables in the following square:
The last row has even parity outcomes and the last column has odd parity outcomes.
\begin{equation} \langle I \otimes (XZ \cdot ZX \cdot YY) \rangle = 1,\quad \langle I \otimes (XX \cdot ZZ \cdot YY) \rangle = -1. \end{equation}
Observables in the same row or column are commuting.
Observables in different rows and columns are anti-commuting.
The magic square game also enforces anti-commutativity and has rigidity.
A basic cryptographic task is to send a message (plain text) from Alice to Bob without letting a third party Eve to learn any information about the plain text by tapping the channel.
The one-time pad is information-theoretically secure.
But where would the key \(k\) come from in the first place?
Classically, primitives like Diffie-Hellman key exchange needs cryptographic assumptions (some problems are hard).
It is possible to establish secure keys between Alice and Bob using quantum key distribution (QKD) protocols.
Security based on quantum mechanics!
Basis \(0\) is the computational basis \(\{\ket{0}, \ket{1}\}\); basis 1 is the Hadamard basis \(\{\ket{+}, \ket{-}\}\).
If Eve doesn't know which basis the states are prepared, learning information about the state will cause disturbance to the state and will be detected.
There is a tradeoff between information gain and disturbance.
The uncertainty principle and no-cloning theorem put to work!
Invented by Bennett and Brassard in 1984 using ideas of Wiesner from 1960's.
Stories about S. Wiesner: https://scottaaronson.blog/?p=5730
Alice chooses \(n\) random bits \(a_1, a_2, \ldots, a_n\) and \(n\) random basis \(b_1, b_2, \ldots, b_n\). She sends \(n\) qubits encoding \(a_i\) in basis \(b_i\) for all \(i\).
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | \(\cdots\) | \(n\) | |
---|---|---|---|---|---|---|---|---|---|---|---|
\(a\) | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | \(\cdots\) | 0 |
\(b\) | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | \(\cdots\) | 1 |
State | \(\ket{+}\) | \(\ket{1}\) | \(\ket{-}\) | \(\ket{0}\) | \(\ket{1}\) | \(\ket{-}\) | \(\ket{1}\) | \(\ket{1}\) | \(\ket{-}\) | \(\cdots\) | \(\ket{+}\) |
Bob measures each qubit in randomly chosen basis \(b'_i\) and obtains outcomes \(a'_i\).
Alice and Bob publish their random basis \(b_i\) and \(b'_i\) for \(i=1, 2, \ldots, n\).
Define \(\text{EQB} = \{ i \mid b'_i = b_i, 1 \le i \le n\}\). The expected size of \(\text{EQB}\) is \(n/2\). Alice randomly partitions \(\text{EQB} = \text{CHK} \cup \text{RAW}\) of equal size. If the fraction of \(a_i \ne a'_i\) for \(i \in \text{CHK}\) is larger than \(p\), Alice and Bob abort the protocol. Otherwise, they use the \(a\) and \(a'\) bits indexed by \(\text{RAW}\) as the raw key.
Alice and Bob perform two postprocessing steps, information reconciliation and privacy amplification.
The keys are random
They are not determined by Alice or Bob directly.
The communication between Alice and Bob does not need to be secure but is authenticated.
Alice knows she is talking to Bob and vice versa.
Commercial and government applications of QKD
Currently, it is not for personal use generally.
Entanglement based protocols
Alice and Bob measure on singlet state, and detect Eve by testing Bell inequality violations.
[Ekert, Quantum cryptography based on Bell's
theorem, 1991]
[Bennett, Brassard, Mermin, Quantum cryptography without Bell's
theorem, 1992]
Unconditional security proof for QKD is hard.
[Mayers, Unconditional security in quantum
cryptography, 1996]
[Lo and Chau, Unconditional security of quantum key distribution over
arbitrarily long distances, 1999]
[Shor and Preskill, Simple Proof of Security of the BB84 Quantum Key
Distribution Protocol, 2000]
Alice and Bob each received a measurement device built by their enemy, Eve.
The devices share some state \(\rho_{ABE}\) among Alice and Bob's devices and Eve's lab.
Each device has two buttons \(0\) and \(1\) and outputs a bit when a button is pressed. Other than that, there are no other guarantees.
Alice and Bob can perform CHSH inequality check by pressing the buttons and collecting data.
As long as the boxes are well shielded, that is, the boxes cannot not send out the keys after the QKD protocol (no radio built in), Alice and Bob can still run QKD protocols!
How is it possible?
Alice and Bob can use the rigidity of the CHSH game to test the devices!
A simple case to check is that if Eve stores some predetermined bit string in the devices and uses it as output, Eve cannot succeed.
Probably the first-ever quantum information protocol (Wiesner 1969, published 1983).
It is an application of the no-cloning theorem as the no-cloning theorem prevents counterfeiting of money.
The bank prints quantum bills, each of which has
To verify a bill, you bring it back to the bank!
The bank looks at the serial number and then measures each qubit in the bill in the basis it was prepared.
A counterfeiter cannot clone the quantum state in the banknote without the knowledge of the basis in which the single-qubit states are prepared.
This can be made rigorous.
[Molina, Vidick, Watrous, Optimal counterfeiting attacks and generalizations for Wiesner's quantum money, 2012]
Better schemes where we don't need to involve the bank for verification were also studied.
Public-key quantum money.
There is a clever attack on Wiesner's scheme if the bank returns the bill whether or not it passed verification.
How to fix the problem?
If the verification fails, the bank will not return the bill.
Secure now?
The attack uses a probe qubit initilized in \(\ket{0}\)
Repeat the following \(\frac{\pi}{2\eps}\) times:
Measure probe qubit.
If the first qubit is one of \(\ket{0}, \ket{1}, \ket{-}\), the probe qubit stays at \(\ket{0}\).
If the first qubit is \(\ket{+}\), the probe qubit rotates to \(\ket{1}\).
The overall probability that the bank verification ever fails is \(O(\eps)\), which can be arbitrarily small! (It is related to the Elitzur-Vaidman bomb detection problem)
How can we fix this attack?!