Lecture 1: Introduction to Quantum Computing

About this Course

Welcome

Seminars on Quantum Computing is a quantum computing course designed for undergraduate students in the computer science.

This is an incomplete set of notes prepared by Zhengfeng Ji for the course.

Quantum Computing

What is quantum computing, or more generally, what is quantum information science?

The first example question in ChatGPT: Explain quantum computing in simple terms.

ChatGPT example questions

The answer I got was:

Quantum computing is a type of computing that uses the principles of quantum mechanics, the science of the very small, to process information. Instead of using classical bits, which can be either 0 or 1, quantum computers use quantum bits or qubits, which can be in multiple states at the same time.

This ability to exist in multiple states simultaneously is called superposition and allows quantum computers to perform certain calculations much faster than classical computers. In addition, quantum computers can also use a phenomenon called entanglement, where the state of one qubit is correlated with the state of another, to further enhance their computational power.

Quantum computing has the potential to solve some of the world's most challenging problems in fields such as cryptography, drug discovery, and climate modeling. However, building a fully functional quantum computer remains a major scientific and engineering challenge, and many of the applications of quantum computing are still in the experimental stage.

You can try DeepSeek V3 and you will get something similar or better.

They all describe the outline and motivation of this course well!
Quantum + Information: The second quantum revolution

The success of quantum mechanics. Laser, GPS, etc.

From passive understanding of the collective effects to active manipulation of individual particles.

History, Highlights, Current Status

History and highlights

a. Early history: Feynman 1982, Deutsch 1985, Deutsch-Jozsa 1992 b. Shor's factoring algorithm 1994 (breaks RSA) c. Quantum cryptography (QKD, Bennett Brassard 84) d. Quantum teleportation (Bennett et al., 1993) e. Simulation of quantum systems (Feynman's dream)
Current status

Primitive quantum computers are being built. Many players are racing to build quantum computers: Google, IBM, China, …

Different labs/companies are betting on different hardware routes: Superconducting qubits, ion traps, photonic systems, topological qubits, and more recently, neutral atom platforms.

Nowadays, there are programmable quantum computers with 50-1000 qubits.

Major problem: devices are noisy! The noise level is around \(10^{-3}\).
Supremacy and simulation of Google and USTC

Google Sycamore

Sycamore, 53 qubits

Zhuchongzhi, 66 qubits

The biggest competitor to quantum computers are clever classical algorithms: In Nov 2021, a group from Chinese Academy of Sciences used tensor network algorithms to reproduce the statistics of the Google supremacy experiment using 512 GPUs in 15 hours.

Both Google and USTC have upgraded the system to battle against better classical simulation methods and hardware.
On the theory side: new algorithms, new protocols, fault-tolerant quantum computing, new insights on the power of quantum computers

More computer scientists get involved
Hype?

a. TIME's new cover: Quantum computers will transform our world—and create a 21st century "space race"

b. Quantum computing has a hype problem, MIT Technology Review, By Sankar Das Sarma.

Quantum computing startups are all the rage, but it's unclear if they'll be able to produce anything of use in the near future.

Learning Quantum Computing

What you will learn in this subject
1. Quantum information basics (for about 1/3 of the time)
2. Topics in quantum computing
3. Research skills. Get an idea of what the big questions are for quantum computing.
Is quantum computing hard to learn?

The mathematics is easy, the physics is hard.

There is a simple mathematical framework describing quantum mechanics with merely four postulates, about state space, evolution, measurement, and joint system, respectively, in both the pure state and mixed state settings.

The postulates are simple, but the truly amazing thing to note is how far these four simple postulates can bring us.

"Shut up and calculate!"
Reiterate over the four postulates from different perspectives.

Lecturers

Zhengfeng Ji, Yuan Feng, and Jianxin Chen
Office: 1-707, Ziqiang S&T Building
Research direction of Zhengfeng

Quantum computing, theory of computing.
Research direction of Yuan

Quantum computing, theory of programming, mathematical logic.
Research direction of Jianxin

Quantum computing, computer architecture.

Q & A

Online: Wechat group, learn.tsinghua.edu.cn, email
Office hour: after each class

Assessment

Homework on the quantum information basics. (40%)
Work on projects in groups and take an idea to the extreme (Given projects, or projects of your choice approved by your lecturers), write an essay, present your findings.

We will grade your project by its writing and format (10%), originality (20%), completion (10%), and presentation (20%).
Extra rule: If your project yields results deemed near-publishable by the teachers' evaluation, you will automatically receive an A+.

Textbook

a. Quantum Computation and Quantum Information (Mike & Ike)

b. Aaronson Lecture Notes

c. de Wolf Lecture Notes

d. Foundations of Quantum Programming by Mingsheng (Tsinghua Online Access)

CS, Mathematics, and Physics

Quantum Computing is Interdisciplinary

Computer science

Mathematics

Physics

Three basic statements about the physical world
1. Probability
  
  \(\sum_j P_j = 1, P_j \ge 0\)
  
  Monotonicity
  
  Single-slit case
2. Locality
  
  Things can only propagate through the universe at a certain speed.
  
  In physics, locality naturally emerges from Einstein's special theory of relativity, which implies that no signal can propagate faster than the speed of light.
3. Local Realism
  
  Things have predetermined value, which depends on parameters in a local region.
  
  Einstein: I like to think the moon is there even if I am not looking at it.
Church-Turing thesis

What is Church-Turing thesis?

TCS is mathematics but the Church-Turing thesis connects it to physics.

Information is physical.

Conversely: It from qubit.
Quantum mechanics and the above principles

Probability stays, yet monotonicity goes away.

Locality stays but realism is abandoned.

EPR and Bell's inequality.

Quantum computing does not falsify Church-Turing, but seriously challenges the extended Church-Turing thesis.

The Double-Slit Experiment

Watch this video.

Double-slit with an observer: which path did the electron take?

If there is a conscious observer, the distribution goes back to classical! That is the same as decoherence in a quantum system interacting with an environment!

Quantum mechanics is an (unusual) theory using probability.

A cat isn't found in a superposition of alive and dead states, because it interacts constantly with its environment. These interactions essentially leak information about the 'cat system' out.

This explains the difficulty of building a quantum computer. Quantum computers need to be isolated well from the environment yet still easy to manipulate.

Shor's factoring algorithm can be thought of as a large-scale interference experiment.

Born's rule
1. Probability stays but monotonicity goes.
  
  God does play dice, and an unusual one.
  
  Einstein liked inventing phrases such as "God does not play dice," "The Lord is subtle but not malicious." On one occasion Bohr answered, "Einstein, stop telling God what to do."
  
  Instead of probabilities, quantum mechanics uses amplitudes.
  
  \(p = \abs{\alpha}^2\).
2. Let \(\alpha_i\) be the amplitude that it lands on a position through slit \(i\).
  
  \(\alpha=\alpha_1 + \alpha_2\).
  
  Example \(\alpha_1 = 1/2\) and \(\alpha_2 = -1/2\). What is the probability \(p\)?
  
  Cancellation!

Linear Algebra Formulation

Linear Algebra for Probabilities

Consider the problem of flipping a coin.

\(\Pr(\text{heads}) = p, \Pr(\text{tails}) = q\).
Apply a transform \(X\).

In general, the transform is a matrix of transition probabilities \(p(a|b)\).

Markov chain and stochastic matrix.

Now consider two independent coins. This leads to the tensor product very naturally.

More generally, the tensor product of two matrices \(A, B\) is defined as a block matrix
\[\begin{equation*} A \otimes B = \begin{pmatrix} A_{1,1} B & A_{1,2} B & \cdots & A_{1,n} B\\ A_{2,1} B & A_{2,2} B & \cdots & A_{2,n} B\\ \vdots & \vdots & \ddots & \vdots \\ A_{m,1} B & A_{m,2} B & \cdots & A_{m,n} B\\ \end{pmatrix}. \end{equation*} \]
Consider the CNOT matrix.

It creates a correlation between the two coins.
Quantum mechanics is similar, except that it uses amplitudes.

Quantum (Pure) States

A quantum state is a unit vector of \(\complex^N\).

Finite dimensional spaces suffice for this course.

A qubit is a superposition of 0 and 1.

Quantum states are superposition of possibilities.

Dirac notation.

ket, bra (\(\ket{0}, \ket{1}, \ket{+}, \ket{-}, \ket{\psi}\)).

State Transformation

Unitary. \(U^\dagger U = I\).

Examples: \(I, X, Y, Z, H, S, T\), CNOT, rotations.

\[\begin{equation*} H = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1\\ 1 & -1 \end{bmatrix}. \end{equation*} \] \[\begin{equation*} H \ket{0} = \ket{+}, H \ket{1} = \ket{-}. \end{equation*} \]

Why unitary? Unitary operators are the only operators that preserve the length of vectors.

Reading I

Movie: The Quantum Tamers
一场从根上开始的革命 (by Mingsheng Ying)
Simulating physics with computers (by Richard P. Feynman)
40 years of quantum computing
- 18:01 Charlie Bennett - 1981 and the Beginning of Quantum Information
- 49:20 Peter Shor - Development of Quantum Algorithms and Error Correction

Lecture 2: Postulates of Quantum Mechanics

Recap

Probability, amplitude, and Born's rule

Two-slit experiment

A (classical) random bit the mixture of \(0\) or \(1\) described by probabilities.

A quantum bit (qubit) is a superposition state described by amplitudes for \(0\) and \(1\).

1-norm vs. 2-norm theory.
Linear algebra for probability

A (random) bit is represented as a vector \(\begin{pmatrix} p_0 \\ p_1 \end{pmatrix}\).

\(X\) and CNOT as operations on the probability vector.

\(X\) is the NOT gate in matrix form!

CNOT says if the control bit is \(1\), flip the target bit.

Tensor product arises naturally.

Binary symmetric channel as a matrix.
Linear algebra for quantum mechanics
Quantum gates
- Unitary gates
- Pauli \(X, Y, Z\)
- The Hadamard gate \(H\)

Hadamard Gates and Quantum Interference

In some sense, the most important quantum gate.
The Hadamard gate \(H\) has the matrix form
\[\begin{equation} H = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1\\ 1 & -1 \end{bmatrix}. \end{equation} \]
It maps \(\ket{0}, \ket{1}\) to \(\ket{+}, \ket{-}\), and vice versa.
\(H\) and \(H^2\).

Destructive interference (in the tree for \(H^2 \ket{0}\)).
All quantum gates are changes of basis.

We can expand \(\ket{0}\) in the basis of \(\ket{+}\) and \(\ket{-}\).

Born's Rule Revisited

\(\ket{0}, \ket{1}\) is not special.
There are different ways to measure in quantum!
What if we measure along another choice of basis? Say, \(\ket{+}, \ket{-}\)?

Decompose the state \(\ket{\psi}\) as a combination of \(\ket{+}\) and \(\ket{-}\).
Two views:
1. Apply \(H\), and measure in the standard basis,
2. Measure in the \(\ket{\pm}\) basis (\(X\) basis).
The projection picture on the circle.
Projective measurements.

Multiple Qubits

Like calculating the probability of two independent random bits, we can use the tensor product to represent the state of multiple qubits.

It is now the tensor product of quantum states (vectors of amplitudes instead of vectors of probabilities).

Two-qubit states
- A general two-qubit state is
  \[\begin{equation} \ket{\psi} = \alpha \ket{00} + \beta \ket{01} + \gamma \ket{10} + \delta \ket{11} \end{equation} \]
- It is the superposition of four possibilities.
- Born's rule gives the probability of four outcomes.
- In principle there's no distance limitation between the two qubits.
  
  Let's say the first qubit is in Alice's hand. What happens if Alice measures it?
  
  If Alice obtained \(0\) outcome, then the state is
  \[\begin{equation} \frac{\alpha \ket{00} + \beta \ket{11}}{\sqrt{\abs{\alpha}^2 + \abs{\beta}^2}}. \end{equation} \]
- This is called the Partial Measurement Rule, which follows from the measurement postulate.
  \[\begin{equation*} \begin{split} M_0 & = \ket{0}\bra{0} \otimes I\\ M_1 & = \ket{1}\bra{1} \otimes I \end{split} \end{equation*} \]
Two-qubit gates

Single-qubit gates as a two-qubit gate: \(X \otimes I\) and \(I \otimes X\).

Applying the same single-qubit gate on the two qubits: \(H \otimes H\).

CNOT gate.

More Linear Algebra Facts

We need some facts from linear algebra to take our grasp of quantum mechanics to the next level.

Matrix and Vectors

\(A (\sum_j a_j \ket{v_j}) = \sum_j a_j A \ket{v_j}\).

Matrix entries \(A_{ij} = \bra{i} A \ket{j}\).

Inner Product

Hilbert space: vector space with an inner product.

What is an inner product \((\,\cdot\,, \,\cdot\,)\)?
1. Linear in the second argument,
2. \({(\ket{u}, \ket{v})}^* = (\ket{v}, \ket{u})\),
3. \((\ket{u}, \ket{u}) \ge 0\) with equality iff \(\ket{u} = 0\).
A concrete example is
\[\begin{equation*} (\ket{u}, \ket{v}) = \sum_{j=1}^d {u_j}^* v_j. \end{equation*} \]
Question: is this the only possibility?

Dirac notation \(\langle u | v \rangle\).
The norm is induced by the inner product (of the Hilbert space).
\[\begin{equation*} \norm{\ket{u}} = \sqrt{\langle u | u \rangle}. \end{equation*} \]
The normalization of \(\ket{u}\) is \(\ket{u} / \norm{\ket{u}}\).
The decomposition of a state into basis
\[\begin{equation*} \ket{\psi} = \sum_j a_j \ket{\lambda_j}. \end{equation*} \]
Then
\[\begin{equation*} a_j = \langle \lambda_j | \psi \rangle. \end{equation*} \]
Orthonormal basis and completeness relation:

If \(\{ \ket{\lambda_i} \}\) form an orthonormal basis, then we have
\[\begin{equation*} \sum_j \ket{\lambda_j} \bra{\lambda_j} = I. \end{equation*} \]
Adjoints of operators

Let \(A\) be a linear operator on a Hilbert space. There exists a unique operator \(A^\dagger\) such that for all \(\ket{v}, \ket{w}\),
\[\begin{equation*} (\ket{v}, A \ket{w}) = (A^\dagger \ket{v}, \ket{w}). \end{equation*} \]
In the matrix form, \(A^\dagger\) is the conjugate transpose of \(A\).

Eigenvalues and Eigenvectors

\(A \ket{\lambda} = \lambda \ket{\lambda}\).

Matrix \(A\) is diagonalizable if \(A = \sum_j \lambda_j \ket{\lambda_j} \bra{\lambda_j}\) where \(\ket{\lambda_j}\) form an orthonormal set of eigenvectors of \(A\).

Example: What is the eigenvalues and eigenvectors of \(X\)?

Spectral decomposition theorem

Theorem. Any normal operator \(A\) on space \(V\) is diagonalizable with respect to some orthonormal basis of \(V\), and vice versa.

Families of diagonalizable matrices

Matrix	Condition	Eigenvalues
Normal	\(A\) and \(A^\dagger\) commute	Complex
Hermitian	\(H = H^\dagger\)	Real
Unitary	\(U^\dagger U = I\)	Phase
Projection	\(P^2 = P, P = P^\dagger\)	\(\{0,1\}\)
Reflection	\(R^2 = I, R = R^\dagger\)	\(\{\pm 1\}\)

Tensor Products

\(A \otimes B\) is a block matrix whose \(i,j\)-th block is \(a_{i,j} B\).
\[\begin{equation*} X \otimes Y = \begin{bmatrix} 0 & 0 & 0 & -i\\ 0 & 0 & i & 0\\ 0 & -i & 0 & 0\\ i & 0 & 0 & 0 \end{bmatrix} \end{equation*} \]

Matrix Functions

For \(A = \sum_a a \ket{a}\bra{a}\), define \(f(A) = \sum_a f(a) \ket{a}\bra{a}\).

Example: \(\exp(A)\).

Four Postulates

State

The state of a closed quantum system can be described by a unit vector in a Hilbert space (called the state space).

Closed: no observer, no leak of information to the environment!

If a "particle" can be in one of the \(d\) possibilities from a finite set \(\Gamma\), the state space is \(\mathcal{H} = \complex^\Gamma\). That is, to each possibility in \(\Gamma\), a complex number (the amplitude) is assigned.
\[\begin{equation*} \ket{\psi} = \begin{bmatrix} \psi_1 \\ \psi_2 \\ \vdots \\ \psi_d \end{bmatrix} \end{equation*} \]
such that \(\norm{\ket{\psi}} = \abs{\psi_1}^2 + \cdots + \abs{\psi_d}^2 = 1\).
Evolution

The discrete time evolution of the closed system can be described by a unitary operator. If the state at time \(t_0\) is \(\ket{\psi_0}\), the state at time \(t_1\) can be obtained as
\[\begin{equation*} \ket{\psi_1} = U \ket{\psi_0}. \end{equation*} \]

It also follows from the Schrödinger equation
\[\begin{equation*} i \hbar \frac{\partial}{\partial t} \ket{\psi(t)} = H \ket{\psi(t)}. \end{equation*} \]
Take \(\hbar = 1\), the evolution from \(t_0\) to \(t_1\) is
\[\begin{equation*} U = \exp \bigl(-i H (t_1 - t_0) \bigr). \end{equation*} \]
Measurement
Composite system

The state of a composite system whose components have states \(\ket{\psi_1}, \ket{\psi_2}, \ldots, \ket{\psi_m}\) is the tensor product state
\[\begin{equation*} \ket{\psi_1} \otimes \ket{\psi_2} \otimes \cdots \otimes \ket{\psi_m}. \end{equation*} \]

Measurement

The measurement postulate

Quantum measurements are described by a collection \(\{M_m\}\) of measurement operators. These are operators acting on the system being measured and must satisfy
\[\begin{equation*} \sum_m M_m^\dagger M_m = I. \end{equation*} \]
The index \(m\) refers to the measurement outcomes that may occur.

(Compare the \(U^\dagger U = I\) condition)

If the state of the system is \(\ket{\psi}\) immediately before the measurement then the probability that result \(m\) occurs is given by
\[\begin{equation*} p(m) = \bra{\psi} M_m^\dagger M_m \ket{\psi}. \end{equation*} \]
The state after the measurement is
\[\begin{equation*} M_m \ket{\psi} \big/ \sqrt{p(m)}. \end{equation*} \]
Why this general form?

It is essentially Born's rule when we use ancillary systems and apply a joint unitary operation.

It is known as the Naimark theorem.

Hint: Consider a unitary
\[\begin{equation*} U = \begin{pmatrix} M_1 & \cdots & \cdots\\ \vdots & \ddots & \vdots\\ M_m & \cdots & \cdots \end{pmatrix} \end{equation*} \]
Special case: projective measurement if \(M_m = \ket{\psi_m}\bra{\psi_m}\) and this will recover the Born's rule.

Observable: an operator that models the observation of a physical quantity.

For any Hermitian operator \(O\), the spectral decomposition theorem tells us that it has the form
\[\begin{equation*} O = \sum_m m P_m, \end{equation*} \]
where \(m\) is from a finite subset of \(\real\).
\[\begin{equation*} \E (O) = \sum_m m\, p(m) = \bra{\psi} O \ket{\psi} = \langle O \rangle. \end{equation*} \]
So the average value \(\langle O \rangle\) on state \(\ket{\psi}\) is the expectation of outcome value \(m\).
Measurement collapses the state.

The three polarizer "paradox".

General Properties of Gates and Measurements

Gates are:
1. Invertible
  
  This means that unitary gates never erase information. Compare this with classical gates!
2. Deterministic
  
  Nothing is random in the application of a quantum gate.
3. Continuous
  
  There is a unitary operation that represents the evolution of a smaller period of time.
  
  Explains the necessity of using complex numbers.
  
  What is the square root of the \(Z\) gate?
Measurements are:
1. Irreversible
  
  We cannot go back to the state before measurement in general!
  
  The superposition state (wave function) collapses.
2. Probabilistic
  
  Born's rule.
3. Discontinuous
  
  The collapse of the wave function in measurement is considered instantaneous!
How does quantum mechanics reconcile these two conflicting operations? Born's rule!

Gates preserve the 2-norm, and measurements give the probabilities determined by the 2-norm.

Reading II

If you want to think more about superposition, the lecture by Allan Adams on superposition is highly recommended.

Lecture 3: Single Qubit Revisited

Recap

Quantum Postulates

States
Evolution
Measurement
Composition

Measurement

Quantum measurements are described by measurement operators \(\{M_m\}\) satisfying

\[\begin{equation*} \sum_m M_m^\dagger M_m = I. \end{equation*} \]

The probability that \(m\) occurs is \(\bra{\psi} M_m^\dagger M_m \ket{\psi}\).
The post-measurement state given outcome \(m\) is proportional to \(M_m \ket{\psi}\).

An observable is a Hermitian operator whose eigenprojectors form a set of measurement operators indexed by the eigenvalues.

\(Z\) is an observable, which corresponds to the computational measurement \(\{\ket{a}\bra{a}\}\) with outcomes \((-1)^a\).

Global and Relative Phase

The global phase is not observable in quantum mechanics.

Quantum Circuits

How do we process information using the four postulates?

Classical Information Review

Boolean functions, Boolean circuits, and Boolean formulas
- A Boolean function is function \(f : \{0,1\}^n \to \{0,1\}\).
  
  There are \(2^{2^n}\) of them!
  - Simple Boolean functions: AND (\(\land\)), OR (\(\lor\)), NOT (\(\neg\)), NAND (\(\uparrow\)).
  - \(n = 1\) and \(n = 2\).
- A Boolean circuit is a model of computation using gates and wires.
  - Each gate has at most two inputs and one output;
  - The fan-out of a gate is unrestricted;
  - Fan-in 0 nodes correspond to input variables;
  - Fan-out 0 node(s) correspond to output(s).
- A Boolean formula is a Boolean circuit where all gates have fan-out at most \(1\).
Functional completeness
- A set of Boolean functions \(f_i : \{0,1\}^{n_i} \to \{0,1\}\) is functionally complete if the functions "generate" every \(f : \{0, 1\}^n \to \{0, 1\}\) for all \(n \ge 1\).
- \(\{\text{AND}, \text{OR}, \text{NOT}\}\) is functionally complete.
- Shannon formula:
  \[\begin{equation*} \begin{split} f(x_1,...,x_n) & = ((\neg x_1) \land f(0,x_2,...,x_n)) \\ & \qquad \lor (x_1 \land f(1,x_2,...,x_n)). \end{split} \end{equation*} \]
- \(\{\text{NAND}\}\) is functionally complete.
- Disjunctive normal form (sum of minterms):
  \[\begin{equation*} \bigvee_{f(a_1,...,a_n)=1} \bigwedge_{i=1}^{n}x_i^{a_i}, \text{ where } x_i^{a_i} = \begin{cases} \phantom{\neg} x_i & \text{if } a_i = 1,\\ \neg x_i & \text{if } a_i=0. \end{cases} \end{equation*} \]

Quantum Circuit Notations

Quantum circuit is our new language for describing quantum information processing.
Reversibility implies that the quantum circuit looks more regular (the same number of input and output qubits).
From left to right: initial state \(\ket{\psi_0}\), the first gate \(U_1\), the second gate \(U_2\), …, the last gate \(U_T\).
Reverse the order when writing them in an equation:
\[\begin{equation*} U_T \cdots U_2 U_1 \ket{\psi_0}. \end{equation*} \]
Multi-qubit gates CNOT, C-U, CCNOT (Toffoli) in circuit diagrams.
What does it mean to apply a single qubit \(U\) to a multi-qubit states?

Examples

Prepare a classical bit string \(x\) from the all-zero string.
\(H \ket{0}\) and measure.
Bell state preparation.

Qubit as a Sphere

Superpositions of 0 and 1

Qubit is the quantum analog of bit and is, therefore, the simplest yet most important concept in quantum information processing.

Some noteworthy examples of states are:

\[\begin{equation*} \begin{split} & \ket{0} = \begin{pmatrix} 1 \\ 0 \end{pmatrix},\quad \ket{1} = \begin{pmatrix} 0 \\ 1 \end{pmatrix},\\ & \ket{+} = \frac{\ket{0} + \ket{1}}{\sqrt{2}} = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix},\\ & \ket{-} = \frac{\ket{0} - \ket{1}}{\sqrt{2}} = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ -1 \end{pmatrix}. \end{split} \end{equation*} \]

Single-qubit states

Bloch sphere

\[\begin{equation*} \ket{\psi} = \alpha \ket{0} + \beta \ket{1}. \end{equation*} \]
Examples of single-qubit states \(\ket{0}, \ket{1}, \ket{+}, \ket{-}\).
\(\alpha\) and \(\beta\) are complex numbers.
Quantum theory is a \(2\)-norm theory: \(\abs{\alpha}^2 + \abs{\beta}^2 = 1\).
Global phases do not matter and so we can write
\[\begin{equation*} \ket{\psi} = \cos\frac{\theta}{2} \ket{0} + \sin\frac{\theta}{2}\, e^{i\varphi} \ket{1} \end{equation*} \]
Pure single-qubit states are on the Bloch sphere.

Pauli Operators

Before discussing mixed states in the Bloch sphere, we review the Pauli operators.

\[\begin{equation*} \begin{split} & \sigma_x = X = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix},\quad \sigma_y = Y = \begin{pmatrix} 0 & -i \\ i & 0 \end{pmatrix},\\ & \sigma_z = Z = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}. \end{split} \end{equation*} \]

Pauli operators are unitary, Hermitian, and traceless.
- What are their eigenvalues and eigenvectors?
- Pauli operators as observables.
Commutation and anti-commutation relations of Pauli operators
\[\begin{align} [\sigma_x, \sigma_y] & = 2 i \sigma_z, & \{\sigma_x, \sigma_y\} & = 0,\\ [\sigma_y, \sigma_z] & = 2 i \sigma_x, & \{\sigma_y, \sigma_z\} & = 0,\\ [\sigma_z, \sigma_x] & = 2 i \sigma_y, & \{\sigma_z, \sigma_x\} & = 0. \end{align} \]

Qubit Observables

For any unit vector \(\vec{n} = (n_x, n_y, n_z) \in \real^3\), define
\[\begin{equation*} \vec{n} \cdot \vec{\sigma} = \sum_{k \in \{x,y, z\}} n_k\, \sigma_k. \end{equation*} \]
It is unitary, Hermitian, and traceless
\[\begin{equation*} (\vec{n} \cdot \vec{\sigma})^2 = I. \end{equation*} \]
For pure state \(\ket{\psi} = \cos\frac{\theta}{2} \ket{0} + \sin\frac{\theta}{2} e^{i\varphi} \ket{1}\) and its coordinate \(\vec{n}_\psi\), we have
\[\begin{equation*} \vec{n}_\psi \cdot \vec{\sigma} = \begin{pmatrix} \cos\theta & \sin\theta\, e^{-i\varphi}\\ \sin\theta\, e^{i\varphi} & -\cos\theta \end{pmatrix}. \end{equation*} \]
What are the eigenvectors of the above operator?

The coordinate of \(\ket{\psi}\) in \(\real^3\) is

\[\begin{equation*} (n_x, n_y, n_z) = (\sin\theta \cos\varphi, \sin\theta \sin\varphi, \cos\theta). \end{equation*} \]

It would be more natural first to define the \(\vec{n} \cdot \vec{\sigma}\) operator for any coordinate and define the pure state as its eigenvalue-\(1\) eigenstate.

It explains the \(\frac{\theta}{2}\) used to define the state \(\ket{\psi}\).

Rotate!

Pauli Actions on the Bloch Sphere

Pauli \(X\), \(Y\), \(Z\) rotate the Bloch sphere by angle \(\pi\) along the \(x\), \(y\), \(z\) axis, respectively.
The example of the \(Z\) action:
\[\begin{equation*} \cos\frac{\theta}{2} \ket{0} + \sin\frac{\theta}{2} e^{i \varphi} \ket{1} \\ \mapsto \cos\frac{\theta}{2} \ket{0} - \sin\frac{\theta}{2} e^{i \varphi} \ket{1}. \end{equation*} \]
How about \(\vec{n} \cdot \vec{\sigma}\)?
We do quantum information processing simply by rotating the qubits in our system!

More Examples

Hadamard \(H\), phase gate \(S\), \(\pi/8\) gate \(T\)
\[\begin{equation*} \begin{split} & H = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix},\quad S = \begin{pmatrix} 1 & 0\\ 0 & i \end{pmatrix},\\ & T = \begin{pmatrix} 1 & 0 \\ 0 & e^{\pi i / 4} \end{pmatrix}. \end{split} \end{equation*} \]
Rotations through \(a = x, y, z\) axis
\[\begin{equation*} R_a(\phi) = \exp\Bigl( - \frac{\phi}{2} i \sigma_a \Bigr) = \cos\frac{\phi}{2} I - i\sin\frac{\phi}{2} \sigma_a. \end{equation*} \]
In particular, \(R_z(\phi) = \begin{pmatrix} e^{-\frac{\phi}{2} i} & 0 \\ 0 & e^{\frac{\phi}{2} i} \end{pmatrix}\), and \(T\) is \(R_z(\pi / 4)\) up to a global phase.

Simple Circuit Identities

\(H^2 = I, T^2 = S, T^4 = Z, T^8 = I\),
\(XYX = -Y, HXH = Z, HYH = -Y, HZH = X\),
\(X R_x(\phi) X = R_x(\phi)\),
\(X R_y(\phi) X = R_y(-\phi)\),
\(X R_z(\phi) X = R_z(-\phi)\),
\(H = \frac{X + Z}{\sqrt{2}}\),
\(HSHZSHZHZHZST = THSHZSHZHZHZS\).

Arbitrary Single-Qubit Unitary Operation

Rotations along \(\vec{n}\)
\[\begin{equation*} \begin{split} R_{\vec{n}}(\phi) & = \exp\bigl( - \frac{\phi}{2} i\, \vec{n} \cdot \vec{\sigma}\bigr)\\ & = \cos\frac{\phi}{2} I - i \sin\frac{\phi}{2} \vec{n} \cdot \vec{\sigma}. \end{split} \end{equation*} \]
Any single-qubit unitary operator \(U\) can be written as
\[\begin{equation*} U = e^{i\alpha} R_{\vec{n}}(\phi), \end{equation*} \]
for some \(\alpha\), \(\phi\), and \(\vec{n}\).
(Z-Y decomposition) Any single-qubit unitary \(U\) has the form
\[\begin{equation*} U = e^{i\alpha} R_z(\beta) R_y(\gamma) R_z(\delta). \end{equation*} \]

Collapse!

Quantum Measurements on a Qubit

The computational (standard) basis measurement (\(Z\) measurement)

Is the state \(\ket{0}\) or \(\ket{1}\)?
(Born's rule) When we measure \(\ket{\psi} = \alpha \ket{0} + \beta \ket{1}\):
- With probability \(\abs{\alpha}^2\), the measurement outcome is \(0\) and the state collapse to \(\ket{0}\);
- With probability \(\abs{\beta}^2\), the measurement outcome is \(1\) and the state collapse to \(\ket{1}\).

Measurement in the Hadamard Basis

Hadamard basis measurement (\(X\) measurement):

Is the state \(\ket{+}\) or \(\ket{-}\)?
When we measure \(\ket{\psi} = \alpha \ket{0} + \beta \ket{1}\):
- With probability \(\abs{\alpha + \beta}^2/2\), the outcome is \(+\) and the state collapse to \(\ket{+}\);
- With probability \(\abs{\alpha - \beta}^2/2\), the outcome is \(-\) and the state collapse to \(\ket{-}\).
Measurement operators
\[\begin{equation*} M_{+} = \frac{I + X}{2} = \ket{+}\bra{+},\quad M_{-} = \frac{I - X}{2} = \ket{-}\bra{-}. \end{equation*} \]
\(X\) as an observable.
Why is it called the Hadamard basis measurement?

General Projective Qubit Measurement

A projective measurement in the direction of \(\vec{n}\)

Note that \(\vec{n} \cdot \vec{\sigma}\) is an observable.
Measurement operators
\[\begin{equation*} M_b = \frac{I + (-1)^b\, \vec{n} \cdot \vec{\sigma}}{2}, \text{ for } b = 0, 1. \end{equation*} \]
Check that \(M_b\) is indeed projective and \(M_0 + M_1 = I\).

Detect Elitzur-Vaidman Bomb

Someone sends you a box, which is either a dud (empty box) or a bomb attached to a quantum device that will trigger the explosion if it measures a \(\ket{1}\) on the photon sent in. Your task is to detect if there is a bomb inside.

Classical Method

Not very interesting

Send in \(\ket{+}\), Measure Output in \(\ket{\pm}\)

Dud: No explosion, measures \(\ket{+}\).

Bomb: 50% explosion, 25% measures \(\ket{+}\), 25% measures \(\ket{-}\).

Elitzur-Vaidman Protocol:

a. Start with \(\ket{0}\), rotate \(\epsilon\) angle, and send it in.

Use \(R_y(\epsilon)\) to implement the \(\epsilon\) rotation.

b. Repeat \(N = O(1/\epsilon)\) times without measuring. c. Measure in the standard basis.

Dud: No explosion, measures \(\ket{1}\) if \(N = \pi/\epsilon\).

Bomb: \(O(\epsilon)\) probability of explosion, measures \(\ket{0}\).

\[\begin{equation*} \Pr(\text{explodes}) \le N \sin^2 \Bigl( \frac{\epsilon}{2} \Bigr) = O(\epsilon). \end{equation*} \]

This is known as the quantum Zeno effect.

Lecture 4: Entanglement and Mixed States

Recap

Bloch sphere
Rotate, collapse, bomb!

Entangle!

Entanglement is yet another peculiar quantum phenomenon along with superposition and measurement.

Two-Qubit States

Superpositions of 00, 01, 10, 11
\[\begin{equation*} \ket{\psi_{AB}} = \alpha_{00} \ket{00} + \alpha_{01} \ket{01} + \alpha_{10} \ket{10} + \alpha_{11} \ket{11}. \end{equation*} \]
Tensor products \(\ket{ab} = \ket{a} \otimes \ket{b}, \text{ for } a, b \in \{0, 1\}\).
Tensor product of two matrices
Vector representation \((\alpha_{ij})\)
Normalization condition \(\sum_{a,b} \abs{\alpha_{ab}}^2 = 1\).

Two-Qubit Gates

CNOT, SWAP, Controlled-\(U\), …
\[\begin{equation*} \begin{split} \text{CNOT} :\; & \ket{a, b} \mapsto \ket{a, a\oplus b}, \\ \text{SWAP} :\; & \ket{a, b} \mapsto \ket{b, a}, \\ \text{Controlled-}U :\; & \ket{a, \psi} \mapsto \ket{a} \otimes U^a \ket{\psi}. \end{split} \end{equation*} \]
Circuit identities
Control vs. target
CNOT and single-qubit gates are all you need for universal quantum computing.

The Characteristic Trait

…, then they can no longer be described in the same way as before, viz. by endowing each of them with a representative of its own. I would not call that one but rather the characteristic trait of quantum mechanics, the one that enforces its entire departure from classical lines of thought. By the interaction the two representatives [the quantum states] have become entangled.

Bell States

Four Bell states
\[\begin{alignat*}{3} & \ket{\beta_{00}} & = \frac{\ket{00} + \ket{11}}{\sqrt{2}},\quad && \ket{\beta_{01}} & = \frac{\ket{01} + \ket{10}}{\sqrt{2}},\\ & \ket{\beta_{10}} & = \frac{\ket{00} - \ket{11}}{\sqrt{2}},\quad && \ket{\beta_{11}} & = \frac{\ket{01} - \ket{10}}{\sqrt{2}}. \end{alignat*} \]
The first index determines the sign; the second determines the parity.
Prepare the Bell states
The Bell states are also known as the EPR states, and we often use \(\ket{\text{EPR}}\) to denote \(\ket{\beta_{00}}\).
Singlet state \(\ket{\beta_{11}}\) is an anti-symmetric state.
The other three states are all symmetric and span the symmetric subspace of the two-qubit state space.

Entanglement

It is easy to verify that this state is not a product of single-qubit states!

A picture of entanglement.

Basic rules of quantum mechanics imply the existence of EPR, arguably what troubled Einstein the most in quantum mechanics! It is what he called the 'spooky action at a distance'.

EPR Paradox

Consider two players Alice and Bob sharing an EPR state. Alice brings her particle to the moon while Bob stays on earth.

Fact: If Alice measures her particle, she instantaneously knows whether Bob's state is \(\ket{0}\) or \(\ket{1}\) (which Bob can confirm if he performs the standard basis measurement).
Is it a big deal?

It is not very different from a pair of two perfectly correlated classical bits.

A pair of gloves in two boxes.
EPR considers other possibilities: Alice can measure in a different basis.

There is another way to open the box.

We have measured \(Z\) basis, what about \(X\) measurement? That is, what happens if Alice measures \(\ket{\pm}\)?
1. Method I. \(M_0 = \ket{+}\bra{+} \otimes I\) and \(M_1 = \ket{-}\bra{-} \otimes I\).
2. Method II. Write EPR state
  \[\begin{equation*} \ket{\text{EPR}} = \frac{\ket{{+}{+}} + \ket{{-}{-}}}{\sqrt{2}}. \end{equation*} \]
The conclusion is that Alice will instantaneously know whether Bob's post-measurement state is \(\ket{+}\) or \(\ket{-}\).

Superluminal Communication? No!

Consider two situations. In the first, Alice measures in the standard basis, she will obtain \(a=0,1\) with equal probability and Bob's qubit collapses to \(\ket{a}\). In the second, she measures in the Hadamard basis, she will obtain \(a={+},{-}\), with equal probability and Bob's qubit collapses to \(\ket{a}\).

It looks as if Alice's measurement choice affects Bob's final state! Would it allow Alice to send information instantaneously to Bob?

What are the states of Bob in the two situations?

Bob's view and Alice's view.

The mixed state framework will help us to answer the question and see why this won't allow superluminal communication.

Mixed States

Ensembles

In the first situation, Bob's state is in an equal distribution over \(\ket{0}, \ket{1}\). In the second, it is an equal distribution over \(\ket{+}\) and \(\ket{-}\). Can we distinguish these two cases in quantum mechanics?

Let's generalize a little bit and consider a distribution of pure states described by an ensemble \(\{(p_i, \ket{\psi_i})\}\). It means that our knowledge of the quantum system says that it is in state \(\ket{\psi_i}\) with probability \(p_i\).

Consider a quantum measurement \(\{M_0, M_1\}\) measuring the state ensemble. The probability that \(a\) occurs is

\[\begin{equation*} \Pr(a) = \sum_i p_i \bra{\psi_i} M_a^\dagger M_a \ket{\psi_i}. \end{equation*} \]

We can rewrite it using cyclic property of the trace as

\[\begin{equation*} \Pr(a) = \tr \Bigl(\Bigl(\sum_i p_i \ket{\psi_i}\bra{\psi_i} \Bigr) \, M_a^\dagger M_a \Bigr). \end{equation*} \]

We call

\[\begin{equation*} \rho = \sum_i p_i \ket{\psi_i}\bra{\psi_i} \end{equation*} \]

the density matrix of the ensemble. The probability is determined by the density matrix, so it makes sense to use the density matrix to describe the state of the system.

Bob's Mixed State

We now go back to the EPR paradox. When Alice measures \(\ket{0}, \ket{1}\), Bob's state is also \(\ket{0}\) or \(\ket{1}\), which Alice knows for sure as it will be the same as her measurement outcome. But Bob does not know the measurement outcome, so before Alice send the outcome to him, his knowledge of the system is that it is \(\ket{0}\) or \(\ket{1}\) with equal probability. The density matrix is

\[\begin{equation*} \rho = \frac{1}{2} (\ket{0}\bra{0} + \ket{1}\bra{1}) = \frac{I}{2}. \end{equation*} \]

If Alice measured \(\ket{+}, \ket{-}\). Bob's qubit is \(\ket{+}\) or \(\ket{-}\) with equal probability. Hence, the density matrix is

\[\begin{equation*} \rho' = \frac{1}{2} (\ket{+}\bra{+} + \ket{-}\bra{-}) = \frac{I}{2}. \end{equation*} \]

So even though the two ensembles look very different, the density matrices in the two situations are the same. It means that no measurement on Bob's system can tell the two cases apart. This makes perfect sense as the state of Bob should not change when Bob has done nothing to his system.

It explains why density matrix is more fundamental then ensemble of quantum states.

No superluminal communication is possible using the strategy above!

Density Matrix

Mathematically, a density matrix is positive semidefinite and of trace \(1\) (iff there is a corresponding ensemble). It is a description of the state of a quantum system that is more general than state vectors for pure states.

Any pure state \(\ket{\psi}\) has a density matrix representation \(\rho = \ket{\psi}\bra{\psi}\).

Example: The density matrix for \(\ket{+}\) is

\[\begin{equation*} \ket{+}\bra{+} = \frac{1}{2} \begin{bmatrix} 1 & 1\\ 1 & 1 \end{bmatrix}. \end{equation*} \]

The spectral decomposition of a density matrix

\[\begin{equation*} \rho = \sum_j \lambda_j \ket{\psi_j}\bra{\psi_j} \end{equation*} \]

gives a possible ensemble \(\{(\lambda_j, \ket{\psi_j})\}\) for the density matrix.

Probability can be considered a special case where the density matrix is diagonal.

Quantum Postulates (Mixed States)

The state is now a density matrix.
Evolution: \(\rho \mapsto U\rho U^\dagger\).
Measurement: \(\Pr(m) = \tr \bigl( M_m^\dagger M_m \rho \bigr)\) and the post-measurement state is proportional to
\[\begin{equation*} M_m \rho M_m^\dagger. \end{equation*} \]
Tensor: \(\rho = \rho_1 \otimes \cdots \otimes \rho_n\).

Mixed States of a Qubit

Expansion in the Pauli Operators

Define an inner product on the \(2\) by \(2\) matrix space
\[\begin{equation*} (A, B) = \frac{\tr(A^\dagger B)}{2}. \end{equation*} \]
Together with \(I\), Pauli matrices form an orthonormal basis!
For all \(\rho\), we can write
\[\begin{equation*} \rho = \frac{I + \vec{n} \cdot \vec{\sigma}}{2}, \text{ where } n_a = \tr(\rho \sigma_a). \end{equation*} \]
The positive semidefinite condition is equivalent to \(\norm{\vec{n}} \le 1\).

Mixed States on the Bloch sphere

The Bloch sphere has all valid single-qubit states.
What are the states of the north pole, the origin, and \((1, 0, 0)\)?
From coordinate \(\vec{n}\), we can compute the state as
\[\begin{equation*} \rho = \frac{I + \vec{n} \cdot \vec{\sigma}}{2}. \end{equation*} \]
Conversely, given the density matrix \(\rho\), the expansion in the Pauli operators basis gives a vector \(\vec{n} = (n_x, n_y, n_z)\)
\[\begin{equation*} n_a = \tr(\rho \sigma_a), \text{ for } a = x, y, z. \end{equation*} \]
Classical probability over a bit can be represented as diagonal density matrices corresponding to the line segment between the north and south poles.
Summarize: Things that can be identified with point \(\vec{n}\) on the sphere.

Measurement of Mixed Qubit State

When we measure \(\rho = (I + \vec{n} \cdot \vec{\sigma})/2\):
- With probability \(\tr(\rho\, \ket{0}\bra{0}) = (1 + n_z) / 2\), the measurement outcome is \(0\) and the state collapse to \(\ket{0}\bra{0}\);
- With probability \(\tr(\rho\, \ket{1}\bra{1}) = (1 - n_z) / 2\), the measurement outcome is \(1\) and the state collapse to \(\ket{1}\bra{1}\).

Partial Trace

Partial Trace and Purification

What is the state of \(A\) given the state of \(AB\)? We cannot answer the question in the pure state framework, and we can now with density matrices.

Partial Trace

Let \(\rho_{AB}\) the joint state of \(A\) and \(B\) systems. What is the state of \(B\) only? Is there a density matrix associated with system \(B\) given that we know the state of both \(A\) and \(B\)?

We need an effective representation of the state on \(B\) so that measurement only depends on this reduced state (not the joint state).

Let \(M\) be a measurement operator; the probability its outcome occurs is

\[\begin{equation*} \tr \bigl( \rho_{AB} (I_A \otimes M^\dagger M) \bigr), \end{equation*} \]

which is a linear functional of \(M^\dagger M\). So there is a \(\rho_B\) such that

\[\begin{equation*} \tr \bigl( \rho_{AB} (I_A \otimes M^\dagger M) \bigr) = \tr (\rho_B\, M^\dagger M). \end{equation*} \]

The partial trace is defined by

\[\begin{equation*} \tr_A : \ket{i,j}\bra{k,l}_{A,B} = \tr(\ket{i}\bra{k}_A) \ket{j}\bra{l}_B. \end{equation*} \]

Extend by linearity.

\[\begin{equation*} \tr \bigl( \rho_{AB} (I_A \otimes M^\dagger M) \bigr) = \tr \bigl( \tr_A(\rho_{AB}) M^\dagger M \bigr). \end{equation*} \]

Reduced Density Matrix

The reduced density matrix \(\rho_B = \tr_A (\rho_{AB})\).

What is the partial trace of the EPR state?

The reduced density matrices of \(\rho_{AB} = \rho_A \otimes \rho_B\) is \(\rho_A\) and \(\rho_B\). The converse is usually not true!

Purification

For any mixed state \(\rho_B\), there is a system \(A\) of the same dimension as \(B\) and a pure state \(\ket{\psi}_{AB}\) such that

\[\begin{equation*} \tr_A \bigl( \ket{\psi}\bra{\psi}_{AB} \bigr) = \rho_B. \end{equation*} \]

Use the spectral decomposition theorem again!

Example: The EPR state is a purification of \(I/2\).

Mixed states arise when we lose control of a system!

Lecture 5: Quantum Information

Review of Quantum Information Basics

Homework I
1. Problem 3 about computing
  \[\begin{equation*} \bra{x} H^{\otimes n} \ket{y}. \end{equation*} \]
2. Problem 4
Spectral decomposition

Consider a special case where \(A\) is Hermitian.

Let \(\lambda_1\) be an eigenvalue of \(A\) and \(\ket{e_1}\) a corresponding eigenvalue.
\[\begin{equation*} \begin{split} & \lambda_1 (\ket{e_1}, \ket{e_1}) = (\ket{e_1}, A\ket{e_1}) \\ = \, & (A\ket{e_1}, \ket{e_1}) = \lambda_1^* (\ket{e_1}, \ket{e_1}). \end{split} \end{equation*} \]
This means that \(\lambda_1\) is real.

Consider subspace \(K\) spanned by all vectors orthogonal to \(\ket{e_1}\).

We prove that \(K\) is an invariant subspace of \(A\). Suppose \(\ket{\psi} \in K\), we have \(\bra{e_1} A \ket{\psi} = \lambda_1 \langle e_1 | \psi \rangle = 0\). Use induction to complete the proof.

In standard textbooks, it says that \(A = UDU^\dagger\). Using Dirac notation, we write it as
\[\begin{equation*} A = U \bigl( \sum_j \lambda_j \ket{j}\bra{j} \bigr) U^\dagger. \end{equation*} \]
In this form, \(\ket{e_j} = U\ket{j}\) will be the eigenvectors of \(A\), and we write
\[\begin{equation*} A = \sum_j \lambda_j \ket{e_j}\bra{e_j}, \end{equation*} \]
and \(\ket{e_j}\)'s are orthonormal.
Measurement
1. Projective measurements
  
  Measurement operators are projectors such as \(\ket{0}\bra{0}\) and \(\ket{1}\bra{1}\).
  
  Note that \(\ket{\psi}\bra{\psi}\) is the projector to the span of \(\ket{\psi}\).
  
  A set of projectors \(\{P_j\}\) form a projective measurement if \(\sum_j P_j = I\).
2. General measurement satisfies \(\sum_j M_j^\dagger M_j = I\)
3. When we don't care about the post-measurement state, we consider measurement operators \(N_j = M_j^\dagger M_j \ge 0\).
  
  The completeness condition is \(\sum_j N_j = I\).
  
  The probability that \(j\) occurs is \(\tr(N_j \rho)\).
  
  They are called positive-operator valued measure (POVM).

Entanglement Revisit

Schmidt Decomposition

(Schmidt Decomposition). For a bipartite pure quantum state \(\ket{\psi_{AB}}\), we can write

\[\begin{equation*} \ket{\psi_{AB}} = \sum_{j=1}^k \sqrt{\lambda_j} \, \ket{\psi_A^{(i)}} \otimes \ket{\psi_B^{(i)}}, \end{equation*} \]

where \(k \le \min(\dim_A, \dim_B)\), \(\lambda_j > 0\), and both \(\{\ket{\psi_A^{(i)}}\}\) and \(\{\ket{\psi_B^{(i)}}\}\) are orthonormal.

The meaning of the theorem says that, up to a change of local basis, an entangled state can be written as
\[\begin{equation*} \sum_j \sqrt{\lambda_j} \ket{j}_A \otimes \ket{j}_B. \end{equation*} \]
Hint: Consider the singular value decomposition of matrix \((\psi_{j,k})\).
EPR as a mirror!

For any matrix \(U\):
\[\begin{equation*} (U \otimes I ) \sum_j \ket{j,j} = (I \otimes U^T) \sum_j \ket{j,j} = \sum_{i,j} U_{i,j} \ket{i,j} \end{equation*} \]
\(\lambda_j\)'s are called the Schmidt coefficients
What are the Schmidt coefficients of the Bell states?

How about product states?
Thanks to this theorem, the theory of pure bipartite entanglement is well established.

Nielsen's Theorem

Local operation and classical communication (LOCC)
Nielsen's Theorem

Let \(\ket{\psi}\) and \(\ket{\phi}\) be two entangled states on systems \(A,B\). Their Schmidt coefficients are \(\lambda = (\lambda_j)\) and \(\gamma = (\gamma_j)\) respectively.

Theorem (Nielsen). There is an LOCC transforming \(\ket{\psi}\) to \(\ket{\phi}\) if and only if the majorization condition \(\lambda \prec \gamma\) holds.

The condition \(\lambda \prec \gamma\) is that for all \(k\)
\[\begin{equation*} \sum_{j=1}^k \lambda_j^{\downarrow} \le \sum_{j=1}^k \gamma_j^{\downarrow}. \end{equation*} \]
A simple corollary is that there is an LOCC transforming an EPR pair to an arbitrary entangled pair of qubits.

EPR: Maximally entangled states.

Multiple-Qubit Entanglement

Superpositions of 000, 001, 010, \(\ldots\), 111
GHZ state
\[\begin{equation*} \ket{\text{GHZ}} = \frac{\ket{000} + \ket{111}}{\sqrt{2}}. \end{equation*} \]
W state
\[\begin{equation*} \ket{\text{W}} = \frac{\ket{001} + \ket{010} + \ket{100}}{\sqrt{3}}. \end{equation*} \]

Entanglement in Mixed States?

Entanglement in Use

Entanglement is a resource for many quantum information processing tasks

Now utilized for good at a distance of at least 1200km (Mozi satellite).

Quantum entanglement—physics at its strangest—has moved out of this world and into space. In a study that shows China's growing mastery of both the quantum world and space science, a team of physicists reports that it sent eerily intertwined quantum particles from a satellite to ground stations separated by 1200 kilometers, smashing the previous world record. (Science, 15 JUN 2017)

What is Information?

A qubit is a unit of quantum information storing the quantum state of a two-level physical system.

Quantum information is more general than classical information. We can store a bit in a qubit. But to store a qubit, it may require an infinite number of bits.

Distingish (Mixed) Quantum States

Consider an encoding \(a \mapsto \rho_a\) for \(a=0, 1\).

Promise: The system is in one of the two states \(\rho_0\) and \(\rho_1\),
Problem: Decide which is the case.
Measurement is the only bridge between the quantum and classical worlds. So we use a measurement with measurement operators \(M_0\) and \(M_1\)
\[\begin{equation*} \begin{split} \text{Minimize: } & \Tr(M_0 \rho_1) + \Tr(M_1 \rho_0)\\ \text{Subject to: } & M_0 \ge 0, M_1 \ge 0, M_0 + M_1 = I. \end{split} \end{equation*} \]
The optimal \(M_0\) minimizes \(\Tr(M_0 (\rho_1 - \rho_0))\)
In the qubit case, we write \(\rho_b = (I + \vec{v}_b \cdot \vec{\sigma}) / 2\), and \(\rho_0 - \rho_1 = \vec{v} \cdot \vec{\sigma}\) for \(\vec{v} = (\vec{v}_0 - \vec{v}_1)/2\). The optimal \(M_0\) should be
\[\begin{equation*} \frac{1}{2} \left( I + \frac{\vec{v}}{\lVert \vec{v} \rVert} \cdot \vec{\sigma} \right) \end{equation*} \]
The minimal error probability is therefore \((2 - \lVert \vec{v}_0 - \vec{v}_1 \rVert )/ 4\).
1. If two states \(\ket{u}\) and \(\ket{v}\) of a qubit are orthogonal, there is a projective measurement to tell them apart.
2. If two states are co-linear, they are indistinguishable.

The amount of information one can store and reliably read from a qubit is only a single bit.

The states \(\ket{0}\) and \(\ket{+}\) are different but not perfectly distinguishable.

Non-orthogonality is the key issue.

What is information? What is quantum information?

We can store the whole Internet in a single qubit, but we cannot reliably read information out.

Holevo Bound

For a quantum state \(\rho\), its von Neumann entropy is \(S(\rho) = -\tr (\rho \log(\rho))\).

The eigenvalues of \(\rho\) form a probability distribution whose Shannon entropy equals the von Neumann entropy \(S(\rho)\).

Mutual information \(H(X{\,:\,}Y) = H(X) + H(Y) - H(XY)\).

Theorem. Let \(\{(p_x, \rho_x)\}\) be an ensemble of states. Alice prepares \(\rho_x\) with probability \(p_x\) and sends it to Bob. Bob measures the state and gets outcome \(y\). Then the mutual information (accessible information) between \(X\) and \(Y\) satisfies

\[\begin{equation*} H(X{\,:\,}Y) \le S \Bigl( \sum_x p_x \rho_x \Bigr) - \sum_x p_x S(\rho_x). \end{equation*} \]

The right-hand side is called Holevo \(\chi\).

We will prove a poor man's version of Holevo's theorem.

Let \(X\) be a uniformly distributed random variable taking values in \([N]\). Consider an encoding of classical information \(x \mapsto \rho_x\) into quantum systems of \(n\) levels. For all measurement (POVM) \(\{M_m\}\), the probability that the measurement correctly guesses \(x\) is

\[\begin{equation*} p_x = \tr(M_x \rho_x) \le \tr(M_x). \end{equation*} \]

In expectation the success probability is therefore

\[\begin{equation*} \E_x p_x \le \frac{1}{N} \sum_x \tr(M_x) = \frac{n}{N}. \end{equation*} \]

Quantum One-Time Pad

Let \(\rho\) be an arbitrary qubit state, say \(\ket{0}\bra{0}\).

The quantum one-time pad works as follows:

Choose two random bits \(a,b\) and,
Apply \(U_{a,b} = X^a Z^b\) to the qubit.

For a player who does not know \(a,b\), the state is

\[\begin{equation*} \frac{1}{4} \sum_{a,b} U_{a,b} \rho U_{a,b}^\dagger = \frac{I}{2}. \end{equation*} \]

No-Cloning Theorem

Unknown quantum states cannot be copied.

There is no unitary transformation \(U\) that can map any \(\ket{\psi} \ket{0}\) to \(\ket{\psi} \ket{\psi}\).

Compute the inner product.

Note: \(\ket{0}\) and \(\ket{1}\) can be cloned using the CNOT gate.

Tomography

State Tomography

Given a large number of i.i.d samples of quantum state \(\rho\), find the density matrix for \(\rho\).
In the single-qubit case, \(I\), Pauli \(X\), \(Y\), \(Z\) form an orthonormal basis, so
\[\begin{equation*} \rho = \frac{\tr(\rho)I + \tr(X \rho) X + \tr(Y \rho) Y + \tr(Z \rho) Z}{2}. \end{equation*} \]
For example, to estimate \(\tr(Z \rho)\), we need to measure \(\rho\) in the \(Z\) basis multiple times, obtaining \(z_1, z_2, \ldots, z_m \in \{\pm 1\}\). Then \((\sum_i z_i) / m\) is an empirical estimate of \(\tr(Z \rho)\).
Note that the standard deviation is \(1/\sqrt{m}\) and so we need to repeat about \(O(1/\eps^2)\) times to have \(\eps\) precision.

State Tomography for Multiple Qubits

Multiple qubit case is similar:
\[\begin{equation*} \rho = \frac{\sum_{P} \tr(\rho P) P}{2^n}, \end{equation*} \]
where the summation goes over \(n\)-qubit Pauli operators.
Check that \(I\) and Pauli operators form a basis also in the multiple qubit case.
But it's a summation of \(4^n\) terms!

In the lab, it would take days to tomography a system of more than 10 qubits.

Theorem. The sample complexity for tomography is \(\tilde{\Omega}(d^2/\eps^2)\), where \(d\) is the dimension and \(\eps\) is the precision in the so called trace distance.
To make things worse, each term needs high precision for the triangle inequality to give meaningful bounds.

There are \(4^n\) small real numbers \(\tr(\rho P)\), and we have to ensure that their error sum is still controlled.
How do we even know the quantum device works as expected when tomography of its state is impractical?

Example: Tomography of the Singlet State

We need to estimate 15 real numbers \(\mu_P = \tr(\ket{\beta_{11}}\bra{\beta_{11}} P)\) for non-identity two-qubit Pauli operators \(P\).

I X Y Z

I 0 0 0

X 0 -1 0 0

Y 0 0 -1 0

Z 0 0 0 -1
1. For \(P = XX, YY, ZZ\), \(\mu_P = -1\),
2. For all other \(P\), \(\mu_P = 0\).
Note that in this simple case, it suffices to check the \(XX, YY, ZZ\) average values in the first case to certify that the state is the singlet state.

The singlet state is the unique state stabilized by \(-XX\) and \(-ZZ\).

	I	X	Y	Z
I		0	0	0
X	0	-1	0	0
Y	0	0	-1	0
Z	0	0	0	-1

Lecture 6: Teleportation

Teleport!

The Setup

We have two players, Alice and Bob.
Alice and Bob share an EPR state.
Alice also has an input qubit given in some unknown state \(\ket{\psi}\).

Classically, there are two methods.
1. Physically transfer
2. Copy and send
She wants to transfer this qubit to Bob using classical communication only.
Even if Alice knows the description of the state \(\ket{\psi}\), for example, maybe she prepared the state using a circuit on her own, the task is still really challenging.

To describe the state, we need to specify two real numbers!
The teleportation protocol needs to transfer two bits from Alice to Bob by consuming the shared EPR state.

The Protocol

We name the qubit in the shared EPR state between Alice and Bob and the input qubit as A, B, and C, respectively.
1. Alice measures the projective measurement defined by the four Bell states on A and C, let \(a, b\) be the outcome bits;
2. Alice sends the classical outcome \(a, b\) to Bob;
3. Bob applies \(Z^a X^b\) to B.
The Bell states form an orthonormal basis for the two-qubit system \(AC\).

How Did the Authors Come Up With This?

The protocol is simple but very hard to find and only discovered in 90's (quantum mechanics was completely established in the 30's).
Teleporting an unknown quantum state via dual classical and Einstein-Podolsky-Rosen channels

[C. Bennett, G. Brassard, C. Crépeau, R. Jozsa, A. Peres and W. Wootters, '93]
The Origins of Quantum Teleportation - Charles Bennett

Bennett on the origins of teleportation (Video)

What is it that they don't have that they need that would enable them to do as well if they were in separate locations than if they were in the same location.

… So then we realized that by sharing entanglement between the two observers and by permitting them to communicate, you could simulate being in the same place.

Analysis

Analysis I: Direct

\[\begin{alignat*}{3} & \ket{\beta_{00}} & = \frac{\ket{00} + \ket{11}}{\sqrt{2}},\quad && \ket{\beta_{01}} & = \frac{\ket{01} + \ket{10}}{\sqrt{2}},\\ & \ket{\beta_{10}} & = \frac{\ket{00} - \ket{11}}{\sqrt{2}},\quad && \ket{\beta_{11}} & = \frac{\ket{01} - \ket{10}}{\sqrt{2}}.\\ \end{alignat*} \]

Before Alice's measurement, the state is
\[\begin{equation*} \ket{\text{Init}} = \frac{1}{\sqrt{2}} {(\alpha\ket{0} + \beta\ket{1})}_C \otimes (\ket{00} + \ket{11})_{AB}. \end{equation*} \]
Expand the state using the Bell basis, we have
\[\begin{equation*} \begin{split} \ket{\text{Init}} \propto \, & \ket{\beta_{00}} \otimes {(\alpha\ket{0}+\beta\ket{1})}_B + \ket{\beta_{01}} \otimes {(\alpha\ket{1}+\beta\ket{0})}_B +\\ & \ket{\beta_{10}} \otimes {(\alpha\ket{0}-\beta\ket{1})}_B + \ket{\beta_{11}} \otimes {(\alpha\ket{1}-\beta\ket{0})}_B. \end{split} \end{equation*} \]

If it is the first you see this, you may feel a bit confused about how we get the above expansion. The key is to use linearity to do this on product state terms and take the sum. Alternatively, you may consider multiplying
\[\begin{equation*} I = \sum_{i,j} \ket{\beta_{ij}}\bra{\beta_{ij}}_{AC} \otimes I_B \end{equation*} \]
If the measurement has outcome \(a, b\), the state on \(B\) after correction is
\[\begin{equation*} \alpha \ket{0} + \beta \ket{1}. \end{equation*} \]

Analysis II: Delayed Measurement (Skip)

Measurements can be delayed to the end.

Proof: Use controlled gates to simulate classical control.
For \(a = 0, 1\),
\[\begin{equation*} \begin{split} & \ket{a} (\ket{00} + \ket{11}) \\ \mapsto\; & \ket{aa0} + \ket{a \bar{a} 1} \\ \mapsto\; & \ket{aaa} + \ket{a\bar{a}a}\\ \mapsto\; & \ket{0aa} + \ket{0\bar{a}a}\\ =\,\; & \sqrt{2} \ket{0} \ket{+} \ket{a}. \end{split} \end{equation*} \]

Analysis III: Tensor (Skip)

Tensors are generalizations of matrices.
- Matrix \(A\) maps a pair of indices \((i,j)\) to a scaler \(A_{i,j}\).
- A rank-k tensor \(T\) maps a \(k\)-tuple \((i_1, i_2, \ldots, i_k)\) to a scaler \(T_{i_1, i_2, \ldots, i_k}\).
Tensor diagrams.
- We use a circle (or other shapes) to represent the tensor and \(k\) open edges connected to it to represent the \(k\) indices.
- The dimension of each index is the number of different values it takes.
Tensor contraction generalizes matrix multiplication.

If two indices have the same dimension, the contraction operation identifies them and sums them over.
\[\begin{equation*} \sum_k A_{ijk} B_{lkm}. \end{equation*} \]
Trace in tensor network!

Analysis III: Tensor (cont.)

States, gates, and measurements are all tensors

Quantum circuits are tensor networks!
Teleportation as a tensor contraction:
- The EPR state is a wire in the tensor picture;
- We can move the tensor node freely along the wires.
Bob gets the one-time pad of Alice input qubit as \(U_{ab} = Z^a X^b\).

Analysis IV: Stabilizer (Skip)

Specifying quantum states using operators:

\[\begin{alignat}{2} & XX & \ket{\text{EPR}} & = \ket{\text{EPR}}\\ & ZZ & \ket{\text{EPR}} & = \ket{\text{EPR}} \end{alignat} \]

\(\ket{\text{EPR}}\) is stabilized by \(XX\) and \(ZZ\).

\(XX\) and \(ZZ\) commute.
\(XX\) and \(ZZ\) as observables:

Measure both qubits in the Hadamard basis (\(X\) measurement), and output the parity.
\[\begin{equation*} \frac{I + XX}{2} = \frac{I+X}{2}\otimes \frac{I+X}{2} + \frac{I-X}{2} \otimes \frac{I-X}{2} \end{equation*} \]
The Bell measurement is the same as measuring the \(XX\) and \(ZZ\) observables.
- We can measure both as they are compatible.
- Back propagation of operators
Teleportation is explained in the stabilizer framework.
- It is enough to make sure \(X\) and \(Z\) measurements are recovered.

Discussions

Understanding Teleportation

What if the input state on C is part of a larger system \(CD\) and the state on \(CD\) is \(\ket{\psi}_{CD}\)?
After the teleportation, the state on \(BD\) is \(\ket{\psi}\). That is, the correlation (entanglement) between \(CD\) is preserved and teleported to \(BD\)!
A special case is when \(\ket{\psi} = \ket{\text{EPR}}\), another EPR pair.

It is known as the Entanglement Swapping protocol.
Would teleportation allow faster-than-light quantum transmission?

Bob has one-time padded quantum information without the two bits.
Would teleportation allow the cloning of quantum information?
Did the Bell measurement learn anything about the input state?

Superdense Coding

A dual problem of teleportation

Alice and Bob share an EPR state;
Alice gets two bits \(a, b\) as input;
Alice applies \(X^a Z^b\) on her half of the EPR state and sends her qubit to Bob;
Bob performs Bell measurement to get \(a, b\).

The four Bell states can be transformed to each other locally.
The four Bell states are orthogonal to each other and are therefore perfectly distinguishable.
A scheme for sharing classical secrets (2 bits) among two players (each holding one qubit).

Question: is it possible to share quantum secrets?

Entanglement as a Resource

Teleportation, superdense coding, and many other protocols use EPR and entanglement as resource, without which the protocols cannot be carried out.
They motivated the study of entanglement as a new type of resource.

How do we quantify the amount of entanglement needed for a certain task?

Mixed State Entanglement (Skip)

A mixed state \(\rho_{AB}\) is said to be separable if there is an ensemble of the form \(\bigl\{ \bigl( p_i, \rho_{A}^{(i)} \otimes \rho_{B}^{(i)} \bigr) \bigr\}_{i=1}^k\) such that
\[\begin{equation*} \rho_{AB} = \sum_{i=1}^k\, p_i\, \rho_{A}^{(i)} \otimes \rho_{B}^{(i)}. \end{equation*} \]
Separable states are preparable using LOCC only
Wide range of applications

[M. Ying, L. Zhou, and Y. Li, Reasoning about Parallel Quantum Programs, '18], [C. Piveteau and D. Sutter, Circuit Knitting with classical communication, '22]

Peres PPT Condition (Skip)

How to check if a mixed state is separable or not?
Partial transpose
\[\begin{equation*} (\cdot)^{T_B}: \ket{ij}\bra{kl} \mapsto \ket{il} \bra{kj}. \end{equation*} \]
(PPT condition). If \(\rho\) is a separable state, then \(\rho^{T_B}\) is positive semidefinite.
\[\begin{equation*} \sum_{i=1}^k p_i\, \rho_A^{(i)} \otimes \rho_B^{(i)} \mapsto \sum_{i=1}^k p_i\, \rho_A^{(i)} \otimes {\bigl( \rho_B^{(i)} \bigr)}^{T_B}. \end{equation*} \]
Check that the \(\ket{\text{EPR}}\) state is not PPT.
The PPT condition is a necessary and sufficient condition for separability only for \(2\) by \(2\) and \(2\) by \(3\) systems.
In general, it is NP-hard even to approximate with \(1/\poly\) error!

[S. Gharibian, Strong NP-Hardness of the Quantum Separability Problem, '08]

Entanglement Distillation and Cost

Distillable entanglement and entanglement cost

\(E_{\text{D}} \le E_{\text{C}}\).
- Distillable entanglement \(E_{\text{D}}(\rho_{AB})\) is the rate at which we can generate the EPR state using LOCC.
- Entanglement cost \(E_{\text{C}}(\rho_{AB})\) is the rate at which EPR can be transformed to the state \(\rho_{AB}\).
If \(\rho_{AB}\) is a pure state,
\[\begin{equation*} E_{\text{D}}(\rho_{AB}) = E_{\text{C}}(\rho_{AB}) = S(\rho_A) = S(\rho_B). \end{equation*} \]
Bound entanglement: entangled states which cannot be distilled exist!

[Horodecki\(^3\), Mixed-state entanglement and distillation, 1998]

Applications and Extensions

Quantum networks

Teleportation empowers entanglement swapping and quantum repeaters.
Gate teleportation

[Aliferis and Leung, Computation by measurements: a unifying picture, 2014]
Port-based teleportation

Lecture 7: Quantum Circuits and Early Quantum Algorithms

Quantum Circuits and Universality

Toffoli and Fredkin

We have seen many quantum gates such as \(H\) and CNOT. We now introduce two three-qubit gates called Toffoli and Fredkin.

Toffoli
\[\begin{equation*} \mathrm{Toffoli}: \ket{a, b, c} \mapsto \ket{a, b, c \oplus a b}. \end{equation*} \]
Fredkin
\[\begin{equation*} \mathrm{Fredkin}: \ket{a, b, c} \mapsto \begin{cases} \ket{a, b, c} & \text{ if } a = 0,\\ \ket{a, c, b} & \text{ if } a = 1. \end{cases} \end{equation*} \]
Both Toffoli and Fredkin are classically universal.

Simulate \(\mathrm{NAND}\) using Toffoli.

Reversible Computation

Reversible classical computation and Landauer's principle

Energy consumption and irreversibility in computation

In the scenario where a computer erases bit of information, it results in the dissipation of energy into its surrounding environment. The minimum amount of energy that is released is proportional to \(k_B T \ln 2\), where \(k_B\) is Boltzmann's constant and \(T\) denotes the temperature of the environment.

Does computation have to consume energy?!

It's possible to perform classical computation reversibly (using Toffoli or Fredkin).
Uncompute technique

Clean the computational garbage!

Universal Gate Sets

A set of quantum gates is universal if all other unitary operators can be generated exactly (or approximately) from that set.
1. All gates are in \(\mathrm{SU}(d)\)
2. For any \(U \in \mathrm{SU}(d)\), there is product of elements in the set that approximates \(U\).
Quantum computers are digital.

Approximation in the sense of the following distance
\[\begin{equation*} d(U, V) = \norm{U - V} = \sup_{\ket{\psi}} \norm{(U-V) \ket{\psi}}. \end{equation*} \]
Example:
1. CNOT and single-qubit gates (exact);
2. CNOT, H, T (approximate);
3. Toffoli, H (weakly universal).
  
  Complex numbers are essential for quantum mechanics but not for quantum computing.
  
  With an extra qubit, it is possible to map a circuit \(U\) with complex entries to a circuit \(V\) with real entries.
  
  Trick: \(a + bi \mapsto \begin{bmatrix}a & b \\ -b & a\end{bmatrix}\).

Solovay-Kitaev Theorem

According to Wikipedia, it was first announced by Robert M. Solovay in 1995 and independently proven by Alexei Kitaev in 1997.

Let \(G\) be a universal gate set for \(\mathrm{SU}(d)\). Then there is a constant \(c\) such that any unitary in \(\mathrm{SU}(d)\) can be approximated to error \(\eps\) using a product of elements in \(G\) of length \(\log^c(1/\eps)\).

Lemma. Let \([U,V] = UVU^{-1}V^{-1}\) be the group commutator of \(U, V\) and let \(S_\eps\) be \(\{ U \in \mathrm{SU}(2) : \norm{I - U} \le \eps \}\). If \(\Gamma\) is an \(\eps^2\)-net for \(S_\eps\), then \([\Gamma, \Gamma] = \{ [U, V] \mid U, V \in \Gamma \}\) is an \(\eps^3\)-net for \(S_{\eps^2}\).

Note that the lemma is not easy to use iteratively as the scaling of \(\eps\)-net and the size of the neighborhood of the identity do not match the initial condition.
The closedness condition under inverse is not needed anymore.

Subadditivity of Errors

Suppose \(U_i, V_i\) are unitary operators satisfying \(\norm{U_i - V_i} \le \eps\) for all \(i = 1, 2, \ldots, t\). Then

\[\begin{equation*} \norm{U_t \cdots U_2 U_1 - V_t \cdots V_2 V_1} \le t \eps. \end{equation*} \]

Proof: By the triangle inequality and unitary invariance of the operator norm,

\[\begin{equation*} \begin{split} & \norm{U_t \cdots U_2 U_1 - V_t \cdots V_2 V_1}\\ =\; & \norm{U_t \cdots U_2 U_1 - U_t V_{t-1} \cdots V_1 + U_t V_{t-1} \cdots V_1 - V_t \cdots V_2 V_1} \\ \le\; & \norm{U_t \cdots U_2 U_1 - U_t V_{t-1} \cdots V_1} + \norm{U_t V_{t-1} \cdots V_1 - V_t \cdots V_2 V_1} \\ =\; & \norm{U_{t-1}\cdots U_2 U_1 - V_{t-1} \cdots V_2 V_1} + \norm{U_t - V_t } \\ \le\; & t \eps. \end{split} \end{equation*} \]

This means that for a circuit of \(m\) gates using one gate set can be transformed to a circuit of size \(m \polylog(m/\eps)\) using one other gate where the total approximation error is \(\eps\).

Early Quantum Algorithms

Superposition of Function Calls

Let \(f : \X \rightarrow \Y\) be a function that we know how to efficiently implement classically.
This implies that we can efficiently perform the following transform
\[\begin{equation*} O_f : \ket{x}\ket{y} \mapsto \ket{x} \ket{y + f(x)}. \end{equation*} \]
The superposition principle then implies that we can efficiently prepare
\[\begin{equation*} \frac{1}{\sqrt{\abs{\X}}} \sum_{x\in \X} \ket{x} \ket{f(x)}. \end{equation*} \]
Use the uncompute technique.
Phase oracle: If \(\Y = \{0,1\}\), we can use the so-called "phase-kickback" technique to implement the transform
\[\begin{equation*} O_{f,\pm} : \ket{x} \mapsto (-1)^{f(x)} \ket{x}. \end{equation*} \]

We add a \(\ket{-}\) ancilla to implement the phase oracle using the standard oracle.

To implement the standard oracle, we may need a controlled phase oracle.

Deutsch-Jozsa Algorithm: Problem Statement

Let \(n\) be an integer and \(N = 2^n\).
We are given oracle access to \(f: \{0,1\}^n \rightarrow \{0,1\}\) with the promise that
1. \(f(x)\) is either always \(0\) or all \(1\) (constant) or
2. For exactly half of the inputs \(f(x) = 0\) (balanced).
Problem: find out which of the above two cases is true.
A classical algorithm will need \(\Omega(N)\) queries to \(f\) in the worst case.

Deutsch-Jozsa Algorithm

Apply \(H^{\otimes n}\) to \(\ket{0^n}\);
Call oracle \(O_{f,\pm}\) once in superposition;
Apply \(H^{\otimes n}\);
Measure all qubits in the \(Z\) basis, answer 'constant' if all outcome is \(0\) and answer 'balanced' otherwise.

Quantum parallelism and cancellation!

After Step 2, the state is
\[\begin{equation*} \frac{1}{\sqrt{\abs{\X}}} \sum_{x} (-1)^{f(x)} \ket{x}. \end{equation*} \]
The action of \(H^{\otimes n}\) is
\[\begin{equation*} H^{\otimes n} : \ket{x} \mapsto \frac{1}{\sqrt{\abs{\X}}} \sum_y (-1)^{x \cdot y} \ket{y}. \end{equation*} \]
So after Step 3, the state is proportional to
\[\begin{equation*} \frac{1}{\abs{\X}} \sum_{x, y} (-1)^{f(x) + x \cdot y}\, \ket{y}. \end{equation*} \]
The probability of observing \(y = 0\) is \(\left( \frac{1}{\abs{\X}} \sum_x (-1)^{f(x)} \right)^2\).

Bernstein-Vazirani Algorithm (Skip)

Bernstein-Vazirani algorithm.

Apply \(H^{\otimes n}\) to \(\ket{0^n}\);
Call oracle \(O_{f, \pm}\);
Apply \(H^{\otimes n}\);
Measure all qubits and answer.

Given oracle access to \(f(x) = s \cdot x\) where \(s\) is a hidden secret.
Problem: Find \(s\).
Analysis
- Similar to the analysis of the Deutsch-Jozsa algorithm.
- Before measurement, the state is
  \[\begin{equation*} \frac{1}{\abs{\X}} \sum_{x, y} (-1)^{f(x) + x \cdot y}\, \ket{y} = \frac{1}{\abs{\X}} \sum_{x, y} (-1)^{x \cdot (y + s)}\, \ket{y} = \ket{s}. \end{equation*} \]
Any classical algorithm (deterministic or randomized) needs \(n\) queries.

The final answer consists of \(n\) bits, and one classical query learns at most \(1\) bit of information.

Simon's Algorithm: Problem Statement

Simon's algorithm is the main inspiration for Shor's factoring quantum algorithm.
It shows that quantum computers are provably exponentially more efficient (in terms of the number of queries) than bounded-error randomized algorithms.
Simon's problem setup:

Oracle \(f: \{0,1\}^n \rightarrow \{0,1\}^n\) satisfies that \(f(x) = f(x')\) if and only if \(x = x'\) or \(s + x = x'\) for some hidden secret \(s \ne 0\).
Problem: Find \(s\).
Note that the function \(f\) outputs multiple bits now, and it is a \(2\)-to-\(1\) function.

Simon's Algorithm

For \(i=1, 2, \ldots, O(n)\) do:
1. Prepare state \(\sum_{x} \ket{x}_A \ket{f(x)}_B\);
2. Measure register \(B\);
3. Apply \(H^{\otimes n}\) and measure all qubits in \(A\) to get \(y^{(i)}\);
Solve \(s \cdot y^{(i)} = 0\) for all \(i\).

The state on \(A\) after the measurement is
\[\begin{equation*} \frac{1}{\sqrt{2}}(\ket{x} + \ket{x + s})_A, \end{equation*} \]
for some random \(x\).

It is the superposition of preimages of \(f\) for \(f(x)\).
After Step 1.3 each iteration, the state on \(A\) can be written as
\[\begin{equation*} \begin{split} & \frac{1}{\sqrt{2^{n+1}}} \sum_y \bigl( (-1)^{x \cdot y} + (-1)^{(x + s) \cdot y} \bigr) \ket{y} \\ = \; & \frac{1}{\sqrt{2^{n+1}}} \sum_y (-1)^{x \cdot y} \bigl(1 + (-1)^{s \cdot y} \bigr) \ket{y}. \end{split} \end{equation*} \]
The measurement will output a random \(y\) such that \(s \cdot y = 0\).
Step 1.2 is not necessary.

Simon's Algorithm: Analysis

Each \(y^{(i)}\) is sampled independently among those satisfying \(s \cdot y^{(i)} = 0\).
To determine \(s\), we need to have \(n-1\) linearly independent \(y^{(i)}\)'s.
If the dimension of \(\text{span}(y^{(1)}, y^{(2)}, \ldots, y^{(i-1)})\) has rank \(k < n-1\), then the probability that \(y^{(i)}\) is not in the span is \(1 - \frac{2^{k}}{2^{n-1}} \ge \frac{1}{2}\). This implies that the algorithm runs in \(O(n)\) iterations with high probability.

Simon's algorithm queries the oracle \(O(n)\) times.
Yet any classical algorithm needs \(\Omega(\sqrt{2^n})\) queries!

Simon's Algorithm: Classical Bounds

Upper bound: By the birthday paradox, we expect to find \(x, x'\) such that \(f(x) = f(x')\) after \(O(\sqrt{2^n})\) queries.
Lower bound: Any classical (randomized or deterministic) algorithm needs \(\Omega(\sqrt{2^n})\) queries.
- We say that a sequence \(x_1, x_2, \ldots, x_k\) is bad if there is no collision.
  
  This excludes at most \(k \choose 2\) possible \(s\)'s.
- For all \(k \ll \sqrt{2^n}\), the probability that \(x_1, x_2, \ldots, x_k\) is bad can be bounded as
  
  \[\begin{equation*} \begin{split} \Pr(x_1, \ldots, x_k \text{ is bad}) & = \prod_{j=2}^k \Pr(x_1, \ldots, x_j \text{ is bad} \,|\, x_1, \ldots, x_{j-1} \text{ is bad}) \\ & = \prod_{j=2}^k \left(1 - \frac{j-1}{2^n - {j-1 \choose 2}}\right)\\ & \ge 1 - \sum_{j=2}^k \frac{j-1}{2^n - {j-1 \choose 2}} \approx 1 - \frac{k^2}{2^{n+1}}. \end{split} \end{equation*} \]

Fourier Transform

Discrete Fourier Transform (DFT)

Let \(\X=\{0,1,\ldots,N-1\}\) be a finite set.
The discrete Fourier transform transforms a function
\[\begin{equation*} f:\X \rightarrow \complex \end{equation*} \]
to another function \(\hat{f}\) where
\[\begin{equation*} \hat{f}(y) = \frac{1}{\sqrt{N}}\sum_{x\in\X} \omega_N^{xy} f(x). \end{equation*} \]
Here, \(\omega_N = e^{2\pi i/N}\) is the \(N\)-th root of unity.

The normalization factor makes the map a unitary operator!
A transformation on a sequence of \(N\) complex numbers.
An important tool in classical signal processing.

Fast Fourier Transform (FFT)

Cooley–Tukey algorithm, from \(O(N^2)\) to \(O(N\log N)\):
\[\begin{equation*} \hat{f}(y) = \frac{1}{\sqrt{N}}\sum_{x\in\X} \omega_N^{xy} f(x). \end{equation*} \]
- Gauss (1805)!
- Assume for simplicity \(N=2^n\).
- Consider the even and odd parts of \(f\) defined on \(\{0, 1, \ldots, N/2 - 1\}\),
  \[\begin{equation*} f_e(x) = f(2x),\quad f_o(x) = f(2x+1). \end{equation*} \]
- Recursive structure: for \(0\le y < N/2\),
  \[\begin{equation*} \hat{f}(\overline{0y}) = \frac{1}{\sqrt{2}} \Bigl[ \hat{f_\text{e}}(y) + \omega_N^y \hat{f_{\text{o}}}(y) \Bigr]. \end{equation*} \]

Fast Fourier Transform (cont.)

What about the second half? For \(1y\), we have

\[\begin{equation*} \begin{split} \hat{f}(\overline{1y}) & = \frac{1}{\sqrt{N}} \sum_{0 \le x < N/2} \bigl(\omega_N^{2 x (y + N/2)} f_e (x) + \omega_N^{(2x + 1) (y + N/2)} f_o(x) \bigr)\\ & = \frac{1}{\sqrt{2}} (\hat{f}_e(y) - \omega_N^y \hat{f}_o(y)). \end{split} \end{equation*} \]
The recursion relation is

\[\begin{equation*} \begin{split} \hat{f}(\overline{0y}) & = \frac{1}{\sqrt{2}} \Bigl[ \hat{f_\text{e}}(y) + \omega_N^y \hat{f_{\text{o}}}(y) \Bigr],\\ \hat{f}(\overline{1y}) & = \frac{1}{\sqrt{2}} \Bigl[ \hat{f_\text{e}}(y) - \omega_N^y \hat{f_{\text{o}}}(y) \Bigr]. \end{split} \end{equation*} \]
FFT: compute the fast Fourier transform (recursively) on the even and odd parts respectively and combine the results using the above rule.

Quantum Fourier Transform

Quantum Fourier Transform (QFT)

For function \(f:\X \rightarrow \complex\), define quantum state.
\[\begin{equation*} \ket{f} = \sum_{x\in\X} f(x) \ket{x}. \end{equation*} \]
Quantum Fourier transform maps the superposition \(\ket{f}\) to \(\ket{\hat{f}}\)

Verify that \(F \ket{f} = \ket{\hat{f}}\).
Recall that quantum computation is about the processing of superposition states.

In QFT, we transform the amplitudes by DFT.
The QFT is a unitary transform:
\[\begin{equation*} F = \frac{1}{\sqrt{N}} \sum_{x,y\in\X} \omega_N^{xy}\ket{y}\bra{x}. \end{equation*} \]
Hadamard is the smallest Fourier transform (N=2)!

Hadamard and Toffoli are quantum universal.

QFT: Efficient Realization I

Use the recursion in a quantum way.
\[\begin{equation*} \begin{split} \hat{f}(\overline{0y}) & = \frac{1}{\sqrt{2}} \Bigl[ \hat{f_\text{e}}(y) + \omega_N^y \hat{f_{\text{o}}}(y) \Bigr]\\ \hat{f}(\overline{1y}) & = \frac{1}{\sqrt{2}} \Bigl[ \hat{f_\text{e}}(y) - \omega_N^y \hat{f_{\text{o}}}(y) \Bigr] \end{split} \end{equation*} \]
Apply QFT of \(n-1\) qubits to the first \(n-1\) qubits only once!!!

Then do some post-processing.
Cheating?! No!

QFT: Efficient Realization II

After the QFT on the first \(n-1\) qubits, we have
\[\begin{equation*} \ket{f} = \ket{f_{\text{e}}} \ket{0} + \ket{f_{\text{o}}} \ket{1} \mapsto \ket{\hat{f_{\text{e}}}} \ket{0} + \ket{\hat{f_{\text{o}}}} \ket{1}. \end{equation*} \]
Define function \(\hat{g}_{\text{o}}(y) = \omega_N^y \hat{f_{\text{o}}}(y)\), and apply \(n-1\) controlled phase gates, we have
\[\begin{equation*} \ket{\hat{f_{\text{e}}}} \ket{0} + \ket{\hat{g}_{\text{o}}} \ket{1}. \end{equation*} \]
Apply Hadamard to the last qubit, we get
\[\begin{equation*} \frac{\ket{\hat{f_{\text{e}}}} + \ket{\hat{g}_{\text{o}}}}{\sqrt{2}} \ket{0} + \frac{\ket{\hat{f_{\text{e}}}} - \ket{\hat{g}_{\text{o}}}}{\sqrt{2}} \ket{1}. \end{equation*} \]
Move the last qubit to the first by SWAP gates, and we have \(\ket{\hat{f}}\).

The last step is needed because we need \(\hat{f}(0y)\) and \(\hat{f}(1y)\).

Remarks on QFT

We have shown an efficient implementation of QFT.

Complexity is \(n^2\) or \(\log^2 N\).

A result shows that, if small rotations are omitted, QFT still works approximately and has complexity \(n \log n\).
Do not expect it to be useful in most problems of signal processing!

You cannot read out the values of the amplitudes.

Lecture 8: Quantum Algorithms II

Course Projects

Project Guidelines

You may select one of the provided topics and work in a group of up to three students. While not mandatory, students are strongly encouraged to pursue projects that contribute something novel, whether through a new perspective, innovative methods, or original algorithms.

Topic Selection Rules

Each topic can be chosen by at most two groups. Selection is on a first-come, first-served basis (先到先得).

After making your choice, please contact the teaching assistant via WeChat as soon as possible to secure your topic.
Contribution statement

The final submission must explicitly state each team member’s role, responsibilities, and specific contributions (e.g., theoretical analysis, implementation, experiments, writing, etc.).
Evaluation Criteria

Your project will be evaluated based on the following components:
1. Understanding
  - Demonstrate a thorough comprehension of the topic, including its background, key concepts, and relevant techniques.
  - (If applicable) Discuss recent advancements in the field, citing relevant literature.
2. Writing (Research Paper Submission) Your written submission should follow a structured research paper format, including:
  - Introduction – Provide context, problem motivation, and a high-level summary of your contributions.
  - Preliminaries – Define key notations, assumptions, and foundational concepts.
  - Main Technical Content – Present your core methodology, analysis, or findings with rigor.
  - Summary & Discussion – Conclude with key takeaways, limitations, and directions for future work.
  Submission Timeline:
  - A draft must be submitted before the presentation.
  - A final polished document is due before Week 18.
  In the final submission, please clearly state your individual contributions to the project.
3. Presentation (Week 16)
  - Prepare clear, well-organized slides to effectively communicate your work.
  - Emphasize key insights, methodology, and results.
  - Engage the audience with a logical flow and confident delivery.
4. Originality
  - Submissions that introduce novel ideas, techniques, or near-publishable results will be highly rewarded.
  - Creativity and innovation in problem-solving are strongly encouraged.

Quantum Theory Projects

T1: Quantum Algorithms and Sparse Optimization

HTML, PDF, Markdown
T2: IQP Circuits, Different Constructions and Noise Effects

HTML, PDF, Markdown
T3: Resource Estimation for Proof of Quantumness

HTML, PDF, Markdown
T4: Robust Combiners for Quantum Cryptographic Primitives

HTML, PDF, Markdown

Quantum Software Projects

S1: Classical simulation of quantum circuits

HTML, PDF, Markdown
S2: Birkhoff-von Neumann Quantum Logic

HTML, PDF, Markdown
S3: Quantum Circuits and Polynomials

HTML, PDF, Markdown
S4: Game Designed for Quantum Computing

HTML, PDF, Markdown

Quantum Error Correction Code Projects

C1: Quantum Error Correction Codes with Transversal Clifford Gates

HTML, PDF, Markdown
C2: Exploration of Error Correction Algorithms for Quantum Dynamic Codes

HTML, PDF, Markdown
C3: Gate Scheduling for Syndrome Measurements

HTML, PDF, Markdown

Quantum Phase Estimation

Phase Estimation: Introduction

In phase estimation, we have access to controlled-\(U^x\) for all \(x\in \X\) and also an eigenstate \(\ket{\phi}\) of \(U\) where
\[\begin{equation*} U \ket{\phi} = e^{i\phi} \ket{\phi}. \end{equation*} \]
The problem is to estimate the phase \(\phi \in [0, 2\pi)\) to some desired precision.
- Inverse Fourier transform
  \[\begin{equation*} F^{-1} = \frac{1}{\sqrt{N}} \sum_{x,y\in\X} \omega_N^{-xy}\ket{y}\bra{x}. \end{equation*} \]
Phase estimation is a powerful quantum algorithmic subroutine used in many other quantum algorithms.

Phase Estimation

To get an \(m\)-bit estimate of \(\phi\), the phase estimation procedure performs the following with \(n = m + O(\log(1/\eps))\),
1. Prepare state \(\displaystyle \frac{1}{\sqrt{N}} \sum_{x\in\X} \ket{x} \ket{\phi}\);
2. Apply unitary \(\displaystyle \sum_{x\in\X} \ket{x}\bra{x} \otimes U^x\), and the state becomes \(\displaystyle \frac{1}{\sqrt{N}} \sum_{x\in\X} e^{i\phi x} \ket{x}\ket{\phi}\);
3. Apply an inverse quantum Fourier transform on the first register.
If \(\phi = 2\pi y / 2^n\) for some \(y\in \X\), the state in the first register before the inverse Fourier transform is \(F \ket{y}\).
The first register is a good approximation of \(\phi/2\pi\).
The approximation will have \(m\) bits of precision with probability at least \(1-\eps\).

If \(n = m\), the outcome is the best \(n\)-bit approximation with probability at least \(4/\pi^2\).

Discrete Logarithm and Diffie-Hellman

2015 Turing Award

New directions in cryptography

Given a cyclic group \(G = \langle g_0 \rangle\) generated by \(g_0\) and an element \(h\in G\), the size \(N=\abs{G}\) is known.
The discrete logarithm \(\log_{g_0} h\) of \(h\) in \(G\) with respect to \(g_0\) is the smallest non-negative integer \(\alpha\) such that \(g_0^\alpha = h\).
Important for classical cryptography (Diffie-Hellman key exchange)!
We can completely break the protocol, if we can solve the discrete logarithm for \(A\) and \(B\), the two messages exchanged by the players,
Yet, no efficient classical algorithm is known.

Quantum Attack on Diffie-Hellman

We attack the protocol by giving a quantum algorithm for computing the discrete logarithm of a given \(h\).
Consider the following unitary.
\[\begin{equation*} U: \ket{g} \mapsto \ket{hg} \end{equation*} \]
It is possible to implement controlled-\(U^x\) operators for any \(x\) by repeated squaring.
Define states \(\displaystyle \ket{\phi_y} = \frac{1}{\sqrt{N}} \sum_{x\in\X} \omega_N^{xy} \ket{g_0^x}\), then
\[\begin{equation*} U \ket{\phi_y} = \frac{1}{\sqrt{N}} \sum_{x\in\X} \omega_N^{xy} \, \ket{g_0^{x+\log_{g_0}h}} = \omega_N^{-y\log_{g_0} h} \ket{\phi_y}. \end{equation*} \]
Use phase estimation!

Quantum Attack on Diffie-Hellman

How do we prepare the eigenstate \(\ket{\phi_y} = \frac{1}{\sqrt{N}} \sum_{x\in\X} \omega_N^{xy} \ket{g_0^x}\)?
- It is not easy as the phase \(\omega_N^{xy}\) depends on the discrete log of \(g_0^x\).
- It won't work if we prepare \(\frac{1}{\sqrt{N}} \sum_{x \in \X} \omega_N^{xy} \ket{x} \ket{g_0^x}\).
- Forgetting quantum information coherently (\(\ket{x}\) here) is hard!
We can sample a random pair of \((y, \ket{\phi_y})\).
1. Prepare \(\frac{1}{\sqrt{N}} \sum_{x \in \X} \ket{x} \ket{g_0^x}\).
2. Apply the QFT to the first register to get \(\frac{1}{N} \sum_{x,y \in \X} \omega_N^{xy} \ket{y} \ket{g_0^x}\).
3. Measure the first register.

Shor's Factoring Algorithm: Background

Factoring has been widely studied, and no classical algorithm is known.

The best known classical algorithm runs in time \(2^{(\log N)^c}\) for \(c = 1/2\) (or \(1/3\) depending on number theory conjectures).
People believe factoring is hard and use it to construct public-key cryptography.

Rivest, Shamir, and Adleman receive 2002 Turing Award for this.
Factoring is in BQP.

Shor's algorithm for factoring runs in \(\poly (\log N)\) quantum time.
A challenge to extended Church-Turing thesis:

Any efficient computation performed on a physically realistic model can be carried out efficiently on a Turing machine.
How did Peter Shor discover his algorithm?

Watch this video

From Factoring to Order-finding

Knowing the period of \(x^0, x^1, x^2, \ldots \mod N\) helps solving the factoring problem for \(N\).

The smallest non-zero \(r\) such that \(x^r \equiv 1 \pmod{N}\).
If \(r\) is even and \(x^{r/2}+1\) and \(x^{r/2}-1\) are not multiples of \(N\), we have
\[\begin{equation*} x^r - 1 = (x^{r/2}-1)(x^{r/2}+1) = kN. \end{equation*} \]
Then \(\mathrm{gcd}(x^{r/2}-1, N)\) and \(\mathrm{gcd}(x^{r/2}+1, N)\) will be non-trivial factors of \(N\).
Number theory: The condition above happens with probability \(\ge 1/2\) over random \(x\).
Find the period of \(f(a) = x^a \pmod{N}\).

The period can be huge, and the problem is believed to be classically hard.

Phase Estimation for Order-finding

Phase estimation applied to
\[\begin{equation*} U \ket{y} = \ket{x y\;\mathrm{mod}\;N}. \end{equation*} \]
Some eigenvectors of \(U\):
\[\begin{equation*} \begin{split} \ket{u_s} & = \frac{1}{\sqrt{r}} \sum_{k=0}^{r-1} \omega_r^{-sk} \ket{x^k\;\mathrm{mod}\;N}.\\ U \ket{u_s} & = \frac{1}{\sqrt{r}} \sum_{k=0}^{r-1} \omega_r^{-sk} \ket{x^{k+1}\;\mathrm{mod}\;N} = \omega_r^{s} \ket{u_s}. \end{split} \end{equation*} \]
\(U\) is unitary, and by using modular exponentiation, we can implement controlled-\(U^j\).
How do we prepare the eigenstate \(\ket{u_s}\) depending on \(r\)?

Eigenstate for Phase Estimation

We don't know how to prepare \(\ket{u_s}\) for given \(s\).
Yet, we have
\[\begin{equation*} \frac{1}{\sqrt{r}} \sum_{s=0}^{r-1} \ket{u_s} = \frac{1}{r} \sum_{s,k=0}^{r-1} \omega_r^{-sk} \ket{x^k\;\mathrm{mod}\; N} = \ket{1}. \end{equation*} \]
We can obtain high precision of the phase \(\varphi \approx s/r\) for some random \(s\).
Use continued fraction algorithm to find \(r\).

Suppose \(s/r\) is a rational number satisfying
\[\begin{equation*} \abs{\frac{s}{r} - \varphi} \le \frac{1}{2 r^2}. \end{equation*} \]
Then the continued fraction algorithm for \(\varphi\) can find \(s\) and \(r\) efficiently.
Some hidden detail: Need a minor fix when \(s\) and \(r\) are not coprime.

Hidden Subgroup Problems

The development of Simon's and Shor's algorithms motivated the quantum algorithmic framework called hidden subgroup problem (HSP).
It generalized the technique to a lot more problems with group (algebraic) structures.

Simons problem: \(\{0, s\}\) is a subgroup of \(\{0, 1\}^n\).

Theorem: any finite abelian group is isomorphic to a direct product of cyclic groups.
The abelian HSP problem is completely understood.
Most of the non-abelian cases are open with a few exceptions.
A famous open problem is the dihedral subgroup problem, for which the best quantum algorithm (Kuperberg's algorithm) runs in time \(O(2^{\sqrt{n}})\).

Shor's Algorithm and Public-key Cryptography

Shor's algorithm breaks the RSA public-key cryptosystem if large-scale quantum computers can be built.
Similar algorithms can break elliptic-curve based public-key cryptography.
Post-quantum cryptography is emerging.

NIST PQC standardization process announced the third-round candidates in July, 2020 and the candidates to be standardized in July, 2022.

Year	Public-Key encryption/KEMs	Digital signatures
2020	Classic McEliece (Code-based)	CRYSTALS-DILITHIUM (Lattice-based)
2020	CRYSTALS-KYBER (Lattice-based, LWE)	FALCON (Lattice-based)
2020	NTRU (Lattice-based)	Rainbow (Multivariate)
2020	SABER (Module Learning With Rounding)

2022	CRYSTALS-KYBER (Lattice-based, LWE)	CRYSTALS-DILITHIUM (Lattice-based)
2022		FALCON (Lattice-based)
2022		SPHINCS+ (Hash-based)

Grover's Algorithm

Both Shor and Grover were at Bell Labs at that time.

Unstructured Search Problem

Suppose you have a Boolean function \(f : \{0,1\}^n \to \{0,1\}\) to which we can make oracle access as before.
We know that there is one and only one input \(x\) such that \(f(x) = 1\).

We call this \(x\) the marked item.
How many queries to \(f\) do we need to find the marked item?
Classically \(\Theta(N)\) for \(N = 2^n\).
In 1996, Grover gave a quantum algorithm that makes \(O(\sqrt{N})\) queries only.

How?! It's not very different from Elitzur-Vaidmen.
This marks the beginning of an interesting line of research on quantum algorithmic techniques different from QFT and HSP.
Remark: Grover's algorithm requires the oracle or QRAM (for database search).

Grover's Algorithm

Prepare the equal superposition state \(\frac{1}{\sqrt{2^n}} \sum_x \ket{x}\).
Apply the Grover iterate \(k\) times.
Measure.

Define Grover iterate as
\[\begin{equation*} G = H^{\otimes n} R H^{\otimes n}\, O_{f,\pm}, \end{equation*} \]
where \(R\) is the reflection along \(\ket{0^n}\).
An example for \(n=2\) and marked item \(x=10\):
\[\begin{equation*} \frac{1}{2} \begin{pmatrix} 1 \\ 1 \\ 1 \\ 1 \end{pmatrix} \mapsto \frac{1}{2} \begin{pmatrix} 1 \\ 1 \\ -1 \\ 1 \end{pmatrix} = \frac{1}{4} \begin{pmatrix} 1 \\ 1 \\ 1 \\ 1 \end{pmatrix} + \frac{1}{4} \begin{pmatrix} 1 \\ 1 \\ -3 \\ 1 \end{pmatrix} \mapsto \begin{pmatrix} 0 \\ 0 \\ 1 \\ 0 \end{pmatrix}. \end{equation*} \]

Lecture 9: Quantum Algorithms III

Grover's Algorithm

Prepare the equal superposition state \(\frac{1}{\sqrt{2^n}} \sum_x \ket{x}\).
Apply the Grover iterate \(k\) times.
Measure.

Define Grover iterate as
\[\begin{equation*} G = H^{\otimes n} R H^{\otimes n}\, O_{f,\pm}, \end{equation*} \]
where \(R\) is the reflection along \(\ket{0^n}\).
An example for \(n=2\) and marked item \(x=10\):
\[\begin{equation*} \frac{1}{2} \begin{pmatrix} 1 \\ 1 \\ 1 \\ 1 \end{pmatrix} \mapsto \frac{1}{2} \begin{pmatrix} 1 \\ 1 \\ -1 \\ 1 \end{pmatrix} = \frac{1}{4} \begin{pmatrix} 1 \\ 1 \\ 1 \\ 1 \end{pmatrix} + \frac{1}{4} \begin{pmatrix} 1 \\ 1 \\ -3 \\ 1 \end{pmatrix} \mapsto \begin{pmatrix} 0 \\ 0 \\ 1 \\ 0 \end{pmatrix}. \end{equation*} \]

Grover's Algorithm: Analysis

Good state \(\ket{\mathrm{G}} = \ket{x}\) for \(f(x) = 1\) and bad state \(\ket{\mathrm{B}} = \frac{1}{\sqrt{N-1}} \sum_{x: f(x)=0} \ket{x}\).
The uniform superposition state \(\ket{\mathrm{U}}\) is in the span of \(\ket{\mathrm{G}}\) and \(\ket{\mathrm{B}}\),
\[\begin{equation*} \ket{\mathrm{U}} = \sin\theta\, \ket{\mathrm{G}} + \cos\theta\, \ket{\mathrm{B}}, \text{ for } \theta = \arcsin(1/\sqrt{N}). \end{equation*} \]
The Grover iterate consists of two reflections which generate a rotation of \(2\theta\).
After \(k\) iterations, the state is \(\sin\bigl((2k+1)\theta \bigr)\, \ket{\mathrm{G}} + \cos\bigl((2k+1)\theta \bigr)\, \ket{\mathrm{B}}\).
Choose \(k \approx \frac{\pi}{4\theta} = O(\sqrt{N})\) as \(\sin\theta \le \theta\).

Amplitude Amplification

Setup the starting state \(\ket{\mathrm{U}} = \mathcal{A} \ket{0}\);
Repeat the following for \(O(1/\sqrt{p})\) times:
1. Apply \(O_{f, \pm}\);
2. Apply \(\mathcal{A}R\mathcal{A}^{-1}\);
Measure.

A Boolean function \(f: X \to \{0,1\}\) partitions \(X\) into good and bad elements.
Suppose there is a quantum algorithm \(\mathcal{A}\) with no intermediate measurements starting from \(\ket{0}\) finds a good element with probability \(p\).
The amplitude amplification algorithm finds a good element with \(O(1/\sqrt{p})\) calls to \(\mathcal{A}\) and \(\mathcal{A}^{-1}\).
Grover's algorithm is the special case where \(\mathcal{A} = H^{\otimes n}\).

Oblivious Amplitude Amplification

Let \(U\) and \(V\) be unitary matrices and let \(\theta \in (0, \pi/2)\). Suppose that for an \(n\)-qubit state \(\ket{\psi}\),

\[\begin{equation*} U \ket{0^k}\ket{\psi} = \sin(\theta) \ket{0^k} V \ket{\psi} + \cos(\theta) \ket{\Phi^\perp}, \end{equation*} \]

where \(\ket{\Phi^\perp}\) satisfies \((\bra{0^k} \otimes I) \ket{\Phi^\perp} = 0\). Then there is a unitary \(W\) that uses \(U\), \(U^\dagger\) \(\ell\) times and a few simple gates,

\[\begin{equation*} W \ket{0^k}\ket{\psi} = \sin((2\ell+1) \theta) \ket{0^k} V \ket{\psi} + \cos((2\ell+1) \theta) \ket{\Phi^\perp}. \end{equation*} \]

Oblivious: We don't need to reflect through \(\ket{\psi}\) which is required in standard AA!

Quantum Hamiltonian Simulation

Motivation

… nature isn't classical, dammit, and if you want to make a simulation of nature, you'd better make it quantum mechanical, and by golly it's a wonderful problem, because it doesn't look so easy.

Simulating physics with computers (Feynman, 1981)
Analog simulation: Build a physical system whose Hamiltonian effectively models the desired target system.
Digital simulation: Build a universal, fault-tolerant quantum computer and run a quantum circuit/program to approximate the dynamics of the target system

Hamiltonian Simulation

The Schrödinger equation is
\[\begin{equation*} i\frac{\dif}{\dif t} \ket{\psi(t)} = H \ket{\psi(t)}, \end{equation*} \]
where \(H\), mathematically a Hermitian matrix of size \(2^n\) by \(2^n\), is called the Hamiltonian of the system and, as the equation says, it governs the dynamics of a system.

It characterizes the energy of the system. If the state of the system is \(\ket{\psi}\), the energy is \(\bra{\psi} H \ket{\psi}\).
We have \(\ket{\psi(t)} = U_t \ket{\psi(0)}\) for unitary operator \(\displaystyle U_t = e^{-i H t}\).
Give Hamiltonian \(H\) and time \(t\), find a quantum circuit \(U\) consisting of \(\poly(n,t,1/\epsilon)\) gates such that \(\| U - e^{-i H t} \| \le \epsilon\).
A (digital) quantum computer running quantum programs and quantum circuits can simulate nature's time evolution.

Product Formula

Lie-Suzuki-Trotter product formula

\[\begin{equation*} e^{A + B} = \lim_{N \to\infty} (e^{A/N} e^{B/N})^N \end{equation*} \]

If both \(H_1\) and \(H_2\) are efficiently simulated, then so is \(H_1 + H_2\).
Notice that \(e^{A+B}\) may differ from \(e^A e^B\) as \(A\) and \(B\) may not commute with each other!
Their difference can be roughly bounded as \(O(\norm{A} \norm{B})\).
If we alternate \(H_1\) and \(H_2\) for small time intervals \(r\) times,
\[\begin{equation*} \begin{split} U & = e^{iHt} = (e^{iHt/r})^r = (e^{iH_1 t/r}e^{iH_2 t/r} + E)^r\\ & \approx (e^{iH_1 t/r}e^{iH_2 t/r})^r. \end{split} \end{equation*} \]
Total error is bounded by \(r \norm{E} = \norm{H_1} \norm{H_2} \cdot t^2 / r\). To have error \(\le \eps\), take \(r = O(t^2/ \eps)\).
Local Hamiltonians can be efficiently simulated.

\(H = \sum_j H_j\) where each term \(H_j\) acts on at most \(k\) qubits \(H\) can be specified efficiently.

Sparse Hamiltonian Simulation

More general class of Hamiltonians (sparse Hamiltonians) can be simulated efficiently

[Aharonov and Ta-Shma, Adiabatic quantum state generation and statistical zero knowledge, 2003]
An \(N\times N\) Hermitian matrix is sparse if in any fixed row there are only \(s = \poly (\log N)\) nonzero entries.
We assume the following oracles access and their inverses:
- Entry queries of a given row and column
  \[\begin{equation*} O_A : \ket{j, k} \ket{0} \mapsto \ket{j, k} \ket{A_{j, k}}. \end{equation*} \]
- Queries of the position of the \(\ell\)-th nonzero entry in the \(k\)-th column
  \[\begin{equation*} O_{A, \mathrm{loc}} : \ket{k, \ell} \mapsto \ket{k, \nu(k, \ell)}. \end{equation*} \]

Sparse Hamiltonian Simulation: Diagonal Case

Simple case: \(H\) is diagonal.
It is not a small sum as the number of diagonal entries is exponential and entries may be all different.
Query the entry in superposition, rotate, and uncompute:
\[\begin{equation*} \begin{split} \ket{x,0} & \mapsto \ket{x,H_{x,x}}\\ & \mapsto e^{-it H_{x,x}} \ket{x,H_{x,x}}\\ & \mapsto e^{-it H_{x,x}} \ket{x,0}\\ & = \bigl( e^{-iHt} \ket{x} \bigr) \ket{0}. \end{split} \end{equation*} \]

Linear Combination of Unitaries (LCU)

Let \(M\) be a matrix acting on \(n\) qubits and \(\ket{\psi}\) be a state of \(n\) qubits.

The task is to apply \(M\) to state \(\ket{\psi}\) and get \(M\ket{\psi}/\norm{M\ket{\psi}}\).
We know that \(M\) is linear combination of unitaries
\[\begin{equation*} M = \sum_{j=1}^m \alpha_j V_j, \end{equation*} \]
where \(\alpha_j\)'s are nonnegative (after absorbing the phase in \(V_j\)).
Suppose \(V_j\) is easy and we can implement
\[\begin{equation*} V = \sum_j \ket{j}\bra{j} \otimes V_j, \end{equation*} \]
which is known as the select unitary.
Let \(W\) be a unitary acting on \(\log m\) qubits that maps
\[\begin{equation*} W : \ket{0} \mapsto \frac{1}{\sqrt{\norm{\alpha}_1}} \sum_j \sqrt{\alpha_j} \ket{j} \end{equation*} \]
The following will apply \(M\) on \(\ket{\psi}\) in the good case:
1. Start with \(\ket{0}\ket{\psi}\).
2. Apply \(W\) to the first register.
3. Apply \(V\).
4. Apply \(W^{-1}\).
By a simple calculation, we have
\[\begin{equation*} \frac{1}{\norm{\alpha}_1} \ket{0} M \ket{\psi} + \sqrt{1 - \frac{\norm{M\ket{\psi}}^2}{\norm{\alpha}_1^2}} \ket{\phi}, \end{equation*} \]
where \(\ket{\phi}\) is the garbage state we don't care and has no support on \(\ket{0}\) on the first register.
We could now apply (oblivious) amplitude amplification with \(p = \norm{M\ket{\psi}}^2 / \norm{\alpha}_1^2\).

So the number of rounds of (oblivious) AA is \(O(\norm{\alpha}_1 / \norm{M\ket{\psi}})\).

Hamiltonian Simulation via LCU

Write \(H = \sum_{j=1}^m \alpha_j V_j\) and assume \(\norm{H} \le 1\).

Example: If \(H\) is \(2\)-local Hamiltonian, it can be written as a sum of at most \(16m\) 2-local Pauli matrices.
Use the Taylor series \(e^x = \sum_{k=0}^\infty x^k/k!\), we have
\[\begin{equation*} \begin{split} e^{iHt} & = \sum_{k=0}^\infty \frac{(iHt)^k}{k!} = \sum_{k=0}^\infty \frac{(it)^k}{k!} \Bigl(\sum_j \alpha_j V_j)^k\\ & = \sum_{k=0}^\infty \frac{(it)^k}{k!} \sum_{j_1, j_2, \ldots, j_k} \alpha_{j_1} \cdots \alpha_{j_k} V_{j_1} \cdots V_{j_k}. \end{split} \end{equation*} \]
Truncate the Taylor series to degree \(d = O(t + \log 1/\eps)\).

There is a constant \(c\) independent of \(t\) and \(\eps\) such that
\[\begin{equation*} \abs{ e^{iHt} - \sum_{k=0}^{c(t+\log 1/\eps)} \frac{(iHt)^k}{k!} } \le \eps. \end{equation*} \]
It is a linear combination of unitaries with coefficients
\[\begin{equation*} \beta_{j_1, \ldots, j_k} = \frac{t^k}{k!} \alpha_{j_1} \cdots \alpha_{j_k}. \end{equation*} \]
These coefficients have \(1\)-norm
\[\begin{equation*} \norm{\beta}_1 \le \sum_{k=0}^{\infty} \frac{t^k}{k!} \sum_{j_1, \ldots, j_k} \alpha_{j_1} \cdots \alpha_{j_k} = \sum_k \frac{(t \norm{\alpha}_1)^k}{k!} = e^{t \norm{\alpha}_1}. \end{equation*} \]
It requires \(O(e^{t\norm{\alpha}_1})\) applications of
\[\begin{equation*} \hat{V} = \sum_k\sum_{j_1, \ldots, j_k} \ket{j_1, \ldots, j_k} \bra{j_1, \ldots, j_k} \otimes (i^k) V_{j_1} \cdots V_{j_k} \end{equation*} \]
and its inverse.
When \(t\) is large, the direct LCU method has bad dependence on \(t\).

The solution is simple, we divide \(t\) into \(r = t \norm{\alpha}_1\) small time slots of length \(\tau = 1/\norm{\alpha}_1\).

Run the above algorithm for time \(\tau\) and error \(\eps' = \eps/r\), which uses \(O(1)\) applications of \(\hat{V}\) and its inverse.

Each \(\hat{V}\) involves \(O(\tau + \log (1/\eps')) = O(\log (t \norm{\alpha}_1/\eps))\) applications of \(V_j\)'s.

This would give us almost linear in \(t\) and logarithmic in \(1/\eps\), both of which are better than the product formula method!

Block-encoding Method

\[\begin{equation*} U = \begin{pmatrix} A & \cdot \\ \cdot & \cdot \end{pmatrix} \end{equation*} \]

Suppose \(A\) is an operator acting on \(n\) qubits with \(\norm{A} \le 1\).
An \(a\)-qubit block-encoding of \(A\) is a unitary \(U\) on \(a+n\) qubits satisfying
\[\begin{equation*} (\bra{0^a} \otimes I) U (\ket{0^a} \otimes I) = A. \end{equation*} \]
Suppose \(f\) is a real function, the block-encoding method implements a \(V\) such that
\[\begin{equation*} V = \begin{pmatrix} f(A) & \cdot\\ \cdot & \cdot \end{pmatrix}. \end{equation*} \]
We only care about the top left part and want it to be \(f(A)\) using a small number of block-encodings of \(A\).

Block-encoding of Sparse Matrices

Let \(A\) be a \(2^n\) by \(2^n\) Hermitian matrix with operator norm \(\norm{A} \le 1\).
\(A\) is \(s\)-sparse if each row and column has at most \(s\) nonzero entries.
By one oracle access to \(O_{A, \mathrm{loc}}\) and a few other gates independent of \(A\), we can implement the following \(2n+1\)-qubit unitary operators
\[\begin{equation*} \begin{split} W_1: & \ket{0}\ket{0^n, k} \mapsto \frac{1}{\sqrt{s}} \ket{0} \sum_{j: A_{j,k} \ne 0} \ket{j, k},\\ W_3: & \ket{0}\ket{0^n, j} \mapsto \frac{1}{\sqrt{s}} \ket{0} \sum_{k: A_{j,k} \ne 0} \ket{j, k}. \end{split} \end{equation*} \]
By access to \(O_A\) and \(O_A^{-1}\) once each and a few other gates independent of \(A\), we can implement
\[\begin{equation*} W_2: \ket{0} \ket{j, k} \mapsto A_{j, k} \ket{0}\ket{j, k} + \sqrt{1 - \abs{A_{j, k}}^2} \ket{1} \ket{k, j}. \end{equation*} \]
Check that \(U = W_3^{-1} W_2 W_1\) is an \((n+a)\)-qubit block-encoding of \(A/s\).

Block-encoding Theorem

Let \(P: [-1,1] \to \{c \in \complex \mid \abs{c} \le 1/4 \}\) be a polynomial of degree \(d\). Let \(U\) be an \(a\)-qubit block-encoding of Hermitian matrix \(A\). We can implement an \(O(a)\)-qubit block-encoding \(V\) of \(P(A)\), using \(d\) applications of \(U\) and \(U^{-1}\), one controlled application of \(U\), and \(O(ad)\) other simple \(2\)-qubit gates.

[A. Gilyén, Y. Su, G. H. Low, and N. Wiebe, Quantum singular value transformation and beyond, 2019]

Hamiltonian Simulation via Block-Encoding

\[\begin{equation*} U = \begin{pmatrix} H/s & \cdot \\ \cdot & \cdot \end{pmatrix} \end{equation*} \]

Let \(H\) be an \(s\)-sparse Hamiltonian. Note that local Hamiltonians are sparse!
Let \(U\) be a block-encoding of \(H/s\).
Approximate \(f(x) = e^{ixt}\) using a polynomial \(P\).
The block-encoding theorem then implies that there is block-encoding unitary \(V\) of \(P(H) \approx e^{iHt}/4\), calling \(U\) and \(U^{-1}\) at most \(O(st + \log 1/\eps)\) times.
That is,
\[\begin{equation*} V : \ket{0}\ket{\psi} \mapsto \ket{0} P(H) \ket{\psi} + \ket{\phi}, \end{equation*} \]
where \(\ket{\phi}\) is orthogonal to all states starting with \(\ket{0}\).
Use oblivious amplitude amplification.

Quantum Linear System Problem

Solving large linear systems is an important subroutine widely used in various fields.

Linear System Problem. Given an \(N\) by \(N\) matrix \(A\) and a vector \(b\), find an \(N\) dimensional \(x\) such that \(Ax = b\).
The HHL algorithm solves a weaker version of the linear system problem.

Quantum Linear System Problem. Let \(A\) be a matrix to which we have sparse oracle access. The vector \(b\) is given as a quantum state \(\ket{b} = \frac{1}{\norm{b}}\sum_{i} b_i \ket{i}\). Define \(\ket{x} = \frac{1}{\norm{\vphantom{b}x}} \sum_i x_i \ket{i}\) for the solution \(x\) of \(Ax=b\). The goal is to find a quantum state \(\ket{\tilde{x}}\) such that \(\norm{\ket{\tilde{x}} - \ket{x}} \le \eps\).
We assume \(A\) is \(s\)-sparse and well-conditioned matrix with condition number \(\kappa\).
The output is a quantum state and may not be as useful as the written-down classical description. Yet, the quantum algorithm runs in time \(\poly(\log N, \kappa)\).

The HHL Algorithm

[Harrow, Hassidim, and Lloyd, Quantum algorithm for linear systems of equations, 2009]

Assume for simplicity that \(A\) is Hermitian, \(\norm{\ket{b}} = 1\), \(\norm{A} \le 1\) and \(\lambda_{\min}(\abs{A}) \ge 1/\kappa\).
Intuitions
- As \(x = A^{-1}b\), we want to apply \(A^{-1}\) to state \(\ket{b}\).
- Consider the spectrum decompose \(A = \sum_j \lambda_j a_j a_j^\dagger\), then the inverse map is the same as \(a_j \mapsto \lambda_j^{-1} a_j\).
- \(A\) and \(A^{-1}\) may be non-unitary and cannot be applied directly, but \(U = e^{iA}\) is always unitary and we can use Hamiltonian simulation to implement \(U\) and its powers.
- We can use phase estimation to estimate \(\lambda_j\)!

The HHL Algorithm (cont.)

Apply phase estimation \(\ket{a_j} \ket{0} \ket{0} \mapsto \ket{a_j} \ket{\lambda_j} \ket{0}\).
Conditioned on \(\lambda_j\) in the second register, apply the \(\lambda_j^{-1}\) scaling by rotating the last qubit
\[\begin{equation*} \ket{0} \mapsto \frac{1}{\kappa \lambda_j} \ket{0} + \sqrt{1 - \frac{1}{(\kappa \lambda_j)^2}} \ket{1}. \end{equation*} \]
Uncompute the phase estimation to get
\[\begin{equation*} \ket{a_j} \ket{0} \biggl( \frac{1}{\kappa \lambda_j} \ket{0} + \sqrt{1 - \frac{1}{(\kappa \lambda_j)^2}} \ket{1} \biggr). \end{equation*} \]
By lineartiy, write \(\ket{b} = \sum_j \beta_j \ket{a_j}\), the above unitary will map
\[\begin{equation*} \ket{b} \ket{0} \mapsto \frac{1}{\kappa} \sum_j \beta_j \frac{1}{\lambda_j} \ket{a_j} \ket{0} + \ket{\phi} \ket{1} = \frac{1}{\kappa} \ket{x} \ket{0} + \ket{\phi} \ket{1}. \end{equation*} \]
Apply oblivious amplitude amplification.