# Introduction to quantum states

## Quantum mechanics in a nutshell

I’m frustrated with the explanations of quantum mechanics which I received prior to graduate school. I don’t think there was any reason I couldn’t have understood quantum mechanics in high school, although I might have understood it better after taking undergraduate linear algebra. This post is an attempt to release some of my frustration. This post will make more sense if you have some theoretical knowledge of linear algebra, but all you really need to understand the math is to know how to do matrix multiplication.

The mechanics of quantum mechanics are pretty boring: they are just linear algebra. We only ever get one value when we measure some quantity, so we use the ways we can measure a system to be as a basis for a vector space in which vectors are states that the system could be in, and we’re guaranteed that the basis is orthogonal (roughly, that none of our basis vectors contain any of the other basis vectors). In order to make predictions of what we will measure given a non-basis vector corresponding to a state, we project a given state onto the basis element which corresponds to a possible measured value and square the the magnitude of the projection (which is just a number) to get a weight that tells us the probability of measuring the value.

# Finite basis

Let’s turn this into math. Suppose objects can have a property “curviness” which takes one of the three numerical values with units of “0 wiggle”, “1 wiggle”, or “2 wiggle”. It is a property of vector spaces that any vector space over a given field with the same dimension is isomorphic to any other, so I might as well make my life easier by representing the vector space of reality using the more typical vector space of column matrices with 3 elements to match the number of possible measurements in our system. Let us define a basis such that the states “measure 0 wiggles”, “measure 1 wiggle”, and “measure 2 wiggles” are represented by the vectors \(\ket{0 \text{ wiggle} } \doteq \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}\), \(\ket{\text{1 wiggle}} \doteq \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}\), and \(\ket{\text{2 wiggle}}\doteq\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\) respectively. In this notation, \(\ket{ \text{[state]} }\) is called a ket, and it represents a vector corresponding to a measured state. I use the \(\doteq\) symbol to emphasize that the column matrix of numbers is a non-unique representation of that state which is useful for calculation but not “equal” to the vector.

You can find the relative odds of measuring each of our basis states by inspection. Simply pick out the number corresponding to each state, square it, and build odds around it. For example, given a state \(\ket\psi\doteq\begin{bmatrix} 3 \\ -4 \\ 5 \end{bmatrix}\), we have odds of \(3^2:(-4)^2:5^2=9:16:25\) of measuring [0 wiggle] to [1 wiggle] to [2 wiggle]. Equivalently, the probability of measuring 0 wiggles is 0.18 and the probability of measuring 1 wiggle is 0.32 and the probability of measuring 2 wiggles is 0.5 if you were given the state represented by \(\ket{\psi}\). This is a special case of finding the probability of an arbitrary state given an arbitrary other state. We can generalize as follows. Suppose you have a state represented by column matrix \(V\) and you want to know the probability of measuring a state represented by the column matrix \(P\). The probability is \(\frac{P^\dagger V V^\dagger P}{P^\dagger P V^\dagger V}\) where the \(\dagger\) (dagger) technically represents a hermitian conjugate, but if you have real numbers it’s just a matrix transpose. Note that this means that the single column matrices we’ve been using to represent states are “daggered” into being single row matrices, and they collapse the column matrices down to single numbers by matrix multiplication. We said that the probability for measuring \(\ket{\text{2 wiggle}}\) given state \(\ket\psi\) was 0.5. We can show that here:

\[\frac{\begin{bmatrix} 0 & 0 & 1 \end{bmatrix}\begin{bmatrix} 3 \\ -4 \\ 5 \end{bmatrix}\begin{bmatrix} 3 & -4 & 5 \end{bmatrix}\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}}{\begin{bmatrix} 0 & 0 & 1 \end{bmatrix}\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\begin{bmatrix} 3 & -4 & 5 \end{bmatrix}\begin{bmatrix} 3 \\ -4 \\ 5 \end{bmatrix}}=\frac{5*5}{1*(9+16+25)}=0.5\]Multiplying by a constant number will not change the state. The odds will not change at all if you are given \(\begin{bmatrix} 6 \\ -8 \\ 10 \end{bmatrix}\) instead of \(\begin{bmatrix} 3 \\ -4 \\ 5 \end{bmatrix}\). If you were instead given the state represented by \(\begin{bmatrix} 3 \\ 4 \\ 5 \end{bmatrix}\), it is not the same state because there is no single number you can multiply by to get to that state. Even though the odds of measuring each thing you can measure are the same, the two states might evolve to different new states under the same initial conditions. We have defined a basis for a vector space, but to do physics, we need to manipulate those states. We generally change states with linear operators, which are represented by square matrices when we represent our vector space of observable states with column matrices . For example, we can define and operator to perform the action “take a system with curviness 1 wiggle and turn it into a system with curliness 0 wiggle, but don’t do anything if the initial system has curviness 0 wiggle or 2 wiggle” and represent it with the matrix \(\begin{bmatrix} 1 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix}\). You should verify that this matrix actually does what I say it does. Here is the matrix turning a “1 wiggle” state into a “0 wiggle” state for example: \(\begin{bmatrix} 1 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix}\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}=\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}\)

We can now apply this operator to states which are not basis states and see what happens. We find that the state \(\begin{bmatrix} 3 \\ -4 \\ 5 \end{bmatrix}\) is taken to the state \(\begin{bmatrix} -1 \\ 0 \\ 5 \end{bmatrix}\) and the state \(\begin{bmatrix} 3 \\ 4 \\ 0 \end{bmatrix}\) is taken to state \(\begin{bmatrix} 7 \\ 0 \\ 5 \end{bmatrix}\). We see that these two states lead to different odds of measuring each basis state after being transformed, even though the initial states had the same odds of measuring each state.

# Infinite basis and normalization

Once you get intuition for a finite basis corresponding to a finite number of measurable states, you can consider an infinite basis. Sometimes it is countably infinite, such as the number of particles of a certain type in a system. Sometimes it has a higher cardinality, such as the basis “position of a point particle” which can take values corresponding to real numbers over a continuous interval. In either case, it becomes impossible to represent operators or vectors as finite matrices, although physicists will still use the phrase “matrix elements” to describe how an operator sends one basis state to another basis state. The probability equation I wrote above doesn’t make much sense in the case of an infinite basis, but you can still think of probabilities as the squared modulus of the projection of one state onto another state. The way that we write the projection of state \(\ket\psi\) onto state \(\ket\phi\) is using “braket” notation, where the state that you project onto is written as the “bra” \(\bra{\text{[state]}}\), and the total projection is the “braket” \(\bra{\phi}\psi\rangle\). It’s not important for my purposes, but moving from bra space to ket space involves taking all of the complex numbers associated with the ket to their complex conjugates. If this were a textbook, I would spend a page showing that \(\bra{\phi}\psi\rangle\) is the complex conjugate of \(\bra\psi\phi\rangle\), but I will instead just tell you that it is. There’s a lot of annoying calculus you have to do to get meaningful numbers out of states with infinite continuous bases, but I’m hoping that I can use the abstraction of the braket notation to avoid it. For now, a braket is a complex number which tells you how much of one state is inside of another state. States with a braket with a smaller magnitude are less similar than states with a larger magnitude braket. In order for the math to work out to anything meaningful, physicists put a lot of effort into “normalizing” states so that they give you one when they project on themselves: \(\bra\psi\psi\rangle=1\). If you don’t do this, you can make the braket number arbitrarily large or small by multiplying by a complex number that doesn’t actually change the state. If you do do this, you benefit from an elegant general probability rule which is \(\text{Prob}(\psi \text{ given } \phi)=\bra\phi\psi\rangle\bra\psi\phi\rangle=\|\bra\psi\phi\rangle\|^2\).