A new basis

In my previous post, we saw that states you can measure form a basis for vectors which represent states you can prepare. We defined a basis based in measuring some value of “curviness” which had the three observable basis states \(\ket{\text{0 wiggle}}\doteq\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}\), \(\ket{\text{1 wiggle}}\doteq\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}\), and \(\ket{\text{2 wiggle}}\doteq\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\). These prepared states might have multiple basis elements in them, so you find that if you try to measure a state made up of multiple basis elements, you have some probability of measuring any of the measurable values which have a nonzero basis element contribution to the state. If we represent a state as a column vector like \(\begin{bmatrix} \frac35 \\ 0 \\ \frac45 \end{bmatrix}\), we would expect some probability of measuring the value associated with the top number in the column (the probability is 0.36 as it turns out), no probability of measuring the value associated with the middle number, and a larger (0.64) probability of measuring the value associated with the bottom number.

If you measure one thing and then measure it again, you tend to get the same answer both times. Once you measure a state, you have projected it onto the single basis element that corresponds to whatever the value you measure. As an example with numbers, if you start out with the (normalized) state \(\ket\psi\doteq\frac{\sqrt{2}}{10}\begin{bmatrix} 3 \\ 4 \\ 5 \end{bmatrix}\), and measure an object with curviness 2 wiggle (you expect this to happen half the time), then you may now consider that object to be in state \(\ket{\text{2 wiggle}}\doteq\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\), and you expect another measurement to give you 2 wiggle again. But why restrict yourself to one basis?

We can define a new basis by measuring something else. Suppose there is some second property “bounciness” of my object above which takes the three values “0 flop”, “1 flop”, and “2 flop”. Suppose that an object that is measured to be in a “0 flop” state is always subsequently measured to be in a “0 wiggle” state and vice versa, but if you measure a “1 flop” state, you are then equally likely to measure 1 wiggle or 2 wiggle after. If we keep using the curviness basis above, this is compatible with \(\ket{\text{1 flop}}\doteq\frac{\sqrt{2}}{2}\begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix}\). But bounciness should also be a good orthogonal basis! Because the “0 flop” and “0 wiggle” state seem interchangeable, we suppose \(\ket{\text{0 flop}}\doteq\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}\). We are then forced to expect that the “2 flop” state will be represented as \(\ket{\text{2 flop}}\doteq\frac{\sqrt{2}}{2}\begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix}\) up to a constant. Why? We need to create a state which does not “contain” any of the “0 flop” or “1 flop” state. Recall that we figure out how much of one state is in another state with the projection operator, which is represented by “braket” notation. If \(\ket\psi\) is represented by the vector \(\psi\) and \(ket\phi\) is represented by the vector \(\phi\), then the projection of \(\phi\) on \(\psi\) is \(\bra\phi\psi\rangle=\phi^\dagger\psi\) where the \(\dagger\) sign means transpose the column vector into a row vector and turn all of the numbers into their complex conjugates (which are just the numbers if the numbers are real). We can confirm that the vector I wrote for \(\ket{\text{2 flop}}\) fulfills the necessary requirements of projecting to zero on any other observable state: \(\bra{\text{2 flop}}\text{0 flop}\rangle=\frac{\sqrt2}2\begin{bmatrix} 0 & 1 & -1 \end{bmatrix}\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}=0\) and \(\bra{\text{2 flop}}\text{1 flop}\rangle=\frac{\sqrt2}2\begin{bmatrix} 0 & 1 & -1 \end{bmatrix}\frac{\sqrt2}2\begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix}=0\). From this representation, we expect a “2 flop” state to have even odds of measuring 1 wiggle or 2 wiggle just like the “1 flop” state.

Order of measurement matters

Measuring a thing forces it into a basis state in the basis corresponding to the measurement, but two different measurements might not use the same basis. If you measure property A followed by property B, you get a state which will always be measured to have a set value in B for future measurements of B. The first time you measure A after that, you will have some probability of several values in A assuming they aren’t the same basis. However, if you measure property B before you measure property A, you will always have a set value in A for future measurements rather than several values of A as if you had measured A followed by B. This is a trivial example, but quantum mechanics textbooks make a big deal out of it, because prior to the development of quantum mechanics, it was taken for granted that you could define a state such that you knew everything it was possible to know about the state. Statistical mechanics allowed macrostates such as “bottle of hydrogen at 3 atmospheres and 296 degrees kelvin” which were compatible with many microstates, but in principle you could define one of the microstates exactly.

It so happens that the universe does not actually work that way. The canonical example is that you cannot know both a particle’s momentum and position in a particular direction to arbitrary accuracy. If you shoot a bunch of electrons down a straight narrow pipe toward a hole in front of a screen, the ones which make it to the end of the pipe will have very little momentum in the sideways direction, because they need to have a momentum consistent with moving straight down the pipe. If the hole is small, you “measure” the location in the sideways direction of the electrons which make it through the hole very precisely because they have to go through the hole. For some electrons which make it to the screen, there will be no line you can draw which goes from the electron source through the hole to the screen. An electron which goes through the hole has a sideways location which is tightly constrained by the hole width, so its momentum cannot keep the tightly-constrained sideways momentum near zero that it must have had to make it to the hole, and there is a significant probability of finding it to have veered off at some angle when you measure it on the other side of the hole. In the language of linear algebra, a tightly constrained set of position basis elements is made up of a loosely constrained set of momentum basis elements and vice versa.

Operators for observables

In the previous post, we also saw how square matrices represent operators which change states into other states. One important type of operator in quantum mechanics is the observable. An observable operator is an operator which does not change states which represent one thing you can measure, but will change states that do not represent a single thing you can measure. The operators, when applied to a state which represents a single observation, return the original state multiplied by whatever value you would have observed if you measure that state. In the language of linear algebra, an observable operator is defined as the operator whose eigenvectors represent states which are always measured to have the same value and whose eigenvalues are those values. In the basis we have above, we can represent the curviness operator as \(\hat C\doteq\begin{bmatrix} \text{0 wiggle} & 0 & 0 \\ 0 & \text{1 wiggle} & 0 \\ 0 & 0 & \text{2 wiggle} \end{bmatrix}\). Notice that because we’re in the curviness basis (the one with wiggles), the matrix is diagonal. We clearly see that if we multiply this matrix by \(\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}\), \(\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}\), and \(\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\), then we get \(\begin{bmatrix} \text{0 wiggle} \\ 0 \\ 0 \end{bmatrix}\), \(\begin{bmatrix} 0 \\ \text{1 wiggle} \\ 0 \end{bmatrix}\), and \(\begin{bmatrix} 0 \\ 0 \\ \text{2 wiggle} \end{bmatrix}\) respectively.

We get more interesting operators when we attempt to build observables for things outside of the basis we’re working with. The bounciness matrix looks like \(\hat B\doteq\text{(flop)}*\begin{bmatrix} 0 & 0 & 0 \\ 0 & \frac32 & -\frac12 \\ 0 & -\frac12 & \frac32 \end{bmatrix}\). Note that I have factored out the units and put them in front of the matrix. We can confirm that the eigenvalues of this matrix give us the correct measurements from above:

\[\hat B \ket{\text{0 flop}}\doteq\text{(flop)}*\begin{bmatrix} 0 & 0 & 0 \\ 0 & \frac32 & -\frac12 \\ 0 & -\frac12 & \frac32 \end{bmatrix}\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}=0\doteq\text{0 flop}\ket{\text{0 flop}}\] \[\hat B \ket{\text{1 flop}}\doteq\frac{\sqrt2}{2}\text{(flop)}*\begin{bmatrix} 0 & 0 & 0 \\ 0 & \frac32 & -\frac12 \\ 0 & -\frac12 & \frac32 \end{bmatrix}\begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix}=\text{(1 flop)}\frac{\sqrt2}2\begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix}\doteq\text{1 flop}\ket{\text{1 flop}}\] \[\hat B \ket{\text{2 flop}}\doteq\frac{\sqrt2}{2}\text{(flop)}*\begin{bmatrix} 0 & 0 & 0 \\ 0 & \frac32 & -\frac12 \\ 0 & -\frac12 & \frac32 \end{bmatrix}\begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix}=\text{(2 flop)}\frac{\sqrt2}2\begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix}\doteq\text{2 flop}\ket{\text{2 flop}}\]

Now, remember how I said that the order of measurement matters? Our observable matrices help us show this. If the order of measurement didn’t matter, we might expect that applying the matrices representing those measurmements in either order would give the same results, which is to say that \(\hat B\hat C=\hat C \hat B\): the products of the matrices are equal in either order. We can rewrite this equation as \([\hat B,\hat C]=0\), where the brackets represent what is called the commutator: \([X,Y]=XY-YX\). Let’s find the actual commutator of \(\hat B\) and \(\hat C\) and see whether it equals zero:

\[[\hat B,\hat C]\doteq\text{(flop)}*\begin{bmatrix} 0 & 0 & 0 \\ 0 & \frac32 & -\frac12 \\ 0 & -\frac12 & \frac32 \end{bmatrix}\text{(wiggle)}\begin{bmatrix} 0 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 2 \end{bmatrix}-\text{(wiggle)}\begin{bmatrix} 0 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 2 \end{bmatrix}\text{(flop)}*\begin{bmatrix} 0 & 0 & 0 \\ 0 & \frac32 & -\frac12 \\ 0 & -\frac12 & \frac32 \end{bmatrix}\] \[=\text{(flop*wiggle)}\begin{bmatrix} 0 & 0 & 0 \\ 0 & \frac32 & -1 \\ 0 & -\frac12 & 3 \end{bmatrix}-\text{(wiggle*flop)}\begin{bmatrix} 0 & 0 & 0 \\ 0 & \frac32 & 0 \\ 0 & 0 & 3 \end{bmatrix}=\text{(flop*wiggle)}\begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & -1 \\ 0 & -\frac12 & 0 \end{bmatrix}\]

The commutator is not equal to the zero matrix, so we expect that the order of measurement matters. We expect that the basis for bounciness is not the same as the basis for curviness. When you have finite matrices like these where you can clearly see that one of the matrices is diagonal and the other one isn’t, these are trivial statements, but if you have an operator with an infinite basis which can’t be represented by an explicit matrix then it might be less obvious that this is the case, and the intuition we get from working with finite matrices can be helpful for us.

Changing the representation

I have scrupulously used the same basis every time I write a column vector or a matrix, but as I repeatedly said above, bounciness should also be a good basis. What does that actually mean? It means that all of the physics must continue to work if I instead define \(\ket{\text{0 flop}}\doteq\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}\), \(\ket{\text{1 flop}}\doteq\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}\), and \(\ket{\text{2 flop}}\doteq\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\). This means that all of my bounciness operator is now diagonal: \(\hat B\doteq\text{flop}\begin{bmatrix} 0 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 2 \end{bmatrix}\). My curviness basis states are now more complicated, and my curviness operator is no longer diagonal. This is why I have used the symbol \(\doteq\) when writing column/row vectors and matrices. They are not unique. We can use whatever basis we want to write our explicit representations. I could even go wild and define \(\ket{\text{1 flop}}\doteq\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}\), \(\ket{\text{0 flop}}\doteq\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}\), and \(\ket{\text{2 flop}}\doteq\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\), and then my bounciness operator would become \(\hat B\doteq\text{flop}\begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 2 \end{bmatrix}\). But in any case, my commutator for bounciness and curviness will be nonzero and my inner products (brakets) will have the same values. Quantum mechanics must work no matter which valid basis of observables you choose, as long as you’re consistent.