Probability and Operators in Quantum Mechanics

In summary, we discussed the derivation of the expectation value in quantum mechanics using the example of the momentum operator. We started with the wave function solution of the time-dependent Schrödinger equation and used the statistical interpretation of the equation to find the expectation value of the observable. We then derived the operator representation of momentum using the Schrödinger equation and the fact that all observables can be written in terms of position and momentum. Finally, we mentioned the two foundational axioms of quantum mechanics and how the second axiom follows from the first using Gleason's Theorem.
  • #1
Ackbach
Gold Member
MHB
4,155
92
Unfortunately, I can't find the thread (if someone finds it, please let me know, and I'll merge this post onto that thread), but someone asked why it is that in quantum mechanics, if you have an observable $B$, that the expectation value (average value) $\langle B \rangle$ is given by
$$\langle B \rangle = \int_{\mathbb{R}}\Psi^{*} \hat{B} \, \Psi\,dx,$$
where $\hat{B}$ is the operator "associated with" the observable $B$.
If I recall, the observable in question was the momentum operator
$$\hat{p}= -i \hbar\,\frac{\partial}{ \partial x}.$$
So, why is
$$ \langle p \rangle= \int_{\mathbb{R}}\Psi^{*} \left( -i \hbar\,\frac{\partial}{ \partial x} \right)\, \Psi\,dx?$$

The following derivation will follow Griffiths' Introduction to Quantum Mechanics, 1st Ed., pages 11-16.

Let's start with $\Psi$, which is the wave function solution of the time-dependent Schrödinger equation in one dimension:
$$i \hbar \frac{ \partial \Psi}{ \partial t}=- \frac{ \hbar^{2}}{2m}
\frac{ \partial^{2} \Psi}{\partial x^{2}}+V \Psi.$$
The statistical interpretation of the Schrödinger equation tells us that the wave function $|\Psi(x,t)|^{2}$ is the probability density for finding the particle at point $x$ at time $t$. So if I want to find the expectation value of the observable $x$, I should do
$$\langle x \rangle=\int_{\mathbb{R}}x|\Psi(x,t)|^{2}\,dx.$$
Now, if we want the observable $p$, we would like
$$p=mv=m\frac{dx}{dt}.$$
Shifting to expectation values, we would like
$$\langle p \rangle=m\frac{d \langle x \rangle}{dt}
=m\,\frac{d}{dt}\int_{\mathbb{R}}x|\Psi(x,t)|^{2}\,dx
=m\int_{\mathbb{R}}x \frac{ \partial}{ \partial t}| \Psi(x,t)|^{2}\,dx.$$
Now the Schrödinger equation tells us that
$$ \frac{ \partial \Psi}{\partial t}= \frac{i \hbar}{2m} \frac{ \partial^{2} \Psi}{ \partial x^{2}}- \frac{i}{ \hbar} V \Psi,$$
and hence
$$ \frac{ \partial \Psi^{*}}{\partial t}=- \frac{i \hbar}{2m} \frac{ \partial^{2} \Psi^{*}}{ \partial x^{2}}+ \frac{i}{ \hbar} V \Psi^{*}.$$
Thus,
$$\frac{ \partial}{ \partial t}| \Psi(x,t)|^{2}
=\frac{ \partial}{ \partial t}(\Psi^{*} \Psi)
= \Psi^{*} \frac{ \partial \Psi}{ \partial t}+ \Psi \frac{ \partial \Psi^{*}}{ \partial t}$$
$$=\Psi^{*} \left(\frac{i \hbar}{2m} \frac{ \partial^{2} \Psi}{ \partial x^{2}}- \frac{i}{ \hbar} V \Psi \right)+
\Psi \left( - \frac{i \hbar}{2m} \frac{ \partial^{2} \Psi^{*}}{ \partial x^{2}}+ \frac{i}{ \hbar} V \Psi^{*} \right)$$
$$= \frac{i \hbar}{2m} \left( \Psi^{*} \frac{ \partial^{2} \Psi}{ \partial x^{2}}- \Psi \frac{ \partial^{2} \Psi^{*}}{ \partial x^{2}} \right)$$
$$= \frac{i \hbar}{2m} \frac{ \partial}{ \partial x}\left( \Psi^{*} \frac{ \partial \Psi}{ \partial x}- \Psi \frac{ \partial \Psi^{*}}{ \partial x} \right).$$
Plugging this into our latest expression for $\langle p \rangle$ yields
$$\langle p \rangle=m\int_{\mathbb{R}}x \frac{ \partial}{ \partial t}| \Psi(x,t)|^{2}\,dx
=m\int_{\mathbb{R}}x \left( \frac{i \hbar}{2m} \frac{ \partial}{ \partial x}\left( \Psi^{*} \frac{ \partial \Psi}{ \partial x}- \Psi \frac{ \partial \Psi^{*}}{ \partial x} \right) \right)\,dx$$
$$=\frac{i \hbar}{2}\int_{\mathbb{R}}x \frac{ \partial}{ \partial x}\left( \Psi^{*} \frac{ \partial \Psi}{ \partial x}- \Psi \frac{ \partial \Psi^{*}}{ \partial x} \right) \,dx.$$
We can integrate this by-parts:
$$ \langle p \rangle=-\frac{i \hbar}{2}\int_{\mathbb{R}} \left( \Psi^{*} \frac{ \partial \Psi}{ \partial x}- \Psi \frac{ \partial \Psi^{*}}{ \partial x} \right) \,dx.$$
Here we have used the fact that $\Psi$ must go to zero as $x\to \pm \infty$ to eliminate the boundary term. Now, if we integrate the second term by-parts, we will obtain
$$ \langle p \rangle =-i \hbar \int_{ \mathbb{R}} \Psi^{*} \frac{ \partial \Psi}{ \partial x} \, dx
=\int_{\mathbb{R}}\Psi^{*} \left( -i \hbar\,\frac{\partial}{ \partial x} \right)\, \Psi\,dx.$$
As it turns out, all observables can be written in terms of position and momentum, so everywhere you see an $x$, "replace" it with multiplication by $x$, and everywhere you see a $p$, replace it with the operator $-i \hbar (\partial / \partial x)$.

So, to sum up: this is the operator representation of $p$, because we want to impose the condition that $\langle p \rangle= m d \langle x \rangle/dt$. And then, because of the Schrödinger equation and the derivation I showed above, you obtain the desired representation.
 
  • Like
Likes bhobba
Mathematics news on Phys.org
  • #2
Nice answer. A more complete analysis of these foundational issues in QM can be found in Ballentine:
https://www.amazon.com.au/dp/9814578584/.

In that standard textbook, you will find QM based on just two axioms. The first is a hermitian operator is associated with any observation whose eigenvalues are the possible outcomes called the observations observable. The second is the so-called Born Rule - Given an observable O then there exists a positive operator of unit trace, P, called the systems state, such that the expectation of the outcome of the observation associated with O, E(O) = Trace (PO). It is a surprising fact that using a famous theroem called Gleasons Thereoem the second axiom actually follows from the first (there is an assumption called non-contextuality involved; its full implications is best left for another thread however it will be pointed out when required here).

Initially it was a difficult theroem to prove, but in modern times it has been greatly simplified using the concept of POVM. Here is the proof I came up with using POVM's.

Just for completeness let's define a POVM. A POVM is a set of positive operators Ei ∑ Ei =1 from, for QM, an assumed complex vector space.

Elements of POVMs are called effects, and it's easy to see a positive operator E is an effect if and only if Trace(E) <= 1.

First, let's start with the foundational axiom the proof uses as its starting point. It is a generalisation of the first axiom of QM I gave by decomposing O = ∑λi |ui><ui|, where λi is an eigenvalue and |ui> is the corresponding eigenvector. The last part is actually non-contextuality I mentioned before.

An observation/measurement with possible outcomes i = 1, 2, 3 ... is described by a POVM Ei such that the probability of outcome Ei determines me, and only by Ei; in particular, it does not depend on what POVM it is part of.

Only by Ei means that regardless of what POVM the Ei belongs to, the probability is the same. This assumption of non-contextuality is the well-known rock bottom essence of Born's rule via Gleason. The other assumption, not explicitly stated, but used, is the strong law of superposition ie in principle, any POVM corresponds to an observation/measurement.

I will let f(Ei) be the probability of Ei.

First, additivity of the measure for effects.

Let E1 + E2 = E3 where E1, E2 and E3 are all effects. Then there exists an effect E E1 + E2 + E = E3 + E = I. Hence f(E1) + f(E2) = f(E3)

f (I) = 1 from the law of total probability. Since I + 0 = I f(0) = 0.

Next, linearity wrt the rationals - it's the usual standard argument from additivity from linear algebra but will repeat it anyway.

f(E) = f(n E/n) = f(E/n + ... + E/n) = n f(E/n) or 1/n f(E) = f(E/n). f(m E/n) = f(E/n + ... E/n) or m/n f(E) = f(m/n E) if m <= n to ensure we are dealing with effects.

Will extend the definition to any positive operator E. If E is a positive operator, an n and an effect E1 exists; E = n* E1, as easily seen by the fact effects are positive operators with trace <= 1. f(E) is defined as nf(E1). To show well-defined suppose nE1 = mE2. n/n+m E1 = m/n+m E2. f(n/n+m E1) = f(m/n+m E1). n/n+m f(E1) = m/n+m f(E2) so nf(E1) = mf(E2).

From the definition its easy to see for any positive operators E1, E2 f(E1 + E2) = f(E1) + f(E2). Then similar to effects show for any rational m/n f(m/n E) = m/n f(E).

Now we want to show continuity to show true for real's.

If E1 and E2 are positive operators define E2 < E1 as a positive operator E exists E1 = E2 + E. This means f(E2) <= f(E1). Let r1n be an increasing sequence of rationals whose limit is the irrational number c. Let r2n be a decreasing sequence of rationals whose limit is also c. If E is any positive operator r1nE < cE < r2nE. So r1n f(E) <= f(cE) <= r2n f(E). Thus by the pinching theorem f(cE) = cf(E).

Extending it to any Hermitian operator H.

H can be broken down to H = E1 - E2, where E1 and E2 are positive operators by, for example, separating the positive and negative eigenvalues of H. Define f(H) = f(E1) - f(E2). To show well defined if E1 - E2 = E3 - E4 then E1 + E4 = E3 + E1. f(E1) + f(E4) = f(E3) + f(E1). f(E1) - f(E2) = f(E3) - f(E4). Actually, there was no need to show uniqueness because I could have defined E1 and E2 as the positive operators from separating the eigenvalues, but what the heck - it's not hard to show uniqueness.

It's easy to show linearity wrt to the real's under this extended definition.

It's pretty easy to see the pattern here but just to complete it will extend the definition to any operator O. O can be uniquely decomposed into O = H1 + i H2 where H1 and H2 are Hermitian. f(O) = f(H1) + i f(H2). Again it's easy to show linearity wrt to the real's under this new definition then extend it to linearity wrt to complex numbers.

Now the final bit. The hard bit - namely linearity wrt to any operator - has been done by extending the f defined on effects. The well-known Von Neumann argument can be used to derive Born's rule. But for completeness will spell out the detail.

First its easy to check <bi|O|bj> = Trace (O |bj><bi|).

O = ∑ <bi|O|bj> |bi><bj| = ∑ Trace (O |bj><bi|) |bi><bj|

Now we use the linearity that the forgoing extensions of f have led to.

f(O) = ∑ Trace (O |bj><bi|) f(|bi><bj|) = Trace (O ∑ f(|bi><bj|)|bj><bi|)

Define P as ∑ f(|bi><bj|)|bj><bi| and we have f(O) = Trace (OP).

P, by definition, is called the state of the quantum system. The following are easily seen. Since f(I) = 1, Trace (P) = 1. Thus P has a unit trace. f(|u><u|) is a positive number >= 0 since |u><u| is an effect. Thus Trace (|u><u| P) = <u|P|u> >= 0 so P is positive.

Hence a positive operator of unit trace P exists such that the probability of Ei occurring in the POVM E1, E2 ... is Trace (Ei P).

Whew. Glad that's over with.

So at rock bottom, QM is modelling the outcomes of observations by a hermitian operator. Why we would want to do that is a deep issue at the foundations of QM - possibly the deep issue:
https://www.physicsforums.com/insig...ciple-at-the-foundation-of-quantum-mechanics/

Thanks
Bill
 
Last edited:
  • Like
Likes Greg Bernhardt

FAQ: Probability and Operators in Quantum Mechanics

What is probability in quantum mechanics?

In quantum mechanics, probability refers to the likelihood of a particular outcome or measurement in a quantum system. It is represented by a mathematical quantity called a probability amplitude, which is a complex number that describes the probability of finding a particle in a specific state.

What are operators in quantum mechanics?

Operators in quantum mechanics are mathematical tools used to describe the physical properties and behaviors of quantum systems. They act on the wave function of a system and can be used to calculate the probability of a particular outcome or the expected value of a measurement.

How are probability and operators related in quantum mechanics?

In quantum mechanics, probability and operators are closely related. The probability of obtaining a particular measurement result is given by the square of the absolute value of the probability amplitude, which is calculated using operators. Operators also play a key role in determining the evolution of a quantum system over time.

What is the difference between classical and quantum probabilities?

Classical probabilities are based on known information and can be calculated with certainty. In contrast, quantum probabilities are based on the uncertainty principle and can only be predicted using probability amplitudes. This means that in quantum mechanics, we can only calculate the probability of obtaining a particular measurement result, rather than knowing the exact outcome.

How does the concept of superposition relate to probability and operators in quantum mechanics?

The concept of superposition in quantum mechanics refers to the ability of a quantum system to exist in multiple states or positions at the same time. Probability and operators are used to describe and calculate the probability of obtaining a particular outcome when a system is in a state of superposition. Operators also play a crucial role in manipulating and measuring superposition states in quantum systems.

Similar threads

Replies
10
Views
1K
Replies
15
Views
1K
Replies
8
Views
4K
Replies
30
Views
2K
Replies
9
Views
1K
Replies
56
Views
4K
Back
Top