MHB Probability and Operators in Quantum Mechanics

AI Thread Summary
In quantum mechanics, the expectation value of an observable, such as momentum, is expressed using operators associated with those observables. The momentum operator is defined as \(\hat{p} = -i \hbar \frac{\partial}{\partial x}\), leading to the expectation value \(\langle p \rangle = \int_{\mathbb{R}} \Psi^* \hat{p} \Psi \, dx\). The derivation involves the time-dependent Schrödinger equation, where the wave function's probability density informs the calculation of expectation values. The discussion also touches on foundational axioms of quantum mechanics, including the role of Hermitian operators and the Born Rule, which connects measurements to positive operators of unit trace. Overall, the thread emphasizes the mathematical framework that underpins quantum mechanics and its implications for observable outcomes.
Ackbach
Gold Member
MHB
Messages
4,148
Reaction score
93
Unfortunately, I can't find the thread (if someone finds it, please let me know, and I'll merge this post onto that thread), but someone asked why it is that in quantum mechanics, if you have an observable $B$, that the expectation value (average value) $\langle B \rangle$ is given by
$$\langle B \rangle = \int_{\mathbb{R}}\Psi^{*} \hat{B} \, \Psi\,dx,$$
where $\hat{B}$ is the operator "associated with" the observable $B$.
If I recall, the observable in question was the momentum operator
$$\hat{p}= -i \hbar\,\frac{\partial}{ \partial x}.$$
So, why is
$$ \langle p \rangle= \int_{\mathbb{R}}\Psi^{*} \left( -i \hbar\,\frac{\partial}{ \partial x} \right)\, \Psi\,dx?$$

The following derivation will follow Griffiths' Introduction to Quantum Mechanics, 1st Ed., pages 11-16.

Let's start with $\Psi$, which is the wave function solution of the time-dependent Schrödinger equation in one dimension:
$$i \hbar \frac{ \partial \Psi}{ \partial t}=- \frac{ \hbar^{2}}{2m}
\frac{ \partial^{2} \Psi}{\partial x^{2}}+V \Psi.$$
The statistical interpretation of the Schrödinger equation tells us that the wave function $|\Psi(x,t)|^{2}$ is the probability density for finding the particle at point $x$ at time $t$. So if I want to find the expectation value of the observable $x$, I should do
$$\langle x \rangle=\int_{\mathbb{R}}x|\Psi(x,t)|^{2}\,dx.$$
Now, if we want the observable $p$, we would like
$$p=mv=m\frac{dx}{dt}.$$
Shifting to expectation values, we would like
$$\langle p \rangle=m\frac{d \langle x \rangle}{dt}
=m\,\frac{d}{dt}\int_{\mathbb{R}}x|\Psi(x,t)|^{2}\,dx
=m\int_{\mathbb{R}}x \frac{ \partial}{ \partial t}| \Psi(x,t)|^{2}\,dx.$$
Now the Schrödinger equation tells us that
$$ \frac{ \partial \Psi}{\partial t}= \frac{i \hbar}{2m} \frac{ \partial^{2} \Psi}{ \partial x^{2}}- \frac{i}{ \hbar} V \Psi,$$
and hence
$$ \frac{ \partial \Psi^{*}}{\partial t}=- \frac{i \hbar}{2m} \frac{ \partial^{2} \Psi^{*}}{ \partial x^{2}}+ \frac{i}{ \hbar} V \Psi^{*}.$$
Thus,
$$\frac{ \partial}{ \partial t}| \Psi(x,t)|^{2}
=\frac{ \partial}{ \partial t}(\Psi^{*} \Psi)
= \Psi^{*} \frac{ \partial \Psi}{ \partial t}+ \Psi \frac{ \partial \Psi^{*}}{ \partial t}$$
$$=\Psi^{*} \left(\frac{i \hbar}{2m} \frac{ \partial^{2} \Psi}{ \partial x^{2}}- \frac{i}{ \hbar} V \Psi \right)+
\Psi \left( - \frac{i \hbar}{2m} \frac{ \partial^{2} \Psi^{*}}{ \partial x^{2}}+ \frac{i}{ \hbar} V \Psi^{*} \right)$$
$$= \frac{i \hbar}{2m} \left( \Psi^{*} \frac{ \partial^{2} \Psi}{ \partial x^{2}}- \Psi \frac{ \partial^{2} \Psi^{*}}{ \partial x^{2}} \right)$$
$$= \frac{i \hbar}{2m} \frac{ \partial}{ \partial x}\left( \Psi^{*} \frac{ \partial \Psi}{ \partial x}- \Psi \frac{ \partial \Psi^{*}}{ \partial x} \right).$$
Plugging this into our latest expression for $\langle p \rangle$ yields
$$\langle p \rangle=m\int_{\mathbb{R}}x \frac{ \partial}{ \partial t}| \Psi(x,t)|^{2}\,dx
=m\int_{\mathbb{R}}x \left( \frac{i \hbar}{2m} \frac{ \partial}{ \partial x}\left( \Psi^{*} \frac{ \partial \Psi}{ \partial x}- \Psi \frac{ \partial \Psi^{*}}{ \partial x} \right) \right)\,dx$$
$$=\frac{i \hbar}{2}\int_{\mathbb{R}}x \frac{ \partial}{ \partial x}\left( \Psi^{*} \frac{ \partial \Psi}{ \partial x}- \Psi \frac{ \partial \Psi^{*}}{ \partial x} \right) \,dx.$$
We can integrate this by-parts:
$$ \langle p \rangle=-\frac{i \hbar}{2}\int_{\mathbb{R}} \left( \Psi^{*} \frac{ \partial \Psi}{ \partial x}- \Psi \frac{ \partial \Psi^{*}}{ \partial x} \right) \,dx.$$
Here we have used the fact that $\Psi$ must go to zero as $x\to \pm \infty$ to eliminate the boundary term. Now, if we integrate the second term by-parts, we will obtain
$$ \langle p \rangle =-i \hbar \int_{ \mathbb{R}} \Psi^{*} \frac{ \partial \Psi}{ \partial x} \, dx
=\int_{\mathbb{R}}\Psi^{*} \left( -i \hbar\,\frac{\partial}{ \partial x} \right)\, \Psi\,dx.$$
As it turns out, all observables can be written in terms of position and momentum, so everywhere you see an $x$, "replace" it with multiplication by $x$, and everywhere you see a $p$, replace it with the operator $-i \hbar (\partial / \partial x)$.

So, to sum up: this is the operator representation of $p$, because we want to impose the condition that $\langle p \rangle= m d \langle x \rangle/dt$. And then, because of the Schrödinger equation and the derivation I showed above, you obtain the desired representation.
 
Mathematics news on Phys.org
Nice answer. A more complete analysis of these foundational issues in QM can be found in Ballentine:
https://www.amazon.com.au/dp/9814578584/.

In that standard textbook, you will find QM based on just two axioms. The first is a hermitian operator is associated with any observation whose eigenvalues are the possible outcomes called the observations observable. The second is the so-called Born Rule - Given an observable O then there exists a positive operator of unit trace, P, called the systems state, such that the expectation of the outcome of the observation associated with O, E(O) = Trace (PO). It is a surprising fact that using a famous theroem called Gleasons Thereoem the second axiom actually follows from the first (there is an assumption called non-contextuality involved; its full implications is best left for another thread however it will be pointed out when required here).

Initially it was a difficult theroem to prove, but in modern times it has been greatly simplified using the concept of POVM. Here is the proof I came up with using POVM's.

Just for completeness let's define a POVM. A POVM is a set of positive operators Ei ∑ Ei =1 from, for QM, an assumed complex vector space.

Elements of POVMs are called effects, and it's easy to see a positive operator E is an effect if and only if Trace(E) <= 1.

First, let's start with the foundational axiom the proof uses as its starting point. It is a generalisation of the first axiom of QM I gave by decomposing O = ∑λi |ui><ui|, where λi is an eigenvalue and |ui> is the corresponding eigenvector. The last part is actually non-contextuality I mentioned before.

An observation/measurement with possible outcomes i = 1, 2, 3 ... is described by a POVM Ei such that the probability of outcome Ei determines me, and only by Ei; in particular, it does not depend on what POVM it is part of.

Only by Ei means that regardless of what POVM the Ei belongs to, the probability is the same. This assumption of non-contextuality is the well-known rock bottom essence of Born's rule via Gleason. The other assumption, not explicitly stated, but used, is the strong law of superposition ie in principle, any POVM corresponds to an observation/measurement.

I will let f(Ei) be the probability of Ei.

First, additivity of the measure for effects.

Let E1 + E2 = E3 where E1, E2 and E3 are all effects. Then there exists an effect E E1 + E2 + E = E3 + E = I. Hence f(E1) + f(E2) = f(E3)

f (I) = 1 from the law of total probability. Since I + 0 = I f(0) = 0.

Next, linearity wrt the rationals - it's the usual standard argument from additivity from linear algebra but will repeat it anyway.

f(E) = f(n E/n) = f(E/n + ... + E/n) = n f(E/n) or 1/n f(E) = f(E/n). f(m E/n) = f(E/n + ... E/n) or m/n f(E) = f(m/n E) if m <= n to ensure we are dealing with effects.

Will extend the definition to any positive operator E. If E is a positive operator, an n and an effect E1 exists; E = n* E1, as easily seen by the fact effects are positive operators with trace <= 1. f(E) is defined as nf(E1). To show well-defined suppose nE1 = mE2. n/n+m E1 = m/n+m E2. f(n/n+m E1) = f(m/n+m E1). n/n+m f(E1) = m/n+m f(E2) so nf(E1) = mf(E2).

From the definition its easy to see for any positive operators E1, E2 f(E1 + E2) = f(E1) + f(E2). Then similar to effects show for any rational m/n f(m/n E) = m/n f(E).

Now we want to show continuity to show true for real's.

If E1 and E2 are positive operators define E2 < E1 as a positive operator E exists E1 = E2 + E. This means f(E2) <= f(E1). Let r1n be an increasing sequence of rationals whose limit is the irrational number c. Let r2n be a decreasing sequence of rationals whose limit is also c. If E is any positive operator r1nE < cE < r2nE. So r1n f(E) <= f(cE) <= r2n f(E). Thus by the pinching theorem f(cE) = cf(E).

Extending it to any Hermitian operator H.

H can be broken down to H = E1 - E2, where E1 and E2 are positive operators by, for example, separating the positive and negative eigenvalues of H. Define f(H) = f(E1) - f(E2). To show well defined if E1 - E2 = E3 - E4 then E1 + E4 = E3 + E1. f(E1) + f(E4) = f(E3) + f(E1). f(E1) - f(E2) = f(E3) - f(E4). Actually, there was no need to show uniqueness because I could have defined E1 and E2 as the positive operators from separating the eigenvalues, but what the heck - it's not hard to show uniqueness.

It's easy to show linearity wrt to the real's under this extended definition.

It's pretty easy to see the pattern here but just to complete it will extend the definition to any operator O. O can be uniquely decomposed into O = H1 + i H2 where H1 and H2 are Hermitian. f(O) = f(H1) + i f(H2). Again it's easy to show linearity wrt to the real's under this new definition then extend it to linearity wrt to complex numbers.

Now the final bit. The hard bit - namely linearity wrt to any operator - has been done by extending the f defined on effects. The well-known Von Neumann argument can be used to derive Born's rule. But for completeness will spell out the detail.

First its easy to check <bi|O|bj> = Trace (O |bj><bi|).

O = ∑ <bi|O|bj> |bi><bj| = ∑ Trace (O |bj><bi|) |bi><bj|

Now we use the linearity that the forgoing extensions of f have led to.

f(O) = ∑ Trace (O |bj><bi|) f(|bi><bj|) = Trace (O ∑ f(|bi><bj|)|bj><bi|)

Define P as ∑ f(|bi><bj|)|bj><bi| and we have f(O) = Trace (OP).

P, by definition, is called the state of the quantum system. The following are easily seen. Since f(I) = 1, Trace (P) = 1. Thus P has a unit trace. f(|u><u|) is a positive number >= 0 since |u><u| is an effect. Thus Trace (|u><u| P) = <u|P|u> >= 0 so P is positive.

Hence a positive operator of unit trace P exists such that the probability of Ei occurring in the POVM E1, E2 ... is Trace (Ei P).

Whew. Glad that's over with.

So at rock bottom, QM is modelling the outcomes of observations by a hermitian operator. Why we would want to do that is a deep issue at the foundations of QM - possibly the deep issue:
https://www.physicsforums.com/insig...ciple-at-the-foundation-of-quantum-mechanics/

Thanks
Bill
 
Last edited:
  • Like
Likes Greg Bernhardt
Seemingly by some mathematical coincidence, a hexagon of sides 2,2,7,7, 11, and 11 can be inscribed in a circle of radius 7. The other day I saw a math problem on line, which they said came from a Polish Olympiad, where you compute the length x of the 3rd side which is the same as the radius, so that the sides of length 2,x, and 11 are inscribed on the arc of a semi-circle. The law of cosines applied twice gives the answer for x of exactly 7, but the arithmetic is so complex that the...
Thread 'Video on imaginary numbers and some queries'
Hi, I was watching the following video. I found some points confusing. Could you please help me to understand the gaps? Thanks, in advance! Question 1: Around 4:22, the video says the following. So for those mathematicians, negative numbers didn't exist. You could subtract, that is find the difference between two positive quantities, but you couldn't have a negative answer or negative coefficients. Mathematicians were so averse to negative numbers that there was no single quadratic...
Thread 'Unit Circle Double Angle Derivations'
Here I made a terrible mistake of assuming this to be an equilateral triangle and set 2sinx=1 => x=pi/6. Although this did derive the double angle formulas it also led into a terrible mess trying to find all the combinations of sides. I must have been tired and just assumed 6x=180 and 2sinx=1. By that time, I was so mindset that I nearly scolded a person for even saying 90-x. I wonder if this is a case of biased observation that seeks to dis credit me like Jesus of Nazareth since in reality...
Back
Top