brownian motion

Brownian Motions and Quantifying Randomness in Physical Systems

Estimated Read Time: 9 minute(s)
Common Topics: random, probability, time, stochastic, tile

Stochastic calculus has come a long way since Robert Brown described the motion of pollen through a microscope in 1827. It’s now a key player in data science, quant finance, and mathematical biology. This article is drawn from notes I wrote for an undergraduate statistical physics course a few months ago. There won’t be any mathematical rigor.

Brownian processes (and a Wetherspoons customer)

In 1d, the term Brownian motion is reserved for continuous functions ##W(t)## that satisfy three key conditions:

  1. The motion starts at zero: ##W(0) = 0##
  2. Its increments ##W(t_{k+1}) – W(t_k)## for ##0 \leq t_1 \leq \dots \leq t_n## are independent of each other
  3. Its increments are normally distributed ##W(t_{k+1}) – W(t_k) \sim N(0, t_{k+1} – t_k)## with variance equal to the time increment.

You might like to verify that the following well-known properties are true: (i) ##E(W(t)) = 0##, (ii) ##E(W(t)^2) = t## and (iii) ##W(t) \sim N(0,t)##, (iv) the continuous time Markov property:

$$P(X(t) = j | X(t_1) = k_1, \dots, X(t_n) = k_n) = P(X(t) = j | X(t_n) = k_n)$$Also, (v) Brownian motions are martingales, which means:

$$E(W(t+s) | W(t)) = W(t)$$

Martingales have cool properties. One is that a martingale stopped at a stopping time is a martingale (what does that mean?).

Consider a random walk consisting of several independent and identically distributed (IID) random variables ##X_i## which take values ##\pm 1## with probabilities ##p_{\pm} = 1/2##. Define ##S_n = X_1 + \dots + X_n##. This ##S_n## is a symmetric random walk: notice ##E(S_n) = 0##.

The quantity ##S_n^2 – n## is also a martingale. If ##S_n## is known, then ##S_{n+1}## is either ##S_{n} + 1## or ##S_{n}-1## with equal probabilities of ##1/2##, so:

$$E(S_{n+1}^2 – (n+1)) = \frac{1}{2}[(S_{n}+1)^2 + (S_{n}-1)^2] – (n+1) = S_{n}^2 – n$$

This fact will be useful in the next example:

Example: A drunken martingale

Drunk Man GIF

To give an example of why martingale-ness is a useful property, consider the following problem. A drunk man starts at the ##n^{\mathrm{th}}## pavement tile on a 100-tile street. He moves either ##+1## or ##-1## tile each step, with a probability of ##1/2## to go forward or backward. What is the probability that he ends up at the ##100^{\mathrm{th}}## tile (as opposed to tile ##0##), and what’s his expected number of steps?

You can solve this with a more direct probability argument, but the martingale approach is faster. Transform the coordinates so that his current position is defined as tile zero so that he ends at either tile ##-n## or tile ##100-n##. His path is now a symmetric random walk starting at zero and ending at one of these points.

Let ##N## be the number of steps until he reaches one of the endpoints. Because ##S_N## and ##S_N^2 – N## are martingales stopped at stopping time (with the stopping time being ##N##), these random variables are also martingales, hence ##E(S_N) = S_0 = 0## and ##E(S_N^2 – N) = S_0^2 – 0 = 0##.

If ##p_0## and ##p_{100}## are the probabilities of arriving at tile 0 and 100 respectively, then you can also figure out directly from the definition of the expectation that

$$E(S_N) = -n p_{0} + (100-n) p_{100}= 0$$

and

$$E(S_N^2 – N) = (-n)^2 p_{0} + (100-n)^2 p_{100} – E(N) = 0$$

You can check that, after putting ##p_{0} + p_{100} = 1## and solving, you obtain the probability of reaching the ##100^{\mathrm{th}}## tile as ##p_{100} = n/100## and the expected number of steps ##E(N) = n(100-n)##.

 

Stochastic differential equations (SDEs)

A typical stochastic differential equation for a random process ##X(t)## is

$$dX(t) = \mu(X, t)dt + \sigma(X,t) dW$$In this equation ##\mu(X,t) dt## is a drift term, and ##\sigma(X,t) dW## is a diffusive term which captures the stochastic character of the process. Mathematicians will tell you that the SDE above is an informal statement of the integral equation

$$X(t+\Delta t) – X(t) = \int_t^{t+\Delta t} \mu(X(t’),t’) dt’ + \int_t^{t+\Delta t} \sigma(X(t’), t’) dW(t’)$$ Stochastic processes obey their own slightly unique calculus (stochastic calculus). A famous result is the Itô Lemma, which describes how to take the differential of a function ##f(X)## of a stochastic process.

Kiyosi Itô

Kiyosi Itô

To arrive at the result, use a Taylor expansion:

$$df = f(X+dX, t+dt) – f(x,t) = \frac{\partial f}{\partial t} dt + \frac{\partial f}{\partial X} dX + \frac{1}{2} \frac{\partial^2 f}{\partial X^2} dX^2 + \dots$$

The ##dX^2## term is retained because, from the SDE, it is first order in ##dt## (because ##dW^2 = dt##, so ##dX^2 = \sigma^2 dt + \dots##). The Itô Lemma results from inserting the SDE into the Taylor expansion:

$$df = \left( \frac{\partial f}{\partial t} + \mu \frac{\partial f}{\partial X} + \frac{1}{2} \sigma^2 \frac{\partial^2 f}{\partial X^2} \right)dt + \sigma \frac{\partial f}{\partial X} dW$$

Example: Geometric Brownian Motion (GBM)

A famous SDE is

$$dX = \mu X dt + \sigma X dW$$

for constants ##\mu## and ##\sigma##. The SDE can be solved by considering the transformation ##f(X) = \log X##, so that ##\partial f/\partial X = 1/X## and ##\partial^2 f/\partial X^2 = -1/X^2##. Obviously ##\partial f/\partial t=0##. The Itô Lemma gives the following result:

$$df = \left( \mu – \frac{1}{2} \sigma^2\right)dt + \sigma dW$$

Integrating and exponentiating, the behavior of ##X(t)## is exponential growth multiplied by a stochastic factor ##e^{\sigma W(t)}##.

$$X = X_0 \exp \left[ \left( \mu – \frac{1}{2} \sigma^2\right)t + \sigma W(t) \right]$$

Simulation of GBM for 20 random paths.

Simulation of GBM for 20 random paths.

Kramers-Moyal Expansion

Now for a change of perspective. Consider instead the full probability ##P(X,t)## that a stochastic process takes a value ##X## at time ##t##. How can we calculate the probability density at time ##t + \Delta t##? The famous Chapman-Kolmogorov theorem is helpful:

$$P(X, t+\Delta t) = \int G(X, t+\Delta t | X’, t) P(X’, t) dX’$$

The function ##G## is the propagator (the conditional probability for each path). It’s a similar concept to the propagator in a path integral. Make a sneaky substitution ##\Delta X := X-X’##, and write

$$P(X, t+\Delta t) = \int G(X + \Delta X – \Delta X, t+\Delta t | X – \Delta X, t) P(X-\Delta X, t) d(\Delta X)$$(Eagle-eyed readers will have noticed that the negative sign in ##d(-\Delta X)## has been canceled by an un-shown reversal of the limits of integration.) This form is convenient because you can do a Taylor expansion in ##-\Delta X##,

$$P(X, t+\Delta t) = \int \sum_{n=0}^{\infty} \frac{(-\Delta X)^n}{n!} \frac{\partial^n}{\partial X^n} \left[ G(X+\Delta X, t+\Delta t | X, t)P(X,t) \right] d(\Delta X)$$You can make an even sneakier improvement by moving the ##\Delta X## integration (and the ##(\Delta X)^n## term) inside the partial derivative,

$$P(X, t+ \Delta t) = \sum_{n=0}^{\infty} \frac{(-1)^n}{n!} \frac{\partial^n}{\partial X^n} \left[ \int (\Delta X)^n G(X+\Delta X, t+\Delta t | X,t) d(\Delta X) \cdot P(X,t)\right]$$But what is ##\int (\Delta X)^n G(X+\Delta X, t+\Delta t | X,t) d(\Delta X)##? It’s the expectation ##E((\Delta X)^n)##. What you end up with is called the Kramers-Moyal expansion for the probability,

$$P(X, t+\Delta t) – P(X,t) = \sum_{n=1}^{\infty} \frac{(-1)^n}{n!} \frac{\partial^n}{\partial X^n} \left[ E((\Delta X)^n) P(X,t) \right]$$

You can turn this into a PDE by dividing through by ##\Delta t## and taking the limit ##\Delta t \rightarrow 0##.

$$\frac{\partial P}{\partial t} = \sum_{n=1}^{\infty} (-1)^n \frac{\partial^n}{\partial X^n} \left[ D^{(n)}(X) P(X,t) \right]$$

This is the famous Kramers-Moyal expansion with Kramers-Moyal coefficients:

$$D^{(n)}(X) := \lim_{\Delta t\rightarrow 0} \frac{1}{n!} \frac{E((\Delta X)^n) }{\Delta t}$$

The Fokker-Planck equation

If you cut off the Kramers-Moyal expansion at ##n=2##, then the result is the Fokker-Planck equation:

$$\frac{\partial P}{\partial t} = -\frac{\partial}{\partial X} [D^{(1)}(X) P] + \frac{\partial^2}{\partial X^2}[D^{(2)}(X) P]$$

Retaining only the first two terms works great a lot of the time. It can fail if there are pathological jump conditions.

Physicists are more familiar with ##\mu := D^{(1)}(x)## and ##\tfrac{1}{2} \sigma^2 = D = D^{(2)}(X)##, where ##D## is Einstein’s diffusion constant:

$$\frac{\partial P}{\partial t} = – \frac{\partial}{\partial X}(\mu P) + \frac{1}{2} \frac{\partial^2}{\partial X^2} (\sigma^2 P)$$

Example: Force-free Brownian motion

What do you notice in the case of zero drift, ##\mu = 0##? The Fokker-Planck equation becomes the diffusion equation

$$\frac{\partial P}{\partial t} = D \frac{\partial^2 P}{\partial X^2}$$

with probability current ##J = -D \partial P/\partial X##.

Example: Constant-force Brownian motion

In 1d, a typical (Langevin) equation of motion of a particle in a potential and subject to thermal noise is

$$\ddot{X} = -\gamma \dot{X} + F(X) + \sqrt{2kT\gamma} \frac{dW}{dt}$$

In this equation, ##\gamma## is the friction constant, ##F(X)## is the force field. The stochastic forcing term ##\sqrt{2kT\gamma} \tfrac{dW}{dt}## is sometimes called the Gaussian white noise. Restrict to the case where the particle is over-damped, ##|\ddot{X}| \ll \gamma |\dot{X}|##, so that you can all but ignore the ##\ddot{X}## term.

A mathematician would rather write the SDE:

$$dX = \frac{1}{\gamma} F(X) dt + \sigma dW$$

Simulation of diffusion under a constant force for 20 random paths.

Simulation of diffusion under a constant force for 20 random paths.

For thermal stochastic noise, the volatility parameter ##\sigma = \sqrt{2kT/\gamma}## has a simple temperature dependence. The Fokker-Planck equation for this system is

$$\frac{\partial P}{\partial t} = -\frac{\partial}{\partial X} \left[ \frac{1}{\gamma} F(X) P \right] + \frac{1}{2} \sigma^2 \frac{\partial^2 P}{\partial X^2}$$

For a constant force ##F(X) = F_0##, the solution is analytic: it’s a Gaussian with mean position ##\bar{x}(t) := E(X(t)) = (F_0/\gamma)t## and a variance ##\sigma^2 t## increasing linearly with time:

$$P(X,t) = \frac{1}{\sqrt{2\pi \sigma^2 t}} \exp\left[ -\frac{(X-\bar{x}(t))^2}{2\sigma^2 t}\right]$$

Imagine what this probability density looks like as time proceeds. The peak moves to the right with speed ##F_0/\gamma##, and its amplitude gets squashed down asymptotically as it spreads out around the mean.

Ornstein-Uhlenbeck processes

So much for constant force fields. What about a harmonic force ##F(X) = -k(X-X_0)##? The corresponding SDE represents an Ornstein-Uhlenbeck process

$$dX = -\frac{k}{\gamma} (X-X_0) dt + \sigma dW$$

Simulation of an Ornstein-Uhlenbeck process with 20 random paths.

Simulation of an Ornstein-Uhlenbeck process with 20 random paths.

You can imagine a particle attached to a spring, being pulled back towards its equilibrium position ##X_0## while being disturbed by some stochastic noise. The process’s mean will approach the equilibrium position asymptotically, but the variance approaches a non-zero constant value ##\sigma^2/(2k)##. Physically: the stochastic noise prevents the system from ever settling!

You can find the full probability density ##P(X,t)## with a neat trick. Define new stochastic variable ##Y := e^{kt} X##. If you write down ##dY## using the Itô Lemma and insert it into the Ornstein-Uhlenbeck SDE, you get:

$$dY = kX_0 e^{kt} dt + e^{kt} \sigma dW$$

Now with the initial condition ##X(0) = 0##. After converting back to ##X##, $$X = X_0(1-e^{-kt}) + \sigma \int_0^t e^{k(t’-t)} dW(t’)$$The expectation is ##E(X(t)) := \bar{x}(t) = X_0(1-e^{-kt})##, because ##E(dW) = 0##. The variance is marginally more fiddly to expand out, but after dropping small terms you get

$$\mathrm{Var}(X) = E(X^2) – E(X)^2 = E \left[\left(  \sigma \int_0^t e^{k(t’-t)} dW(t’) \right)^2 \right]$$

It helps to re-write the right-hand side as a double integral$$\mathrm{Var}(X) = \sigma^2 \int_0^t \int_0^t e^{k(t’-t)} e^{k(t”-t)} E(dW(t’) dW(t”))$$The useful relationship for differential Brownian motions ##E(dW(t’) dW(t”)) = \delta(t’-t”)## means that this simplifies to

$$\mathrm{Var}(X) = \frac{\sigma^2}{2k}(1-e^{-2kt})$$In analogy to the result obtained for constant forcing, the full probability density in a harmonic potential is also a Gaussian, but with a very different time dependence. The function ##P(x,t)## is given by

$$P(x,t) = \frac{1}{\sqrt{\pi \frac{\sigma^2}{k}(1-e^{-2kt}) t}} \exp \left(- \frac{(X-\bar{x}(t))^2}{\frac{\sigma^2}{k}(1-e^{-2kt}) t}\right)$$with a central position ##E(X(t)) := \bar{x}(t) = X_0(1-e^{-kt})##.

1 reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply