Gaussian process and climate model in Matlab

In summary, a group of climate scientists are running a climate model to output temperature data for every location on Earth for the years 2006 and 2101. The model is deterministic and the challenge is to select parameters that provide a realistic evolution in time. For this project, the focus is on the parameter "albedo of sea ice" (θ) and the goal is to choose a value that best matches the observed temperature on October 18, 2019. The group of scientists have spent a month running the model and have provided five evaluation points of (θ, y(θ)).To model the relationship between parameter value and score, a Gaussian process model is used with a mean vector of 0.5 and a covariance
  • #1
LogarithmLuke
83
3
A group of climate scientists are running a climate model that outputs the temperature at every location on Earth for every 6-hour period in the years 2006 and 21001. The climate model is deterministic, and given the athmospheric starting conditions and model parameters, you will always get the same result. The challenge is that the parameters of the climate model must be selected so that the output provides as realistic evolution in time as possible. This is immensely difficult because running the model only once may require one month of computation time. For the sake of this project, assume that the only way to choose these parameters are to run the climate model for different parameter values and compare to observed temperatures.

We limit the focus to one parameter, “the albedo of sea ice”, which is a measure how much sun light is reflected by sea ice. We call this parameter θ, and we decide to choose this parameter so that the temperature observed on October 18, 2019, at 12:00–18:00 best matches the output of the climate model. The fit is measured through a score y(θ) calculated based on the model output generated with parameter value θ.

The group of climate scientists have spent the last month running the model in five computing centres and provides you with five evaluation points of (θ, y(θ)): (0.30, 0.5), (0.35, 0.32), (0.39, 0.40), (0.41, 0.35), and (0.45, 0.60).

Use a Gaussian process model {Y(θ) : θ ∈ [0,1]} to model the unknown relation- ship between the parameter value and the score. Use E[Y(θ)] ≡ 0.5, Var[Y(θ)] ≡ 0.52, and Corr[Y (θ1), Y (θ2)] = (1 + 15|θ1 − θ2|) exp(−15|θ1 − θ2|) for θ1, θ2 ∈ [0, 1].

a) Define a regular grid of parameter values from θ = 0.25 to θ = 0.50 with spacing 0.005 (n = 51 points). Construct the mean vector and the covariance matrix required to compute the condi- tional means and covariances of the process at the 51 points conditional on the five evaluation points. Display the prediction as a function of θ, along with 90% prediction intervals.

b) The scientists’ goal is to achieve y(θ) < 0.30. Use the predictions from a) to compute the conditional probability that y(θ) < 0.30 given the 5 evalution points. Plot the probability as a function of θ.
I'll admit I am very new to Gaussian processes, but from what I know a Gaussian process is completely determined by a mean vector E(Y(θ)) and a covariance function Cov[Y(θ1), Y(θ2)]. E(Y(θ)) is given, and we have the correlation, which is just the covariance divided by Var(θ1)*Var(θ2).The dimension of the mean vector and covariance matrix I believe is 51 (equal to the number of unknown parameter values).
So it seems I have everything I need, but I don't know how to go about computing these things in Matlab.
We were given the following algorithm by our lecturer:
Code:
1. calculate Cholesky decomposition Σ = LLT
2. for i = 1...n
3. draw zi ∼ N(0,1)
4. end
5. set ⃗x = μ⃗ + L⃗z

Here ⃗x is a draw from Nn(μ⃗, Σ). However I am not sure how to use this to calculate the information requested in a) and b).
 
Physics news on Phys.org
  • #2
I think I need to use the Cholesky decomposition to calculate the conditional mean and covariance. I also need to use the conditional probability formula, but I am not sure how to incorporate the mean and covariance into that formula. I would appreciate any guidance on how to proceed with this problem.
 

FAQ: Gaussian process and climate model in Matlab

What is a Gaussian process?

A Gaussian process is a statistical model used to describe a set of data points as a continuous function. It is often used in machine learning and data analysis to model complex relationships between variables.

How is a Gaussian process used in climate modeling?

A Gaussian process can be used in climate modeling to simulate and predict various climate variables, such as temperature and precipitation, over a specific time period. It can capture the underlying patterns and correlations in the data, allowing for more accurate predictions and analysis.

What is the role of Matlab in Gaussian process and climate modeling?

Matlab is a popular programming language and software environment used for scientific and numerical computing. It offers various tools and functions for implementing and analyzing Gaussian processes and climate models, making it a useful tool for researchers and scientists in this field.

What are the advantages of using a Gaussian process in climate modeling?

One of the main advantages of using a Gaussian process in climate modeling is its ability to handle non-linear relationships between variables. It also provides uncertainty estimates for the predicted values, which is crucial for understanding the reliability of the model's results.

Are there any limitations to using a Gaussian process in climate modeling?

While Gaussian processes have many advantages, they also have some limitations. They may not perform well with large and complex datasets, and they require careful tuning of parameters to avoid overfitting. Additionally, they may not be suitable for modeling extreme events or rare occurrences in climate data.

Similar threads

Replies
12
Views
3K
Replies
2
Views
5K
Replies
0
Views
2K
Replies
1
Views
2K
Replies
1
Views
2K
Replies
7
Views
2K
Back
Top