Differentiation with matrices/vectors

In summary, the proof for the least squares estimator for beta involves using the Leibniz rule to differentiate with respect to beta, resulting in a final equation of -2X^Ty + 2X^TXbeta. This is evaluated at the point beta=hat beta to obtain the derivative of S with respect to beta. There is no assumption that X is square.
  • #1
mikeph
1,235
18
Hello,

I'm trying to understand this proof:

http://en.wikipedia.org/wiki/Proofs...st_squares#Least_squares_estimator_for_.CE.B2

Can someone quickly talk me through the differentiation step, bearing in mind I've never learn how to differentiate with respect to a vector?

Most confusing for me is:

1. why are they differentiating with respect to the transpose b' rather than just b?
2. where does the -2X'y term come from?
3. is there any assumption here that X is square?

Thanks for any help,
Mike
 
Mathematics news on Phys.org
  • #2
It is easier with the Leibniz rule: ##S(\beta)=(y-X\beta)^\tau(y-X\beta)##. Hence differentiation with respect to ##\beta## is
\begin{align*}
S(\beta)'&=[(y-X\beta)^\tau]'\cdot (y-X\beta) + (y-X\beta)^\tau \cdot (y-X\beta)'\\
&=-X^\tau\cdot (y-X\beta) + (y-X\beta)^\tau\cdot (-X)\\
&=-X^\tau y + X^\tau X\beta -y^\tau X + \beta^\tau X^\tau X\\
&=-2X^\tau y +2X^\tau X\beta
\end{align*}
as matrix times column vector equals row vector times transpose matrix and at the evaluation point ##\beta=\hat \beta## we get
$$
\dfrac{dS}{d\beta}(\hat \beta) = S(\beta)'|_{\beta=\hat \beta} = -2X^\tau y +2X^\tau X \hat \beta
$$
 

FAQ: Differentiation with matrices/vectors

What is differentiation with matrices and vectors?

Differentiation with matrices and vectors is a mathematical process that involves finding the rate of change of a matrix or vector with respect to one or more variables. It is similar to traditional differentiation, but instead of working with functions, it involves operations on matrices and vectors.

Why is differentiation with matrices and vectors important?

Differentiation with matrices and vectors is important because it allows us to analyze and optimize systems that involve multiple variables, such as in economics, engineering, and physics. It also helps us understand the behavior of complex systems and make predictions based on their rate of change.

How is differentiation with matrices and vectors performed?

To differentiate a matrix or vector, we use a technique called partial differentiation, where we take the derivative of each element in the matrix or vector with respect to the variable of interest while keeping other variables constant. This results in a new matrix or vector with the same dimensions as the original.

Can differentiation be applied to any type of matrix or vector?

Yes, differentiation can be applied to matrices and vectors of any size and type, as long as they are differentiable functions. This includes both real and complex matrices and vectors, as well as matrices and vectors with both continuous and discrete elements.

What are some real-world applications of differentiation with matrices and vectors?

Differentiation with matrices and vectors has a wide range of applications, such as in optimization problems, machine learning algorithms, and financial modeling. It is also used in physics to analyze motion and in economics to study supply and demand curves. Additionally, differentiation with matrices and vectors is used in computer graphics to create smooth and realistic animations.

Similar threads

Replies
11
Views
2K
Replies
5
Views
1K
Replies
16
Views
2K
Replies
1
Views
1K
Replies
5
Views
1K
Replies
1
Views
2K
Replies
16
Views
3K
Replies
11
Views
3K
Back
Top