Gram-Schmidt Orthonormalization .... Garling Theorem 11.4.1 ....

  • MHB
  • Thread starter Math Amateur
  • Start date
  • Tags
    Theorem
In summary: W_j ...This can be done by induction on j ... for j = 1 we have e_1 = \frac{ f_1 }{ \| f_1 \| } and \| e_1 \| = 1 ...Now suppose that j \geq 2 and that the list e_1, e_2, \ ... \ ... \ , e_{ j - 1 } is a basis for W_{ j - 1 } ...Now since e_j \in W_j we can write e_j = \sum_{ i = 1 }^{ j - 1 } \alpha_i e_i for some scalars \alpha_1
  • #1
Math Amateur
Gold Member
MHB
3,998
48
I am reading D. J. H. Garling's book: "A Course in Mathematical Analysis: Volume II: Metric and Topological Spaces, Functions of a Vector Variable" ... ...

I am focused on Chapter 11: Metric Spaces and Normed Spaces ... ...

I need some help with an aspect of the proof of Theorem 11.4.1 ...

Garling's statement and proof of Theorem 11.4.1 reads as follows:
View attachment 7921In the above proof by Garling we read the following:

" ... ... Let \(\displaystyle f_j = x_j - \sum_{ i = 1 }^{ j-1 } \langle x_j , e_i \rangle e_i\). Since\(\displaystyle x_j \notin W_{ j-1 }, f_j \neq 0\).

Let \(\displaystyle e_j = \frac{ f_j }{ \| f_j \| } \). Then \(\displaystyle \| e_j \| = 1\) and

\(\displaystyle \text{ span } ( e_1, \ ... \ ... \ e_j ) = \text{ span } ( W_{ j - 1 } , e_j ) = \text{ span }( W_{ j - 1 } , x_j ) = W_j \)

... ... "
Can someone please demonstrate rigorously how/why \(\displaystyle f_j = x_j - \sum_{ i = 1 }^{ j-1 } \langle x_j , e_i \rangle e_i \)

and

\(\displaystyle e_j = \frac{ f_j }{ \| f_j \| }\)imply that \(\displaystyle \text{ span } ( e_1, \ ... \ ... \ e_j ) = \text{ span } ( W_{ j - 1 } , e_j ) = \text{ span }( W_{ j - 1 } , x_j ) = W_j\)

Help will be much appreciated ...

Peter
 
Physics news on Phys.org
  • #2
Peter said:
I am reading D. J. H. Garling's book: "A Course in Mathematical Analysis: Volume II: Metric and Topological Spaces, Functions of a Vector Variable" ... ...

I am focused on Chapter 11: Metric Spaces and Normed Spaces ... ...

I need some help with an aspect of the proof of Theorem 11.4.1 ...

Garling's statement and proof of Theorem 11.4.1 reads as follows:
In the above proof by Garling we read the following:

" ... ... Let \(\displaystyle f_j = x_j - \sum_{ i = 1 }^{ j-1 } \langle x_j , e_i \rangle e_i\). Since\(\displaystyle x_j \notin W_{ j-1 }, f_j \neq 0\).

Let \(\displaystyle e_j = \frac{ f_j }{ \| f_j \| } \). Then \(\displaystyle \| e_j \| = 1\) and

\(\displaystyle \text{ span } ( e_1, \ ... \ ... \ e_j ) = \text{ span } ( W_{ j - 1 } , e_j ) = \text{ span }( W_{ j - 1 } , x_j ) = W_j \)

... ... "
Can someone please demonstrate rigorously how/why \(\displaystyle f_j = x_j - \sum_{ i = 1 }^{ j-1 } \langle x_j , e_i \rangle e_i \)

and

\(\displaystyle e_j = \frac{ f_j }{ \| f_j \| }\)imply that \(\displaystyle \text{ span } ( e_1, \ ... \ ... \ e_j ) = \text{ span } ( W_{ j - 1 } , e_j ) = \text{ span }( W_{ j - 1 } , x_j ) = W_j\)

Help will be much appreciated ...

Peter
Reflecting on my post above I have formulated the following proof of Garling's statement ... ...\(\displaystyle \text{ span } ( e_1, \ ... \ ... \ e_j ) = \text{ span } ( W_{ j - 1 } , e_j ) = \text{ span }( W_{ j - 1 } , x_j ) = W_j\)

We have \(\displaystyle e_1 = \frac{ f_1 }{ \| f_1 \| }\) and we suppose that we have constructed \(\displaystyle e_1, \ ... \ ... \ e_{j - 1 } \), satisfying the conclusions of the theorem ...Let \(\displaystyle f_j = x_j - \sum_{ i = 1 }^{ j-1 } \langle x_j , e_i \rangle e_i\)Then \(\displaystyle e_j = \frac{ f_j }{ \| f_j \| } = \frac{ x_j - \sum_{ i = 1 }^{ j-1 } \langle x_j , e_i \rangle e_i }{ \| x_j - \sum_{ i = 1 }^{ j-1 } \langle x_j , e_i \rangle e_i \| }\)

So ...

\(\displaystyle e_j = \frac{ x_j - \langle x_j , e_1 \rangle e_1 - \langle x_j , e_2 \rangle e_2 - \ ... \ ... \ ... \ - \langle x_j , e_{ j - 1 } \rangle e_{ j - 1 } }{ \| x_j - \sum_{ i = 1 }^{ j-1 } \langle x_j , e_i \rangle e_i \| }\) Therefore ...

\(\displaystyle x_j = \| x_j - \sum_{ i = 1 }^{ j-1 } \langle x_j , e_i \rangle e_i \| e_j + \langle x_j , e_1 \rangle e_1 + \langle x_j , e_2 \rangle e_2 + \ ... \ ... \ ... \ + \langle x_j , e_{ j - 1 } \rangle e_{ j - 1 }\)Therefore \(\displaystyle x_j \in \text{ span } ( e_1, e_2, \ ... \ ... \ , e_j )\) ... ... ... ... ... (1)But \(\displaystyle W_{j-1} = \text{ span } ( x_1, x_2, \ ... \ ... \ , x_{ j - 1 } ) = \text{ span } ( e_1, e_2, \ ... \ ... \ , e_{ j - 1} ) \) ... ... ... ... ... (2) Now \(\displaystyle (1) (2) \Longrightarrow \text{ span } ( x_1, x_2, \ ... \ ... \ , x_j ) \subseteq \text{ span } ( e_1, e_2, \ ... \ ... \ , e_j )\)But ... both lists are linearly independent (x's by hypothesis and the e's by orthonormality ...)

Thus both lists have dimension j and hence they must be equal ...That is \(\displaystyle \text{ span } ( x_1, x_2, \ ... \ ... \ , x_j ) = \text{ span } ( e_1, e_2, \ ... \ ... \ , e_j )

\)

Is that correct ...?

Can someone please critique the above proof pointing out errors and/or shortcomings ...Peter*** EDIT ***

Above I claimed that the the list of vectors \(\displaystyle e_1, e_2, \ ... \ ... \ , e_j\) was orthonormal ... and hence linearly independent ... but I needed to show that the list \(\displaystyle e_1, e_2, \ ... \ ... \ , e_j \) was orthonormal ... To show this let \(\displaystyle 1 \le k \lt j\) and calculate \(\displaystyle \langle e_j, e_k \rangle\) ... indeed it readily turns out that \(\displaystyle \langle e_j, e_k \rangle = 0\) for all \(\displaystyle k\) such that \(\displaystyle 1 \le k \lt j\) and so list of vectors \(\displaystyle e_1, e_2, \ ... \ ... \ , e_j\) is orthonormal ... Peter
 
Last edited:
  • #3


Sure, I'd be happy to help explain this proof for you.

First, let's start with the definition of span. The span of a set of vectors is the set of all possible linear combinations of those vectors. In this case, we are dealing with a set of vectors: e_1, ..., e_j. So, the span of these vectors, denoted as \text{ span } ( e_1, \ ... \ ... \ e_j ), is the set of all possible linear combinations of e_1, ..., e_j.

Now, let's look at the definition of f_j. It is defined as x_j - \sum_{ i = 1 }^{ j-1 } \langle x_j , e_i \rangle e_i. This means that f_j is a linear combination of x_j and e_1, ..., e_{j-1}. Therefore, f_j is an element of the span of x_j and e_1, ..., e_{j-1}. In other words, f_j \in \text{ span } ( x_j, e_1, ..., e_{j-1} ).

Next, we are given that x_j \notin W_{j-1}. This means that x_j is not in the span of e_1, ..., e_{j-1}. Therefore, we can conclude that f_j \neq 0, since it contains x_j as a component. In other words, f_j is a non-zero vector.

Now, let's look at the definition of e_j. It is defined as \frac{ f_j }{ \| f_j \| }. The norm of a vector is its length or magnitude. So, the norm of f_j, denoted as \| f_j \|, is the length or magnitude of f_j. By dividing f_j by its norm, we are essentially normalizing it and making it a unit vector (a vector with length/magnitude 1). This is why \| e_j \| = 1.

Finally, we can see that e_j is a linear combination of f_j. In fact, it is the same linear combination as f_j, just with a different magnitude (1 instead of \| f_j \|). So, e_j \in \text{ span } ( f_j ). But we also know that f_j \in \text{ span } ( x_j, e_1, ..., e_{j-1} ).
 

FAQ: Gram-Schmidt Orthonormalization .... Garling Theorem 11.4.1 ....

What is Gram-Schmidt Orthonormalization?

Gram-Schmidt Orthonormalization is a mathematical process used to transform a set of linearly independent vectors into a set of orthonormal vectors, which are vectors that are both orthogonal (perpendicular) and have a magnitude of 1.

Why is Gram-Schmidt Orthonormalization important?

This process is important because it allows us to simplify and solve complex mathematical problems involving linearly independent vectors. It also helps to reduce computational errors by making the vectors more easily manageable.

What is Garling Theorem 11.4.1?

Garling Theorem 11.4.1 is a mathematical theorem that states a set of linearly independent vectors can be transformed into a set of orthonormal vectors using the Gram-Schmidt Orthonormalization process.

How does Gram-Schmidt Orthonormalization work?

The process involves three steps: first, we take the first vector in the set and normalize it (divide it by its magnitude) to get a unit vector. Then, we subtract the projection of this unit vector onto the second vector from the second vector. This creates a vector that is perpendicular to the first vector. Lastly, we repeat this process for the remaining vectors in the set, subtracting their projections onto the previously orthonormalized vectors, until we have a set of orthonormal vectors.

What are some applications of Gram-Schmidt Orthonormalization?

This process is commonly used in fields such as linear algebra, signal processing, and machine learning. It is also used in solving systems of linear equations, finding eigenvalues and eigenvectors, and constructing orthonormal bases for vector spaces.

Back
Top