Yet Another Basic Question on Linear Transformations and Their Matrices

In summary, Peter, Lipshutz is discussing the basics of linear transformations and matrix operations. He states that vectors are elements of a vector space and that coordinates alone do not always give information about which point an element refers to. He also mentions that there are two more pieces of information that are needed in order to say which "point" a matrix refers to: "relative to what (origin) point", and "
  • #1
Math Amateur
Gold Member
MHB
3,998
48
I am revising the basics of linear transformations and trying to get a thorough understanding of linear transformations and their matrices ... ...

At present I am working through examples and exercises in Seymour Lipshutz' book: Linear Algebra, Fourth Edition (Schaum Series) ... ...

At present I am focused on Chapter 6: Linear Mappings and Matrices ...

I need help with an aspect of Example 6.1 on page 196 ...

Example 6.1 reads as follows:View attachment 5279Now in Example 6.1 (a), (1) above, Lipshutz determines \(\displaystyle F(u_1)\) as follows:\(\displaystyle F(u_1) = F( \begin{bmatrix} 1 \\ 2 \end{bmatrix} ) = \begin{bmatrix} 8 \\ -6 \end{bmatrix}\)and then Lipshutz goes on to find the coordinates \(\displaystyle x\) and \(\displaystyle y\) of \(\displaystyle \begin{bmatrix} 8 \\ -6 \end{bmatrix}\) relative to the basis \(\displaystyle \{ u_1, u_2 \}\) ... ...... ... BUT ... ... what is \(\displaystyle \begin{bmatrix} 8 \\ -6 \end{bmatrix}\) exactly ... ...To answer my own question ... ... I suspect it is the coordinates of a point relative to the standard basis \(\displaystyle e_1 = \begin{bmatrix} 1 \\ 0 \end{bmatrix}, e_2 = \begin{bmatrix} 0 \\ 1 \end{bmatrix}\) ... ... is that right? Have I described it correctly?

So, if I am right ...

\(\displaystyle \begin{bmatrix} 8 \\ -6 \end{bmatrix} = 8e_1 + (-6)e_2\)Can someone please confirm that the above analysis of what is going on is correct ... or alternatively point out errors and shortcomings in what I have said ...

Peter
 
Physics news on Phys.org
  • #2
Hi Peter,

Lipshutz writes "For notational convenience, we use column vectors." This means that he'll use the column vector notation $\begin{bmatrix}x\\y\end{bmatrix}$ to mean the point $(x,y)$ for a point in $\Bbb R^2$.
 
  • #3
Euge said:
Hi Peter,

Lipshutz writes "For notational convenience, we use column vectors." This means that he'll use the column vector notation $\begin{bmatrix}x\\y\end{bmatrix}$ to mean the point $(x,y)$ for a point in $\Bbb R^2$.
Thanks Euge ...

I think from what you are saying is that \(\displaystyle \begin{bmatrix} 8 \\ -6 \end{bmatrix}\) is a vector (point in \(\displaystyle \mathbb{R}^2\) ) in the standard basis \(\displaystyle e_1 = \begin{bmatrix} 1 \\ 0 \end{bmatrix}, e_2 = \begin{bmatrix} 0 \\ 1 \end{bmatrix}\) ...

Maybe I am blurring the distinction between points and vectors a bit ...

Peter
 
  • #4
Vectors are elements of a vector space. While that seems almost tautological, it's the most accurate description.

Some people think of vectors as "arrows" (so they START at a point, and "go somewhere for a distance"). This isn't quite accurate-for we can imagine two such arrows that start at *different* points, and then we have no way to ADD them.

However, we CAN add all such arrows that start at a GIVEN point, by a purely geometric process called the "parallelogram rule". This process, of "choosing a point" turns our geometric space (more properly an AFFINE space) into a vector space, if we agree to identify the chosen point as our "origin".

However, we may already have an origin, for example we may already have a coordinate-space used to describe a curve, or a surface, and we wish to describe our chosen point in terms of the coordinate system we are describing our curve or surface in. So now we have TWO coordinate systems-relative to our original origin, and relative to our chosen point. This is, in somewhat loose terms, the difference between $\Bbb R^n$ and $(\Bbb R^n)_p$.

So if in our original coordinate system, we have $p = (c_1,c_2,c_3)$, then in our SECOND coordinate system (relative to $p$) we have: $p = (0,0,0)_p$.

I bring this up to emphasize that coordinates ALONE do not tell us "which point" (element of a vector space) we have. WE need two more pieces of information: "relative to what (origin) point", and "using what coordinate system".

In their rush to give students some basic faculty with vectors and matrices, many textbooks ignore these niceties lurking in the background. In other words, they ASSUME that the basis $\{(1,0,\dots,0),(0,1,\dots,0),\dots,(0,0,\dots,1)\} = \{e_i\}$ will be used relative to the "usual origin" (the 0-vector with all 0 coordinates), so that the $n$-tuple of (real numbers, for example) $(c_1,c_2,\dots,c_n)$ will MEAN $c_1e_1 + c_2e_2 + \cdots + c_ne_n$.

Now, there is a NATURAL isomorphism between, $\Bbb R^2$, say, and the space $\text{Mat}_{2 \times 1}(\Bbb R)$, given by:

$(x,y) \mapsto \begin{bmatrix}x\\y\end{bmatrix}$.

It is this isomorphism Lipshutz is tacitly referring to when he says "for notational convenience, we will use column vectors". Mathematically speaking, it would be fine to have $\{e_1,e_2\}$ be ANY two orthonormal vectors, we would get the same set of 2x1 column vector representations, but unless we actually knew what coordinates $e_1,e_2$ had in our "base coordinate system" we would not "know" which "point" in $\Bbb R^2$ any given matrix REFERRED to.

There is yet a further wrinkle in all this-orthognality and normality depend on a given inner product, and a given norm. Many times, authors of linear algebra texts tacitly assume there is a natural reason to assume the Euclidean inner product:

$\langle (x_1,y_1),(x_2,y_2)\rangle = x_1x_2 + y_1y_2$

but many,many inner products are possible.

Given an inner product, one can DEFINE a norm by: $\|(x,y)\| = \sqrt{\langle(x,y),(x,y)\rangle}$, which leads to the familiar formula:

$\|(x,y)\| = \sqrt{x^2 + y^2}$ when the Euclidean inner product is used. This, of course, is the usual "distance" formula that has its origins in Pythagoras' Theorem.

HOWEVER, one can define "distance" WITHOUT having first defined an inner product, for example, there is the "discrete distance function":

$d((x_1,y_1),(x_2,y_2)) = 1$ if $(x_1,y_1) \neq (x_2,y_2)$
$d((x_1,y_1),(x_1,y_1)) = 0$

that returns a distance of 1, if two points are different, and a distance of 0 if they are the same.

The point is, in arenas of greater mathematical sophistication, "points" don't always have some of the "nice" properties we take for granted in a Euclidean plane.

There is a slight danger of saying the column vector $\begin{bmatrix}x\\y\end{bmatrix}$ IS the point $(x,y)$. What is more ACCURATE to say, is that that column matrix is a REPRESENTATION of the point $(x,y)$. When, in mathematics, you see the word "representation", you should immediately think inside: "Oh, so there's a homomorphism of some kind involved".

Loosely speaking, however, in the same sense that Lipshutz is, you are correct. You (and he) are both blurring these fine distinctions (this is all very well and good, until one has "multiple coordinate systems you are switching between", and then it pays to keep them straight in your mind).
 
  • #5
Deveno said:
Vectors are elements of a vector space. While that seems almost tautological, it's the most accurate description.

Some people think of vectors as "arrows" (so they START at a point, and "go somewhere for a distance"). This isn't quite accurate-for we can imagine two such arrows that start at *different* points, and then we have no way to ADD them.

However, we CAN add all such arrows that start at a GIVEN point, by a purely geometric process called the "parallelogram rule". This process, of "choosing a point" turns our geometric space (more properly an AFFINE space) into a vector space, if we agree to identify the chosen point as our "origin".

However, we may already have an origin, for example we may already have a coordinate-space used to describe a curve, or a surface, and we wish to describe our chosen point in terms of the coordinate system we are describing our curve or surface in. So now we have TWO coordinate systems-relative to our original origin, and relative to our chosen point. This is, in somewhat loose terms, the difference between $\Bbb R^n$ and $(\Bbb R^n)_p$.

So if in our original coordinate system, we have $p = (c_1,c_2,c_3)$, then in our SECOND coordinate system (relative to $p$) we have: $p = (0,0,0)_p$.

I bring this up to emphasize that coordinates ALONE do not tell us "which point" (element of a vector space) we have. WE need two more pieces of information: "relative to what (origin) point", and "using what coordinate system".

In their rush to give students some basic faculty with vectors and matrices, many textbooks ignore these niceties lurking in the background. In other words, they ASSUME that the basis $\{(1,0,\dots,0),(0,1,\dots,0),\dots,(0,0,\dots,1)\} = \{e_i\}$ will be used relative to the "usual origin" (the 0-vector with all 0 coordinates), so that the $n$-tuple of (real numbers, for example) $(c_1,c_2,\dots,c_n)$ will MEAN $c_1e_1 + c_2e_2 + \cdots + c_ne_n$.

Now, there is a NATURAL isomorphism between, $\Bbb R^2$, say, and the space $\text{Mat}_{2 \times 1}(\Bbb R)$, given by:

$(x,y) \mapsto \begin{bmatrix}x\\y\end{bmatrix}$.

It is this isomorphism Lipshutz is tacitly referring to when he says "for notational convenience, we will use column vectors". Mathematically speaking, it would be fine to have $\{e_1,e_2\}$ be ANY two orthonormal vectors, we would get the same set of 2x1 column vector representations, but unless we actually knew what coordinates $e_1,e_2$ had in our "base coordinate system" we would not "know" which "point" in $\Bbb R^2$ any given matrix REFERRED to.

There is yet a further wrinkle in all this-orthognality and normality depend on a given inner product, and a given norm. Many times, authors of linear algebra texts tacitly assume there is a natural reason to assume the Euclidean inner product:

$\langle (x_1,y_1),(x_2,y_2)\rangle = x_1x_2 + y_1y_2$

but many,many inner products are possible.

Given an inner product, one can DEFINE a norm by: $\|(x,y)\| = \sqrt{\langle(x,y),(x,y)\rangle}$, which leads to the familiar formula:

$\|(x,y)\| = \sqrt{x^2 + y^2}$ when the Euclidean inner product is used. This, of course, is the usual "distance" formula that has its origins in Pythagoras' Theorem.

HOWEVER, one can define "distance" WITHOUT having first defined an inner product, for example, there is the "discrete distance function":

$d((x_1,y_1),(x_2,y_2)) = 1$ if $(x_1,y_1) \neq (x_2,y_2)$
$d((x_1,y_1),(x_1,y_1)) = 0$

that returns a distance of 1, if two points are different, and a distance of 0 if they are the same.

The point is, in arenas of greater mathematical sophistication, "points" don't always have some of the "nice" properties we take for granted in a Euclidean plane.

There is a slight danger of saying the column vector $\begin{bmatrix}x\\y\end{bmatrix}$ IS the point $(x,y)$. What is more ACCURATE to say, is that that column matrix is a REPRESENTATION of the point $(x,y)$. When, in mathematics, you see the word "representation", you should immediately think inside: "Oh, so there's a homomorphism of some kind involved".

Loosely speaking, however, in the same sense that Lipshutz is, you are correct. You (and he) are both blurring these fine distinctions (this is all very well and good, until one has "multiple coordinate systems you are switching between", and then it pays to keep them straight in your mind).
Thanks Deveno ... just working through your post now ...

Peter
 
  • #6
Deveno said:
Vectors are elements of a vector space. While that seems almost tautological, it's the most accurate description.

Some people think of vectors as "arrows" (so they START at a point, and "go somewhere for a distance"). This isn't quite accurate-for we can imagine two such arrows that start at *different* points, and then we have no way to ADD them.

However, we CAN add all such arrows that start at a GIVEN point, by a purely geometric process called the "parallelogram rule". This process, of "choosing a point" turns our geometric space (more properly an AFFINE space) into a vector space, if we agree to identify the chosen point as our "origin".

However, we may already have an origin, for example we may already have a coordinate-space used to describe a curve, or a surface, and we wish to describe our chosen point in terms of the coordinate system we are describing our curve or surface in. So now we have TWO coordinate systems-relative to our original origin, and relative to our chosen point. This is, in somewhat loose terms, the difference between $\Bbb R^n$ and $(\Bbb R^n)_p$.

So if in our original coordinate system, we have $p = (c_1,c_2,c_3)$, then in our SECOND coordinate system (relative to $p$) we have: $p = (0,0,0)_p$.

I bring this up to emphasize that coordinates ALONE do not tell us "which point" (element of a vector space) we have. WE need two more pieces of information: "relative to what (origin) point", and "using what coordinate system".

In their rush to give students some basic faculty with vectors and matrices, many textbooks ignore these niceties lurking in the background. In other words, they ASSUME that the basis $\{(1,0,\dots,0),(0,1,\dots,0),\dots,(0,0,\dots,1)\} = \{e_i\}$ will be used relative to the "usual origin" (the 0-vector with all 0 coordinates), so that the $n$-tuple of (real numbers, for example) $(c_1,c_2,\dots,c_n)$ will MEAN $c_1e_1 + c_2e_2 + \cdots + c_ne_n$.

Now, there is a NATURAL isomorphism between, $\Bbb R^2$, say, and the space $\text{Mat}_{2 \times 1}(\Bbb R)$, given by:

$(x,y) \mapsto \begin{bmatrix}x\\y\end{bmatrix}$.

It is this isomorphism Lipshutz is tacitly referring to when he says "for notational convenience, we will use column vectors". Mathematically speaking, it would be fine to have $\{e_1,e_2\}$ be ANY two orthonormal vectors, we would get the same set of 2x1 column vector representations, but unless we actually knew what coordinates $e_1,e_2$ had in our "base coordinate system" we would not "know" which "point" in $\Bbb R^2$ any given matrix REFERRED to.

There is yet a further wrinkle in all this-orthognality and normality depend on a given inner product, and a given norm. Many times, authors of linear algebra texts tacitly assume there is a natural reason to assume the Euclidean inner product:

$\langle (x_1,y_1),(x_2,y_2)\rangle = x_1x_2 + y_1y_2$

but many,many inner products are possible.

Given an inner product, one can DEFINE a norm by: $\|(x,y)\| = \sqrt{\langle(x,y),(x,y)\rangle}$, which leads to the familiar formula:

$\|(x,y)\| = \sqrt{x^2 + y^2}$ when the Euclidean inner product is used. This, of course, is the usual "distance" formula that has its origins in Pythagoras' Theorem.

HOWEVER, one can define "distance" WITHOUT having first defined an inner product, for example, there is the "discrete distance function":

$d((x_1,y_1),(x_2,y_2)) = 1$ if $(x_1,y_1) \neq (x_2,y_2)$
$d((x_1,y_1),(x_1,y_1)) = 0$

that returns a distance of 1, if two points are different, and a distance of 0 if they are the same.

The point is, in arenas of greater mathematical sophistication, "points" don't always have some of the "nice" properties we take for granted in a Euclidean plane.

There is a slight danger of saying the column vector $\begin{bmatrix}x\\y\end{bmatrix}$ IS the point $(x,y)$. What is more ACCURATE to say, is that that column matrix is a REPRESENTATION of the point $(x,y)$. When, in mathematics, you see the word "representation", you should immediately think inside: "Oh, so there's a homomorphism of some kind involved".

Loosely speaking, however, in the same sense that Lipshutz is, you are correct. You (and he) are both blurring these fine distinctions (this is all very well and good, until one has "multiple coordinate systems you are switching between", and then it pays to keep them straight in your mind).



Well ! ... ... indeed ... that was so helpful ...

Thanks!

Peter
 

FAQ: Yet Another Basic Question on Linear Transformations and Their Matrices

What is a linear transformation?

A linear transformation is a mathematical function that maps one vector space to another in a linear manner. This means that the transformation preserves the basic operations of addition and scalar multiplication.

What is a matrix representation of a linear transformation?

A matrix representation is a way of expressing a linear transformation using a matrix. Each column of the matrix represents the image of a basis vector from the input vector space, and the matrix multiplication of this representation with a vector results in the transformation of that vector.

How do you determine the matrix representation of a linear transformation?

The matrix representation of a linear transformation can be determined by finding the images of the basis vectors from the input vector space, and then arranging these images as columns in a matrix. The order of the basis vectors in the input space and the corresponding columns in the matrix must be the same.

Can a linear transformation have multiple matrix representations?

Yes, a linear transformation can have multiple matrix representations depending on the choice of basis vectors for the input and output vector spaces. However, the matrix representations will still represent the same transformation and will only differ in their notation.

What is the significance of the standard matrix of a linear transformation?

The standard matrix of a linear transformation is the unique matrix representation of the transformation with respect to the standard basis vectors of the input and output vector spaces. It is significant because it allows for the easy computation of the image of any vector under the transformation and the composition of multiple linear transformations.

Similar threads

Back
Top