First Isomorphism Theorem for Vector Spaces - Knapp, Theorem 2.27

In summary, the conversation discusses Theorem 2.27 (First Isomorphism Theorem) from Anthony W. Knapp's book, Basic Algebra, and its proof. The conversation also raises three questions regarding the proof and its implications. The first question asks for an explanation of why L being a function implies L (L^{-1} (T)) = T. The second question asks for an explanation of how the two equations L (L^{-1} (T)) = T and L^{-1} (L (S)) = S allow us to conclude that L is one-to-one and hence S \cong L(S). The third question discusses the difference between Knapp's version of the First Isomorphism Theorem and
  • #1
Math Amateur
Gold Member
MHB
3,998
48
I am reading Chapter 2: Vector Spaces over \(\displaystyle \mathbb{Q}, \mathbb{R} \text{ and } \mathbb{C}\) of Anthony W. Knapp's book, Basic Algebra.

I need some help with some issues regarding Theorem 2.27 (First Isomorphism Theorem) on pages 57-58.

Theorem 2.27 and its proof read as follows:

https://www.physicsforums.com/attachments/2913
View attachment 2914

My questions/issues are as follows:

Question 1

In the third paragraph of the proof we read the following:

"Moreover, the vector subspace \(\displaystyle L^{-1} (T)\) contains \(\displaystyle L^{-1} (0) = U\). Therefore the inverse image under L of a of a vector space as in (b) is a vector space as in (a). Since L is a function, we have \(\displaystyle L ( L^{-1} (T)) = T \) ... ... ... "

Can someone explain exactly why L being a function implies that \(\displaystyle L ( L^{-1} (T)) = T \)?


Question 2


In the third paragraph Knapp shows that \(\displaystyle L ( L^{-1} (T)) = T \) while in the fourth paragraph Knapp shows that \(\displaystyle L^{-1} ( L (S)) = S \).

Can someone please explain how these two equations allow us to conclude that L is one-to-one ... and hence \(\displaystyle S \cong L(S)\)?Question 3

Knapp's version of the the First Isomorphism Theorem seems somewhat different to the expression of this theorem in other texts where we are dealing with a conclusion that looks like the following: \(\displaystyle V/S \cong L(S)\) - or something like that ... can anyone explain what is going on ... has Knapp generalized the version of the other texts?

(Note: In fact Knapp's Corollary to his Proposition 2.25 looks very like what I am used to as the First Isomorphism Theorem.)

Hope someone can help?

Peter
***NOTE*** I have referred in the above post to Knapp Proposition 2.25 and Corollary 2.26 and so, for the interest of MHB members following this post, I am providing the Proposition an Corollary as follows:

https://www.physicsforums.com/attachments/2915
https://www.physicsforums.com/attachments/2916
 
Last edited:
Physics news on Phys.org
  • #2
For any function $f:A \to B$, and for any subset $U \subseteq f(A)$**, we have:

$f(f^{-1}(U)) = U$.

Two other facts we will need:

1. If $X \subseteq A$, then $f(X) \subseteq f(A)$.

2. If $Y \subseteq B$, then $f^{-1}(Y) \subseteq f^{-1}(B)$.

Proof of 1: let $y \in f(X)$ be arbitrary. So $y = f(x)$ for some $x \in X$. Since $X \subseteq A$, we have $x \in A$, so that:

$y = f(x)$ for some $x \in A$, that is: $y \in f(A)$, which shows $f(X) \subseteq f(A)$.

Proof of 2: Let $x \in f^{-1}(Y)$ be arbitrary. Thus $f(x) = y \in Y$. Since $Y \subseteq B$, we have $y \in B$, so that:

$f(x) = y$ for some $y \in B$, that is: $x \in f^{-1}(Y)$, which shows $f^{-1}(Y) \subseteq f^{-1}(B)$.

*******

Now, suppose we have $y \in f(f^{-1}(U))$. This means $y = f(a)$ for some $a \in f^{-1}(U)$.

Since $a \in f^{-1}(U)$ we have $f(a) \in U$. But $y = f(a)$, so $y \in U$. Thus $f^{-1}(f(U)) \subseteq U$.

On the other hand, suppose $u \in U$. Thus $\{u\} \subseteq U$, so that $f^{-1}(u) = f^{-1}(\{u\}) \subseteq f^{-1}(U)$, by (2).

Now choose any $a \in f^{-1}(u)$ (we can do so, because $u \in f(A)$).

Thus $f(a) = f(\{a\}) \subseteq f(f^{-1}(\{u\}) \subseteq f(f^{-1}(U))$ (by 1),

so $u \in \{u\} = f(\{a\}) \subseteq f(f^{-1}(U))$, so $u \in f^{-1}(f(U))$.

Hence the two sets are equal.

I'm afraid this all obscures the real meat of the matter: pre-images may "blow up" a set (they are not functions), but taking the image of a pre-image, "shrinks it back down". The reverse is NOT true, in general:

$f^{-1}(f(V)) \neq V$, as one can see by letting $f: \Bbb Z \to \Bbb Z$ be given by $f(k) = k^2$, and taking:

$V = \{1,2\}$, in which case $f(V) = \{1,4\}$ and $f^{-1}(f(V)) = \{-2,-1,1,2\} \neq V$.

*******

**Note: it is crucial that this be true, because it fails for sets $W$ that lie outside the image of $A$, which have null pre-image, and thus have null "image of the pre-image", even though our set $W$ is non-null. This is why $L$ is presumed to be ONTO in the theorem.

*******

As I remarked above, it is generally NOT true that $f^{-1}(f(V)) = V$. If $f$ is one-to-one, it clearly is, because every element has a UNIQUE pre-image.

Now in Knapp's proof, one should not think of $L^{-1}$ as a function $W \to V$, but rather as a function from the POWER SET of $W$ to the power set of $V$. Indeed, we are even restricting the domain to those subsets of $W$ which are vector spaces. We probably should use other symbols (like $\hat{L}$ and $\hat{L}^{-1}$) to make this distinction clear. It is $\hat{L}$ that is the actual bijection.

The point is that $\hat{L} \circ \hat{L}^{-1} = \text{id}$ and $\hat{L}^{-1} \circ \hat{L} = \text{id}$ where the two identity functions are on the set of subspaces of $W$, and the set of subspaces of $V$ containing $U$, respectively.

*******

As to your last question, Knapp's use of "First Isomorphism Theorem" IS at variance with what you would expect (Proposition 2.25). This theorem (Theorem 2.27) is also called, variously, "the Lattice Isomorphism Theorem", "Fourth Isomorphism Theorem" and "Correspondence Theorem", the terminology is not standardized, see, for example:

Lattice theorem - Wikipedia, the free encyclopedia
 
  • #3
Deveno said:
For any function $f:A \to B$, and for any subset $U \subseteq f(A)$**, we have:

$f(f^{-1}(U)) = U$.

Two other facts we will need:

1. If $X \subseteq A$, then $f(X) \subseteq f(A)$.

2. If $Y \subseteq B$, then $f^{-1}(Y) \subseteq f^{-1}(B)$.

Proof of 1: let $y \in f(X)$ be arbitrary. So $y = f(x)$ for some $x \in X$. Since $X \subseteq A$, we have $x \in A$, so that:

$y = f(x)$ for some $x \in A$, that is: $y \in f(A)$, which shows $f(X) \subseteq f(A)$.

Proof of 2: Let $x \in f^{-1}(Y)$ be arbitrary. Thus $f(x) = y \in Y$. Since $Y \subseteq B$, we have $y \in B$, so that:

$f(x) = y$ for some $y \in B$, that is: $x \in f^{-1}(Y)$, which shows $f^{-1}(Y) \subseteq f^{-1}(B)$.

*******

Now, suppose we have $y \in f(f^{-1}(U))$. This means $y = f(a)$ for some $a \in f^{-1}(U)$.

Since $a \in f^{-1}(U)$ we have $f(a) \in U$. But $y = f(a)$, so $y \in U$. Thus $f^{-1}(f(U)) \subseteq U$.

On the other hand, suppose $u \in U$. Thus $\{u\} \subseteq U$, so that $f^{-1}(u) = f^{-1}(\{u\}) \subseteq f^{-1}(U)$, by (2).

Now choose any $a \in f^{-1}(u)$ (we can do so, because $u \in f(A)$).

Thus $f(a) = f(\{a\}) \subseteq f(f^{-1}(\{u\}) \subseteq f(f^{-1}(U))$ (by 1),

so $u \in \{u\} = f(\{a\}) \subseteq f(f^{-1}(U))$, so $u \in f^{-1}(f(U))$.

Hence the two sets are equal.

I'm afraid this all obscures the real meat of the matter: pre-images may "blow up" a set (they are not functions), but taking the image of a pre-image, "shrinks it back down". The reverse is NOT true, in general:

$f^{-1}(f(V)) \neq V$, as one can see by letting $f: \Bbb Z \to \Bbb Z$ be given by $f(k) = k^2$, and taking:

$V = \{1,2\}$, in which case $f(V) = \{1,4\}$ and $f^{-1}(f(V)) = \{-2,-1,1,2\} \neq V$.

*******

**Note: it is crucial that this be true, because it fails for sets $W$ that lie outside the image of $A$, which have null pre-image, and thus have null "image of the pre-image", even though our set $W$ is non-null. This is why $L$ is presumed to be ONTO in the theorem.

*******

As I remarked above, it is generally NOT true that $f^{-1}(f(V)) = V$. If $f$ is one-to-one, it clearly is, because every element has a UNIQUE pre-image.

Now in Knapp's proof, one should not think of $L^{-1}$ as a function $W \to V$, but rather as a function from the POWER SET of $W$ to the power set of $V$. Indeed, we are even restricting the domain to those subsets of $W$ which are vector spaces. We probably should use other symbols (like $\hat{L}$ and $\hat{L}^{-1}$) to make this distinction clear. It is $\hat{L}$ that is the actual bijection.

The point is that $\hat{L} \circ \hat{L}^{-1} = \text{id}$ and $\hat{L}^{-1} \circ \hat{L} = \text{id}$ where the two identity functions are on the set of subspaces of $W$, and the set of subspaces of $V$ containing $U$, respectively.

*******

As to your last question, Knapp's use of "First Isomorphism Theorem" IS at variance with what you would expect (Proposition 2.25). This theorem (Theorem 2.27) is also called, variously, "the Lattice Isomorphism Theorem", "Fourth Isomorphism Theorem" and "Correspondence Theorem", the terminology is not standardized, see, for example:

Lattice theorem - Wikipedia, the free encyclopedia
Thanks for the help Deveno ... just working carefully through your post now ...

Appreciate your help ...

Peter
 
  • #4
Deveno said:
For any function $f:A \to B$, and for any subset $U \subseteq f(A)$**, we have:

$f(f^{-1}(U)) = U$.

Two other facts we will need:

1. If $X \subseteq A$, then $f(X) \subseteq f(A)$.

2. If $Y \subseteq B$, then $f^{-1}(Y) \subseteq f^{-1}(B)$.

Proof of 1: let $y \in f(X)$ be arbitrary. So $y = f(x)$ for some $x \in X$. Since $X \subseteq A$, we have $x \in A$, so that:

$y = f(x)$ for some $x \in A$, that is: $y \in f(A)$, which shows $f(X) \subseteq f(A)$.

Proof of 2: Let $x \in f^{-1}(Y)$ be arbitrary. Thus $f(x) = y \in Y$. Since $Y \subseteq B$, we have $y \in B$, so that:

$f(x) = y$ for some $y \in B$, that is: $x \in f^{-1}(Y)$, which shows $f^{-1}(Y) \subseteq f^{-1}(B)$.

*******

Now, suppose we have $y \in f(f^{-1}(U))$. This means $y = f(a)$ for some $a \in f^{-1}(U)$.

Since $a \in f^{-1}(U)$ we have $f(a) \in U$. But $y = f(a)$, so $y \in U$. Thus $f^{-1}(f(U)) \subseteq U$.

On the other hand, suppose $u \in U$. Thus $\{u\} \subseteq U$, so that $f^{-1}(u) = f^{-1}(\{u\}) \subseteq f^{-1}(U)$, by (2).

Now choose any $a \in f^{-1}(u)$ (we can do so, because $u \in f(A)$).

Thus $f(a) = f(\{a\}) \subseteq f(f^{-1}(\{u\}) \subseteq f(f^{-1}(U))$ (by 1),

so $u \in \{u\} = f(\{a\}) \subseteq f(f^{-1}(U))$, so $u \in f^{-1}(f(U))$.

Hence the two sets are equal.

I'm afraid this all obscures the real meat of the matter: pre-images may "blow up" a set (they are not functions), but taking the image of a pre-image, "shrinks it back down". The reverse is NOT true, in general:

$f^{-1}(f(V)) \neq V$, as one can see by letting $f: \Bbb Z \to \Bbb Z$ be given by $f(k) = k^2$, and taking:

$V = \{1,2\}$, in which case $f(V) = \{1,4\}$ and $f^{-1}(f(V)) = \{-2,-1,1,2\} \neq V$.

*******

**Note: it is crucial that this be true, because it fails for sets $W$ that lie outside the image of $A$, which have null pre-image, and thus have null "image of the pre-image", even though our set $W$ is non-null. This is why $L$ is presumed to be ONTO in the theorem.

*******

As I remarked above, it is generally NOT true that $f^{-1}(f(V)) = V$. If $f$ is one-to-one, it clearly is, because every element has a UNIQUE pre-image.

Now in Knapp's proof, one should not think of $L^{-1}$ as a function $W \to V$, but rather as a function from the POWER SET of $W$ to the power set of $V$. Indeed, we are even restricting the domain to those subsets of $W$ which are vector spaces. We probably should use other symbols (like $\hat{L}$ and $\hat{L}^{-1}$) to make this distinction clear. It is $\hat{L}$ that is the actual bijection.

The point is that $\hat{L} \circ \hat{L}^{-1} = \text{id}$ and $\hat{L}^{-1} \circ \hat{L} = \text{id}$ where the two identity functions are on the set of subspaces of $W$, and the set of subspaces of $V$ containing $U$, respectively.

*******

As to your last question, Knapp's use of "First Isomorphism Theorem" IS at variance with what you would expect (Proposition 2.25). This theorem (Theorem 2.27) is also called, variously, "the Lattice Isomorphism Theorem", "Fourth Isomorphism Theorem" and "Correspondence Theorem", the terminology is not standardized, see, for example:

Lattice theorem - Wikipedia, the free encyclopedia
Thanks again for your help Deveno ... but just a clarifying question ...

Does Knapp showing that \(\displaystyle S = L^{-1} (L(S) ) \) then imply that L is one-to-one ... and is this the basis of the proof that \(\displaystyle L|_S \) is one-to-one ...

(You have pointed out that if L is one-to-one then \(\displaystyle S = L^{-1} (L(S) ) \) but I guess I am checking that the converse is true and whether this is the basis of the proof that \(\displaystyle L|_S \) is one-to-one ... ... )

Hope you can help ...

Peter
 
  • #5
I thought I would illustrate this with an example, to show you how it works out "in practice":

Let $V = \Bbb R^3$, and $W = \Bbb R^2$, with $L : V \to W$ the mapping $L(x,y,z) = (x,y)$.

The kernel of the map is:

$U = \{(0,0,z) \in \Bbb R^3: z \in \Bbb R\}$ (the $z$-axis, $L$ is just the projection of a point onto the $xy$-plane).

Now a subspace of $V$ containing $U$ that is not $U$ nor $V$ must be 2-dimensional, and is of the form:

$S = \langle v\rangle + U = \{\alpha v + \beta(0,0,1)\}$ for some vector $v \in \Bbb R^3$ such that $\{v,(0,0,1)\}$ is linearly independent.

Say $v = (x_0,y_0,z_0)$. Then a typical element of $s \in S$ is: $\alpha(x_0,y_0,z_0) + \beta(0,0,1) = (\alpha x_0,\alpha y_0,\alpha z_0 + \beta)$.

Hence $L(s) = (\alpha x_0,\alpha y_0) = \alpha(x_0,y_0)$.

That is to say: $L(\langle v\rangle + U) = \langle L(v)\rangle$.

The lattice of subspaces of $V$ that contain $U$ looks like "an infinite fan" spreading out from $U$, topped by a "cone" from each of these subspaces up to $V$. The actual subspaces form (planar) slices of space through the $z$-axis.

The intersection of any said plane with the $xy$-plane determines a UNIQUE line in the $xy$-plane, this line is the image (under $L$) of of our plane containing $U$ (the $z$-axis). As we rotate about the $z$-axis through the plane, the image lines rotate about the origin, tit-for-tat.

So, given any plane that passes through the $z$-axis, we have a unique corresponding line in the $xy$-plane passing through the origin. Given any line passing through the origin, we can recover the original plane, by only using the $x$ and $y$ coordinate-pairs that occur in our line, and letting the $z$-coordinate be arbitrary (in effect, forcing our plane to pass through the $z$-axis, since $(0,0)$ lies on our line).

**********

It should be noted that this construction (and theorem) holds for any "group-like" structure, the key thing is that the structure possesses "operation-preserving morphisms" and that it possesses kernels. Here, by kernel, I mean something different than the pre-image of some identity element (although that is what it typically turns out to be). What I mean is:

1) The structure possesses a ZERO object, an object Z such that there are UNIQUE maps:

$Z \to A$
$A \to Z$ <--these maps are typically called zero maps.

for any object $A$. In vector spaces, a vector space consisting of only a zero-vector is such an object. In groups, any trivial group works.

2) We say $k: K \to A$ is a kernel for $f: A \to B$ if:

a) $f \circ k$ is a zero map
b) if $k':K' \to A$ is any other map such that $f \circ k'$ is a zero-map, there is a unique map:

$u: K' \to K$ with $k \circ u = k'$.

In most of the structures you will encounter for which this is true, $K$ can be regarded as a sub-structure of $A$, and $k$ the inclusion morphism, and this says that $K$ is the LARGEST sub-structure that $f$ kills, since if $K'$ is another, we have an inclusion $K' \to K$.

That may seem like a lot to digest: the thing is, you've seen (and proved) this theorem BEFORE, and the proof always goes the same way. The main players in this drama have just "changed costumes".

*****************
Peter said:
Thanks again for your help Deveno ... but just a clarifying question ...

Does Knapp showing that \(\displaystyle S = L^{-1} (L(S) ) \) then imply that L is one-to-one ... and is this the basis of the proof that \(\displaystyle L|_S \) is one-to-one ...

(You have pointed out that if L is one-to-one then \(\displaystyle S = L^{-1} (L(S) ) \) but I guess I am checking that the converse is true and whether this is the basis of the proof that \(\displaystyle L|_S \) is one-to-one ... ... )

Hope you can help ...

Peter

$L$ is not one-to-one. $L|_S$ is STILL not one-to-one, for example, if $S = U = \text{ker }L$, $L|_S$ is just about as non-injective as you can get, it maps EVERYTHING to the 0-vector of $W$.

What IS one-to-one is the map $\hat{L}$, which maps subspaces to subspaces (and not vectors to vectors). Unfortunately, it is common for the notation not to indicate this, you sort of have to "infer" it, by what is "within the parentheses".

$L(S)$ is an image SET, namely $\{w \in W: w = L(s)\text{ for some } s\in S\}$. This mapping from subsets of $V$ to subsets of $W$ really ought to have a different name, since its domain and co-domain differ from that of $L$.
 
  • #6
Deveno said:
I thought I would illustrate this with an example, to show you how it works out "in practice":

Let $V = \Bbb R^3$, and $W = \Bbb R^2$, with $L : V \to W$ the mapping $L(x,y,z) = (x,y)$.

The kernel of the map is:

$U = \{(0,0,z) \in \Bbb R^3: z \in \Bbb R\}$ (the $z$-axis, $L$ is just the projection of a point onto the $xy$-plane).

Now a subspace of $V$ containing $U$ that is not $U$ nor $V$ must be 2-dimensional, and is of the form:

$S = \langle v\rangle + U = \{\alpha v + \beta(0,0,1)\}$ for some vector $v \in \Bbb R^3$ such that $\{v,(0,0,1)\}$ is linearly independent.

Say $v = (x_0,y_0,z_0)$. Then a typical element of $s \in S$ is: $\alpha(x_0,y_0,z_0) + \beta(0,0,1) = (\alpha x_0,\alpha y_0,\alpha z_0 + \beta)$.

Hence $L(s) = (\alpha x_0,\alpha y_0) = \alpha(x_0,y_0)$.

That is to say: $L(\langle v\rangle + U) = \langle L(v)\rangle$.

The lattice of subspaces of $V$ that contain $U$ looks like "an infinite fan" spreading out from $U$, topped by a "cone" from each of these subspaces up to $V$. The actual subspaces form (planar) slices of space through the $z$-axis.

The intersection of any said plane with the $xy$-plane determines a UNIQUE line in the $xy$-plane, this line is the image (under $L$) of of our plane containing $U$ (the $z$-axis). As we rotate about the $z$-axis through the plane, the image lines rotate about the origin, tit-for-tat.

So, given any plane that passes through the $z$-axis, we have a unique corresponding line in the $xy$-plane passing through the origin. Given any line passing through the origin, we can recover the original plane, by only using the $x$ and $y$ coordinate-pairs that occur in our line, and letting the $z$-coordinate be arbitrary (in effect, forcing our plane to pass through the $z$-axis, since $(0,0)$ lies on our line).

**********

It should be noted that this construction (and theorem) holds for any "group-like" structure, the key thing is that the structure possesses "operation-preserving morphisms" and that it possesses kernels. Here, by kernel, I mean something different than the pre-image of some identity element (although that is what it typically turns out to be). What I mean is:

1) The structure possesses a ZERO object, an object Z such that there are UNIQUE maps:

$Z \to A$
$A \to Z$ <--these maps are typically called zero maps.

for any object $A$. In vector spaces, a vector space consisting of only a zero-vector is such an object. In groups, any trivial group works.

2) We say $k: K \to A$ is a kernel for $f: A \to B$ if:

a) $f \circ k$ is a zero map
b) if $k':K' \to A$ is any other map such that $f \circ k'$ is a zero-map, there is a unique map:

$u: K' \to K$ with $k \circ u = k'$.

In most of the structures you will encounter for which this is true, $K$ can be regarded as a sub-structure of $A$, and $k$ the inclusion morphism, and this says that $K$ is the LARGEST sub-structure that $f$ kills, since if $K'$ is another, we have an inclusion $K' \to K$.

That may seem like a lot to digest: the thing is, you've seen (and proved) this theorem BEFORE, and the proof always goes the same way. The main players in this drama have just "changed costumes".

*****************$L$ is not one-to-one. $L|_S$ is STILL not one-to-one, for example, if $S = U = \text{ker }L$, $L|_S$ is just about as non-injective as you can get, it maps EVERYTHING to the 0-vector of $W$.

What IS one-to-one is the map $\hat{L}$, which maps subspaces to subspaces (and not vectors to vectors). Unfortunately, it is common for the notation not to indicate this, you sort of have to "infer" it, by what is "within the parentheses".

$L(S)$ is an image SET, namely $\{w \in W: w = L(s)\text{ for some } s\in S\}$. This mapping from subsets of $V$ to subsets of $W$ really ought to have a different name, since its domain and co-domain differ from that of $L$.
Thanks Deveno ... I will be working through your example very carefully ... your examples are EXTREMELY illustrative and helpful ...

... ... but I have skim read your remarks beginning "$L$ is not one-to-one. $L|_S$ is STILL not one-to-one, ... ... " and ... yes, indeed ... you are, of course, correct ...

Can you confirm, then, that if $\hat{L}$ is the map that maps subspaces to subspaces (and not vectors to vectors), that Knapp showing that \(\displaystyle S = \hat{L}^{-1} (\hat{L}(S) ) \) then implies that \(\displaystyle \hat{L} \) is one-to-one ... and is this the lynchpin of the proof that \(\displaystyle \hat{L} \) is one-to-one ... ... ?

Hope you can help again ...

Sorry for being a bit slow on this matter ... ...

Peter
 
  • #7
It is fairly standard in mathematics to show a function $f: A\to B$ is one-to-one if it has a left-inverse: in other words, if we are given:

$a \mapsto f(a) = b$, then given $b$, we can recover $a$.

Clearly, if for every image $f(a)$ of an injective function $f$, we can define a function:

$g: f(A) \to A$ by $g(f(a)) = a$, it follows that $g \circ f = 1_A$.

If $f$ is not injective, then for some $f(a)$, the $g$ we want fails to be a function (we don't know which pre-image to choose).

Naively, if $f$ takes every $a$ to a UNIQUE image $f(a)$, then to define a left-inverse, all we have to do is "reel it back in again".

This can fail to be a "true" (two-sided) inverse, because if we start at $B$ (the co-domain of $f$), there may not BE any $a$ that maps to a given $b$ (if $f$ is not onto), so when we start at such a $b$, we have to pick "some" value in $A$ for $g(b)$, and all of the domain elements of $A$, are already spoken for via $f$, so when we come back:

$(f \circ g)(b) \in f(A)$, and this cannot be $b$, since it lies outside the image set.

In a similar vein, it is standard to show a function is onto, by showing it has a right-inverse. This usually isn't a "true" inverse, either, because it isn't unique (unless $f$ is 1-1).

These matters are worth thinking deeply about, they pervade a LOT of mathematical thought. Onto maps come in a lot of flavors, and have different names in different fields: coverings, partitions, projections, and other similar terms. One-to-one maps also come in a variety of guises, and are often "almost transparent", they can be called: inclusions, identifications, correspondences or transfers. Bijections, are "the best of both worlds", they are often the maps we wish all the other maps would be, because they preserve something "faithfully" and "fully" (terms you may encounter in a technical sense later).
 
  • #8
Deveno said:
It is fairly standard in mathematics to show a function $f: A\to B$ is one-to-one if it has a left-inverse: in other words, if we are given:

$a \mapsto f(a) = b$, then given $b$, we can recover $a$.

Clearly, if for every image $f(a)$ of an injective function $f$, we can define a function:

$g: f(A) \to A$ by $g(f(a)) = a$, it follows that $g \circ f = 1_A$.

If $f$ is not injective, then for some $f(a)$, the $g$ we want fails to be a function (we don't know which pre-image to choose).

Naively, if $f$ takes every $a$ to a UNIQUE image $f(a)$, then to define a left-inverse, all we have to do is "reel it back in again".

This can fail to be a "true" (two-sided) inverse, because if we start at $B$ (the co-domain of $f$), there may not BE any $a$ that maps to a given $b$ (if $f$ is not onto), so when we start at such a $b$, we have to pick "some" value in $A$ for $g(b)$, and all of the domain elements of $A$, are already spoken for via $f$, so when we come back:

$(f \circ g)(b) \in f(A)$, and this cannot be $b$, since it lies outside the image set.

In a similar vein, it is standard to show a function is onto, by showing it has a right-inverse. This usually isn't a "true" inverse, either, because it isn't unique (unless $f$ is 1-1).

These matters are worth thinking deeply about, they pervade a LOT of mathematical thought. Onto maps come in a lot of flavors, and have different names in different fields: coverings, partitions, projections, and other similar terms. One-to-one maps also come in a variety of guises, and are often "almost transparent", they can be called: inclusions, identifications, correspondences or transfers. Bijections, are "the best of both worlds", they are often the maps we wish all the other maps would be, because they preserve something "faithfully" and "fully" (terms you may encounter in a technical sense later).

Thanks Deveno ... but just to ensure that I am following you and understanding the situation fully ... a further confirmation/clarification ...

... ... the central element of Knapp's proof that the mapping \(\displaystyle \hat{L} \) is one-to-one is as follows:

\(\displaystyle S = \hat{L}^{-1} (\hat{L}(S) ) \Longrightarrow \) L has a left inverse

\(\displaystyle \Longrightarrow \) L is one-to-one

OR in terms of your notation where:

\(\displaystyle f : \ A \longrightarrow B \) and \(\displaystyle X \subseteq A \) , \(\displaystyle f^{-1}\) is the inverse image of \(\displaystyle f\), and \(\displaystyle {f_l}^{-1}\) is the left inverse of \(\displaystyle f\),

we have that:

\(\displaystyle f^{-1} (f(X)) = X \Longrightarrow f \) has a left inverse \(\displaystyle {f_l}^{-1}\), that is \(\displaystyle {f_l}^{-1} \circ f = 1_A \) ...

Is that correct?Now if the above is correct, can you please help with a proof of \(\displaystyle f^{-1} (f(X)) = X \Longrightarrow f \) has a left inverse \(\displaystyle {f_l}^{-1}\), that is \(\displaystyle {f_l}^{-1} \circ f = 1_A \) ... I have so far failed to frame a proof ...

Hope you can help ...

Peter***EDIT*** Can you recommend a book or a set of online notes that deals with the topic of set and functions in enough detail to cover the issues regarding left and right inverses, inverse images, etc
 
  • #9
Peter said:
Thanks Deveno ... but just to ensure that I am following you and understanding the situation fully ... a further confirmation/clarification ...

... ... the central element of Knapp's proof that the mapping \(\displaystyle \hat{L} \) is one-to-one is as follows:

\(\displaystyle S = \hat{L}^{-1} (\hat{L}(S) ) \Longrightarrow \) L has a left inverse

\(\displaystyle \Longrightarrow \) L is one-to-one

OR in terms of your notation where:

\(\displaystyle f : \ A \longrightarrow B \) and \(\displaystyle X \subseteq A \) , \(\displaystyle f^{-1}\) is the inverse image of \(\displaystyle f\), and \(\displaystyle {f_l}^{-1}\) is the left inverse of \(\displaystyle f\),

we have that:

\(\displaystyle f^{-1} (f(X)) = X \Longrightarrow f \) has a left inverse \(\displaystyle {f_l}^{-1}\), that is \(\displaystyle {f_l}^{-1} \circ f = 1_A \) ...

Is that correct?Now if the above is correct, can you please help with a proof of \(\displaystyle f^{-1} (f(X)) = X \Longrightarrow f \) has a left inverse \(\displaystyle {f_l}^{-1}\), that is \(\displaystyle {f_l}^{-1} \circ f = 1_A \) ... I have so far failed to frame a proof ...

Hope you can help ...

Peter***EDIT*** Can you recommend a book or a set of online notes that deals with the topic of set and functions in enough detail to cover the issues regarding left and right inverses, inverse images, etc
I'm sorry I wasn't clearer earlier, this is the same type of confusion Knapp is engendering.

We have TWO functions, unfortunately we are calling both "$f$". The first, is an ordinary function:

$f:A \to B$, with $a \mapsto f(a)$.

The second is this function:

$f:\mathcal{P}(A) \to \mathcal{P}(B)$ with $X \mapsto f(X)$. We should probably call it $\hat{f}$ to make the distinction clearer.

We also have a third function:

$f^{-1}:\mathcal{P}(B) \to \mathcal{P}(A)$ with $Y \mapsto f^{-1}(Y)$.

If for any subset $X \subseteq A$, we have: $f^{-1}(\hat{f}(X)) = X$, it is $\hat{f}$ that has the left-inverse.

In Knapp's proof, he is proving $\hat{L}$ is injective, NOT $L$.

We have a 3-tiered stratum here:

Elements
Sets
Power set (set of subsets)

A function that maps elements to elements (whose domain and co-domain are sets) INDUCES a function that maps sets to sets (whose domain and co-domain are power sets). If the "set-function" (as opposed to the "element-function") has a left-inverse, then we have a "one-to-one correspondence of sets".

So if $f^{-1}(\hat{f}(X)) = X$, for any $X \subseteq A$, then we have $f^{-1}\circ \hat{f} = 1_{\mathcal{P}(A)}$

However, it IS true, that if $f^{-1}(\hat{f}(X)) = X$ for *any* subset $X \subseteq A$, that $f$ is injective as well.

The reason being, is that we could take $X = \{a\}$, a singleton subset. Then $\hat{f}(\{a\}) = \{f(a)\}$.

If the pre-image of $\{f(a)\}$ is the same singleton $\{a\}$, it follows that $f(a)$ has a UNIQUE pre-image ($a$), and so $f$ (the "ordinary function" is injective).

THIS IS NOT TRUE FOR $L$, in Knapp's proof. For example, if $v \neq 0$ is an element of $U = \text{ker }L$, then:

$\hat{L}(\{v\}) = \{0_W\}$ whereas: $L^{-1}(\{0_W\}) = U \neq \{v\}$.

********************

I'll try to illustrate the analogous theorem for groups, so it's clearer. Suppose that $f:G \to H$ is a group homomorphism, with $K = \text{ker }f$.

Then there is a bijection between subgroups of $f(G)$ (this is actually $\hat{f}(G)$) and subgroups of $G$ containing $K$. The proof is about the same, and I'm going to omit it here. Instead, let's look at a group with a finite lattice of subgroups, and a homomorphic image of our first group.

For $G$, we'll use the group of quaternion units:

$G = \{1,-1,i,-i,j,-j,k,-k\}$, where we have: $i^2 = j^2 = k^2 = -1$ and $ij = k, ji = -k$, and $1,-1$ work the usual way.

(So for example, $ki = (ij)i = -(ji)i = -j(i^2) = (-j)(-1) = j$).

For $H$ we'll use the group $\{e,a,b,ab\}$ with $a^2 = b^2 = e$ and $ab = ba$. For our (onto) homomorphism $f$ we will define it on generators (since $i$ and $j$ generate $G$) by $f(i) = a,\ f(j) = b$. So:

$f(1) = e$
$f(-1) = f(i^2) = f(i)f(i) = a^2 = e$
$f(i) = a$
$f(-i) = f(-1)f(i) = ea = a$
$f(j) = b$
$f(-j) = eb = b$
$f(k) = f(ij) = f(i)f(j) = ab$
$f(-k) = f(-1)f(k) = eab = ab$

As you can see, the kernel $K = \{1,-1\}$. The subgroups of $G$ containing $K$, and their homomorphic images are:

$G \leftrightarrow H$
$\{1,-1,i,-i\} \leftrightarrow \{e,a\}$
$\{1,-1,j,-j\} \leftrightarrow \{e,b\}$
$\{1,-1,k,-k\} \leftrightarrow \{e,ab\}$
$\{1,-1\} \leftrightarrow \{e\}$.

********

I'm sorry, I can't think of a good set of on-line notes that deals with these basic properties of functions.
 
  • #10
Deveno said:
I'm sorry I wasn't clearer earlier, this is the same type of confusion Knapp is engendering.

We have TWO functions, unfortunately we are calling both "$f$". The first, is an ordinary function:

$f:A \to B$, with $a \mapsto f(a)$.

The second is this function:

$f:\mathcal{P}(A) \to \mathcal{P}(B)$ with $X \mapsto f(X)$. We should probably call it $\hat{f}$ to make the distinction clearer.

We also have a third function:

$f^{-1}:\mathcal{P}(B) \to \mathcal{P}(A)$ with $Y \mapsto f^{-1}(Y)$.

If for any subset $X \subseteq A$, we have: $f^{-1}(\hat{f}(X)) = X$, it is $\hat{f}$ that has the left-inverse.

In Knapp's proof, he is proving $\hat{L}$ is injective, NOT $L$.

We have a 3-tiered stratum here:

Elements
Sets
Power set (set of subsets)

A function that maps elements to elements (whose domain and co-domain are sets) INDUCES a function that maps sets to sets (whose domain and co-domain are power sets). If the "set-function" (as opposed to the "element-function") has a left-inverse, then we have a "one-to-one correspondence of sets".

So if $f^{-1}(\hat{f}(X)) = X$, for any $X \subseteq A$, then we have $f^{-1}\circ \hat{f} = 1_{\mathcal{P}(A)}$

However, it IS true, that if $f^{-1}(\hat{f}(X)) = X$ for *any* subset $X \subseteq A$, that $f$ is injective as well.

The reason being, is that we could take $X = \{a\}$, a singleton subset. Then $\hat{f}(\{a\}) = \{f(a)\}$.

If the pre-image of $\{f(a)\}$ is the same singleton $\{a\}$, it follows that $f(a)$ has a UNIQUE pre-image ($a$), and so $f$ (the "ordinary function" is injective).

THIS IS NOT TRUE FOR $L$, in Knapp's proof. For example, if $v \neq 0$ is an element of $U = \text{ker }L$, then:

$\hat{L}(\{v\}) = \{0_W\}$ whereas: $L^{-1}(\{0_W\}) = U \neq \{v\}$.

********************

I'll try to illustrate the analogous theorem for groups, so it's clearer. Suppose that $f:G \to H$ is a group homomorphism, with $K = \text{ker }f$.

Then there is a bijection between subgroups of $f(G)$ (this is actually $\hat{f}(G)$) and subgroups of $G$ containing $K$. The proof is about the same, and I'm going to omit it here. Instead, let's look at a group with a finite lattice of subgroups, and a homomorphic image of our first group.

For $G$, we'll use the group of quaternion units:

$G = \{1,-1,i,-i,j,-j,k,-k\}$, where we have: $i^2 = j^2 = k^2 = -1$ and $ij = k, ji = -k$, and $1,-1$ work the usual way.

(So for example, $ki = (ij)i = -(ji)i = -j(i^2) = (-j)(-1) = j$).

For $H$ we'll use the group $\{e,a,b,ab\}$ with $a^2 = b^2 = e$ and $ab = ba$. For our (onto) homomorphism $f$ we will define it on generators (since $i$ and $j$ generate $G$) by $f(i) = a,\ f(j) = b$. So:

$f(1) = e$
$f(-1) = f(i^2) = f(i)f(i) = a^2 = e$
$f(i) = a$
$f(-i) = f(-1)f(i) = ea = a$
$f(j) = b$
$f(-j) = eb = b$
$f(k) = f(ij) = f(i)f(j) = ab$
$f(-k) = f(-1)f(k) = eab = ab$

As you can see, the kernel $K = \{1,-1\}$. The subgroups of $G$ containing $K$, and their homomorphic images are:

$G \leftrightarrow H$
$\{1,-1,i,-i\} \leftrightarrow \{e,a\}$
$\{1,-1,j,-j\} \leftrightarrow \{e,b\}$
$\{1,-1,k,-k\} \leftrightarrow \{e,ab\}$
$\{1,-1\} \leftrightarrow \{e\}$.

********

I'm sorry, I can't think of a good set of on-line notes that deals with these basic properties of functions.

Thanks for that help Deveno ... Just read the first section of your post ... shortly I will work through your example (many thanks for the example!) ...

BUT ... just clarifying ... you write:" ... ... $f:\mathcal{P}(A) \to \mathcal{P}(B)$ with $X \mapsto f(X)$. We should probably call it $\hat{f}$ to make the distinction clearer.

We also have a third function:

$f^{-1}:\mathcal{P}(B) \to \mathcal{P}(A)$ with $Y \mapsto f^{-1}(Y)$.

If for any subset $X \subseteq A$, we have: $f^{-1}(\hat{f}(X)) = X$, it is $\hat{f}$ that has the left-inverse. ... ... "


So when I asked for a proof, I should have been dealing with the situation where we define \(\displaystyle \hat{f}\) as follows:

$f:\mathcal{P}(A) \to \mathcal{P}(B)$

Then we can say:

\(\displaystyle f^{-1}(\hat{f}(X)) = X \Longrightarrow \hat{f}\) has a left inverse \(\displaystyle f^{-1}\); that is \(\displaystyle f^{-1} \circ \hat{f} = 1_A \) ... ...

BUT it seems to me that saying $f^{-1}(\hat{f}(X)) = X$ for all \(\displaystyle X\) where \(\displaystyle X \subseteq A\) (or for all \(\displaystyle X \in \mathcal{P}(A)\) ) is simply a restatement of the definition of \(\displaystyle f^{-1}\) : visually \(\displaystyle f^{-1} \circ \hat{f} = 1_A\) ... ...

... ... so there is no need for a 'proof' since the statement follows directly from the definition ...

... ... indeed, we cannot have \(\displaystyle \hat{f} (X) = X\) and \(\displaystyle \hat{f} (X) = Y\) if \(\displaystyle f^{-1}(\hat{f}(X)) = X\) ... ...

Can you confirm (or otherwise) that the above analysis is correct?

Peter
 
Last edited:
  • #11
Peter said:
Thanks for that help Deveno ... Just read the first section of your post ... shortly I will work through your example (many thanks for the example!) ...

BUT ... just clarifying ... you write:" ... ... $f:\mathcal{P}(A) \to \mathcal{P}(B)$ with $X \mapsto f(X)$. We should probably call it $\hat{f}$ to make the distinction clearer.

We also have a third function:

$f^{-1}:\mathcal{P}(B) \to \mathcal{P}(A)$ with $Y \mapsto f^{-1}(Y)$.

If for any subset $X \subseteq A$, we have: $f^{-1}(\hat{f}(X)) = X$, it is $\hat{f}$ that has the left-inverse. ... ... "


So when I asked for a proof, I should have been dealing with the situation where we define \(\displaystyle \hat{f}\) as follows:

$f:\mathcal{P}(A) \to \mathcal{P}(B)$

Then we can say:

\(\displaystyle f^{-1}(\hat{f}(X)) = X \Longrightarrow \hat{f}\) has a left inverse \(\displaystyle f^{-1}\); that is \(\displaystyle f^{-1} \circ \hat{f} = 1_A \) ... ...

BUT it seems to me that saying $f^{-1}(\hat{f}(X)) = X$ for all \(\displaystyle X\) where \(\displaystyle X \subseteq A\) (or for all \(\displaystyle X \in \mathcal{P}(A)\) ) is simply a restatement of the definition of \(\displaystyle f^{-1}\) : visually \(\displaystyle f^{-1} \circ \hat{f} = 1_A\) ... ...

... ... so there is no need for a 'proof' since the statement follows directly from the definition ...

... ... indeed, we cannot have \(\displaystyle \hat{f} (X) = X\) and \(\displaystyle \hat{f} (X) = Y\) if \(\displaystyle f^{-1}(\hat{f}(X)) = X\) ... ...

Can you confirm (or otherwise) that the above analysis is correct?

Peter

It need not be the case that $\hat{f}$ is injective. Let's look at 2 "toy examples". Suppose $A = \{a,b,c,d\}$ and $B = \{x,y,z\}$. We define $f: A \to B$ as follows:

$f(a) = y$
$f(b) = x$
$f(c) = y$
$f(d) = z$

Consider the subset $X = \{b,c\}$ of $A$. We have $\hat{f}(X) = \{x,y\}$. Now:

$f^{-1}(\hat{f}(X)) = \{a,b,c\} = \neq X$, since $a \not\in X$.

This fails to be true, because $y$ has two pre-images ($a$ and $c$), and one of these was not in our original set.

The pre-image set of an image set of a function is NOT the image set under the inverse function of the image set of the function.(although these two distinct objects are often written exactly alike).

The pre-image sets of singleton subsets of the co-domain are usually NOT singletons. Since they are not singletons, the assignment of a set $f^{-1}(y)$ to the set $\{y\}$ does not define a function on elements. Functions must have UNIQUE images, that is the essence of the definition of a function.

For example:

$f(x) = x \pm 1$ is NOT A FUNCTION. What is the image of 1? Is it 2, or 0? We don't know.

Ok, now let's look at another function $g: C \to D$ where $C = \{k,m,n\}$ and $D = \{s,t,u,v\}$, and:

$g(k) = v$
$g(m) = s$
$g(n) = t$.

Suppose $U = C$. then $\hat{g}(C) = \{s,t,v\}$. Here, we have: $g^{-1}(\hat{g}(C)) = \{k,m,n\} = C$. This precisely because $g$ is injective.

Functions have a "flow": source to target. This flow often "merges" different sources to the same target. As such, when we try to "back-track", we can wind up somewhere different than where we started from. For example:

$\sqrt{x^2} \neq x$, because if $x < 0$, we wind up with "the wrong sign".

It turns out that $f(x) = x^2$ and $g(x) = \sqrt{x}$ are both group homomorphisms (the first has domain $\Bbb R^{\ast}$, and the second has domain $\Bbb R^+$), but they are not inverses.

$f$ is a left-inverse for $g$, and $g$ is a right-inverse for $f$. It IS true that $(\sqrt{x})^2 = x$.

Note that $h(x) = -\sqrt{x}$ is ALSO a right-inverse for $f$, although $h$ is not a group homomorphism (negative times negative is not negative).
 

FAQ: First Isomorphism Theorem for Vector Spaces - Knapp, Theorem 2.27

What is the First Isomorphism Theorem for Vector Spaces?

The First Isomorphism Theorem for Vector Spaces, also known as Knapp's Theorem 2.27, states that given two vector spaces V and W over the same field, and a linear transformation T from V to W, the quotient space V/ker(T) is isomorphic to the image of T.

What is an isomorphism?

An isomorphism is a linear transformation between two vector spaces that preserves the vector space structure, meaning it preserves the operations of vector addition and scalar multiplication. In other words, an isomorphism is a one-to-one and onto mapping that maintains the algebraic structure of the vector spaces.

What is the significance of Knapp's Theorem 2.27?

Knapp's Theorem 2.27 is significant because it provides a fundamental understanding of the relationship between vector spaces and their linear transformations. It also allows for the simplification of complex vector spaces by showing their isomorphism to simpler spaces, making calculations and proofs easier.

How is Knapp's Theorem 2.27 proved?

Knapp's Theorem 2.27 can be proved using the Fundamental Theorem of Linear Algebra, which states that every linear transformation can be represented by a matrix. By using this matrix representation, the proof involves showing that the kernel of T is isomorphic to the zero subspace and that the image of T is isomorphic to the quotient space V/ker(T).

Can Knapp's Theorem 2.27 be applied to non-vector spaces?

No, Knapp's Theorem 2.27 specifically applies to vector spaces. It cannot be applied to other mathematical structures, such as groups or rings, which have different algebraic properties and operations. However, similar theorems exist for other structures, such as the First Isomorphism Theorem for Groups.

Back
Top