Is it invalid to redefine the sgn function in this way in a proof?

In summary: S_n##. But the notation does not reflect this. The revised definition of ##\textrm{sgn}## to follow aims to correct for this shortcoming.In summary, the definition of ##\textrm{sgn}## needs to be reworked because it is not sufficient for some proofs about determinants. This is demonstrated through an example where replacing a row of a matrix with a multiple of another row does not change the determinant. The revised definition aims to correct this shortcoming and takes into account the fact that the elements in a permutation should be interpreted as integer-valued functions rather than just integers. This allows for a more accurate representation of matrices and their row vectors. The conclusion of the
  • #1
Eclair_de_XII
1,083
91
TL;DR Summary
Let ##A## be an ##(n,n)## matrix. If ##A'## is the matrix obtained from adding one multiple of one row to another, then ##\det A=\det A'##.
Let ##S_n## denote ##\{1,\ldots,n\}##, where ##n\in\mathbb{N}##.

Recall that the ##\textrm{sgn}## function maps a permutation of ##S_n## to an element in ##\{1,0,-1\}##.

We want to rework the definition of ##\textrm{sgn}## because it is not sufficient for some proofs about determinants. For example, it is known that replacing one row of a given matrix with the sum of itself and a multiple of another row does not change the value of the determinant. The proof for this fact could start out in the following way.

Let ##A## be a matrix and let ##A'## be the matrix obtained from replacing the ##n\textrm{-th}## row with itself added to the first row multipled by ##\alpha##, where ##\alpha## is a constant. In theory, replacing any row with a itself added to a multiple off another would yield the same result, but these two rows are chosen specifically to economize on space. We proceed by computing ##\det A'##.

\begin{eqnarray*}
\det A'&=&\sum_{\textrm{n-lists}} \textrm{sgn}(j_1,\ldots,j_n) a_{1j_1}\cdot\ldots\cdot (\alpha\cdot a_{1j_1}+a_{nj_n})\\
&=&\alpha \sum_{\textrm{n-lists}} \textrm{sgn}(j_1,\ldots,j_n) a_{1j_1}\cdot\ldots\cdot a_{1j_1}\\
&+&\sum_{\textrm{n-lists}} \textrm{sgn}(j_1,\ldots,j_n) a_{1j_1}\cdot\ldots\cdot a_{nj_n}\\
&=&\alpha \sum_{\textrm{n-lists}} \textrm{sgn}(j_1,\ldots,j_n) a_{1j_1}\cdot\ldots\cdot a_{1j_1}+\det A
\end{eqnarray*}

The conclusion of the proof is that ##\det A'=\det A##, which means that the summation must be identically zero. This holds if each summand is zero, which is a fact cited by some linear-algebra texts in order to prove this theorem.

Assuming this is true, either there is an identically zero row in the matrix described by first determinant term, or there are two identical column indices ##j_i,j_l## in the permutation of ##S_n##. In the latter scenario, ##\textrm{sgn}## evaluates to zero. On the other hand, these elements ##j_i## are meant to represent distinct numbers in the set ##S_n##. But the notation does not reflect this. The revised definition of ##\textrm{sgn}## to follow aims to correct for this shortcoming.

It is slightly inaccurate to think of the elements in each permutation as integers; it would be more natural to think of them as integer-valued functions. We base this off the following idea: Each distinct row vector must be assigned a unique column index in each summand of the determinant expansion. By construction, if any of the rows are identical, they are assigned same column index, which means that the ##\textrm{sgn}## evaluates the list of said indices to zero.

Let ##V_n## be a list of vectors in ##\mathbb{F}^n##, not necessarily distinct, each of length ##n##, with ##|V_n|=n##.

Define ##f:S_n\longrightarrow V_n##. In other words, ##f## maps each integer in ##S_n## to some row vector of length ##n##. Generally speaking, we can interpret any matrix as an array of such mappings. Alternatively, ##f## can be interpreted as a function that determines the row vector in some row index of a given matrix.

Let ##T_n## denote a subset of ##S_n##, where ##T_n## has as many elements as there are distinct row vectors in the image of ##f##.

Now define a function ##g:V_n\longrightarrow T_n## as the function assigning each distinct row vector a unique column index. This defines a one-to-one function.

The composition ##g\circ f## applies to some element ##k\in S_n## as follows:

\begin{eqnarray*}
k\textrm{ is mapped to some row vector in a matrix via }f.\\
\textrm{This row vector is mapped to a column index }(g\circ f)(k).
\end{eqnarray*}

We redefine the argument of ##\textrm{sgn}## as a permutation of the function ##h## evaluated on ##S_n##, where ##h## is defined below.

\begin{equation}
h(S_n)=\left((g\circ f)(1),\ldots,(g\circ f)(n)\right)
\end{equation}

Going back to the proof from earlier, we have a matrix with two identical rows, one in the first row and the other in the last. Let ##a_1## denote the row vector in the first row of the matrix ##A'##.

\begin{eqnarray*}
f(1)=a_1=f(n)\\
g(f(1))=k=g(f(n))
\end{eqnarray*}

where ##k\in T_n##.

Let ##(i_1,\ldots,i_n)## be a permutation of ##h(S_n)##. It follows that at least two of the elements in the permutation must be identical, which means that ##\textrm{sgn}## evaluates it to zero, which means that each summand in the sum derived in the proof from earlier is identically zero.

This gives our desired result: ##\det A=\det A'##.
 
Last edited:
Physics news on Phys.org
  • #2
Aside from the fact that this might confuse readers, you will have to provide a new definition every time you invoke sgn in a new proof - just to avoid more confusion. I fail to see why you really need this.

A mathematician should weigh in on this topic I think. @fresh_42 might have something to say.
 
  • #3
Eclair_de_XII said:
Summary:: Let ##A## be an ##(n,n)## matrix. If ##A'## is the matrix obtained from adding one multiple of one row to another, then ##\det A=\det A'##.

Let ##S_n## denote ##\{1,\ldots,n\}##, where ##n\in\mathbb{N}##.

Recall that the ##\textrm{sgn}## function maps a permutation of ##S_n## to an element in ##\{1,0,-1\}##.
The signature function has only two values, ##\pm 1.##
Eclair_de_XII said:
We want to rework the definition of ##\textrm{sgn}## because it is not sufficient for some proofs about determinants. For example, it is known that replacing one row of a given matrix with the sum of itself and a multiple of another row does not change the value of the determinant. The proof for this fact could start out in the following way.

Let ##A## be a matrix and let ##A'## be the matrix obtained from replacing the ##n\textrm{-th}## row with itself added to the first row multipled by ##\alpha##, where ##\alpha## is a constant. In theory, replacing any row with a itself added to a multiple off another would yield the same result, but these two rows are chosen specifically to economize on space. We proceed by computing ##\det A'##.

\begin{eqnarray*}
\det A'&=&\sum_{\textrm{n-lists}} \textrm{sgn}(j_1,\ldots,j_n) a_{1j_1}\cdot\ldots\cdot (\alpha\cdot a_{1j_1}+a_{nj_n})\\
&=&\alpha \sum_{\textrm{n-lists}} \textrm{sgn}(j_1,\ldots,j_n) a_{1j_1}\cdot\ldots\cdot a_{1j_1}\\
&+&\sum_{\textrm{n-lists}} \textrm{sgn}(j_1,\ldots,j_n) a_{1j_1}\cdot\ldots\cdot a_{nj_n}\\
&=&\alpha \sum_{\textrm{n-lists}} \textrm{sgn}(j_1,\ldots,j_n) a_{1j_1}\cdot\ldots\cdot a_{1j_1}+\det A
\end{eqnarray*}

The conclusion of the proof is that ##\det A'=\det A##, which means that the summation must be identically zero. This holds if each summand is zero, which is a fact cited by some linear-algebra texts in order to prove this theorem.
I doubt it very much because it is wrong. The sum equals ##0## because we have as many even permutations of ##S_n## as we have odd ones with the same value since two elements of the product are equal.

Eclair_de_XII said:
Assuming this is true, either there is an identically zero row in the matrix described by first determinant term, or there are two identical column indices ##j_i,j_l## in the permutation of ##S_n##. In the latter scenario, ##\textrm{sgn}## evaluates to zero. On the other hand, these elements ##j_i## are meant to represent distinct numbers in the set ##S_n##. But the notation does not reflect this. The revised definition of ##\textrm{sgn}## to follow aims to correct for this shortcoming.

It is slightly inaccurate to think of the elements in each permutation as integers; it would be more natural to think of them as integer-valued functions. We base this off the following idea: Each distinct row vector must be assigned a unique column index in each summand of the determinant expansion. By construction, if any of the rows are identical, they are assigned same column index, which means that the ##\textrm{sgn}## evaluates the list of said indices to zero.

Let ##V_n## be a list of vectors in ##\mathbb{F}^n##, not necessarily distinct, each of length ##n##, with ##|V_n|=n##.

Define ##f:S_n\longrightarrow V_n##. In other words, ##f## maps each integer in ##S_n## to some row vector of length ##n##. Generally speaking, we can interpret any matrix as an array of such mappings. Alternatively, ##f## can be interpreted as a function that determines the row vector in some row index of a given matrix.

Let ##T_n## denote a subset of ##S_n##, where ##T_n## has as many elements as there are distinct row vectors in the image of ##f##.

Now define a function ##g:V_n\longrightarrow T_n## as the function assigning each distinct row vector a unique column index. This defines a one-to-one function.

The composition ##g\circ f## applies to some element ##k\in S_n## as follows:

\begin{eqnarray*}
k\textrm{ is mapped to some row vector in a matrix via }f.\\
\textrm{This row vector is mapped to a column index }(g\circ f)(k).
\end{eqnarray*}

We redefine the argument of ##\textrm{sgn}## as a permutation of the function ##h## evaluated on ##S_n##, where ##h## is defined below.

\begin{equation}
h(S_n)=\left((g\circ f)(1),\ldots,(g\circ f)(n)\right)
\end{equation}

Going back to the proof from earlier, we have a matrix with two identical rows, one in the first row and the other in the last. Let ##a_1## denote the row vector in the first row of the matrix ##A'##.

\begin{eqnarray*}
f(1)=a_1=f(n)\\
g(f(1))=k=g(f(n))
\end{eqnarray*}

where ##k\in T_n##.

Let ##(i_1,\ldots,i_n)## be a permutation of ##h(S_n)##. It follows that at least two of the elements in the permutation must be identical, which means that ##\textrm{sgn}## evaluates it to zero, which means that each summand in the sum derived in the proof from earlier is identically zero.

This gives our desired result: ##\det A=\det A'##.
 
  • Like
Likes Eclair_de_XII
  • #4
The signature or signum function is defined as follows:

If ##N## is the number of permutations we need to bring an arbitrary order of ##(a_1,a_2,\ldots,a_n)## into the order ##(1,2, \ldots,n)## then ##\operatorname{sgn}((a_1,a_2,\ldots,a_n))=(-1)^N.## Here we have ##\{a_1,a_2,\ldots,a_n\}=\{1,2,\ldots,n\}=S_n## as sets. E.g.
\begin{align*}
(4,6,1,2,5,3) &\longrightarrow_1 (4,1,6,2,5,3) \longrightarrow_2 (1,4,6,2,5,3)\longrightarrow_3 (1,4,2,6,5,3)\\
&\longrightarrow_4 (1,2,4,6,5,3)\longrightarrow_5 (1,2,4,6,3,5)\longrightarrow_6 (1,2,4,3,6,5)\\
&\longrightarrow_7 (1,2,3,4,6,5)\longrightarrow_8 (1,2,3,4,5,6)\\ &\Longrightarrow \operatorname{sgn}((4,6,1,2,5,3))=(-1)^8=+1
\end{align*}
 
  • #5
Assume we sum over all permutations of length ##4,## then we have to sum over all indices
\begin{align*}
1234 (+)&\;;\; 1243 (-) \;;\;1423 (+)\;;\;1432 (-)\;;\;4132 (+)\;;\;4123 (-)\\
4213 (+)&\;;\; 4231 (-) \;;\;4321 (+)\;;\;4312 (-)\;;\;3412 (+)\;;\;3421 (-)\\
3241 (+)&\;;\; 3214 (-) \;;\;2314 (+)\;;\;2341 (-)\;;\;2431 (+)\;;\;2413 (-)\\
2143 (+)&\;;\; 2134 (-) \;;\;1342 (+)\;;\;1324 (-)\;;\;3124 (+)\;;\;3142 (-)
\end{align*}
Now let us consider the case where the first and the last term are equal, here ##1=4##. This means
\begin{align*}
1231 (+)&\;;\; 1213 (-) \;;\;1123 (+)\;;\;1132 (-)\;;\;1132 (+)\;;\;1123 (-)\\
1213 (+)&\;;\; 1231 (-) \;;\;1321 (+)\;;\;1312 (-)\;;\;3112 (+)\;;\;3121 (-)\\
3211 (+)&\;;\; 3211 (-) \;;\;2311 (+)\;;\;2311 (-)\;;\;2131 (+)\;;\;2113 (-)\\
2113 (+)&\;;\; 2131 (-) \;;\;1312 (+)\;;\;1321 (-)\;;\;3121 (+)\;;\;3112 (-)
\end{align*}
and if I made no mistake, we have every index combination twice with different signs. This is what happens in your formula, and why the sum adds up to ##0.##
 
  • #6
fresh, in line 2 of post #4, i think you meant "transposition", instead of "permutation".
 
  • #7
Thanks for going through all the trouble of explaining all this, @fresh_42 . I appreciate the clarification.
 
  • Like
Likes berkeman
  • #8
mathwonk said:
fresh, in line 2 of post #4, i think you meant "transposition", instead of "permutation".
I meant what common language uses if two elements are swapped. I looked up the German word for it in the dictionary and got permutation. I tried to avoid technical terms.
 
  • Like
Likes jim mcnamara
  • #9
i think that makes it hard to have this conversation. in english a permutation of a set of n elements is a bijection of that set with itself . A transposition is a special permutation in which only two elements are interchanged. It is a theorem that every permutation can be written as a composition of a sequence of transpositions, but not uniquely, however that any two different sequences of transpositions, which express the same permutation, have the same parity. Then each permutation has a sign, which is defined as 1 or -1, according as the parity of the number of transpositions needed to express that permutation is even or odd.
 
  • #10
mathwonk said:
i think that makes it hard to have this conversation. in english a permutation of a set of n elements is a bijection of that set with itself . A transposition is a special permutation in which only two elements are interchanged. It is a theorem that every permutation can be written as a composition of a sequence of transpositions, but not uniquely, however that any two different sequences of transpositions, which express the same permutation, have the same parity. Then each permutation has a sign, which is defined as 1 or -1, according as the parity of the number of transpositions needed to express that permutation is even or odd.
Sure, permutation means the same thing here. But these are technical terms and ##\operatorname{Sym}(n)## isn't the main topic here, so there is no need to talk about the group. The example explains it better than any formulas in my opinion. Yes, we have to sum over ##\operatorname{Sym}(n)## and ##\operatorname{Sym}(n)/\operatorname{A}_n## plays an essential role, however, I didn't want to lecture the determinant. I just wanted to demonstrate why this specific sum is zero. If I had used transposition, then I would have had to define it, had to explain which kind of transposition I meant, plus it is confusing in this context.

The English word I was looking for was the translation of "vertauschen". Unfortunately, there is no English correspondence. "Exchange" comes close, but there is no "ex" in the action. "Swap" might have worked, but this is too general. As mentioned, the dictionary said "Vertauschung = permutation". And since every transposition is a permutation, I don't see why it should have been wrong.

After all, it is about to help, not to brag what a smart guy I am that I know so much about permutation groups.
 
  • #11
sorry for not being helpful. sounds as if you worked hard at making it clear to your audience, i hereby butt out. i just missed your point.
 
  • #12
mathwonk said:
sorry for not being helpful. sounds as if you worked hard at making it clear to your audience, i hereby butt out. i just missed your point.
Sorry for being a bit harsh.

To be honest, I often hope that dialogue would grow out of first answers where I could address all the things I have also in mind: why I wrote ##(1234)## and ##(1231)## instead of ##a_1a_2a_3a_4## and ##a_1a_2a_3a_1##, the necessary pairing of permutations to get the desired result zero, transpositions, wedge products of row vectors, the bad notation ##S_n=\{1,\ldots,n\}## instead of ##S_n=\operatorname{Sym}(n)## which I only used because the OP had burnt ##S_n##, the unclear use of "n-lists" in the OP, the remark that it doesn't work if the first row is replaced by ##\alpha r_1+r_n## which looks very similar at first glance, the characteristic polynomial, and so on. There's a lot that can be said about determinants, but it depends on the OP to get a real chance to do so. If the topic wasn't yet exhaustively discussed on so many websites and full of formulas and indices I'd written an insights article about it.
 
Last edited:
  • #13
I am impressed at your work at making your answer suited to your questioner. That is the sign of a good techer in my opinion, and you seem to be one. The rigorous definiton of determinants is some thing that has always been a bit sticky for me, and I think an insights article might be quite welcome.

By the way, since words always fascinate me, what do you think of the english words "swap", or "interchange", as translations of vertauschen?
 
  • #14
"Swap if two neighboring numbers are in the wrong order" is probably the best description. Translations are difficult, regardless of which direction. Words always carry certain interpretations that do not translate well. My favorite examples are "schweigen" for being silent, but actively, not passive as in its English use, and "sophisticated". There are many possible translations but neither hits the nail.
 
  • #15
IMHO either "swap" or "exchange" are better than "permutation" here. Both swap and exchange imply an operation that affects exactly two elements, a permuatation may affect any number of elements.
 
  • #16
interesting!
say, in reference to wedge products, are you interested in koszul homology? I just got a boost for a result i think interesting, from vanishing of 1st koszul homology for a "regular sequence" of elements of a ring. ( as you probably know, regularity seems to be a generalization of two elements being relatively prime.) If you are not into this, 1st koszul homology seems to be an aspect of "wedge 2", or 2x2 determinants, that illuminates relations between entires of a sequence of elements of a ring.

I have always been shy of this subject as well, but a friend pointed out to me how it provides an answer to a question i find of interest in algebraic geometry. namely vanishing of just the 1st koszul homology implies that if the tangent cones to a sequence of m hypersurfaces intersect in codimension m, then their intersection is the tangent cone to the intersection variety of the original hypersurfaces.

In general as you probably know, the intersection of the tangent cones may not be the tangent cone to the intersection. anyway, i am taking this thread away from its purpose. apologies.

but my point was, that if you are up for it, an insights article on determinants, wedge products, and perhaps koszul homology, would be a gift to many. (I still don't know what the vanishing of higher koszul homology means.)
 

FAQ: Is it invalid to redefine the sgn function in this way in a proof?

1. What is the sgn function and why is it important in proofs?

The sgn function, also known as the sign function, is a mathematical function that returns the sign of a number, indicating whether it is positive, negative, or zero. It is important in proofs because it helps to determine the behavior of a function or equation and can be used to simplify mathematical expressions.

2. How is the sgn function typically defined and used in proofs?

The sgn function is typically defined as follows: sgn(x) = 1 if x > 0, sgn(x) = -1 if x < 0, and sgn(x) = 0 if x = 0. It is commonly used in proofs involving inequalities, limits, and continuity of functions.

3. What is the proposed redefinition of the sgn function in this proof?

The proposed redefinition of the sgn function in this proof is to define sgn(x) = 1 if x ≥ 0 and sgn(x) = -1 if x < 0. This redefinition may be used in certain cases to simplify the proof or make it easier to work with certain equations.

4. Is it always valid to redefine the sgn function in a proof?

No, it is not always valid to redefine the sgn function in a proof. The original definition of the sgn function may be necessary for the proof to be logically sound and valid. It is important to carefully consider the implications of redefining the sgn function before doing so in a proof.

5. Are there any potential consequences of redefining the sgn function in a proof?

Yes, there can be potential consequences of redefining the sgn function in a proof. It may lead to incorrect conclusions or make the proof invalid. It is important to carefully analyze the impact of the redefinition on the proof before using it.

Similar threads

Replies
3
Views
2K
Replies
2
Views
1K
Replies
7
Views
222
Replies
1
Views
2K
Replies
20
Views
3K
Back
Top