What is the meaning of the sigma notation in this matrix A?

In summary, the conversation discusses the use of sigma notation in a computer science paper and the confusion surrounding its use in the context of polynomial regression with multiple dimensions. The conversation also provides clarification on the meaning of the sigma notation and how it relates to the overall calculations. It is concluded that the notation is shorthand for summing over certain values and that it is important to understand the context in order to interpret it correctly.
  • #1
dodo21
4
0
I've been clawing at my mind for a while regarding this, I pray somebody can help me.

I was implementing a method from a computer science paper when I came across this:

[PLAIN]http://www.mattkent.eu/challenge.png

Bearing in mind that x is a vector, epsilon is a vector of integers (and x raised to this is performed component wise), I can't tell what exactly the Sigma is summing.

Firstly what does a sigma of index i mean in the matrix A? What exactly would be summed there? Is it the 'sum of the vector components for a given i' or the 'sum of all vectors where each has been raised to epsilon'?

And secondly, in the definition of the b vector, what is the sigma doing here? There's absolutely no index or limit associated with it. The t value is essentially a constant. So is this saying 'sum up the vectors components (once multiplied with t)' ?

I'm at a loss. Please help super mathematicians.
 
Last edited by a moderator:
Mathematics news on Phys.org
  • #2
Assuming that x is a vector of dimension d then the subscript x_i indicates i is the index running from 1 to d. This sigma notation is shorthand.
[tex] \Sigma_j[/tex]
means sum over the values of j which should be understood from the context.

Thus for example if you want the dot product of x and y:
[itex]\Sigma_i \mathbf{x}_i \mathbf{y}_i[/itex]

You might also sometimes see:
[tex]\sum_{a \in S}[/tex]
where S is a set.
 
  • #3
Thanks for the quick response.

jambaugh said:
Assuming that x is a vector of dimension d then the subscript x_i indicates i is the index running from 1 to d.

Earlier in that section of the report it in fact defines x_i (where i = 1,2... m)to be a vector from a set of vectors (set size m), rather than a component of the vector x.

It then defines that a second subscript (i.e. x_i_j) indicates the vector component of the ith vector (where j = 1, 2, ... d; d is the dimension of the vector).
 
  • #4
dodo21 said:
Thanks for the quick response.

Earlier in that section of the report it in fact defines x_i (where i = 1,2... m)to be a vector from a set of vectors (set size m), rather than a component of the vector x.

It then defines that a second subscript (i.e. x_i_j) indicates the vector component of the ith vector (where j = 1, 2, ... d; d is the dimension of the vector).

OK that is a useful detail. I should have read more carefully.
I would guess then the i is indexing the set of vectors and that is what is being summed over. The definition of B looks to have simply dropped the i. The author appears to have gotten quite terse with the notation.

Parse this in stages.

You know what a matrix is so look at one entry, it is a sum.

You know what summing is (and you'll figure what i ranges over from context) so look at the terms of the sum.

You see each of a set of vectors indexed by i, (so the i will index the elements of that set) and you say the power notation is explained and should then yield a vector. You can add vectors so that's not a problem, if you can evaluate the powers then that also shouldn't be a problem.

Note: Since sums of matrices equal sums of corresponding entries you can probably rewrite A as:

[tex] \mathbf{A} = \sum_i \left[ \begin{array}{ccc} \mathbf{x}_i^{\epsilon_1+\epsilon_1}& \cdots & \mathbf{x}_i^{\epsilon_1+\epsilon_0}\\
\vdots & \ddots & \vdots \\
\mathbf{x}_i^{\epsilon_0+\epsilon_1}& \cdots & \mathbf{x}_i^{\epsilon_0+\epsilon_0}
\end{array}\right][/tex]

What I'm confused about is the ... in the matrices. That should indicate an obvious pattern but at face value I only see what should be 2x2 entries.

I might be able to understand better if you give a little context. What's the paper and what are A and b and gamma supposed to be?
 
  • #5
The Matrix is of variable dimension, the author chose the letter 'o' to signify 'order' and so the matrix is o-by-o, depending on what 'o' is.

I've taken this from the appendix of "Generalizing Surrogate-assisted Evolutionary Computation", by Lim et al. It's within the authors description of Polynomial Regression with an input of multiple dimensions.

In the context of polynomial regression (with a multiple dimension input vector), x_i is an input vector of dimension d, for i = 1, 2, ... m.

Epsilon_j is an 'exponent vector' for j = 1, 2, ... o. While o is the order of the polynomial+1 being regressed to. Each Epsilon vector contains d integers (the exponent values).

W.r.t. the vector b: the t_i value is the output of the function being modeled, associated with ith x vector.

Finally the gamma vector contains vectors of constants; C_i for all i = 1,2...o.

So the coefficient vector of vectors (does that necessarily result in a matrix if all nested vectors are the same length?) can be found:

gamma[transposed] = (A[inverse]*b[transposed])

I was speaking with a colleage this afternoon and we both came to the conclusion that in the matrix the sigma is indicating "sum for a given i"; similarly for the b vector. But I don't understand why the author would specifcally note "for a given i" in the matrix and then not do so in the vector.
 
  • #6
dodo21 said:
The Matrix is of variable dimension, the author chose the letter 'o' to signify 'order' and so the matrix is o-by-o, depending on what 'o' is.
Ok, that makes sense.
I've taken this from the appendix of "Generalizing Surrogate-assisted Evolutionary Computation", by Lim et al. It's within the authors description of Polynomial Regression with an input of multiple dimensions.
Ahhh! That (the need to sum vector powers) makes sense in the context of regression calculations.

Epsilon_j is an 'exponent vector' for j = 1, 2, ... o. While o is the order of the polynomial+1 being regressed to. Each Epsilon vector contains d integers (the exponent values).
Then the vector to vector power gives you a term in the polynomial. I presume then that you don't get a vector but rather the product of the X components raised to the epsilon powers. Like:
[tex] x^2 y^3 z = (x,y,z)^{[2,3,1]}[/tex]

I was speaking with a colleage this afternoon and we both came to the conclusion that in the matrix the sigma is indicating "sum for a given i"; similarly for the b vector. But I don't understand why the author would specifcally note "for a given i" in the matrix and then not do so in the vector.

It was probably a matter of neglect. With such involved formulas one tends to ignore "obvious" details such as over what one is summing.

EDIT: PS, It may be quite helpful to review some other references on polynomial regression in one and many variables to compare formulas. Likely part of the author's neglect was in that he assumes the reader is somewhat familiar with polynomial regression.
 
Last edited:
  • #7
The author referenced an article where I assume he got this definition. A 1956 article by F.H. Lesh; and the notation was no different (aside from the swapping of several letters).

That said, I shall search around the world wide web for further definitions to see if any enlightenment is offered.

Thanks for your input jambaugh. It's genuinely appreciated.
 

Related to What is the meaning of the sigma notation in this matrix A?

What is specific sigma notation?

Specific sigma notation is a mathematical notation used to represent the sum of a specific sequence of numbers. It is denoted by the Greek letter sigma (Σ) and is often used in calculus and other areas of mathematics.

How is specific sigma notation written?

Specific sigma notation is written as Σn=1 to n of a sub n, where n is the starting index and a sub n is the nth term in the sequence. The index n typically starts at 1 and increases by 1 until it reaches the specified value.

What is the purpose of using specific sigma notation?

The purpose of using specific sigma notation is to simplify and compactly represent a large sum of numbers. It also allows for easier calculations and generalizations of patterns in the sequence.

What are the properties of specific sigma notation?

The properties of specific sigma notation include linearity, distributivity, and associativity. Linearity means that the notation can be broken into multiple sigma notations with different constants. Distributivity means that the notation can be distributed over addition or subtraction. Associativity means that the notation can be grouped in different ways without changing the value of the sum.

How is specific sigma notation evaluated?

To evaluate specific sigma notation, you need to substitute the index values into the expression and calculate the corresponding values for each term. Then, you add up all the values to get the final sum. It is important to pay attention to the starting and ending indices and the pattern of the sequence to ensure accurate evaluation.

Similar threads

  • Special and General Relativity
Replies
4
Views
446
  • Special and General Relativity
4
Replies
124
Views
7K
Replies
4
Views
3K
Replies
2
Views
2K
  • Advanced Physics Homework Help
Replies
1
Views
695
  • Special and General Relativity
Replies
1
Views
666
Replies
2
Views
954
Replies
16
Views
3K
  • Special and General Relativity
Replies
10
Views
2K
Replies
2
Views
3K
Back
Top