Math terminology in my Taylor Series expansion?

  • #1
hotvette
Homework Helper
1,008
7
I have another dilemma with terminology that is puzzling and would appreciate some advice.

Consider the following truncated Taylor Series:
$$\begin{equation*}
f(\vec{z}_{k+1}) \approx f(\vec{z}_k)
+ \frac{\partial f(\vec{z}_k)}{\partial x} \Delta x
+ \frac{\partial f(\vec{z}_k)}{\partial \beta_1} \Delta \beta_1
+ \frac{\partial f(\vec{z}_k)}{\partial \beta_2} \Delta \beta_2 + \dots
+ \frac{\partial f(\vec{z}_k)}{\partial \beta_n} \Delta \beta_n
\end{equation*}$$
where:
$$\begin{align*} f &= f(\vec{z}\,) &
\vec{z} &= (x;\vec{\beta}\,) \\
\vec{z}_k &= (x;\vec{\beta}\,)_k &
\vec{z}_{k+1} &= (x;\vec{\beta}\,)_{k+1} \\
\Delta x &= x_{k+1} - x_k &
\Delta \beta_j &= (\Delta \beta_j)_{k+1} - (\Delta \beta_j)_k \\
\vec{\beta} &= (\beta_1, \beta_2, \dots, \beta_n)
\end{align*}$$
Then form the following function with the truncated Taylor Series:
$$\begin{equation*}
L = \sum_{i=0}^m \left[ f_i
+ \frac{\partial f_i}{\partial x_i} \Delta x_i
+ \frac{\partial f_i}{\partial \beta_1} \Delta \beta_1
+ \frac{\partial f_i}{\partial \beta_2} \Delta \beta_2 + \dots
+ \frac{\partial f_i}{\partial \beta_n} \Delta \beta_n - y_i
\right]^2 \end{equation*}$$
where:
$$\begin{align*}
f_i &= f(z_k) &
z_k &= (x_i; \beta)_k \\
\Delta x_i &= (x_i)_{k+1} - (x_i)_k &
\Delta \beta_j &= (\Delta \beta_j)_{k+1} - (\Delta \beta_j)_k \\
\vec{x} &= (x_1, x_2, \dots, x_m) &
\vec{y} &= (y_1, y_2, \dots, y_m)
\end{align*}$$
The ##\partial x_i## in the second term of the sum matches the ##\Delta x_i## but doesn't seem right because it implies ##f=f(\vec{x};\vec{\beta})## which isn't true, but using ##\partial x## instead doesn't seem right either because it doesn't match ##\Delta x_i##. How to handle?
 
Science news on Phys.org
  • #2
I guess that L would be :
1728781594511.png

Does it make sense ?
 
  • #3
Thanks for you comment (and catching the error on the sum index). If I wanted to make the expression for L more compact, it would be:
$$\begin{equation*}
L = \sum_{i=1}^m \left[ f_i - y_i + \frac{\partial f_i}{\partial x_i} \Delta x_i
+ \sum_{k=1}^n \frac{\partial f_i}{\partial \beta_k} \Delta \beta_k \right]^2
\end{equation*}$$
but the dilemma remains with ##\partial x_i##. It's a strange situation; ##f## has only one ##x## as an independent variable but the sum involves ##m## values of ##x##. I'm tempted to use:
$$\frac{\partial f_i}{\partial x} \Delta x_i$$ but it just doesn't look right.
 
  • #4
In my hand writing I introduced j as another dummy index for sum as you did with k. Does it help you ?
 
  • #5
I don't think so. The dilemma really shows up when L is expanded, where we let ##r_i = (f_i - y_i)##:
$$\begin{align*}
L &= \left[ r_1 + \frac{\partial f_1}{\partial x} \Delta x_1
+ \sum_{k=1}^n \frac{\partial f_1}{\partial \beta_k} \Delta \beta_k \right]^2
+ \left[ r_2 + \frac{\partial f_2}{\partial x} \Delta x_2
+ \sum_{k=1}^n \frac{\partial f_2}{\partial \beta_k} \Delta \beta_k \right]^2
\\
&+ \dots
+ \left[ r_m + \frac{\partial f_m}{\partial x} \Delta x_m
+ \sum_{k=1}^n \frac{\partial f_m}{\partial \beta_k} \Delta \beta_k \right]^2
\end{align*}$$
or (?)
$$\begin{align*}
L &= \left[ r_1 + \frac{\partial f_1}{\partial x_1} \Delta x_1
+ \sum_{k=1}^n \frac{\partial f_1}{\partial \beta_k} \Delta \beta_k \right]^2
+ \left[ r_2 + \frac{\partial f_2}{\partial x_2} \Delta x_2
+ \sum_{k=1}^n \frac{\partial f_2}{\partial \beta_k} \Delta \beta_k \right]^2
\\
&+ \dots
+ \left[ r_m + \frac{\partial f_m}{\partial x_m} \Delta x_m
+ \sum_{k=1}^n \frac{\partial f_m}{\partial \beta_k} \Delta \beta_k \right]^2
\end{align*}$$
I think the first one is mathematically correct because ##f_i=f(x_i;\vec{\beta})=f(x;\vec{\beta})|_{x_i}##, but it seems to me in violation of nomenclature for Taylor Series. Maybe that's OK?
 
Last edited:
  • #6
Let me say what I think I understand.
##f_i## is n+1 variable function ##f_i(x_i;\beta_1,\beta_2,...,\beta_n)## and partial derivatives are taken keeping other n variables are constant. i.e.
[tex]\frac{\partial }{\partial x_i}|_{\beta_1,\beta_2,...\beta_n}[/tex]
where usual way of
[tex] \frac{\partial }{\partial x_1}|_{x_2,x_3,...x_m}[/tex]
does not apply, and
[tex]\frac{\partial }{\partial \beta_1}|_{x_i,\beta_2,...\beta_n} := (\frac{\partial }{\partial \beta_1})_i[/tex]
,and so on. I introduced RHS to make it clear that these partial derivative operators are different for different i.

In full details
[tex]L=\sum_{i=1}^m [\ r_i+[\ \triangle x_i \frac{\partial }{\partial x_i}|_{\beta_1,\beta_2,..,\beta_n}+\sum_{j=1}^n \triangle \beta_j \frac{\partial }{\partial \beta_j}|_{x_i,\beta_1,...,\beta_{j-1},\beta_{j+1} ..,\beta_n} \ ]\ f_i\ \ ]^2[/tex]
Using the above said convention
[tex]L=\sum_{i=1}^m [\ r_i+[\ \triangle x_i \frac{\partial}{\partial x_i}+\sum_{j=1}^n \triangle \beta_j (\frac{\partial}{\partial \beta_j})_i\ ]\ f_i\ \ ]^2[/tex]

[EDIT]
Let us introduce m+n dimension vector
[tex]\mathbf{q}=(q_1,q_2,...q_{m+n})=(x_1,x_2,..,x_n,\ \beta_1,\beta_2,...,\beta_m)[/tex]
[tex]L=\sum_{i=1}^m [\ r_i+ \sum_{k=1}^{m+n} \triangle q_k \frac{\partial f_i}{\partial q_k}\ ]^2[/tex]
[tex]=\sum_{i=1}^m [\ r_i+ \triangle \mathbf{q} \cdot \frac{\partial }{\partial \mathbf{q}}f_i\ ]^2[/tex]
We get a simple formula at the expense of many zeros in the sum because ##f_i## is function of ##x_i## and not of ##x_k## so
[tex]\frac {\partial f_i}{\partial x_k}=0[/tex]
with no regard to partial differential conditions.

This formula is useful for multiple-x's-function of ##f_i## case, i.e.
[tex]f_i(x_1,x_2,..,x_n,\ \beta_1,\beta_2,...,\beta_m)[/tex]
which is extended from the original
[tex]f_i(x_i,\ \beta_1,\beta_2,...,\beta_m)[/tex]
 
Last edited:
  • #7
I decided to simplify and go back to fundamentals as a way to try to sort this out. Below is the result. Start with the following definition of truncated Taylor Series:
$$\begin{equation*} f(x) \approx f(x_0) + \frac{d f(x_0)}{dx} (x-x_0) \end{equation*}$$
then the following should be valid (where each ##x_i## is a distinct point and ##x'_i## is nearby ##x_i##):
$$\begin{align*}
&f(x'_0) \approx f(x_0) + \frac{d f(x_0)}{dx} (x'_0-x_0) \\
&f(x'_1) \approx f(x_1) + \frac{d f(x_1)}{dx} (x'_1-x_1) \\
&f(x'_2) \approx f(x_2) + \frac{d f(x_2)}{dx} (x'_2-x_2) \\
&f(x'_m) \approx f(x_m) + \frac{d f(x_m)}{dx} (x'_m-x_m)
\end{align*}$$
Form the sum:
$$\begin{align*}
&L = \left[ f(x_1)-y_1 + \frac{df(x_1)}{dx} (x'_1-x_1) \right]^2
+ \left[ f(x_2)-y_2 + \frac{df(x_2)}{dx} (x'_2-x_2) \right]^2 \\
&+ \dots + \left[ f(x_m) - y_m + \frac{d f(x_m)}{dx} (x'_m-x_m) \right]^2
\\\\
&L = \sum_{i=1}^m \left[ f(x_i) - y_i + \frac{d f(x_i)}{dx} (x'_i-x_i) \right]^2 \\
&L = \sum_{i=1}^m \left[ f_i - y_i + \frac{df_i}{dx} \Delta x_i \right]^2
\end{align*}$$
where we let ##f(x_i) = f_i## and ##(x'_i-x_i) = \Delta x_i##. Is there any flaw in the above? I can't see any.
 
  • #8
WRT the last formula I would like to propose
[tex]L=\sum_{i=1}^m[f_i-y_i-f'_i\triangle x_i]^2[/tex]
where
[tex]f'_i=f'(x_i)=\frac{df}{dx}(x_i)[/tex]
to avoid possible confusion of ##x## or ##x_i## in the derivative denominator.

With m sets values of ##\{x_i,\triangle x_i\}## given and functions f(x) f'(x)and y(x) in
[tex]y_i=y(x_i)[/tex]
known, I can calculate L.

Now I noticed my misunderstanding of you in the previous posts.
I took ##x_i## is one of variables ##x_1,x_2,....##. Now I know ##x_i## is a value of single variable x.
I would like to amend the formula of L as
[tex]L=\sum_{i=1}^m [\ r_i+\ \triangle x_i \frac{\partial f}{\partial x}(x_i;\beta_1,\beta_2,..,\beta_n)|_{\beta_1,\beta_2,..,\beta_n}+\sum_{j=1}^n \triangle \beta_j \frac{\partial f}{\partial \beta_j}(x_i;\beta_1,\beta_2,..,\beta_n)|_{x,\beta_1,...,\beta_{j-1},\beta_{j+1} ..,\beta_n} \ \ ]^2[/tex]
We may write it in brevity
[tex]=\sum_{i=1}^m [\ r_i+\ \triangle x_i f_{x\ i}+\sum_{j=1}^n \triangle \beta_j f_{\beta_j\ i}\ ]^2[/tex]
with convention for function
[tex]\frac{\partial f}{\partial \alpha}:=f_\alpha[/tex]
and for its output value wrt input value ##x_i##
[tex]\frac{\partial f}{\partial \alpha}(x_i):=f_{\alpha\ i}[/tex]
omitting mention of ##\beta##s

Notations
[tex]\frac{df}{dx_i},\frac{df_i}{dx},\frac{df_i}{dx_i}[/tex]
are misleading because suffixed ones are not variables but the values which are inappropriate as
[tex]\frac{df}{d\ 2},\frac{d\ 10}{dx},\frac{d 10}{d\ 2}[/tex]are.
 
Last edited:
  • #9
Does the following work?
\begin{equation*}
L = \sum_{i=1}^m \left[ f_i - y_i + \Delta x_i \cdot \frac{\partial f}{\partial x} (x_i; \vec{\beta})
+ \sum_{k=1}^n \Delta \beta_k \cdot \frac{\partial f}{\partial \beta_k} (x_i; \vec{\beta}) \right]^2
\end{equation*}
with the clarification that:
\begin{equation*}
\frac{\partial f}{\partial x} (x_i; \vec{\beta}) = \frac{\partial f}{\partial x} |_{x_i; \vec{\beta}}
\end{equation*}

I had a completely different line of thought. The following is directly from my Advanced Calculus book (Kaplan, fourth edition 1957), p370:
\begin{equation*}
f(x,y) = f(x_1, y_1) + \left[ \frac{\partial f}{\partial x}(x-x_1) + \frac{\partial f}{\partial y}(y-y_1) \right] + \dots
\end{equation*}
all derivatives evaluated at ##(x_1, \,y_1)##.

But, what if I wanted to know ##f(x, y)## for specific values of ##x## and ##y##, say ##x=3## and ##y=4##. Isn't the following valid (as long as ##3## is close to ##x_1## and ##4## is close to ##y_1##)?

\begin{equation*}
f(3,4) = f(x_1, y_1) + \left[ \frac{\partial f}{\partial x}(3-x_1) + \frac{\partial f}{\partial y}(4-y_1) \right] + \dots
\end{equation*}
all derivatives evaluated at ##(x_1, \,y_1)##.
 
Last edited:
  • #10
Yes, it seems to work. Possible concerns are:
##\beta_k## is used as both variable in partial derivative formula and their specific values in ##\vec{\beta}##. (Below I used \boldsymbol font for the variable but it does not make a big difference in appearance.)
##(\triangle x)_i ## might be better than ##\triangle x_i## to avoid possible misinterpretation of ##\triangle## multiplied by ##x_i##
[tex]\frac{\partial f}{\partial x}(x_i;\vec{\beta}):=\frac{\partial f}{\partial x}|_{\vec{\beta}}(x_i)[/tex]
[tex]\frac{\partial f}{\partial \boldsymbol{\beta}_k}(x_i;\vec{\beta}):=\frac{\partial f}{\partial\boldsymbol{\beta}_k}|_{x_i,\beta_1,...\beta_{k-1},\beta_k,...,\beta_n}(\beta_k)[/tex]
 
Last edited:
  • #11
hotvette said:
But, what if I wanted to know f(x,y) for specific values of x and y, say x=3 and y=4. Isn't the following valid (as long as 3 is close to x1 and 4 is close to y1)?

f(3,4)=f(x1,y1)+[∂f∂x(3−x1)+∂f∂y(4−y1)]+…
all derivatives evaluated at (x1,y1).
In order to avoid possible misinterpretation I would like to write it as
[tex]f(3,4)=f(x_1,y_1)+(3-x_1)\frac{\partial f}{\partial x}(x_1,y_1)+(4-y_1)\frac{\partial f}{\partial y}(x_1,y_1)+...[/tex]
where (A)g(B) means (A) multiplied by g(B) where B is variable(s) of function g.
 
Last edited:
  • #12
Thanks for the discussion. It was very helpful.
 
  • Like
Likes berkeman
  • #13
I think I found a solution to the dilemma. A caution was posted earlier not to confuse variables and values, but the point didn't really hit me until now. I was using ##\beta_k## as a variable but ##x_i## as a specific value of ##x##. I think what resolves the inconsistent use of terminology is to define a set of functions ##f_i = f_i(x_i; \vec{\beta})##, where ##x_i## and ##\beta_k## are variables, ##\vec{\beta} = (\beta_1, \beta_2, \dots, \beta_{k-1}, \beta_k, \beta_{k+1}, \dots, \beta_n)##, and ##\vec{x} = (x_1, x_2, \dots, x_{i-1}, x_i, x_{i+1}, \dots, x_m)## . Letting subscripts ##p## and ##p+1## represent iterates of specific values of the variables, the Truncated Taylor Series expansions of each ##f_i## might be:
\begin{align*}
f_1 &\approx f_1(x_{1p};\vec{\beta}_p) + \Delta x_1 \cdot \frac{\partial f_1}{\partial x_1} (x_{1p};\vec{\beta}_p)
+ \sum_{k=1}^n \Delta \beta_k \cdot \frac{\partial f_1}{\partial \beta_k} (x_{1p};\vec{\beta}_p)
\\
f_2 &\approx f_2 (x_{2p};\vec{\beta}_p) + \Delta x_2 \cdot \frac{\partial f_2}{\partial x_2} (x_{2p};\vec{\beta}_p)
+ \sum_{k=1}^n \Delta \beta_k \cdot \frac{\partial f_2}{\partial \beta_k} (x_{2p};\vec{\beta}_p)
\\ \vdots \\
f_m &\approx f_m (x_{mp};\vec{\beta}_p) + \Delta x_m \cdot \frac{\partial f_m}{\partial x_m} (x_{mp};\vec{\beta}_p)
+ \sum_{k=1}^n \Delta \beta_k \cdot \frac{\partial f_m}{\partial \beta_k} (x_{mp};\vec{\beta}_p)
\end{align*}
Also:
\begin{align*}
\vec{x}_p &= (x_1, x_2, \dots, x_m)_p = (x_{1p}, x_{2p}, \dots, x_{mp} ) \\
\vec{\beta}_p &= (\beta_1, \beta_2, \dots, \beta_n)_p = (\beta_{1p}, \beta_{2p}, \dots, \beta_{np} ) \\
\Delta x_i &= (x_i)_{p+1} - (x_i)_p \\
\Delta \beta_k &= (\beta_k)_{p+1} - (\beta_k)_p \\
\Delta \vec{x} &= (\Delta x_1, \Delta x_2, \dots, \Delta x_m) \\
\Delta \vec{\beta} &= (\Delta \beta_1, \Delta \beta_2, \dots, \Delta \beta_n) \\
\vec{x}_{p+1} &= \vec{x}_{p} + \Delta \vec{x} \\
\vec{\beta}_{p+1} &= \vec{\beta}_{p} + \Delta \vec{\beta}
\end{align*}
The function ##L## then becomes:
\begin{equation*}
L = \sum_{i=1}^m \left[ f_i (x_{ip};\vec{\beta}_p) + \Delta x_i \cdot \frac{\partial f_i}{\partial x_i} (x_{ip};\vec{\beta}_p)
+ \sum_{k=1}^n \Delta \beta_k \cdot \frac{\partial f_i}{\partial \beta_k} (x_{ip};\vec{\beta}_p) - \widetilde{y}_i \right]^2
\end{equation*}
where ##\widetilde{y}_i## are constants. I plan to work out a way to get rid of the double subscripts. Per the last reply on my other post, I will also consider ##x_i^{(p)}## and ##\beta_k^{(p)}## to denote the values.
 
Last edited:
Back
Top