How can hyperreal numbers make infinitesimals logically sound in calculus?

bhobba · May 9, 2023

When I learned calculus, the intuitive idea of infinitesimal was used. These are numbers so small that, for all practical purposes (say 1/trillion to the power of a trillion) can be taken as zero but are not. That way, when defining the derivative, you do not run into 0/0, but when required, you can neglect them as being zero for all practical purposes.

This is fine for those that are applied mathematicians, physicists, actuaries etc, that just want it as a tool to use in their work. But mathematicians, while conceding it is OK to start that way, eventually will need to rectify being handwavey and be logically sound. In calculus, that is sometimes called doing your ‘epsilonics’. This is code for studying what is called real analysis:
http://ramanujan.math.trinity.edu/wtrench/texts/TRENCH_REAL_ANALYSIS.PDF

I posted the above link in case the reader does not know real analysis, and since this is a beginner-level thread I don't expect them to, wanted to see what it is about. Just peek at it - I will not be using it. Any analysis ideas I will explicitly state. Instead, I will be making the idea of infinitesimal logically sound. About 1960, mathematicians (notably Abraham Robinson) did something nifty. They created hyperreal numbers, which have real numbers plus actual infinitesimals. These are numbers y with a very strange property. If x is any positive real number -x<y<x. Normally 0 is the only number with that property - but in the hyperreals, there are actual positive numbers not equal to zero that is less than any positive real number. That way, the infinitesimal approach can be justified without logical issues. It's also more in line with how many are likely to do calculus in practice. Even though I know real analysis, I hardly ever use it - instead use infinitesimals. After reading this, you can continue doing it knowing it is logically sound. I could give some links, and it would be advisable to read some texts or lecture notes later, but I will instead explain hyperreals in this post. Also, many books introduce, IMHO, unnecessary ideas, such as ultrafilters, making understanding them more complex than necessary.

Since the reader is not expected to have done real analysis, nobody probably defined precisely real numbers. For mathematicians, that is a no-no; you encounter the problem of talking about something that may not exist. There are several ways of doing it. I will use sequences because that is what is used for hyperreals. However, we need a special kind of sequence called a Cauchy sequence.

A Cauchy sequence is simply a sequence of numbers Xn such that for any e > 0, an N can be found if m and n are greater than N, then |Xm-Xn| < e. It is an idea from real analysis, but it is all we need. Intuitively, the terms of the sequence eventually get closer and closer together with no constraint on how close they get. You would likely say for n very large, the terms are so close to each other that they are, for all practical purposes, equal, i.e. the sequence converges to a number. And that is the general idea - except we are careful in the definition and don't have to specify that number because we are using it to define the real number it converges to.

The reals are all the Cauchy sequences of rational numbers.

Two reals, A and B, are equal if An - Bn converges to zero. This means in analysis lingo for any e > 0 an N can be found such that for any n > N |An - Bn| < e.

We define A < B as an N can be found such that for any m > N, Am < Bm. Similarly, for A > B.

We can easily define the usual operations of addition, multiplication, etc. A + B = An + Bn. A*B = An*Bn. Similarly, for subtraction. But not division. 1/Xn may not be Cauchy. We need to ensure 1/Xn is Cauchy first, which I show below for those interested provided the sequence does not converge to 0. Hence if Bn does not converge to zero division is defined by An/Bn.

The hyperreals are constructed similarly to the reals but use any sequence of reals, with a different definition of equality, > and <.

The hyperreals are all the sequences of real numbers.

If F(X) is a function defined on the reals, then that can easily be extended to the hyperreals by F(X) = F(Xn).

A + B = An + Bn. A*B = An*Bn. Similar definitions for subtraction and division without worrying about the division issue in constructing reals.

Here is the other difference from the construction of the reals. Two hyperreals, A and B, are equal if An = Bn except for a finite number of terms.

We define A < B as Am < Bm except for a finite number of terms. Similarly, for A > B.

If X is a real number, then the sequence Xn = X is the hyperreal of the real number X. Any sequence equal to X is obviously also the real number X.

Now we can show that the hyperreals contain infinitesimals. Let X be any positive real number. Let b be the hyperreal bn = 1/n. Then regardless of what value X is, we can find an N such that 1/n < X for any n > N. Hence, by the definition of < in the hyperreals, b<X for any positive real number, hence b is infinitesimal.

This implies some rather strange things. We have for the reals .9999999999999…… = 1. But what about the hyperreals? .9999999….. is the sequence .9 .99 .999 ………. But every term is less than 1. Thus .99999999….. < 1. However, 1 - .99999999999...... is the sequence = .1 .01 .001 ……. = a1 a2 … an …. Hence for any positive real number X, we can find N such that for n > N then an < X. Hence .9999999…. differs infinitesimally from 1.

Also, we have infinitesimals smaller than other infinitesimals, eg 1/n^2 < 1/n, except when n = 1.

Hyperreals also contain infinite numbers larger than any real number. Let A be the sequence n. If X is any real number there is an N such for all n > N n > X. Again we have infinitely large numbers greater than other infinitely large numbers because except for n = 1 n^2 > n.

If a is an infinitesimal 1/a is infinitely large. This follows from a/a = (1/a)*a = 1. If 1/a is infinitesimal then multiplying by an infinitesimal will give another infinitesimal. If 1/a is a real number, multiplying it by an infinitesimal still gives another infinitesimal. This means 1/a must be infinite. And conversely, if a is infinite 1/a is infinitesimal.

Let's see how it is used in calculus. The derivative is simply (f(x + dx) - f(x))/dx - dx infinitesimal, nothing more is needed except to say we can, of course, neglect the infinitesimal part of the answer when working in the reals which would normally be done. Why? If a is infinitesimal and x is any positive real number then -x<a<x. But in the reals, only one number has that property - 0.

For the integral, let A be the area under f(x) from a to b. A = sigma (from a to b) f(x)*delta(x) + e(delta(x)) with e an error term that becomes 0 when delta(x) is zero. Of course, delta(x) can't be zero, however, it can be infinitesimally close to zero in which case the error should be infinitesimally close to zero. A = sigma (from a to b) f(x)*dx + e(dx) where dx is the infinitesimal (delta(x), delta(x)/2, delta(x)/3 .........). Neglecting infinitesimals, we have A = integral (a to b) f(x)dx.

The reader may find it instructive and fun to go through a usual handwavy infinitesimal calculus treatment and apply the hyperreals to it.

mathwonk · May 9, 2023

Thank you. This raises some interesting points. One question: Is it possible when you say "Two reals, A and B, are equal if An - Bn is Cauchy. Intuitively this means (An - Bn) converges to 0, i.e. they converge to the same number." that you really meant to say explicitly that An-Bn converges to zero, and not just that it is Cauchy. Apparently An-Bn is always Cauchy when An and Bn are, but it need not converge to zero. I.e. An-Bn being always Cauchy is needed for your definition of subtraction.

I.e. perhaps you meant: "Two reals, A and B, are equal iff (An - Bn) converges to 0, i.e. they converge to the same number." ?

fresh_42 · May 9, 2023

The real numbers are representatives of the equivalence classes of rational Cauchy sequences modulo zero-sequences.

bhobba · May 9, 2023

mathwonk said:

One question: Is it possible when you say "Two reals, A and B, are equal if An - Bn is Cauchy. Intuitively this means (An - Bn) converges to 0, i.e. they converge to the same number." that you really meant to say explicitly that An-Bn converges to zero, and not just that it is Cauchy. Apparently An-Bn is always Cauchy when An and Bn are, but it need not converge to zero. I.e. An-Bn being always Cauchy is needed for your definition of subtraction.

I.e. perhaps you meant: "Two reals, A and B, are equal iff (An - Bn) converges to 0, i.e. they converge to the same number." ?

I see your point. If An - Bn is Cauchy then it can converge to a number other than 0. I need to explicitly state it converges to zero. Will fix. Thanks for picking up the error.

Thanks
Bill

bhobba · May 9, 2023

fresh_42 said:

The real numbers are representatives of the equivalence classes of rational Cauchy sequences modulo zero-sequences.

Hi Fresh

Thanks for reading my post. It has been a while since I have done this analysis stuff but I thought defining equality was the same thing as defining an equivalence class. Or have I goofed? I simply did it that way rather than equivalence classes because its similar to defining the hyperreals. Would it be better to say they are equivalent instead of equal? I thought it was the same thing or is there a difference?

I simply included so the reader can see how the reals can be defined in a way similar to hyperreals?

Thanks
Bill

fresh_42 · May 9, 2023

bhobba said:

Hi Fresh

Thanks for reading my post. It has been a while since I have done this analysis stuff but I thought defining equality was the same thing as defining an equivalence class. Or have I goofed. I simply did it that way rather than equivalence classes because its similar to defining the hyperreals.

Thanks
Bill

I would have to look it up for the technical details. What I wrote was what I had in mind from Hewitt, Stromberg. I learned about the reals the hard way, by Dedekind cuts. I was surprised by how intuitive the approach via Cauchy sequences is. Hewitt, Stromberg define
$$
\mathfrak{N}=\{\text{zeros (null)}\}\subsetneq \mathfrak{C}=\{\text{Cauchy}\}\subsetneq\mathfrak{B}=\{\text{bounded}\}
$$
and introduce the completion field as ##\boldsymbol{\bar{F}}=\mathfrak{C}/\mathfrak{N}.## That's the idea. The details take a couple of pages.

bhobba · May 9, 2023

fresh_42 said:

I was surprised by how intuitive the approach via Cauchy sequences is. Hewitt, Stromberg define

My analysis prof did it by Cauchy sequences all those years ago - he mentioned he thought it the most intuitive. The hardest part was proving the LUB property. When a prof says that you can bet it will be on the final. It wasn't - it was on the mid-term. :DD

.

I refreshed my memory from here (and even then, I made a goof picked up by Mathwonk):
https://en.wikipedia.org/wiki/Construction_of_the_real_numbers

Tarski is the most interesting, if not the most transparent. The reason is it evades Godel - something many don't seem aware of:
https://plato.stanford.edu/entries/goedel-incompleteness/
'On the other hand, not all theories of arithmetic are incomplete. The theory of only addition of natural numbers but without multiplication (often called “Presburger arithmetic”), for example, is complete (and decidable) (Presburger 1929), as is the theory of multiplication of the positive integers (Skolem 1930). These theories are, though, very weak. But in any case, at least a theory which deals with both addition and multiplication is needed. More interestingly, the natural first-order theory of arithmetic of real numbers (with both addition and multiplication), the so-called theory of real closed fields (RCF) is both complete and decidable, as was shown by Tarski (1948); he also demonstrated that the first-order theory of Euclidean geometry is complete and decidable. Thus, one should keep in mind that there are some non-trivial and interesting theories to which Gödel’s theorems do not apply.'

I remember when I found this out, it was a conundrum. It took a while and some investigation to understand. It means the reals can't be used to define arithmetic, but it can be used to define the reals. What that means is still a conundrum I may understand better someday.

Another is defining the reals from the hyperrationals. That might be interesting to investigate.

Thanks
Bill

bhobba · May 9, 2023

One thing I need to do is prove 1/Xn is Cauchy if Xn does not converge to 0. I did not do it in the main write-up as it really is a bit of a tangent into deeper real analysis but will correct it for those interested. Suppose it does not converge to zero. There is some N such that for all n > N, |Xn| > M where M is a positive number. If not, it would converge to 0 contradicting our assumption. This means 1/|Xn| < M. If |Xn - Xm| < e*M^2 where m and n > N.

|1/Xn - 1/Xm| = |Xn - Xm|/|Xn*Xm| < (e*M^2)/M^2 = e.

Hence given any e, we can find an N such that if m,n > N then |1/Xn - 1/Xm| < e ie the reciprocal is Cauchy.

Hence 1/Xn is Cauchy, provided Xn does not converge to 0.

Thanks
Bill

haushofer · May 10, 2023

A related question: I never understood what those infinitesimals have to do with the basis vectors for (one)-forms. But if that's offtopic here, never mind :P

martinbn · May 10, 2023

haushofer said:

A related question: I never understood what those infinitesimals have to do with the basis vectors for (one)-forms. But if that's offtopic here, never mind :P

I think differentials of functions are supposed to be infinitesimals at least nonrigorously. Not sure if non-standard analysis deals with that in a logically consistant way.

mathwonk · May 10, 2023

Thanks bhobba! This sent me on a tour of a couple of sources, after which I could understand your post better than the first time. As a traditional mathematician, I did come away unconvinced of the advantages of the approach, but greatly appreciate learning more about it. Here is the result of my adventure.

Hyperreal numbers: (a very naive account, certainly including errors)

R* is an ordered field containing the reals R, but also containing infinitesimal (infinitely small) and infinite (infinitely large) “hyperreal” numbers.

Definition: A hyperreal number e is infinitesimal if |e| < r for all positive real numbers r,
is infinite if r < |e| for all positive real numbers r,
and is finite if |e| < r for some positive real number r.

Note we have to allow infinitely large numbers since we have a field and we have infinitely small numbers. Then one proves that that infinitesimals are a subring of hyperreals, and an ideal in the ring of finite hyperreals. I.e. the sum, difference, and product of two infinitesimals is infinitesimal, and the product of an infinitesimal by a finite number is again infinitesimal. Also the sum of a finite and an infinite number is infinite, as is the sum and product of two infinite numbers (but not the difference), and the reciprocal of an infinite number is a (non zero) infinitesimal.

In particular, all infinitesimal numbers are finite, as also is the sum of an infinitesimal and a real. In fact those are exactly all the finite hyperreals. I.e.

Defn: Two hyperreal numbers are infinitely near iff they differ by an infinitesimal.

Theorem: Every finite hyperreal number is infinitely near exactly one real number, namely the real least upper bound of all reals less than x.

Cor: The finite hyperreals form a local ring, with unique maximal ideal given by the infinitesimals, and the quotient field of this ring by its maximal ideal is the usual real field.
proof: The finite hyperreals are obtained by removing the infinitely large hyperreals from the hyperreal field, i.e. we remove exactly the inverses of non zero infinitesimals. Hence the set of infinitesimals is exactly the non units (non invertible elements) of the finite hyperreals. But this says the finite hyperreals are a ring in which the non units form an ideal. That ideal is thus the unique maximal in the ring of finite hyperreals, which is thus a "local" ring. Modding out this local ring by the maximal ideal of infinitesimals gives exactly the real field. I.e. no non zero real is infinitesimal, so the quotient map is injective on the underlying real field. Since every hyperreal x is infinitely close to some real, hence every hyperreal has the same image in the quotient field as some real. qed.

Defn: The unique real number infinitely near a finite hyperreal is called the “standard part” of that (possibly non standard) finite hyperreal number. I.e. the standard part of a hyperreal number x is its image in the quotient ring after mi=odding out the finite hyoperreals by the mximal ideal of infinitesimals.

It follows that taking the standard part is a ring homomorphism, i.e. preserves sums and products.

So the finite hyperreal numbers consist of all numbers of form r+e, where r is real and e is infinitesimal, and the infinitesimals are precisely those hyperreals that are infinitely near zero.

Now let me pause for a remark that may orient you within this new world of strange objects. Namely, when I looked at the actual construction of the hyperreals, it seems that a hyperreal is just a sequence of reals, and it is infinitesimal if and only if the sequence converges to zero. So basically an infinitesimal is just a sequence of ordinary reals that converges to zero. There is however an equivalence relation but a pretty weak one.

I.e. we know how to construct the reals from the rationals by taking equivalence classes of Cauchy sequences of rationals: two Cauchy sequences of rationals are equivalent iff their difference converges to zero. Then a real number is just an equivalence class of Cauchy sequences of rationals i.e. the real numbers are the quotient of the ring of Cauchy sequences, modulo the maximal ideal of “null sequences”.

To construct hyperreals, we start from all sequences of reals, restrict to a certain subset of all such sequences, and then define an equivalence relation. The whole construction is given by some exotic notion called an ultrafilter, but it seems to equate two sequences essentially if they differ only in a finite number of terms. So two sequences are equivalent iff they are eventually the same. [This is one key place where my naive version of this topic is likely to introduce errors, but hopefully it still contains the essence of the concept.]

This is one reason we need to use sequences of reals and not just rationals. I.e. we need the reals to be a subfield of hyperreals, and since a sequence representing a real number must not only converge to it, it must eventually equal it, so we need to allow sequences of reals. Thus a hyperreal represents a real number essentially if it is eventually equal to that real number. Note that by making the equivalence relation weaker, we get more equivalence classes, i.e. more hyperreal than real numbers.

There are various ways to choose the data that give a construction of hyperreals, e.g. the precise equivalence relation, but it is apparently a theorem that a sequence of reals represents an infinitesimal in all of them, if and only if it converges to zero, and that a sequence represents a finite hyperreal with standard part equal to r, if and only if the sequence converges to r, and a sequence represents a (positive) infinite hyperreal if and only if it diverges to +infinity.

[From my naive discussion, it seems as if I need sequences restricted to ones that either converge or diverge to infinity or minus infinity, and not ones that oscillate. But I have not understood the abstract construction via ultrafilters; i.e. my eyes glazed over at a certain point. It is also true that one can choose the ultrafilter data so that a sequence is infinitely near r iff it has a subsequence converging to r.]

Thus it seems to me that a hyperreal number is essentially just a sum r + en, where r is a real number and en is a sequence of reals that converges to zero.
In particular, the proofs of the properties of infinitesimals, using the definitions above, look exactly like the usual epsilon - delta proofs of convergence statements for sequences.

The “standard part” of the hyperreal number r + en is just the real number r, which is just the limit of the sequence r+en, so taking the “standard part” corresponds exactly to taking the limit.

Now let us compare a non standard argument, to a standard limit argument.

Definition: A function f is continuous at r iff whenever x is infinitely close to r, then f(x) is infinitely close to f(r).

This then is the same as saying that if xn converges to r, then f(xn) converges to f(r).

E.g. to prove f(x) = x^2 is continuous at r, the non standard argument is that if x = r+e is infinitely close to r, then (r+e)^2 = r^2 + 2er + e^2. since the product of 2r and the infinitesimal e is again infinitesimal, as is the product e^2, this is infinitely close to r^2.

The standard argument is that if xn = r + en converges to r, then (xn)^2 = (r+en)^2 = r^2 + 2r.en + (en)^2 converges to r^2, since both 2r.en and (en)^2 converge to zero.

For differentiation, it is apparently not quite correct to say that the derivative is equal to [f(x+e)-f(x)]/e, where e is an infinitesimal; rather one must show that this expression is always a finite hyperreal for every infinitesimal e, and always has the same standard part, and then one takes as derivative, the common “standard part” of this expression. I.e. we do not recover the language of Leibniz, where the derivative is equal to the difference quotient with an infinitesimal in the denominator, rather one takes the “standard part” of that expression.

Ah yes, bhobba said that in passing: “Let's see how it is used in calculus. The derivative is simply (f(x + dx) - f(x))/dx - dx infinitesimal, nothing more is needed except to say we can, of course, neglect the infinitesimal part of the answer when working in the reals which would normally be done. "

E.g. the derivative of x^2 is computed in the hyperreals by looking at [(x+e)^2 - x^2]/e = 2x +e, whose standard part is then 2x, since e is infinitesimal.

In standard calculus, one has the remarkably similar calculation that if en—>0, then
[(x+en)^2 - x^2]/en = (2x + en) —>2x.

I myself don’t see much difference in these arguments, and in particular no significant advantage to the non standard version. In fact, without the actual construction of the hyperreals, which the elementary calculus books using it seem to omit or finesse, there is to me a great loss of clarity in the non standard version, due to fuzzy concepts. I.e. the book I saw by Keisler, “Elementary Calculus”, uses axioms rather than a construction for hyperreals, and the axioms were quite difficult for me to make precise sense of. In particular I could not understand the “extension” principle, or “function” axiom, by which every expression involving reals was supposed to extend to a corresponding hyperreal version. E.g. I could not see why every function defined on the reals yields a unique function naturally defined for all hyperreals. This seems to be a place where the key notion may be a logical discussion of how functions are defined by “first order sentences”, a concept that Keisler thinks better to avoid. The result for me was just confusion. 

AHA! bhobba clarifies this by saying: “If F(X) is a function defined on the reals, then that can easily be extended to the hyperreals by F(X) = F(Xn).”

Here are Keisler’s axioms (not the ones in his Elementary Calculus, but those in his Foundations of Infinitesimal Calculus”).

“The following axioms describe a hyperreal number system as a triple (∗, R, R∗), where R is called the field of real numbers, R∗ the field of hyperreal numbers, and ∗ the natural extension mapping.

Axiom A
R is a complete ordered field.

Axiom B
R∗ is an ordered field extension of R.

Axiom C
R∗ has a positive infinitesimal.

Axiom D (Function Axiom)
For each real function f of n variables there is a corresponding hyperreal
function f∗ of n variables, called the natural extension of f. The field operations of R∗ are the natural extensions of the field operations of R.

By a hyperreal solution of a system of formulas S with the variables x1,... ,xn we mean an n-tuple (c1,... ,cn) of hyperreal numbers such that all the formulas in S are true when each function is replaced by its natural extension and each xi is replaced by ci.

Axiom E (Transfer Axiom)
Given two systems of formulas S,T with the same variables, if every real
solution of S is a solution of T , then every hyperreal solution of S is a solution of T.”

This monograph is available free online at
https://people.math.wisc.edu/~hkeisler/foundations.pdf

Please take a look, and do not take my ignorant version as gospel, but I have done my best to make some sense of it for myself. And now that I have worked on it, bhobba's post is even more clear and useful. Thank you bhobba! Thanks to you I now have some idea of this previously mysterious topic.

bhobba · May 10, 2023

haushofer said:

A related question: I never understood what those infinitesimals have to do with the basis vectors for (one)-forms. But if that's offtopic here, never mind :P

Not off-topic. But I have no idea. There is a book I may get called Applied Non-Standard Analysis which has a more sophisticated treatment than I used. It may answer that. I know it does something 'weird' and develops infinite dimensional spaces from finite linear algebra.

This post was just a surface treatment for students like me who were worried about the handwavey treatment of usual introductory calculus generally done at HS here in Aus. It is fixed in Real Analysis courses, but I just wanted to give a more straightforward way of doing it for those that have not done real analysis yet. For example, I often recommend Boaz to people after a beginner's course in calculus, delaying real analysis. You learn all the math you need in the physical sciences and can move on with science studies and learn real analysis a bit later. It is not how I did it. Calculus was required for admission. If you had not done it, there was a subject you had to pass before full admission, but everyone I knew had done it in HS. We just had a review subject to ensure everyone was up to speed, then real analysis, which nearly everyone except nuts like me hated. We had one person doing it for the third time - in the end, he got a pass conceded.

Non-Standard analysis has many more applications than I posted.

Thanks
Bill

bhobba · May 10, 2023

martinbn said:

I think differentials of functions are supposed to be infinitesimals at least nonrigorously. Not sure if non-standard analysis deals with that in a logically consistant way.

They are infinitesimals. I think there are theorems that show it is as consistent as rigorous analysis. My favourite use is in the way I explain complex numbers. Consider the real line as the x-axis in the plane. We know -1 is an operator that rotates a real number by pi. I generalise that to F(x) being an operator that rotates a real number by an angle x in the plane. Of course F(x+y) = F(x)F(y). We define F(pi/2) as i and note i^2 = -1 ie a rotation by pi. i = sqrt(-1) hence i is the imaginary number i. But here, it is not imaginary at all - but is a simple process of rotation. This, of course, leads to complex numbers. A point (x,y) = (x,0) + (0,y). Since (x,0) and (y,0) lie on the x axis which can be taken as the real line then (x,0) = x and (y,0) = y so (x,y) = x + (0,y) = x + i*(y,0) = x + i*y and the y axis becomes the complex line. Of course, in this form points in the plane become complex numbers.

F'(x) = (F(x + dx) - F(x))/dx = ((F(dx) - 1)/dx)*F(x) = i'*dx where the operator i' = (F(dx) - 1)/dx. F(delta(x)) is approximately 1 + F(pi/2)*delta(x), as can be seen by rotating a line through a small angle delta(x). F(delta(x)) = 1 + i*delta(x) + e(delta(x)) where e(delta(x)) is an error term that when delta(x) is infinitesimally close to 0 can be taken as 0. So F(dx) = 1 + i*dx. i' = (1 + i*dx - 1)/dx = i. Hence F'(x) = dF/dx = i*F(x). Solving, we have dF/F = i*dx. Integrating both sides ln(F) = C*i*x. e^(ln(F)) = C'*e^(i*x). F(x) = C'*e^(i*x). If x=0, C' = 1. F(x) = e^(i*x). This is formally based on the properties of ln(x) for reals so is just suggestive where F(x) is a rotation operator. And as everyone knows ln(z) where z is complex does not have a unique inverse. But it is reasonable justification to define e^(i*x) as F(x). Certainly (e^(i*x))' = i*e^(i*x) and e^i*(x+y) = e^(i*x)*e^(i*y) as would be expected.

Now things move quickly F(-x)(cos(x) + i*sine(x)) = 1. Applying F(x) to both sides F(x) = cos(x) + i*sine(x) = e^(i*x) which of course, is Euler's famous relation. Using it, we can easily find the derivatives of the trig functions and cos(x+y) etc.

See the key trick used when going to infinitesimals. If e(x) is an error function that is zero when x=0, we assume it is also zero when x is infinitesimally close to 0. Formally since e(x) is real, e(dx) is also taken as real even though if dx is the sequence 1/n, then the sequence e(1/n) would be infinitesimal. So by definition, e(x) is the real part when working in the hyperreals. Naturally, for x real, it is real, so it makes no difference to the approximation formula but allows it to be 0 when x is infinitesimal.

Thanks
Bill

bhobba · May 10, 2023

mathwonk said:

As a traditional mathematician, I did come away unconvinced of the advantages of the approach, but greatly appreciate learning more about it. Here is the result of my adventure.

Thanks for your very nice reply - much appreciated. And again, for picking up the error I initially made about equality. I read before that link you gave, but it is not my favourite because it just lists properties rather than actually constructs hyperreals. The book I am thinking of getting for a more advanced treatment is Applied Nontandard Analysis:

https://www.amazon.com.au/dp/0486442292/

What the heck - it's cheap enough, so I just bought it for my ever-growing library - this time in paper form - I tend to get e-books frequently these days but find books I learn from are better in paper form.

Your view as a traditional mathematician is by far the most common. The converts to nonstandard analysis tend to be those into mathematical logic and foundations, such as Tarski and Godel (when they were alive). Godel said when he was introduced to it in a seminar by Robinson:

'I would like to point out a fact that was not explicitly mentioned by Professor Robinson but seems quite important to me; namely that non-standard analysis frequently simplifies substantially the proofs, not only of elementary theorems, but also of deep results. This is true, e.g., also for the proof of the existence of invariant subspaces for compact operators, disregarding the improvement of the result, and it is true to an even higher degree in other cases. This state of affairs should prevent a rather common misinterpretation of non-standard analysis, namely the idea that it is some kind of extravagance or fad of mathematical logicians. Nothing could be further from the truth. Rather there are good reasons to believe that non-standard analysis, in some version or other, will be the analysis of the future.'

I don't like to disagree with someone of Godel's stature, but I think history has shown it has not caught on like wildfire - most mathematicians stick to traditional analysis. A few are 'converts' and have solved problems that baffled the usual methods. So it is good they are around.

I think its value is pedagogical. I would not teach beginner calculus using non-standard analysis but just using the simple idea of infinitesimals being numbers so small that, for all practical purposes, they can be neglected when you want to but don't run into the dreaded 0/0. In a final chapter - or appendix - I would give a bit more detail. Gardner does this in Calculus Made Easy. I would, however, include the detail in my first post - I have never liked the - it can be shown. Gardner just states the results. I much prefer at least heuristic derivations.

Thanks
Bill

mathwonk · May 11, 2023

Dear Bill: I could be wrong, but it seems to me that Keisler does construct the hyperreals on pages 23-31 of that linked monograph, chapter 1G, using Zorn's lemma to prove existence of free ultrafilters. Here is what I gathered from trying to read it.

Ok, it is a little more complicated than I thought. I.e. at least in the version described by Keisler, we do take all real sequences, as bhobba said, but the equivalence relation is more complicated, (since not all real sequences converge or diverge to infinity). It is not enough to equate only sequences that eventually are the same. e.g. the sequences (1, 1/2, 1/3, 1/4, 1/5, 1/6, 1/7, 1/8, 1/9, ….) and (1/2, 1/2, 1/6, 1/4, 1/10, 1/6, 1/14, 1/8, 1/18,…) both converge to zero, hence are infinitesimal hyperreal numbers. But their quotient sequence is (2, 1, 2, 1, 2, 1, 2, 1,……) hence does not converge to anything. So what is it equivalent to?

It turns out there is a condition in the equivalence relation that equates this sequence with a convergent sequence. However, it can be equated to either (1,1,1,1,1…) or to (2,2,2,2,2…..) depending on your choice of the expanded equivalence relation.

I.e. the equivalence relation depends on a choice of a family U of infinite subsets of the positive integers, and two sequences are equivalent iff the indices where they are the same, belongs to U. Now it is true that U contains all “cofinite” subsets of the positive integers, but it also contains far more. In fact, for every infinite subset S whose complement is also infinite, either S or its complement must belong to U, but not both.

Hence in my example quotient sequence above, where the odd indexed entries are all 2 and the even indexed entries are all 1, either the set of all odd positive integers or the set of all even positive integers must belong to U. Hence that sequence must be equated either to 2 or to 1, (but not to both). i.e. there is a system of hyperreals in which that sequence equals 1, and another in which it equals 2. (I have not read chapter 15, where Keisler says he uses the axiom of "saturation" to define a single distinguished hyperreal field.)

I.e. a family of subsets of positive integers that is closed under taking larger subsets, and forming finite intersections, is called a filter. If it also has the property that it contains all cofinite subsets, and whenever both S and not-S are infinite it contains exactly one of them, it is called a free ultrafilter. One uses Zorn’s lemma to prove the existence of these and works to prove the resulting object is an ordered field. (The collection of all cofinite subsets is a filter but not an ultrafilter.)

Now note that every real sequence is either bounded, hence contains a convergent subsequence, or unbounded, hence contains a subsequence that diverges to infinity or to minus infinity. The ultrafilter will apparently single out exactly one of those convergent subsequences! Or rather, one family of convergent subsequences, all with the same limit. Note e.g. that if U contains the indices of one convergent subsequence, by the definition of an ultrafilter, it cannot contain the complementary set of indices, hence cannot contain any infinite subset of that complement; hence it cannot contain the indices corresponding to another convergent subsequence with a different limit. So the ultrafilter may distinguish many convergent subsequences, but they must all have the same limit.

Thus an ultrafilter specifies a huge number of subsequences of a given sequence. Some of those may be convergent and some not, but all distinguished subsequences which do converge, must have the same limit. Presumably it is a (compactness type) theorem that at least one of the distinguished subsequences does converge (or diverge to ± infinity). Then that convergent subsequence, if its limit is finite, determines the "standard" (real) part of the hyperreal number defined by the original sequence.

Thus in practice one cannot tell from looking at a non convergent sequence, just what hyperreal, or real, number it represents. Since (1, 1/2, 2, 1, 1/3, 3, 1, 1/4, 4, 1, 1/5, 5, 1, 1/6, 6,….) is an admissible sequence, one cannot even tell whether the number represented is infinitely large, finite, or infinitesimal!

(One can see this sequence must represent either 0, 1, or +infinity however. I.e.let A be the subset of indices congruent to 1, mod 3, let B be those congruent to 2, and C the subset congruent to 0. Then either the ultrafilter contains A, and this sequence represents 1, or if not, it must contain BunionC. Similarly it must either contain B, whence it is infinitesimal, or not, whence it contains AunionC. But if it does not contain either A or B, it must contain both BunionC and AunionC, hence also their intersection, which is C, and the sequence represents an infinitely large positive number.)

However, this ambiguity exists only for non convergent sequences. I.e. every convergent sequence does represent (i.e. is infinitely near) its ordinary limit, in every choice of ultrafilter. So an ultrafilter is a way of making every sequence look either convergent, or divergent to (plus or minus) infinity. If you are worried about dividing by zero, as I was, note that every sequence with an infinite number of zeroes, and an infinite number of non zeroes, is either equated to a sequence of all zeroes, hence you don't have to divide by it, or to a sequence with all non zeroes, hence no problem!

Interesting stuff.

Thanks again!

mathwonk · May 12, 2023

Well this continues to be quite interesting. Since ultrafilters define an equivalence relation on sequences of reals, that respect the laws of addition and multiplication, and the equivalence classes are a field, I should have noticed much sooner that they must be essentially given by a maximal ideal in the ring S = ∏R, the product of infinitely many copies of the reals R, the product being indexed by the natural numbers N.

Indeed, if F is a filter consisting of subsets of N, and if we define a corresponding subset J = J(F), of this ring S of sequences, by saying that x is in J precisely when the subset of indices where x=0 is a set in our filter F, then the filter properties make J an ideal, and J is a prime ideal precisely when F is an ultrafilter. Moreover all proper prime ideals of this ring S are maximal, so an ultrafilter F gives us a maximal ideal J of S, and hence a field S/J as quotient ring.

Now one maximal ideal in this ring is of form {0} x ∏R, product indexed from 2 to infinity, and the quotient field is just R, the first copy of the reals in the product, and we can do this for any one copy of R, thus getting a countable family of maximal ideas of S each with quotient R. But this first maximal ideal is defined by the ultrafilter F consisting of all subsets of N that contain the integer 1, and this is not a "free" ultrafilter. In particular this filter does not contain all cofinite subsets of N. The same is true for all examples of this simple type.

If we take instead the filter consisting just of cofinite subsets of N, we get an ideal but not a maximal ideal. By Zorn's lemma however, this ideal is contained in a maximal ideal m, and we get a more exotic quotient field S/m = R* of the sort we have been discussing in posts above. This maximal ideal m in fact comes from an ultrafilter containing the cofinite subsets.

I.e. the correspondence between filters and ideals is reversible: if we take J to be any ideal of S, we can define a family F of subsets of N by saying a subset U of N, belongs to F if and only if there is an element x of the ideal J which has zeroes at precisely the entries defined by the subsequence U. Then F is a filter, and does not contain the empty set iff J is a proper ideal. Moreover, if J is a prime ideal then F is an ultrafilter. So from the construction above of a maximal ideal containing the ideal of sequences corresponding to cofinite subsets, we get a maximal ideal m corresponding to a free ultrafilter, and hence a construction of S/m = R* exactly as discussed in previous posts above.

This also can be used to prove that there is a very large family of "hyperreal" numbers R*, since they correspond to all exotic maximal ideals of S. In fact the family of all maximal ideals of S has a topology making the whole collection homeomorphic to the Stone-Cech compactification of N, i.e. of the natural numbers! The wikipedia page on that Stone Cech compactification of N discusses its construction as the family of all ultrafilters on N, and we have seen here that this family is equivalent to the family of all maximal ideas of S. So this elucidates for me, after over 50 years, Mumford's cryptic comment in his little redbook of algebraic geometry, in the section "spec(R)" on affine schemes, example G, that spec(∏k) can be shown to be the stone cech compactification of N, by people who know about "ultrafilters and other far out mysteries"!

[here is a conjectural sketch of an argument based on information jogged in memory by the introduction to a paper of Jerison et. al. on a theorem of Gelfand and Kolmogoroff. Namely, for a nice compact space X, like a closed bounded real interval, the space itself is recovered as the space of maximal ideals in its ring C(X) of continuous real valued functions. Now the ring ∏R consists of continuous real valued functions on N, and every such function extends uniquely to a continuous function on the stone cech compactification X of N, hence ∏R = C(N), is actually isomorphic to the ring C(X) of continuous real valued functions on X, hence the maximal ideals in ∏R are the same as the maximal ideals in C(X), which (since X is compact) is the same space as X, the stone cech compactification of N. i.e. spec(∏R) ≈ stone cech compactification of N! In particular ultrafilters are not needed for this. For this to work you do need the theorem of Gelfand-Kolmogoroff that the maximal ideals of the ring of continuous functions on X are bijectively equivalent with the maximal ideals of the subring of bounded functions.]

wow! cool. Thanks again bhobba. You opened a door for me that was very long closed.
It is still a little surprizing to me that all these quotient fields have a natural ordering, as I know very little about ordered fields. Maybe after all this time I will read those sections of Van der Waerden. I do not see, e.g., why the existence of infinitesimals follows obviously from the ring theory. Well if the field is ordered and larger than the reals, then those elements have to fit in somewhere, so the existence of the linear ordering seems basic.

.......Well it seems that any field extension L of an ordered field K, in which -1 is not a linear combination, with positive coefficients from K, of squares in L, has an ordering extending that of K, essentially taking (numbers equivalent to) non zero squares as positive. This seems to hold here.

mathwonk · May 13, 2023

Actually my concern as to just why these exotic ordered field extensions of the reals should happen to have the properties we want, of containing infinitesimals, is misplaced. I.e. ANY ordered field extension properly larger than the reals must have infinitesimals. All we need to know is that our new fields are not archimedean. But by the least upper bound property, the reals are a maximal archimedean ordered field. Any ordered (proper) field extension of them is thus non archimedean, and hence contains infinitely large elements, and then their inverses are infinitesimal. So the problem was just to find any ordered field which is a proper extension of the reals, and then it was guaranteed to be one with the infinitesimals we want. There are lots of these out there, e.g. presumably the one analogous to that given in Van der Waerden, namely the fraction field of the polynomial ring R[t], where a polynomial is positive iff its leading coefficient is so. Then t is infiitely large and 1/t is an infinitesimal. For some reason then, the hyperreals constructed above from real sequences are preferred by logicians. Of course a rational function has a Laurent expansion apparently displaying its infinitesimal, finite, and infinite parts as well as roughly the relative "sizes" of these parts: i.e. 1/t + 3 + t + 2t^2. But maybe the hyperreals above are complete? (in the Cauchy sense)? (no, apparently not.)

Well I have just looked at the wikipedia article on hyperreal numbers, and everything I had discovered for myself is laid out clearly there, and more. They point out e.g. that the usefulness of the hyperreals as constructed from sequences is not only do they contain infinitesimals, but the "transfer principle" holds. I have not understood that so well, but it seems that is key to relating properties of the reals to properties of the hyperreals. They had apparently also been introduced algebraically by the mathematician Hewitt, over a decade before the logician Robinson showed their usefulness for non standard analysis. Here is the link to the wiki article.
https://en.wikipedia.org/wiki/Hyperreal_number

the connection with the sort of standard (functional) analysis done by Hewitt is apparently that Hewitt looked at rings of continuous (real and complex valued) functions, and their maximal ideals and resulting quotient fields. Our example of the product ring ∏R, is the special case of continuous real valued functions on the space N of natural numbers with the discrete topology. I.e. sequence of reals {x(n)}, is just a real valued function on the index set N. Hewitt apparently even coined the term "hyperreal" fields for the case where his resulting field properly contained the real numbers. Interestingly, and frustratingly, there is no way to actually construct an exotic maximal ideal in ∏R, nor an actual free ultrafilter, nor an actual hyperreal field. You just have to wave your hands and say the magic words "Zorn's lemma", i.e. axiom of choice.

Oh yes remark: it is also stated there that the hyperreal construction does not depend, up to isomorphism, on the choice of ultrafilter, i.e. of the choice of exotic maximal ideal in ∏R; i.e. if you assume the continuum hypothesis, all the hyperreal fields obtained are isomorphic.

SSequence · May 13, 2023

bhobba said:

More interestingly, the natural first-order theory of arithmetic of real numbers (with both addition and multiplication), the so-called theory of real closed fields (RCF) is both complete and decidable, as was shown by Tarski (1948); he also demonstrated that the first-order theory of Euclidean geometry is complete and decidable. Thus, one should keep in mind that there are some non-trivial and interesting theories to which Gödel’s theorems do not apply.'

I remember when I found this out, it was a conundrum. It took a while and some investigation to understand. It means the reals can't be used to define arithmetic, but it can be used to define the reals. What that means is still a conundrum I may understand better someday.

I don't know much either what the theory of reals entails actually (meaning the kind of questions it can pose). Here is a somewhat relevant thread that I stumbled upon many months back:
https://math.stackexchange.com/ques...in-the-first-order-theory-of-the-real-numbers

I do feel that it is interesting when a problem that looks (or is) quite difficult can be asked in a decidable theory.

bhobba · May 15, 2023

Hi Everyone.
The comments have been excellent. I have been working on an insight article that expands on what I wrote, and it is nearly finished. I hope it does justice to some of the beautiful comments. Managed to link it more to limits and how it is related to the hyperrationals that don't have one important property that requires completeness. Namely, the elements of the hypernumber sequence need completeness to decompose a hypernumber into an element of the hypernumbers are sequences of plus an infinitesimal. The hyperrationals contain the reals but are only infinitesimally close to them. To get the number, it is infinitesimally close to; completeness is necessary.

However, the reals as Cauchy sequences are contained in the hyperrationals, which is how I construct the hyperreals. Also, I isolated the critical thing that is going on. If limit An, Bn converges to the same number; there is a very large N such that n>N |An-Bn| < e where e is infinitesimal in the usual sense of it being so small it can be neglected. This leads to an N can be found n>N An = Bn as the definition of convergence, which is the definition of equality in the hyperreals. The sequences indeed converge in that case. But in other cases, no such N exists. However, an N exists such that, An and Bn are infinitesimally close. This is the intuitive idea and carries over to rigorous detail. More in my insight article when finished.

SammyS · May 16, 2023

bhobba said:

Hi Everyone.
The comments have been excellent. I have been working on an insight article that expands on what I wrote, and it is nearly finished. I hope it does justice to some of the beautiful comments.
...

I'll be looking forward to it.

bhobba · May 21, 2023

SammyS said:

I'll be looking forward to it.

I will submit it tomorrow.

Thanks
Bill

bhobba · May 24, 2023

Submitted.

bhobba · Jun 2, 2023

Hi All

This has proven surprisingly addictive. I have discovered something very interesting. The bounded hyperrationals are nothing but all the rational Cauchy Sequences hence are just the reals. I am working on updating my article to incorporate this surprising result and its implications.

The trouble is it has grown beyond what I envisioned it to be.

Thanks
Bill

bhobba · Jun 3, 2023

Update done.

bhobba · Jun 4, 2023

As explained in the previous posts the insights article has grown beyond what I intended. To rectify this I have done a simplified version. There are now two versions - an advanced version and a simplified version. The advanced version goes deeper into real analysis and is best read after a book on Calculus. The simplified version can be read before an infinitesimal-based text like Full Frontal Calculus or Calculus Made Even Easier.

How can hyperreal numbers make infinitesimals logically sound in calculus?

FAQ: How can hyperreal numbers make infinitesimals logically sound in calculus?

What are hyperreal numbers?

How do hyperreal numbers relate to infinitesimals?

Can you give an example of how hyperreal numbers are used in calculus?

What are the advantages of using hyperreal numbers over traditional calculus?

Are there any criticisms of using hyperreal numbers in calculus?

Similar threads

Hot Threads

Recent Insights