Should I Become a Mathematician?

mathwonk · Jun 11, 2006

i think you can prove it with only the intermediate value theorem. i.e. let f be a continuous function on [a,b] with derivative non zero on (a,b), and prove first that the image of f is an interval [c,d], and that f is strictly monotone from [a,b] to [c,d].\\\

this will be easy for you. then use IVT to prove that f^(-1) is continuous from [c,d] to [a,b].

you can do this easily too, with some thought, and it may convince you that the things concealed from you are as easy as those shown you. the great thing is to begin to see that the subject is not a mystery, contained within book covers, but is open to all who use their "3rd eye", i.e. their reasoning powers.

mathwonk · Jun 11, 2006

hint : draw a picture of the graph.

mathwonk · Jun 11, 2006

to prove the only solutions of (D-a)(D-b)y = 0 are of form ce^ax + de^bx, show that if L is linear and Lf = Lg = h, then L(f-g) = 0. then note that (D-a)(e^(bx)) = (b-a)e^(bx), hence (D-a)(e^(bx)/[b-a]) = e^(bx).

Thus (D-a)(D-b)(e^(bx)/[b-a]) = 0, so all solutions of (D-a)(D-b)y = 0 are of form y = ce^ax + de^bx.

is this right? i have lots of problems with arithmetic.

r4nd0m · Jun 12, 2006

mathwonk said:

now can you refine it to give the full IVT for derivatives? I.e. assume f'(a) = c and f'(b) = d, and c<e<d. prove f'(x) = e has a solution too.

Well, I would do it like this:
Let function g be defined as g(x) = f(x) -ex. f is differentiable, obviously ex is also differentiable, hence g is differentiable (subtraction of two differentiable functions is differentiable).
g'(x) = f'(x) - e
g'(a) = f'(a) - e = c - e <0
g'(b) = f'(b) - e = d - e >0

So there must be a value c, for which g'(c) = f'(c) - e = 0 => f'(c) = e
Q.E.D.

What I have realized when learning calculus and also doing this excercise is that many of the proofs use only few very simmilar ideas.
At first (when I started with calculus) I dindn't really understand why we proved some theorem in a certain way - I understood it only formally as some series of expressions. But as we proceeded I saw that there is something behind those formulas, some basic ideas, which repeat pretty often and I started to understand it more intuitively (if something like that can be said about math

).

TD said:

We then used it so prove the implicit function theorem for f : R²->R, which was a rather technical proof (more than we were used to at least).

In my textbook the proof of this theorem takes two pages. Anyway I think, that it is a very elegant proof, when you see what's happening there - I drew a picture and I was really suprised how easy it was.

jbusc · Jun 12, 2006

mathwonk said:

Jbusc, topology is such a basic fioundational subjuect that it does not depend on much else, whereas differential geometry is at the other end of the spectrum.. still there are inrtroductions to differential geometry that only use calculus of several variables (and topology and linear algebra). Try Shifrin's notes on his webpage.http://www.math.uga.edu/~shifrin/

I had forgotten that I had posted here, so sorry for bringing it up again.

Thanks for that resource, his notes are exactly what I was looking for. I am reading Hartle's General Relativity and while it is an excellent book the math is quite watered down and I am trying to look for some readings on differential geometry.

As well there are some graduate students in electrical engineering here whose research problems lead them to asking questions about things such as maps between spaces with different metrics and topologies and they could use some resources as well since those are not addressed in their education.

I have one other question, from looking around it seems that you (and many others) are quite fond of Apostol as a Calculus textbook. Now without being too egotistical I would rank my knowledge of multivariable calc at "very proficient", but I would like to improve to "extremely proficient". I am running low on my textbook budget however - should I really start on volume I or is beginning on volume II adequate?

mathwonk · Jun 12, 2006

nice proof r4nd0m! claen as a whistle. I must say when I read your arguments they look more succint and clear than my own.

i am really happy to hear what you say about proofs beginning to look more similar, based on a few ideas. like in this case subtraction! this trivial sounding device is very basic, as it reduces the consideration of arbitrary numbers to consideration of the case of zero! and zero is often an easier case to reason about.

jbusc,

you can begin wherever you want, but you may find that volume 2 of apostol uses more familiarity with rigorous arguments and proofs from volume 1 than you are used to. but since you are reading on your own, you can always slow down and go back fopr a refresher.

i think i find some much cheaper copies of apostol than the usual, and listed some of them above. try abebooks.com or one of the other used book sites.

mathwonk · Jun 12, 2006

another good choice for multivariable calc for people already knowing some, and some linear algebra, and some limits theory, is spivaks "calculus on manifolds". this excellent short book is a great bridge from undergraduate to graduate level preparation in calculus.

mathwonk · Jun 12, 2006

remark on implicit, inverse functions: i recall the 2 variable inverse function theorem is used to prove the implicit function theorem from R^2-->R.

As opposed to the one variable inverse function theorem, the 2 variable version is topologically interesting and requires (or usually uses) some new ideas.

for instance, one must prove that a smooth function f:R^2-->R^2 taking (0,0) to (0,0), and with derivative matrix at (0,0) equal to the 2by2 identity matrix, maps a small nbhd of (0,0) onto a small nbhd of (0,0).

try picturing this, and see if you can think of an intuitive argument that it should be true.

it is also useful to restudy the argument deriving the implicit function theorem from this result as r4nd0m has done.

mathwonk · Jun 12, 2006

as to solving linear ode's with constant coeffs, the hard part is the case already solved above for n = 1, i.e. that (D-a)y = 0 iff y = ce^(at).

the higher cases follow from that one by induction.

i.e. prove that (D-a)(D-b)y = 0 if and only if (D-b)y = z and (D-a)z = 0. thus z must equal ce^(at), and y must solve (D-b)y = z. so it suffices to
i) find one solution y1 of (D-b)y = e^(at), and then
ii) show that if y2 is another solution, then (y1-y2) solves (D-b)y = 0. and this tells you what y1-y2 is. since you already know y1, this tells you all possible solutions y2.

try this. it shows how to use "linear algebra" to ratchet up the calculus solution for the case n=1, in order to solve all higher cases.

mathwonk · Jun 12, 2006

for people trying to graduate to a higher level view of calculusa, here is a little of an old lecture on background for beginning students in differential topology:

Math 4220/6220, lecture 0,
Review and summary of background information

Introduction: The most fundamental *concepts used in this course are those of continuity and differentiability (hence linearity), and integration.

Continuity
Continuity is at bottom the idea of approximation, since a continuous function is one for which f(x) approximates f(a) well whenever x approximates a well enough. The precise version of this is couched in terms of “neighborhoods” of a point. In that language we say f is continuous at a, if whenever a neighborhood V of f(a) is specified, there exists a corresponding neighborhood U of a, such that every point x lying in U has f(x) lying in V.
Then the intuitive statement “if x is close enough to a, then f(x) is as close as desired to f(a)”, becomes the statement: “for every neighborhood V of f(a), there exists a neighborhood U of a, such that if x is in U, then f(x) is in V”.
Neighborhoods in turn are often defined in terms of distances, for example an “r neighborhood” of a, consists of all points x having distance less than r from a. In the language of distances, continuity of f at a becomes: “if a distance r > 0 is given, there is a corresponding distance s > 0, such that if dist(x,a) < s, (and f is defined at x) then dist(f(x),f(a)) < r”.
More generally we say f(x) has limit L as x approaches a, if for every nbhd V of L, there is a nbhd U of a such that for every point of U except possibly a, we have f(x) in V. Notice that the value f(a) plays no role in the definition of the limit of f at a. Then f is continuous at a iff f(x) has limit equal to f(a) as x approaches a.

Differentiability
Differentiability is the approximation of non linear functions by linear ones. Thus making use of differentiability requires one to know how to calculate the linear function which approximates a given differentiable one, to know the properties of the approximating linear function, and how to translate these into analogous properties of the original non linear function. Hence a prerequisite for understanding differentiability is understanding linear functions and the linear spaces on which they are defined.

Linearity
Linear spaces capture the idea of flatness, and allow the concept of dimension. A line with a specified point of origin is a good model of a one dimensional linear space. A Euclidean plane with an origin is a good model of a two dimensional linear space. Every point in a linear space is thought of as equivalent to the arrow drawn to it from the specified origin. This makes it possible to add points in a linear space by adding their position vectors via the parallelogram law, and to "scale" points by real numbers or "scalars", by stretching the arrows by this scale factor, (reversing the direction if the scalar is negative).
We often call the points of a linear space "vectors" and the space itself a "vector space". A linear function, or linear map, is a function from one linear space to another which commutes with these operations, i.e. f is linear if f(v+w) = f(v)+f(w) and f(cv) = cf(v), for all scalars c, and all vectors v,w.
The standard model of a finite dimensional linear space is R^n. A fundamental example of an infinite dimensional linear space is the space of all infinitely differentiable functions on R.

Linear Dimension
This is an algebraic version of the geometric idea of dimension. A line is one dimensional. This means given any point except the origin, the resulting non zero vector can be scaled to give any other vector on the line. Thus a linear space is one dimensional if it contains a non zero vector v such that given any other vector x, there is a real number c such that x = cv. We say then v spans the line.
A plane has the two dimensional property that if we pick two distinct points both different from the origin, and not collinear with the origin, then every point of the plane is the vector sum of multiples of the two corresponding vectors. Thus a linear space S is two dimensional if it contains two non zero vectors v,w, such that w is not a multiple of v, but every vector in S has form av+bw for some real numbers a,b. We say the set {v,w} spans the plane S.
In general a set of vectors {vi} spans a space S if every vector in S has form <summation> aivi where the sum is finite. The space is finite dimensional if the set {vi} can be taken to be finite. A space has dimension r if it can be spanned by a set of r vectors but not by any set of fewer than r vectors. If S is inside T, and both are finite dimensional linear spaces of the same dimension, then S = T.

Linear maps
Unlike continuous maps, linear maps cannot raise dimension, and bijective linear maps preserve dimension. More precisely, if f:S-->T is a surjective linear map, then dim(T) <= dim(S), whereas if f:S-->T is an injective linear map, then dim(T) >= dim(S). Still more precisely, if ker(f) = f-1(0), and im(f) = {f(v): v is in S}, then ker(f) and im(f) are both linear spaces [contained in S,T respectively], and dim(ker(f)) + dim(im(f)) = dimS. This is the most fundamental and important property of dimension. This is often stated as follows. The rank of a linear map f:S-->T is the dimension of im(f) and the nullity is the dimension of ker(f). Then for f:S-->T, we have rank(f) + nullity(f) = dim(S).
It follows that f is injective if and only if ker(f) = {0}, and surjective if dimT = dim(im(f)) is finite. A linear map f:S-->T with a linear inverse is called an isomorphism. A linear map is an isomorphism if and only if it is bijective. If dimS = dimT is finite, a linear map f:S-->T is bijective if and only if f is injective, if and only if f is surjective. A simple and important example of a linear map is projection R^nxR^m-->R^n, taking (v,w) to v. This map is trivially surjective with kernel {0}xR^m.
The theory of dimension gives a strong criterion for proving the existence of solutions of linear equations f(x) = w in finite dimensional spaces. Assume dimS = dimT finite, f:S-->T linear, and f(x) = 0 only if x = 0. Then for every w in T, the equation f(x) = w has a unique solution.
More generally, if S,T are finite dimensional, f:S-->T linear, and dim(ker(f)) = dim(S) - dim(T) = r, then every equation f(x) = w has an r dimensional set of solutions. We describe the set of solutions more precisely below.
Differentiation D(f) = f' is a linear map from the space of infinitely differentiable functions on R to itself. The mean value theorem implies the kernel of D is the one dimensional space of constant functions, and the fundamental theorem of calculus implies D is surjective.
More generally, for every constant c the differential operator
(D-c) is surjective with kernel the one dimensional space of multiples of ect, hence a composition of n such operators has n dimensional kernel. One can deduce that a linear combination <summation>cjDj 0<=j<=n, cn not 0, with constant coefficients cj, of compositions of D with maximum order n, has n dimensional kernel.

Geometry of linear maps.
If f:S-->T is a linear surjection of finite dimensional spaces, then ker(f) = f-1(0) is a linear space of dimension r = dim(T)-dim(S), and for every w in T, the set f-1(w) is similar to a linear space of dimension r, except it has no specified origin. I.e. if v is any solution of f(v) = w, then the translation taking x--> x+v, is a bijection from f-1(0) to f-1(w). Hence the choice of v as "origin" in f-1(w) allows us to define a unique structure of linear space making f-1(w) isomorphic to f-1(0). Thus f-1(w) is a translate of an r dimensional linear space.
In this way, f "fibers" or "partitions" the space S into the disjoint union of the "affine" linear sets" f-1(w). There is one fiber f-1(w) for each w in T, each such fiber being a translate of the linear space ker(f) = f-1(0). If
f:S-->T is surjective and linear, and dimT = dimS - 1, then the fibers of f are all one dimensional, so f fibers S into a family of parallel lines, one line over each point of T. If f:S-->T is surjective (and linear), but dimT = dimS - r with r > 0, then f fibers S into a family of parallel affine linear sets f-1(w) each of dimension r.

The matrix of a linear map R^n-->R^m
If S, T are linear spaces of dimension n and m and {v1,...,vn}, {w1,...,wm} are sets of vectors spanning S,T respectively, then for every v in S, and every w in T, the scalar coefficients ai, bj in the expressions v = <summation>aivi, and w = <summation>bjwj, are unique. Then given these minimal spanning sets, a linear map f:S-->T determines and is determined by the "m by n matrix" [cij] of scalars where: f(vj) =
<summation>i cijwi, for all j = 1,...,n. If S = T = Rn, we may take vi = wi = (0,...,0,1,0,...,0) = ei = the "ith unit vector", where the 1 occurs in the ith place.
If S is a linear space of dimension n and {v1,...,vn} is a minimal spanning set, we call {v1,...,vn} a basis for S. Then there is a unique isomorphism S-->R^n that takes vi to ei, where the set of unit vectors {e1,...,en} is called the "standard" basis of Rn. Conversely under any isomorphism S-->R^n, the vectors in S corresponding to the set {e1,...,en} in R^n, form a basis for S. Thus a basis for an n dimensional linear space S is equivalent to an isomorphism of S with R^n. Since every linear space has a basis, after choosing one, a finite dimensional vector space can be regarded as essentially equal to some R^n.
In the context of the previous sentence, every linear map can be regarded as a map f:R^n-->R^m. The matrix of such a map, with respect to the standard bases, is the m by n matrix whose jth column is the coordinate vector f(ej) in R^m.
If f:S-->T is any linear surjection of finite dimensional spaces, a careful choice of bases for S,T can greatly simplify the matrix of the corresponding map R^n-->R^m. In fact there are bases for S,T such that under the corresponding isomorphisms, f is equivalent to a projection
R^(n-m)xR^m-->R^m. I.e., up to linear isomorphism, every linear surjection is equivalent to the simplest example, a projection.
This illustrates the geometry of a linear surjection as in the previous subsection. I.e. a projection f:R^nxR^m-->R^m fibers the domain space R^nxR^m into the family of disjoint parallel affine spaces f-1(v) = R^nx{v}, with the affine space R^nx{v} lying over the vector v. Since every linear surjection is equivalent to a projection, every linear surjection fibers its domain into a family of disjoint affine spaces linearly isomorphic to this family. We will see that the implicit function theorem gives an analogous statement for differentiable functions.

The determinant of a linear map R^n-->R^n.
For each linear map f:R^n-->R^n there is an important associated number det(f) = det(cij) = the sum of the products <summation>p <product>i sgn(p)cip(i), where p ranges over all permutations of the integers (1,2,3...,n). det(f) is the oriented volume of the parallelepiped (i.e. block) spanned by the image of the ordered set of unit vectors f(e1),...,f(en). Then f is invertible iff det(f) is not 0. The intuition is that this block has non zero n dimensional volume iff the vectors f(e1),...,f(en) span R^n, iff f is surjective, iff f is invertible.

mathwonk · Jun 12, 2006

summary of derivatives in several variables

Here is the other half of lecture zero for a course that intends to use calculus of several variables. i.e. this is what you need to know:

Derivatives: Approximating non linear functions by linear ones.
Ordinary Euclidean space R^n is a linear space in which an absolute value is defined, say by the Euclidean "norm", |v| = (x1^2+...+xn^2)^(1/2), where v = (x1,...,xn), hence also a distance is defined by dist(v,w) = |v-w|. The set of points x such that |x-a| < r, is called the open ball of radius r centered at a. An "open set" is any union of open balls, and an open neighborhood of the point a is an open set containing a. If f:R^n-->R^m is any map, then f(x) has limit L as x approaches a, iff the real valued function |f(x)-L| has limit 0 as x approaches a.
In a linear space with such an absolute value or norm we can define differentiability as follows. A function h is "tangent to zero" at a, if h(a) = 0 and the quotient |h(x)|/|x-a| has limit zero as x approaches a. I.e. if "rise" over "run" approaches zero in all directions. In particular then h(x) approaches zero as x approaches a. Two functions f,g are tangent at a, if the difference f-g is tangent to zero at a.
A function f defined on a nbhd of a, is differentiable at a if there is a linear function L such that L(v) is tangent to f(v+a)-f(a) at 0. Then L = f'(a) is unique and is called the derivative of f at a. I.e. f has derivative L = f'(a) at a, iff the quotient |(f(x)-f(a)-L(x-a))|/|x-a| has*limit zero as x approaches a. If f is itself linear, then f'(a)(v) = f(v), for all a. I.e. then a-->f'(a) is a constant (linear map valued) function, with value f everywhere.

Chain Rule
The most important property of derivatives is the chain rule for the derivative of a composite function. If f is differentiable at a and g is differentiable at f(a), then gof is differentiable at a and (gof)'(a) = g'(f(a))of'(a). I.e. the derivative of the composition, is the composition (as linear functions) of the derivatives. Since the derivative of the identity map is the identity map, this says roughly "the derivative is a functor", i.e. it preserves compositions and identity maps.
As a corollary, if a differentiable function has a differentiable inverse, the derivative of the inverse function is the inverse linear function of the derivative. I.e. If f-1 exists and is differentiable, then (f-1)'(f(a)) = (f'(a))-1. In particular, since a linear function can be invertible only if the domain and range have the same dimension, the same holds for a differentiable function. E.g. a differentiable function f:R^2-->R cannot have a differentiable inverse. (Continuous invertible functions also preserve dimension, but this is harder to prove in general. It is easy in low dimensions however. Can you prove there is no continuous invertible function f:R^2-->R?)

Calculating derivatives
The usual definition of the derivative of a one variable function from R to R, agrees with that above, in the sense that if f'(a) is the usual derivative, i.e. the number limh-->0 (f(a+h)-f(a))/h), then
f(a+h)-f(a) is tangent at zero to the linear function f'(a)h of the variable h. I.e. the usual derivative is the number occurring in the 1 by 1 matrix of the derivative thought of as a linear function. There is an analogous way to compute the matrix of the derivative in general.
A function f:R^n-->R^m is made up of m component functions g1,...,gm, and if in the ith component function gi, we hold all but the jth variable constant, and define the real valued function h(t) of one variable by h(t) = gi(a1,...,aj+t,...,an), we call h'(0) = dgi/dxj(a), the jth partial derivative of gi at a. If f is differentiable at a, then all partials of f exist at a, and the matrix of the derivative L = f'(a) of f at a is the "Jacobian" matrix of partials [dgi/dxj(a)].
It is useful to have a criterion for existence of a derivative that does not appeal to the definition. It is this: if all the partials of f exist not only at a but in a nbhd of a, and these partials are all continuous at a, then f is differentiable at a, and the derivative is given by the matrix of partials. We can then check the invertibility of f'(a), by computing the determinant of this Jacobian matrix.

Inverse function and implicit function theorems
The "inverse function theorem", is a criterion for f to have a local differentiable inverse as follows: If f is differentiable on a neighborhood of a, and the derivative f'(x) is a continuous function of x in that nbhd, (i.e. the entries in the matrix of f'(x) are continuous functions of x), and if f'(a) is invertible, then f is differentiably invertible when restricted to some nbhd U of a. I.e. f maps some open nbhd U of a bijectively onto an open nbhd V = f(U) of f(a), with f-1 defined and differentiable on V, and f-1(V) = U.
More generally, the implicit function theorem characterizes differentiable functions locally equivalent to projection maps, as follows. If f is differentiable on a neighborhood of a in R^n with values in R^m, if the derivative f'(x) is a continuous function of x, and if f'(a) is surjective, then on some nbhd U of a, f is differentiably isomorphic to a projection.
I.e. if f:R^n-->R^m is continuously differentiable near a with surjective derivative at a, then there are open sets U in R^n, W in
R^(n-m), V in R^m, with U a nbhd of a, V a nbhd of f(a), and a differentiable isomorphism h:U-->WxV, such that the composition
foh-1:WxV-->V, is the projection map (x,y)-->y. Then the parallel flat sets Wx{y} which fiber the rectangle WxV, are carried by h-1 into "parallel" curved sets which fiber the nbhd U of a. The fiber passing through a, suitably restricted, is the graph of a differentiable function, hence the name of the theorem.
I.e. one can take a smaller nbhd of a within U, of form XxY, with XinW, and the map XxY-->WxV to be of form (x,y)-->(x,f(x,y)). Then the flat set Xx{f(a)} pulls back by h-1 to some subset Z of XxY in which every point is determined by its "X-coordinate". I.e. given x in X, there is a unique point of form (x, f(a)), hence a unique point h-1(x,f(a)) in the set Z = h-1(Xx{f(a)}). Since on Z, the Y coordinate of every point is determined by the X coordinate, and every x coordinate in X occurs, Z is the graph of a function X-->Y. This function is differentiable since it is a composite of differentiable functions: i.e. (projection) o (h-1) o (id,f(a)). We are more interested in the simpler geometric interpretation, that the map fibers the domain into smooth parallel surfaces, than in the "implicit function" interpretation that each of these surfaces is a graph of a function.

Compactness
In proving various results, we will often need the important ideas of connectedness and compactness from point set topology. In Euclidean space recall that an open set is a union of open balls. Compactness is a replacement for finiteness as follows: a set Z is called compact if whenever Z is "covered by" a collection of open sets (i.e. Z is contained in the union of those open sets), then a finite number of those same open sets already cover Z. A set is called "closed" if it is the complement of an open set.
A subset of R^n is compact if and only if it is closed and contained in some finite open ball, i.e. if and only if it is closed and "bounded". It follows that the product of two compact sets of Euclidean space is compact.
If f is a continuous function, and Z a compact subset of its domain, then f(Z) is also compact. Hence a real valued continuous function defined on a compact set Z assumes a global maximum there, namely the least upper bound of its values on Z. Likewise it assumes a global minimum on Z.
If Z is a compact subset of R^n then any open cover {Ui} of Z has a "Lebesgue number". I.e. given any collection of open sets {Ui} covering Z, there is a positive number r > 0, such that every open ball of radius r centered at any point of Z is wholly contained in some open set Ui of the given cover. This number is the minimum of the continuous function assigning to each point p of Z the least upper bound of its distances from the outside of all the sets Ui, i.e. the least upper bound of all r > 0 such that the open ball of radius r about p is contained in some set Ui. This function is positive valued since the sets Ui cover Z, hence it has a positive minimum.
A sequence contained in a compact set Z has a subsequence converging to a point of Z. In R^n this property implies in turn that Z is closed and bounded hence compact.

Connectedness
This is one of the most intuitive concepts in topology. Ask anyone mathematician or not, which set is connected, the interval [0,1], or the two point set {0,1}, and they will always get it correct. Fortunately it is also one of the most important and powerful concepts. A set Z is connected if whenever Z is contained in the union of two open sets A,B, then either some point of Z is in both A and B, or Z is entirely contained in one of the sets A or B. I.e. you cannot separate a connected Z into two non empty disjoint open parts (A intersect Z) and (B intersect Z). Either (A intersect Z) and (B intersect Z) have a common point, or one of them is empty.
The empty set is connected. Any one point set is connected. The only connected subsets of R are the intervals, either finite or infinite, open or closed, half open or half closed. The image of a connected set under any continuous map is again connected. Thus an integer valued continuous function on an interval is constant. If f is a continuous real valued function defined on an interval of R, the set of values of f is also an interval. In calculus this is called the intermediate value theorem. (Tip: For proving things about connectedness, the most efficient form of the definition is that a set S is connected if and only if every continuous map from S to the 2 point set {0,1} is constant.)
If f:S^1-->R^2 is a continuous injection from the circle to the plane, then R^2 - f(S1) is a disjoint union of exactly two non empty connected open sets, the inside and the outside of the closed loop f(S1). This, the "Jordan curve theorem", is famously hard to prove, but we will prove it easily when f is continuously differentiable.

mathwonk · Jun 12, 2006

i have just summarized all of the basics of topology, linear algebra, and calculus of several variables. did i touch any bases? help anyone?

do you recognize the content of your first 2 or 3 years of math in these 10 pages? :bugeye:

ircdan · Jun 13, 2006

mathwonk said:

do you recognize the content of your first 2 or 3 years of math in these 10 pages?

In general, most of the stuff, but some of the stuff I hadn't seen before.

I was surprised because I had seen most of the stuff in your post on differential topology(I haven't studied topology yet). I also was familiar with connectedness. I had not seen the Inverse function and implicit function theorems but I'll be seeing them again next semester. Also the local boundedness stuff was new.

Do you have any notes on algebra you can just copy/paste? I'm taking my first course on algebra next semester from an extremely difficult(yet amazing) professor, so I plan to start reading my book in advance. Any extra notes or pointers would be appreciated!

courtrigrad · Jun 14, 2006

I was just wondering, does anybody have an online notebook. In other words, I am thinking about creating a Latex journal that shows my work for all the problems that I do (right now working out of Apostol). Would do you guys think about this?

ircdan · Jun 14, 2006

courtrigrad said:

I was just wondering, does anybody have an online notebook. In other words, I am thinking about creating a Latex journal that shows my work for all the problems that I do (right now working out of Apostol). Would do you guys think about this?

Yes I have one. Currently I have four categories, Advanced Calculus, Linear Algebra, Complex Analysis, and Number Theory. It's actually really fun doing this, in a sense it's an "end result" to all your work. I mean sure there is a certain self satisfaction you get from proving something, but there is nothing concrete. It's also an amazing way to get organized. Currently I have 5-6 big binders full of math problems, all disorganized, so what I do is I read a section, do as many of the problems as I can, and then compare them to my previous solutions if any. Alot of times I find out my new solution ends up being much cleaner than my old one. Also I don't use latex, I just use a scanner, it's much quicker and I can focus on solving problems rather than on making them look pretty. I think it's a great idea, go for it.

GregA · Jun 14, 2006

Just a thought...I think that this suggestion is bloody brilliant and I was just wondering how much application there would be for a section where people can just post the more interesting problems they have solved with working included by category so that others can perhaps discuss them, find and comment upon other ways of reaching the solutions, or be inspired by them...like I said, it was just a thought

mathwonk · Jun 14, 2006

ircdan, my profile has my webpage address at school (somethinglike roy at mathdept UGA) , where there is an entire algebra book, in pdf, downloadable. also a 14 page linear algebra primer. how's that?

ircdan · Jun 14, 2006

mathwonk said:

ircdan, my profile has my webpage address at school (somethinglike roy at mathdept UGA) , where there is an entire algebra book, in pdf, downloadable. also a 14 page linear algebra primer. how's that?

Excellent thank you.

mathwonk · Jun 14, 2006

it is a privilege to be of assistance.

I hope some will or has solved my second ode exercise as well. i realize i did not give enough help. there is an idea there, the idea of linearity.

i.e. the first step is to prove that (D-a)y = 0 iff y = ce^(at).

Then to solve (D-a)(D-b)y = 0, one needs a little preparation.

define the operator Ly = (D-a)(D-b)y. then show that L is linear i.e.
(i) L(cy) = cL(y) and
(ii) L(y+z) = L(y)+L(z).

and show also that L = (D-a)(D-b) = (D-b)(D-a).

then it follows that L(0) = 0. hence (D-a)y = 0 or (D-b)y=0 implies that also L(y) = 0.

when a and b are different this already gives as solutions at least all functions y = ce^(at) + de^(bt).

then we want to prove there are no others. try to get this far first.

notice we are introducing a concept, the concept of "linearity", into what was previously just a calculation. this distinction separates advanced math from elementary math.

mathwonk · Jun 14, 2006

oh by the way my website actually has three algebra books, one elementary algebra book, that i teach from to our juniors, one i teach from to our grad students, and a linear algebra book i have never had the nerve to teach anyone from yet, since it covers the whole semester or more course in 14 pages.
[edit: (many years later) that 14 page book has been greatly expanded now into a longer version also on that webpage.]

mathwonk · Jun 14, 2006

my preference is actually topology, differential topology, and complex analysis, or all of them combined in complex algebraic geometry. but because even the mathematical layperson thinks that anyone who calls himself an algebraic geometer must know some algebra, i have been called upon more often to teach algebra than complex analysis or topology. hence my books, which are really course notes, are almost all about algebra. it was good for me to have to learn the subject, but i hope someday they trust me to teach topology or complex analysis again, or even real analysis, so i can learn that too.

i did write some notes ages ago on sheaf theory, and serre's duality theorem proved by distribution theory (real and functional analysis) and complex riemann surfaces, but it was before the era of computers so i have no magnetic versions of those notes.

ircdan · Jun 15, 2006

mathwonk said:

it is a privilege to be of assistance.

I hope some will or has solved my second ode exercise as well. i realize i did not give enough help. there is an idea there, the idea of linearity.

i.e. the first step is to prove that (D-a)y = 0 iff y = ce^(at).

Then to solve (D-a)(D-b)y = 0, one needs a little preparation.

define the operator Ly = (D-a)(D-b)y. then show that L is linear i.e.
(i) L(cy) = cL(y) and
(ii) L(y+z) = L(y)+L(z).

and show also that L = (D-a)(D-b) = (D-b)(D-a).

then it follows that L(0) = 0. hence (D-a)y = 0 or (D-b)y=0 implies that also L(y) = 0.

when a and b are different this already gives as solutions at least all functions y = ce^(at) + de^(bt).

then we want to prove there are no others. try to get this far first.

notice aw are introducing a concept, the conecpt of linearity into what was previously just a calculation. this distinction separates advanced math from elementary math.

I already showed the first part in an earlier post I think. Well I showed that if (D - a)y = 0, then all solutions are of the form y = ce^(at). The other direction is just a calculation I assume. If y = ce^(at), then
(D - a)(ce^(at)) = D(ce^(at)) - ace^(at) = ace^(at) - ace^(at) = 0.

For the second part you just hinted on I had been trying and couldn't get it, but I think I got it now(at least the direction you gave hints for), it just did not occur to me define Ly = (D-a)(D-b)y and show is linear, and then since L is linear L(0) = 0. I think it's very nice to see that linearity can used here. I studied linear algebra but never used it to solve differential equations. I think this works, I'm not too sure it's correct.

First to show L is linear.(I showed a lot of the steps, habit, but maybe not necessary)

Define Ly = (D-a)(D-b)y.

L(cy) = (D - a)(D - b)(cy)
= [D^2 - (a + b)D + ab](cy)
= D^2(cy) - (a +b)D(cy) + ab(cy)
= cD^2(y) - c(a + b)D(y) + c(ab)y (by linearity of D)
= c(D^2(y) - (a + b)D(y) + aby)
= c[D^2 - (a + b) + ab](y)
= c(D - a)(D - b)y
= cLy

L(z + y) = (D - a)(D - b)(z + y)
= [D^2 - (a + b)D + ab)(z + y)
= D^2(z + y) -(a + b)D(z + y) + ab(z + y)
= D^2(z) + D^2(y) - (a + b)D(z) - (a + b)D(y) + abz + aby (by linearity of D)
= D^2(z) - (a + b)D(z) + abz + D^2(y) - (a + b)D(y) + aby
= [D^2 - (a + b)D + ab](z) + [D^2 - (a + b)D + ab](y)
= (D - a)(D - b)(z) + (D - a)(D - b)(y)
= Lz + Ly

Thus L is linear.

Also (D - a)(D - b) = [D^2 - (a + b)D + ab]
= [D^2 - (b + a)D + ba]
= (D - b)(D - a)Hence L(0) = 0.(this also follows from the fact L is linear, so the above is not really necessary right?)

Hence (D - a)(y_1) = 0 or (D - b)(y_2) = 0 implies L(y_1 + y_2) = L(y_1) + L(y_2) = 0 + 0 = 0
so y = y_1 + y_2 = ce^(at) + de^(bt)? (is that right?)
Edit: For the second part, does this work? (this doesn't use linear algebra, and I guess it isn't a proof since I didn't prove the method being used)

Suppose w is another solution to (D - a)(D-b)y =0, then
(D-a)(D-b)w = 0,
w'' -(a +b)w' + abw = 0, which has characteristic equation,
r^2 - (a+b)r + ab = 0 => r = a or r = b , hence w = ce^(at) + de^(bt) = y.

I'm assuming there is a way it can be done with linear algebra, I'll try later, thanks for the tip.

mathwonk · Jun 15, 2006

excellent. it all looks correct and exemplary. as to the final argument, you are again right, it is not a proof since the word "hence" in the next to last line is the uniqueness we are trying to prove.

the point is that in linear algebra if you can find all solutions to the equation fx = 0, and if you can find one solution to fx = b, then you can get all the other solutions to fx=b, by adding solutions of fx=0 to the one solution you have.

you also want to use the fact, true of all functions, linear or not, that if g(a) = b, and if f(c)=a, then g(f(c)) = b. i.e. to find solutions for a composite function (D-a)(D-b)y = 0, find z such that (D-a)z =0, then find y such that (D-b)y = z.

so use the step you already did to solve for z such that (D-a)z = 0, then use hook or crook (e.g. characteristic equation) to find one solution of (D-b)y = z, and then finally use linearity to find ALL solutions of (D-b)y=z, hence also by linearity all solutions of (D-a)(D-b)y = 0.this is a completely self contained proof of the uniqueness step for these ode's that is often left out of books, by quoting the general existence and uniqueness theorem which many do not prove.but this proof is much easier than the general theorem, and uses the theory of linearity one has already studied in linear algebra.

In fact it is not too far a guess to imagine that most of linear algebra, such as jordan forms etc, was discovered by looking at differential equations, and was intended to be used in solving them. todays linear algebra classes that omit all mention of differential quations are hence absurd exercises in practicing the tedious and almost useless skill of multiplying and simplifying matrices. The idea of linearity, that L(f+g) = Lf +Lg is never even mentioned in some courses on linear algebra if you can believe it, and certainly not the fact that differentiation is a linear operator.

ircdan · Jun 15, 2006

mathwonk said:

todays linear algebra classes that omit all mention of differential quations are hence absurd exercises in practicing the tedious and almost useless skill of multiplying and simplifying matrices. The idea of linearity, that L(f+g) = Lf +Lg is never even mentioned in some courses on linear algebra if you can believe it, and certainly not the fact that differentiation is a linear operator.

Yea my first linear algebra class was very tedious! We mentioned linearity but I didn't really learn any of nice properties of linear operators until my second course in linear algebra.

Anyways I think I got the second part thanks to your hints.

(D - a)(D - b)y = 0 implies (D - b)y = z and (D - a)z = 0 (this follows from the hint you gave about general functions).

Now (D - a)z = 0 iff z = e^(at), so
(D - b)y = e^(at)
Let y_p = Ae^(at) for some A and note
(D - b)(Ae^(at)) = aAe^(at) - bAe^(at) = A(a - b)e^(at) = e^(at), hence A = 1/(a - b) so that y_p = e^(at)/(a - b) solves (D - b)y = e^(at).

Now suppose y_1 is any other solution to (D - b)y = e^(at).
Since (D - b) is linear,
(D - b)(y_1 - y_p) = (D - b)y_1 - (D - b)y_p = e^(at) - e^(at) = 0 and thus w = y_1 - y_p solves (D - b)y = 0, so w = de^(bt) for some d.

Again, since (D - b) is linear,
(D - b)(y_p + w) = (D - b)y_p + (D - b)w = e^(at) + 0 = e^(at), hence
y = y_p + w = e^(at)/(a - b) + de^(bt) solves (D - b)y = e^(at) and so y also solves (D - a)(D - b)y = 0 so all solutions have the form e^(at)/(a - b) + de^(bt) = ce^(at) + de^(bt) where c = 1/(b - a).

I notice this only works if a != b. If a = b, the y_p would be different so the same proof would work it seems.

mathwonk · Jun 15, 2006

outstanding! and do you know the solution if a = b?

ircdan · Jun 15, 2006

mathwonk said:

outstanding! and do you know the solution if a = b?

Hey thanks for the help! Yea I just tried it now, and it didn't take me as long as that first one because it's almost the same. There are two differences in the proof. Originally I thought there would be only one difference in the proof until I tried it, so I'm glad I did it, it's good practice too.(D - a)(D - a)y = 0 implies (D - a)y = z and (D - a)z = 0.

Now (D - a)z = 0 iff z = ce^(at), so
(D - a)y = ce^(at)
Let y_p = Ate^(at) for some A and note
(D - a)(Ate^(at)) = D(Ate^(at)) -atAe^(at) = Ae^(at) + Atae^(at) - aAte^(at) = Ae^(at) = ce^(at), hence A = c so that y_p = cte^(at) solves (D - a)y = ce^(at).

Now suppose y_1 is any other solution to (D - a)y = ce^(at).
Since (D - a) is linear,
(D - a)(y_1 - y_p) = (D - a)y_1 - (D - a)y_p = cte^(at) - cte^(at) = 0 and thus w = y_1 - y_p solves (D - a)y = 0, so w = de^(at) for some d.

Again, since (D - a) is linear,
(D - a)(y_p + w) = (D - a)y_p + (D - a)w = cte^(at) + 0 = cte^(at), hence
y = y_p + w = cte^(at) + de^(at) solves (D - a)y = ce^(at) and so y also solves (D - a)(D - a)y = 0 so all solutions have the form cte^(at) + de^(at).

I think that works. Thanks for all the tips!

mathwonk · Jun 16, 2006

this method works, modified, on any problem which can be factored into first order operators, and where one can solve first order problems. another example is the so called Eulers equation.
Similarly for Euler's equation, x^2y'' +(1-a-b)xy' + ab y = 0, with
indicial equation

(r-a)(r-b) = 0, just factor x^2y'' +(1-a-b)xy' + ab y = (xD-a)(xD-b)y = 0,

and solve (xD-a)z = 0, and then (xD-b)y = z.

As above, this proves existence and uniqueness simultaneously, and also
handles the equal roots cases at the same time, with no guessing.

Here you have to use, I guess, integrating factors to solve the first order cases, and be careful when "multiplying" the non constant coefficient operators (xD-a), since you must use the leibniz rule.

these are usually done by powers series methods, or just stating that the indicial equation should be used, again without proving there are no other solutions. of course the interval of the solution must be specified, or else I believe the space of solutions is infinite dimensional.

Palindrom · Jun 16, 2006

O.K., great thread guys. I got one for you.

I want to study Functional Analysis (Operator theory, Measure theory - Probability) and its applications in Quantum Physics, Statistical Mechanics and any other interesting part of Physics. After asking about 10 Professors in my campus (half in Physics, half in Math), I got the feeling that a department of Mathematics would be the best choice for me (among other things, mathematical rigor is something that's important to me).

Any insights on that, and also recommendations on what schools I should apply to?

ircdan · Jun 17, 2006

mathwonk said:

this method works, modified, on any problem which can be factored into first order operators, and where one can solve first order problems.

Neat. Now that I think about it, after reading your post, I remembered that I had seen something similar for pdes about a year ago, in particular, for the wave equation.

For notation, let u = u(x,t), and u_tt, u_xx denote partial derivatives. Then if we consider u_tt = c^2u_xx for -inf < x < inf,
u_tt - c^2u_xx = (d/dt - cd/dx)(d/dt + cd/dx)u = 0.
Let v = (d/dt + cd/dx)u, then we must have (d/dt - cd/dx)v = 0. Anyways these pdes are easy to solve and you end up with u(x,t) = f(x + ct) + g(x - ct) using the same arguement.

This result is stated in my book(without proof) and I had always wondered how they did it. I knew how to solve the two individual pdes, but I never knew how to prove that all solutions of the wave equation for x in (-inf, inf) had the form f(x +ct) + g(x - ct), but now I know how, thanks to working out the simpler ones like (D - a)(D-b)y = 0. Thanks a lot for the help.

courtrigrad · Jun 17, 2006

Do most people major just in math? Or do they have a minor in something else? These days, it is hard to find a job if one just gets a PhD in pure math. What are some good combinations? I am considering major in math, and a minor in economics. Or majoring in math and minoring in biology. What are your thoughts and opinions about these options? Also, what is graduate school like? My father keeps telling me not pursue a PhD right after undergraduate school. Would it be better to work for a few years, and then consider getting a PhD? That way, you would have experienced the real world? Could you guys please tell me your experiences of graduate school, and your opinions about the PhD degree?

Thanks a lot

mathwonk · Jun 18, 2006

when i get time and inspiration, i mean to continue my thread of general advice by going further into the grad school experience, passing prelims, writing a thesis, passing orals, applying for jobs, and grants, and then teaching and maybe departmental politics, survival, and retirement planning. and then getting into math heaven, reincarnation as a fields medalist...

r4nd0m · Jun 21, 2006

Recently I encountered a book "Mathematical problem solving methods" written by L.C.Larson. There are many problems from the Putnam competition.
My question is: how important is it for a physicist (mathematician) to be able to solve this kind of problems.

mathwonk · Jun 21, 2006

well it is not essential, but it can't hurt. I myself have never solved a Putnam problem, and did not participate in the contest in college, but really bright, quick people who do well on them may also be outstanding mathematicians.

My feeling from reading a few of them is they do not much resemble real research problems, since they can presumably be done in a few hours as opposed to a few months or years.

E.g. the famous fermat problem was solved in several stages. first people tried a lot of special cases, i.e. special values of the exponent. None of these methods ever yielded enough insight to even prove it in an infinite number of cases.

Then finally Gerhard Frey thought of linking the problem with elliptic curves, by asking what kind of elliptic curve would arise from the usual equation y^2 = (x-a)(x-b)(x-c) if a,b,c, were constructed in a simple way from three solutions to fermat's problem.

he conjectured that the elliptic curve could not be "modular". this was indeed proved by Ribet I believe, and then finally Andrew Wiles felt there was enough guidance and motivation there to be worth a long hard attempt on the problem via the question of modularity.

Then he succeeded finally, after a famous well publicized error, and some corrective help from a student, at solving the requisite modularity problem.

He had to invent and upgrade lots of new techniques for the task and it took him over 7 years.

I am guessing a Putnam problem is a complicated question that may through sufficient cleverness be solved by also linking it with some simpler insight, but seldom requires any huge amount of theory.

However any practice at all in thinking up ways to simplify problems, apply old ideas to new situations, etc, or just compute hard quantities, is useful. I would do a few and see if they become fun. If not I would not punish myself.

mathwonk · Jun 21, 2006

you could start a putnam thread here perhaps if people want to talk about these problems and get some first hand knowledge.but in research the smartest people, although they often do best on these tests, do not always do the deepest research. that requires something else, like taste, courage, persistence, luck and inspiration.

One of my better results coincided with the birth if one of my children. Hironaka (a famous fields medalist) once told me, somewhat tongue in cheek, that others had noticed a correlation between making discoveries and getting married, and "some of them do this more than once for that reason".I have noticed that success in research is in the long run, related to long hard consistent work. I.e. if you keep at it faithfully, doing what you have noticed works, you will have some success. Don't be afraid to make mistakes, or to make lengthy calculations that may not go anywhere.

And talk to people about it. This can be embarrassing, but after giving a talk on work that was still somewhat half baked, I have usually finished it off satisfactorily.

Here is an example that may be relevant: Marilyn Vos Savant seems to be an intelligent person, who embarrassed many well educated mathematicians a few years back with a simple probability problem published in a magazine. But she not only cannot do any research in the subject without further training, but even does not understand much of what she has read about mathematics. Still she has parlayed her fame into a newspaper column and some books.

The great Grothendieck, so deep a mathematician that his work discouraged fellow Fields medalist Rene Thom from even attempting to advance in algebraic geometry, once proposed 57 as an example of a prime number. This composite integer is now famous as "Grothendieck's prime".

But he made all of us begin to realize that to understand geometry, and also algebra, one must always study not just individual objects or spaces, but mappings between those objects. This is called category theory. Of course a study of Riemann's works reveals that he also focused as well on studying objects in families, i.e. mappings whose fibers are the objects of interest.

that is why the first few chapters of Hartshorne are about various types of maps, proper maps, finite maps, flat maps, etale maps, smooth maps, birational maps, generically finite maps, affine maps, etc...

fournier17 · Jun 22, 2006

If someone wanted to get a Ph.D in mathematical physics should you pursue an undergrad degree in math or physics. I would like to eventually like to do research in M theory but as a Mathematical physicist. Thanks in advance for your reply.

Should I Become a Mathematician?

Similar threads

Hot Threads

Recent Insights