Understanding the recursion theorem

In summary: There is at most one such fixed point, which is called L_0. So if you have two functions that produce the same results for an element n, then the set of functions \phi will include both \phi(f) and \phi(g) at L_0.
  • #1
martijnh
8
0
Hi all,

Im doing some self study in set theory, but got kind of stuck with a proof my textbook gives about the so called recursion theorem:

Theorem:
Let a be a set, and let [itex]\phi[/itex]: a[itex]^{\mathbb{N}}[/itex][itex]\rightarrow[/itex]a[itex]^{\mathbb{N}}[/itex] be a function such that for every natural number n, if f, g [itex]\in[/itex]a[itex]^{\mathbb{N}}[/itex] are such that f|n = g|n, then [itex]\phi[/itex](f)(n) = [itex]\phi[/itex](g)(n). Then [itex]\phi[/itex] has a unique fixpoint L[itex]\phi[/itex] [itex]\in[/itex] a[itex]^{\mathbb{N}}[/itex], which means that [itex]\phi[/itex] (L[itex]\phi[/itex] ) = L[itex]\phi[/itex] . Consider the function [itex]\phi[/itex]n : a[itex]^{n}[/itex][itex]\rightarrow[/itex]a which evaluates to [itex]\phi[/itex]n(g) = [itex]\phi[/itex](g*)(n). Then we have

L[itex]\phi[/itex] (0) = [itex]\phi[/itex]0(!:0[itex]\rightarrow[/itex]a)

L[itex]\phi[/itex] (n+) = [itex]\phi[/itex]n+(L[itex]\phi[/itex] |n+)


What I get from this is:
Let [itex]\phi[/itex] be a function that maps the result of a function that maps natural numbers to the set a, to the result of another function that does the same.

If you'd pick two functions from the function set a[itex]^{\mathbb{N}}[/itex] and these return the same result for an element n, then the set of functions [itex]\phi[/itex] would yield the same given f or g using n as input.

They then continue to use this to show there is a so called fixpoint in the set of functions [itex]\phi[/itex] which yields itself given itself as input to [itex]\phi[/itex]. But I don't see how this follows from the facts described above...


Then they go on proving there can be only one such a fixpoint in the set of functions [itex]\phi[/itex], but this makes even less sense to me:

Proof:
There is at most one such fixpoint. In fact, let L and M be two such fixpoints [itex]\phi[/itex](L) = L and [itex]\phi[/itex](M) = M, and suppose that they are different. Then there is a smallest value n0 such that L(n0) ≠ M(n0). This means that L|n0 = M|n0. But then [itex]\phi[/itex](L)(n0) = [itex]\phi[/itex](M)(n0), a contradiction. So there is at most one such fixpoint. For every n [itex]\in[/itex] [itex]\mathbb{N}[/itex], let S(n) [itex]\subset[/itex] an be the set of those functions f: n [itex]\rightarrow[/itex] a such that for all m [itex]\in[/itex] n, f(m) = [itex]\phi[/itex]m(f|m). Clearly, either S(n) is empty or S(n) contains precisely one function gn. The set S(0) is not empty. Let N+ be the smallest natural number such that S(N+) is empty. We may define a function h: N+ [itex]\rightarrow[/itex] a by h|N = gN and h(N) = [itex]\phi[/itex]N(h|N). But this is a function in S(N+), so every S(n) is non-empty. Now define L(n) = gn+(n). Clearly, this function is our candidate for a fixpoint: To begin with, if n < m, then evidently, by the uniqueness of the elements of S(n), g(m)|n = g(n). Therefore, L|n = gn for all n. And L is a fixpoint, in fact: L(n) = gn+(n) = [itex]\phi[/itex](gn+|n) = [itex]\phi[/itex]n(gn) = [itex]\phi[/itex](g[itex]^{*}_{n}[/itex])(n) = [itex]\phi[/itex](L)(n) since L|n = gn = g[itex]^{*}_{n}[/itex]|n. The claimed formula then follows by construction.


I was hoping someone could explain this theorem and proof in a somewhat less abstract fashion and maybe point out flaws in my reasoning?

Any help is greatly appreciated!

Thanks!

Martijn
 
Physics news on Phys.org
  • #2
What I get from this is:
Let [itex]\phi[/itex] be a function that maps [STRIKE]the result of[/STRIKE] a function that maps natural numbers to the set a, to [STRIKE]the result of[/STRIKE] another function that does the same.

If you'd pick two functions, say f and g, from the function set a[itex]^{\mathbb{N}}[/itex] and these return the same results for [STRIKE]an element n[/STRIKE] all inputs 0 through n-1 inclusive (i.e. f(0) = g(0), ..., f(n-1) = g(n-1)), then the [STRIKE]set of[/STRIKE] functions [itex]\phi[/itex] would yield, namely [itex]\phi(f)[/itex] and [itex]\phi(g)[/itex] [STRIKE]the same given f or g using n as input.[/STRIKE] return the same result on input n (i.e. [itex]\phi(f)(n) = \phi(g)(n)[/itex]).

They then [STRIKE]continue to use this to show[/STRIKE] assume that if [itex]\phi[/itex] is some arbitrary map of functions satisfying the above property, then there is a so called fixed point in the set of functions which yields itself given itself as input to [itex]\phi[/itex]. But I don't see how this follows from the facts described above.
They construct the fixed point of [itex]\phi[/itex]. They call this fixed point [itex]L_0[/itex], and they give you a recursive definition of this function, although admittedly their description is a bit confusing because it involves this intermediate definition of the functions [itex]\phi_n[/itex]. Let's make sure you understand the basics first before understanding the construction of [itex]L_0[/itex]. I.e. make sure you understand what I've written above, the corrections and additions to your statements. Be especially clear about the fact that [itex]a^{\mathbb{N}}[/itex] is a set of functions, and a function [itex]a^{\mathbb{N}}\to a^{\mathbb{N}}[/itex] is a function whose inputs and outputs are functions! Meaning whole functions are being treated as individual objects, which can serve as inputs and outputs of yet other functions. You should also be clear that the [itex]\phi[/itex] in the statement of the theorem is not a particular function [itex]\phi[/itex]. Rather, it says that if [itex]\phi[/itex] is any function of functions which has a particular nice property, then it has a unique fixed point.
 
  • #3
Thank you very much, that certainly helped!

So the corrected line of thinking should be; If there are two functions in a set of functions which return the same results for input restricted up to n, then the set of functions yield the same result on either f(n) or g(n).

I think I get the statement of the theorem; Given a set of functions there exist at most one input (a function) on the set of functions that yields this same input (the same function). And I think I understand the proof by contradication they provide to show there can be only 1 such function.

But I'm still lost in the details:
L[itex]\phi[/itex](n+) maps n+ to a
[itex]\phi[/itex]n+ is the set of functions that map n+ to a
L[itex]\phi[/itex]|n+ is the set of results of L[itex]\phi[/itex] from 0 up to n

So how can L[itex]\phi[/itex](n+) be equal to [itex]\phi[/itex]n+(L[itex]\phi[/itex]|n+) ?

Thanks again!

Martijn
 
  • #4
martijnh said:
Thank you very much, that certainly helped!

So the corrected line of thinking should be; If there are two functions in a set of functions which return the same results for input restricted up to n, then the set of functions yield the same result on either f(n) or g(n).

You might find it helpful to visualize a function from N->X is a sequence of elements of X. That let's you see more clearly what they're getting at.
 
  • #5
SteveL27 said:
You might find it helpful to visualize a function from N->X is a sequence of elements of X. That let's you see more clearly what they're getting at.

Hmmm if I'd simplify the case and substitute the set of functions by a function that maps [itex]\mathbb{N}[/itex] to [itex]\mathbb{N}[/itex], and the function inputs f & g by elements from [itex]\mathbb{N}[/itex]... it would basically say something very trivial; a function maps each input to one output, so if the input equals the output, that would be the fixpoint??
 
  • #6
Hmmm, what you're saying definitely isn't right. You're not mentioning [itex]\phi[/itex] at all.

We've got our set of natural numbers [itex]\mathbb{N} = \{ 0, 1, 2, \dots \}[/itex]. And then we've got some other random set [itex]a[/itex]. Now we can talk about functions from [itex]\mathbb{N}[/itex] to [itex]a[/itex]. We can imagine the collection of all such functions, and we call said collection [itex]a^{\mathbb{N}}[/itex]. If [itex]f\in a^{\mathbb{N}}[/itex], then [itex]f[/itex] inputs natural numbers [itex]n[/itex], and outputs something [itex]f(n) \in a[/itex].

Next, we can think about these sort of monster functions, which eat up entire functions as input (i.e. their inputs are not merely numbers, but entire functions), and spit out entire functions. We're interested now in talking about monsters [itex]\phi[/itex] which input elements in [itex]a^{\mathbb{N}}[/itex] and output elements in [itex]a^{\mathbb{N}}[/itex].

[tex]\phi \in \left(a^{\mathbb{N}}\right)^{a^{\mathbb{N}}}[/tex]

or, equivalently:

[tex]\phi : a^{\mathbb{N}} \to a^{\mathbb{N}}[/tex]

But we're not interested in any old [itex]\phi[/itex]. We're only interested in a certain kind of [itex]\phi[/itex], those that have a certain nice property. The Recursion Theorem says that if [itex]\phi[/itex] is any monster which has some particular nice property, then [itex]\phi[/itex] has a "fixed point," namely some input [itex]f[/itex] such that [itex]\phi(f) = f[/itex].

[tex]\forall \phi \in \left(a^{\mathbb{N}}\right)^{a^{\mathbb{N}}}\ \ \left(\phi\mbox{ nice }\Rightarrow \exists f \in a^{\mathbb{N}} : \phi(f) = f\right)[/tex]

The first question is, what is this "nice property?" We'll say [itex]\phi[/itex] is nice if for any two functions [itex]f,g\in a^{\mathbb{N}}[/itex], if it so happens that for some [itex]n\in\mathbb{N}[/itex]

[tex]f(0) = g(0), f(1) = g(1), \dots, f(n-1) = g(n-1)[/tex]

(i.e. [itex]f[/itex] and [itex]g[/itex] "agree up to [itex]n[/itex]") then when the monster eats up [itex]f[/itex] and [itex]g[/itex], the resulting functions it spits out "agree at [itex]n[/itex]), i.e. [itex]\phi(f) (n) = \phi(g) (n)[/itex].

Definition: For [itex]\phi : a^{\mathbb{N}} \to a^{\mathbb{N}}[/itex], we say [itex]\phi[/itex] is "nice" iff:

[tex]\forall f,g\in a^{\mathbb{N}} \forall n \in \mathbb{N} \left [\left(f(0) = g(0), \dots , f(n-1) = g(n-1)\right) \Rightarrow \phi(f) (n) = \phi(g) (n)\right][/tex]

There will be infinitely many nice monsters, and infinitely many not-nice monsters (assuming [itex]|a|>1[/itex]).

Again, let me stop here and make sure you've understood all this correctly before I go on and explain the proof of the theorem. There's a lot more to say about better ways to understand functions [itex]\mathbb{N} \to a[/itex], better way to understand nice monsters [itex]a^{\mathbb{N}} \to a^{\mathbb{N}}[/itex], and understanding how the Recursion Theorem justifies the process of "defining functions by recursion" (e.g. you could define the Fibonacci function by [itex]f(0) = 0, f(1) = 1, f(n+2) = f(n+1) + f(n)[/itex], but how do you know there exists a unique function satisfying this definition? It seems obvious, but the rigorous answer is: The Recursion Theorem). But let's make sure we're on the same page with the basics first.
 
Last edited:
  • #7
Thank you very much for this lucid explanation, I wish my textbook was that clear!

I still wonder about the proof and the practical usefullness of the theorem though!

PS: I did not mention [itex]\phi[/itex] in my previous post as I thought Steve suggested I should look at it in a simplified way; Where [itex]\phi[/itex] would be analogous to a function [itex]\mathbb{N}[/itex] [itex]\rightarrow[/itex] [itex]\mathbb{N}[/itex] and the input functions f & g analogous to elements of [itex]\mathbb{N}[/itex] (so comparing the construct to a regular function). But I now understand the gist of the recusrsion theorem to be about a property of functions.
 
Last edited:

FAQ: Understanding the recursion theorem

1. What is the recursion theorem?

The recursion theorem is a mathematical theorem that states that any computable function can be defined using recursion. This means that a function can be defined in terms of itself, allowing for the creation of complex and dynamic algorithms.

2. How does the recursion theorem work?

The recursion theorem works by breaking a problem down into smaller, simpler versions of itself. This is achieved by using a base case, which is the simplest form of the problem, and a recursive case, which uses the solution to the smaller version of the problem to solve the current version.

3. What is the difference between recursion and iteration?

Recursion and iteration are both methods of solving problems by breaking them down into smaller parts. The main difference is that recursion involves calling a function within itself, while iteration involves using a loop to repeatedly execute a section of code.

4. What are the advantages of using recursion?

One advantage of using recursion is that it allows for a more concise and elegant solution to certain problems. It can also be easier to understand and implement in some cases. Additionally, recursion can be more efficient in terms of memory usage compared to iteration.

5. What are some common applications of the recursion theorem?

The recursion theorem has many applications in mathematics and computer science. It is commonly used in algorithms for sorting, searching, and graph traversal. It is also used in the creation of fractal patterns and in the modeling of natural phenomena such as the growth of plants and the behavior of certain animals.

Similar threads

Replies
4
Views
922
Replies
11
Views
840
Replies
1
Views
1K
Replies
1
Views
507
Replies
5
Views
380
Back
Top