Understanding the recursion theorem

martijnh · Feb 19, 2012

Hi all,

Im doing some self study in set theory, but got kind of stuck with a proof my textbook gives about the so called recursion theorem:

Theorem:
Let a be a set, and let [itex]\phi[/itex]: a[itex]^{\mathbb{N}}[/itex][itex]\rightarrow[/itex]a[itex]^{\mathbb{N}}[/itex] be a function such that for every natural number n, if f, g [itex]\in[/itex]a[itex]^{\mathbb{N}}[/itex] are such that f|_n = g|_n, then [itex]\phi[/itex](f)(n) = [itex]\phi[/itex](g)(n). Then [itex]\phi[/itex] has a unique fixpoint L_{[itex]\phi[/itex]} [itex]\in[/itex] a[itex]^{\mathbb{N}}[/itex], which means that [itex]\phi[/itex] (L_{[itex]\phi[/itex]}) = L_{[itex]\phi[/itex]}. Consider the function [itex]\phi[/itex]_n : a[itex]^{n}[/itex][itex]\rightarrow[/itex]a which evaluates to [itex]\phi[/itex]_n(g) = [itex]\phi[/itex](g*)(n). Then we have

L_{[itex]\phi[/itex]}(0) = [itex]\phi[/itex]₀(!:0[itex]\rightarrow[/itex]a)

L_{[itex]\phi[/itex]}(n⁺) = [itex]\phi[/itex]_n+(L_{[itex]\phi[/itex]}|_n+)

What I get from this is:
Let [itex]\phi[/itex] be a function that maps the result of a function that maps natural numbers to the set a, to the result of another function that does the same.

If you'd pick two functions from the function set a[itex]^{\mathbb{N}}[/itex] and these return the same result for an element n, then the set of functions [itex]\phi[/itex] would yield the same given f or g using n as input.

They then continue to use this to show there is a so called fixpoint in the set of functions [itex]\phi[/itex] which yields itself given itself as input to [itex]\phi[/itex]. But I don't see how this follows from the facts described above...

Then they go on proving there can be only one such a fixpoint in the set of functions [itex]\phi[/itex], but this makes even less sense to me:

Proof:
There is at most one such fixpoint. In fact, let L and M be two such fixpoints [itex]\phi[/itex](L) = L and [itex]\phi[/itex](M) = M, and suppose that they are different. Then there is a smallest value n₀ such that L(n₀) ≠ M(n₀). This means that L|_n0 = M|_n0. But then [itex]\phi[/itex](L)(n₀) = [itex]\phi[/itex](M)(n₀), a contradiction. So there is at most one such fixpoint. For every n [itex]\in[/itex] [itex]\mathbb{N}[/itex], let S(n) [itex]\subset[/itex] aⁿ be the set of those functions f: n [itex]\rightarrow[/itex] a such that for all m [itex]\in[/itex] n, f(m) = [itex]\phi[/itex]_m(f|_m). Clearly, either S(n) is empty or S(n) contains precisely one function g_n. The set S(0) is not empty. Let N⁺ be the smallest natural number such that S(N⁺) is empty. We may define a function h: N⁺ [itex]\rightarrow[/itex] a by h|_N = g_N and h(N) = [itex]\phi[/itex]_N(h|_N). But this is a function in S(N⁺), so every S(n) is non-empty. Now define L(n) = g_n+(n). Clearly, this function is our candidate for a fixpoint: To begin with, if n < m, then evidently, by the uniqueness of the elements of S(n), g(m)|_n = g(n). Therefore, L|_n = g_n for all n. And L is a fixpoint, in fact: L(n) = g_n+(n) = [itex]\phi[/itex](g_n+|_n) = [itex]\phi[/itex]_n(g_n) = [itex]\phi[/itex](g[itex]^{*}_{n}[/itex])(n) = [itex]\phi[/itex](L)(n) since L|_n = g_n = g[itex]^{*}_{n}[/itex]|_n. The claimed formula then follows by construction.

I was hoping someone could explain this theorem and proof in a somewhat less abstract fashion and maybe point out flaws in my reasoning?

Any help is greatly appreciated!

Thanks!

Martijn

AKG · Feb 20, 2012

What I get from this is:
Let [itex]\phi[/itex] be a function that maps [STRIKE]the result of[/STRIKE] a function that maps natural numbers to the set a, to [STRIKE]the result of[/STRIKE] another function that does the same.

If you'd pick two functions, say f and g, from the function set a[itex]^{\mathbb{N}}[/itex] and these return the same results for [STRIKE]an element n[/STRIKE] all inputs 0 through n-1 inclusive (i.e. f(0) = g(0), ..., f(n-1) = g(n-1)), then the [STRIKE]set of[/STRIKE] functions [itex]\phi[/itex] would yield, namely [itex]\phi(f)[/itex] and [itex]\phi(g)[/itex] [STRIKE]the same given f or g using n as input.[/STRIKE] return the same result on input n (i.e. [itex]\phi(f)(n) = \phi(g)(n)[/itex]).

They then [STRIKE]continue to use this to show[/STRIKE] assume that if [itex]\phi[/itex] is some arbitrary map of functions satisfying the above property, then there is a so called fixed point in the set of functions which yields itself given itself as input to [itex]\phi[/itex]. But I don't see how this follows from the facts described above.

They construct the fixed point of [itex]\phi[/itex]. They call this fixed point [itex]L_0[/itex], and they give you a recursive definition of this function, although admittedly their description is a bit confusing because it involves this intermediate definition of the functions [itex]\phi_n[/itex]. Let's make sure you understand the basics first before understanding the construction of [itex]L_0[/itex]. I.e. make sure you understand what I've written above, the corrections and additions to your statements. Be especially clear about the fact that [itex]a^{\mathbb{N}}[/itex] is a set of functions, and a function [itex]a^{\mathbb{N}}\to a^{\mathbb{N}}[/itex] is a function whose inputs and outputs are functions! Meaning whole functions are being treated as individual objects, which can serve as inputs and outputs of yet other functions. You should also be clear that the [itex]\phi[/itex] in the statement of the theorem is not a particular function [itex]\phi[/itex]. Rather, it says that if [itex]\phi[/itex] is any function of functions which has a particular nice property, then it has a unique fixed point.

martijnh · Feb 20, 2012

Thank you very much, that certainly helped!

So the corrected line of thinking should be; If there are two functions in a set of functions which return the same results for input restricted up to n, then the set of functions yield the same result on either f(n) or g(n).

I think I get the statement of the theorem; Given a set of functions there exist at most one input (a function) on the set of functions that yields this same input (the same function). And I think I understand the proof by contradication they provide to show there can be only 1 such function.

But I'm still lost in the details:
L_{[itex]\phi[/itex]}(n⁺) maps n+ to a
[itex]\phi[/itex]_n+ is the set of functions that map n+ to a
L_{[itex]\phi[/itex]}|_n+ is the set of results of L_{[itex]\phi[/itex]} from 0 up to n

So how can L_{[itex]\phi[/itex]}(n⁺) be equal to [itex]\phi[/itex]_n+(L_{[itex]\phi[/itex]}|_n+) ?

Thanks again!

Martijn

SteveL27 · Feb 20, 2012

martijnh said:

Thank you very much, that certainly helped!

So the corrected line of thinking should be; If there are two functions in a set of functions which return the same results for input restricted up to n, then the set of functions yield the same result on either f(n) or g(n).

You might find it helpful to visualize a function from N->X is a sequence of elements of X. That let's you see more clearly what they're getting at.

martijnh · Feb 21, 2012

SteveL27 said:

You might find it helpful to visualize a function from N->X is a sequence of elements of X. That let's you see more clearly what they're getting at.

Hmmm if I'd simplify the case and substitute the set of functions by a function that maps [itex]\mathbb{N}[/itex] to [itex]\mathbb{N}[/itex], and the function inputs f & g by elements from [itex]\mathbb{N}[/itex]... it would basically say something very trivial; a function maps each input to one output, so if the input equals the output, that would be the fixpoint??

AKG · Feb 21, 2012

Hmmm, what you're saying definitely isn't right. You're not mentioning [itex]\phi[/itex] at all.

We've got our set of natural numbers [itex]\mathbb{N} = \{ 0, 1, 2, \dots \}[/itex]. And then we've got some other random set [itex]a[/itex]. Now we can talk about functions from [itex]\mathbb{N}[/itex] to [itex]a[/itex]. We can imagine the collection of all such functions, and we call said collection [itex]a^{\mathbb{N}}[/itex]. If [itex]f\in a^{\mathbb{N}}[/itex], then [itex]f[/itex] inputs natural numbers [itex]n[/itex], and outputs something [itex]f(n) \in a[/itex].

Next, we can think about these sort of monster functions, which eat up entire functions as input (i.e. their inputs are not merely numbers, but entire functions), and spit out entire functions. We're interested now in talking about monsters [itex]\phi[/itex] which input elements in [itex]a^{\mathbb{N}}[/itex] and output elements in [itex]a^{\mathbb{N}}[/itex].

[tex]\phi \in \left(a^{\mathbb{N}}\right)^{a^{\mathbb{N}}}[/tex]

or, equivalently:

[tex]\phi : a^{\mathbb{N}} \to a^{\mathbb{N}}[/tex]

But we're not interested in any old [itex]\phi[/itex]. We're only interested in a certain kind of [itex]\phi[/itex], those that have a certain nice property. The Recursion Theorem says that if [itex]\phi[/itex] is any monster which has some particular nice property, then [itex]\phi[/itex] has a "fixed point," namely some input [itex]f[/itex] such that [itex]\phi(f) = f[/itex].

[tex]\forall \phi \in \left(a^{\mathbb{N}}\right)^{a^{\mathbb{N}}}\ \ \left(\phi\mbox{ nice }\Rightarrow \exists f \in a^{\mathbb{N}} : \phi(f) = f\right)[/tex]

The first question is, what is this "nice property?" We'll say [itex]\phi[/itex] is nice if for any two functions [itex]f,g\in a^{\mathbb{N}}[/itex], if it so happens that for some [itex]n\in\mathbb{N}[/itex]

[tex]f(0) = g(0), f(1) = g(1), \dots, f(n-1) = g(n-1)[/tex]

(i.e. [itex]f[/itex] and [itex]g[/itex] "agree up to [itex]n[/itex]") then when the monster eats up [itex]f[/itex] and [itex]g[/itex], the resulting functions it spits out "agree at [itex]n[/itex]), i.e. [itex]\phi(f) (n) = \phi(g) (n)[/itex].

Definition: For [itex]\phi : a^{\mathbb{N}} \to a^{\mathbb{N}}[/itex], we say [itex]\phi[/itex] is "nice" iff:

[tex]\forall f,g\in a^{\mathbb{N}} \forall n \in \mathbb{N} \left [\left(f(0) = g(0), \dots , f(n-1) = g(n-1)\right) \Rightarrow \phi(f) (n) = \phi(g) (n)\right][/tex]

There will be infinitely many nice monsters, and infinitely many not-nice monsters (assuming [itex]|a|>1[/itex]).

Again, let me stop here and make sure you've understood all this correctly before I go on and explain the proof of the theorem. There's a lot more to say about better ways to understand functions [itex]\mathbb{N} \to a[/itex], better way to understand nice monsters [itex]a^{\mathbb{N}} \to a^{\mathbb{N}}[/itex], and understanding how the Recursion Theorem justifies the process of "defining functions by recursion" (e.g. you could define the Fibonacci function by [itex]f(0) = 0, f(1) = 1, f(n+2) = f(n+1) + f(n)[/itex], but how do you know there exists a unique function satisfying this definition? It seems obvious, but the rigorous answer is: The Recursion Theorem). But let's make sure we're on the same page with the basics first.

martijnh · Feb 23, 2012

Thank you very much for this lucid explanation, I wish my textbook was that clear!

I still wonder about the proof and the practical usefullness of the theorem though!

PS: I did not mention [itex]\phi[/itex] in my previous post as I thought Steve suggested I should look at it in a simplified way; Where [itex]\phi[/itex] would be analogous to a function [itex]\mathbb{N}[/itex] [itex]\rightarrow[/itex] [itex]\mathbb{N}[/itex] and the input functions f & g analogous to elements of [itex]\mathbb{N}[/itex] (so comparing the construct to a regular function). But I now understand the gist of the recusrsion theorem to be about a property of functions.

Understanding the recursion theorem

FAQ: Understanding the recursion theorem

1. What is the recursion theorem?

2. How does the recursion theorem work?

3. What is the difference between recursion and iteration?

4. What are the advantages of using recursion?

5. What are some common applications of the recursion theorem?

Similar threads

Hot Threads

Recent Insights