One thing I don't understand about Cantor's diagonal argument

  • B
  • Thread starter Warp
  • Start date
  • Tags
    Argument
In summary: X: 111111...which indeed is not in our set... because it's not a number at all. A base-2 number with an infinite amount of 1-digits is just infinity, which isn't a number (natural or real) by definition. Thus the diagonal argument didn't actually create a new number that's not in the set, because it merely created infinity, an invalid "number"."But", you might argue, "the argument is about proving that the real numbers between 0 and 1 are uncountable!" That doesn't really make a difference because we can simply make those be binary values between 0.0 and 1.0:0:
  • #1
Warp
131
13
Cantor's diagonal argument, essentially, proves (or demonstrates, as I'm not exactly sure if it's considered a mathematically rigorous proof) that the set of all real numbers is uncountable, ie. essentially larger than the set of natural numbers.

It does this by, essentially, proof by contradiction: We start with the assumption that the set of all real numbers is countable, we list all these numbers (which should be possible if they are countable), and we proceed to construct a number that's not in the set. Thus our original assumption that we listed all the numbers was proven incorrect, and thus the premise that the set of real numbers is countable is incorrect.

But thinking about the argument, I can't help but get the feeling that it's circular: It assumes what it's trying to prove.

What do I mean by this?

If we start with the assumption that the set of all real numbers is countable, that means that we can assume that we can create a bijection between this set and the set of natural numbers. In other words, that we can "remap" all the numbers to the natural numbers (which would essentially be their position in the list), and this shouldn't make a difference. If the set is countable, then it shouldn't really matter how we represent the values, as long as they are unique (ie. it's a true bijection.)

Thus we can represent every number in the set with consecutive natural numbers. We can do it eg. in base-2, starting with the least-significant digit first, like:

0: 000000...
1: 100000...
2: 010000...
3: 110000...
4: 001000...
5: 101000...

and so on and so forth. But now if we construct a new number by taking the first digit of the first number and inverting it, the second digit of the second number and inverting it and so on, we get a new number like:

X: 111111...

which indeed is not in our set... because it's not a number at all. A base-2 number with an infinite amount of 1-digits is just infinity, which isn't a number (natural or real) by definition. Thus the diagonal argument didn't actually create a new number that's not in the set, because it merely created infinity, an invalid "number".

"But", you might argue, "the argument is about proving that the real numbers between 0 and 1 are uncountable!" That doesn't really make a difference because we can simply make those be binary values between 0.0 and 1.0:

0: 1.000000...
1: 0.000000...
2: 0.100000...
3: 0.010000...
4: 0.110000...
5: 0.001000...

and now the diagonal argument constructs the number:

X: 0.111111...

which is just 1.0, which already exists in our original set! The diagonal argument failed to create a new number that's not in the set.

"But", you may once again argue, "you can't represent irrational numbers like pi or square root of 2 like that because they have an infinite representation and they can't be mapped to a natural number because... oh..."

Which is precisely where the circularity of the argument arises: It already assumes that there are numbers that cannot be enumerated like this... and proceeds to prove that there are numbers that cannot be enumerated like this.
 
Physics news on Phys.org
  • #2
Warp said:
Cantor's diagonal argument, essentially, proves (or demonstrates, as I'm not exactly sure if it's considered a mathematically rigorous proof) that the set of all real numbers is uncountable, ie. essentially larger than the set of natural numbers.

It does this by, essentially, proof by contradiction: We start with the assumption that the set of all real numbers is countable, we list all these numbers (which should be possible if they are countable), and we proceed to construct a number that's not in the set. Thus our original assumption that we listed all the numbers was proven incorrect, and thus the premise that the set of real numbers is countable is incorrect.

But thinking about the argument, I can't help but get the feeling that it's circular: It assumes what it's trying to prove.

What do I mean by this?

If we start with the assumption that the set of all real numbers is countable, that means that we can assume that we can create a bijection between this set and the set of natural numbers. In other words, that we can "remap" all the numbers to the natural numbers (which would essentially be their position in the list), and this shouldn't make a difference. If the set is countable, then it shouldn't really matter how we represent the values, as long as they are unique (ie. it's a true bijection.)

Thus we can represent every number in the set with consecutive natural numbers. We can do it eg. in base-2, starting with the least-significant digit first, like:

0: 000000...
1: 100000...
2: 010000...
3: 110000...
4: 001000...
5: 101000...

and so on and so forth. But now if we construct a new number by taking the first digit of the first number and inverting it, the second digit of the second number and inverting it and so on, we get a new number like:

X: 111111...

which indeed is not in our set... because it's not a number at all. A base-2 number with an infinite amount of 1-digits is just infinity, which isn't a number (natural or real) by definition. Thus the diagonal argument didn't actually create a new number that's not in the set, because it merely created infinity, an invalid "number".

"But", you might argue, "the argument is about proving that the real numbers between 0 and 1 are uncountable!" That doesn't really make a difference because we can simply make those be binary values between 0.0 and 1.0:

0: 1.000000...
1: 0.000000...
2: 0.100000...
3: 0.010000...
4: 0.110000...
5: 0.001000...

and now the diagonal argument constructs the number:

X: 0.111111...

which is just 1.0, which already exists in our original set! The diagonal argument failed to create a new number that's not in the set.

"But", you may once again argue, "you can't represent irrational numbers like pi or square root of 2 like that because they have an infinite representation and they can't be mapped to a natural number because... oh..."

Which is precisely where the circularity of the argument arises: It already assumes that there are numbers that cannot be enumerated like this... and proceeds to prove that there are numbers that cannot be enumerated like this.
The diagonal argument doesn't quite work in base 2. You need to represent the numbers in another base.

All you have is a failed attempt at a proof.
 
  • #4
Warp said:
Cantor's diagonal argument, essentially, proves (or demonstrates, as I'm not exactly sure if it's considered a mathematically rigorous proof) that the set of all real numbers is uncountable, ie. essentially larger than the set of natural numbers.

Cantor's diagonal argument is a mathematically rigorous proof, but not of quite the proposition you state. It is a mathematically rigorous proof that the set of all infinite sequences of binary digits is uncountable. That set is not the same as the set of all real numbers.

It can be proven that there is a bijection between the set of all infinite sequence of binary digits and the set of all real numbers, which, combined with Cantor's diagonal argument, shows that the set of all real numbers is uncountable. But AFAIK Cantor never actually produced any proof that such a bijection exists.

Cantor did publish a different proof that the set of real numbers is uncountable, not using the diagonal argument:

https://en.wikipedia.org/wiki/Cantor's_first_set_theory_article
 
  • Like
Likes WWGD and etotheipi
  • #5
Warp said:
But thinking about the argument, I can't help but get the feeling that it's circular: It assumes what it's trying to prove.

What do I mean by this?

If we start with the assumption that the set of all real numbers is countable,
A circular argument is where you assume what you are trying to prove. That is invalid. A proof by contradiction is where you assume what you are trying to disprove. That is completely different and it is valid.
The Cantor diagonal proof is a valid proof by contradiction.
 
  • #6
FactChecker said:
A circular argument is where you assume what you are trying to prove. That is invalid. A proof by contradiction is where you assume what you are trying to disprove. That is completely different and it is valid.
The Cantor diagonal proof is a valid proof by contradiction.
I don't think you understood my post. I did not say that the argument is circular because it starts with the assumption that the set of real numbers is countable. The reason why I think it's circular is stated at the end of my post, not the part which you quoted.
 
  • #7
PeterDonis said:
Cantor's diagonal argument is a mathematically rigorous proof, but not of quite the proposition you state. It is a mathematically rigorous proof that the set of all infinite sequences of binary digits is uncountable. That set is not the same as the set of all real numbers.
Technically speaking you are completely right, as it indeed deals with infinite strings of digits (without taking a stance on whether the set of them has a bijection to the set of real numbers). But still, I don't think that changes the essence of my question.

If we start with the assumption that the set of all possible infinite strings is countable (which is the assumption we are trying to disprove), then it ought to make no difference if we remap every element in this set to the natural numbers. This is because if you argue that you can't do this kind of remapping without changing something essential about the situation, you are arguing that there is no such bijection between the set of all infinite strings and the set of natural numbers... which is the very thing we are trying to prove. By saying "you can't do this remapping" you would be already assuming what you are trying to prove, which would be a circular argument.

I'm certain that I'm missing something here, that I have misunderstood something, but I'm not sure what.
 
  • #8
Warp said:
Technically speaking you are completely right, as it indeed deals with infinite strings of digits (without taking a stance on whether the set of them has a bijection to the set of real numbers). But still, I don't think that changes the essence of my question.

If we start with the assumption that the set of all possible infinite strings is countable (which is the assumption we are trying to disprove), then it ought to make no difference if we remap every element in this set to the natural numbers. This is because if you argue that you can't do this kind of remapping without changing something essential about the situation, you are arguing that there is no such bijection between the set of all infinite strings and the set of natural numbers... which is the very thing we are trying to prove. By saying "you can't do this remapping" you would be already assuming what you are trying to prove, which would be a circular argument.

I'm certain that I'm missing something here, that I have misunderstood something, but I'm not sure what.

Cantor's proof has a subtlety. Many rationals can be expressed in two ways. E.g. ##0.4999 \dots = 0.500 \dots##. The proof in base 10 needs to take account of this and avoid the possibility of getting a string of nines and/or a string of zeros. This is easy to do. For example, you simply never pick 0 or 9, as there are always other digits to choose from.

If you try in base 2, the problem of avoiding this gets harder, which is what you found.

You might be able to repair the proof using base 2, but better to stick to base 10. A proof is a proof.
 
  • Like
Likes FactChecker and etotheipi
  • #9
PeroK said:
If you try in base 2, the problem of avoiding this gets harder, which is what you found.
Yeah, I suppose that if I used my remapping trickery in base 3 (or higher) rather than base 2, then it would be more obvious that the argument is sound, as it's now easy to create a new number that doesn't exist in the set (and isn't just infinity).

But then... how do I know that the generated number wasn't actually in the original set that got remapped, and this diagonalization technique didn't simply replicate one of those numbers? Or does this even matter?

The argument, as I have changed it, is now essentially saying "if you remap the set of all infinite strings of digits to the set of all finite strings of digits, there are no infinite strings of digits in this new set", which seems to be self-evidently true, almost tautological?

I think there's something fundamentally wrong with my idea of remapping the supposedly-countable set to the natural numbers, but my very limited mathematical knowledge doesn't allow me to fully understand the basic problem in it.
 
  • #10
Warp said:
But then... how do I know that the generated number wasn't actually in the original set that got remapped, and this diagonalization technique didn't simply replicate one of those numbers? Or does this even matter?

In the diagonal argument it is important that a) the string is not in the list and b) the string is not equal to another (different) string in the list when the two strings are mapped to real numbers. a) is easy; b) is the subtlety.

There is no need for any further trickery!
 
  • #11
Warp said:
I don't think you understood my post. I did not say that the argument is circular because it starts with the assumption that the set of real numbers is countable. The reason why I think it's circular is stated at the end of my post, not the part which you quoted.
The issue with multiple representations of a number is not an assumption that creates a circular argument. It is a different problem that is easily avoided by using the base-10 representation. As @PeroK indicated, that gives you much more freedom to generate unlisted numbers while avoiding the problem of multiple representations. It is not "trickery"; it is a more natural approach. Another thing I like about using the decimal representation is that it gives you so much freedom to create vastly more unlisted numbers than listed numbers. So it prepares you for the introduction of transfinite numbers beyond ##\aleph_0##.
 
Last edited:
  • #12
Warp said:
The argument, as I have changed it, is now essentially saying "if you remap the set of all infinite strings of digits to the set of all finite strings of digits, there are no infinite strings of digits in this new set", which seems to be self-evidently true, almost tautological?
No, you have that the wrong way round; the argument is that if you try to remap the set of all infinite strings of digits to the set of all finite strings of digits I can always show that you have not succeeded by creating an infinite string of digits which you have not yet mapped.
 
  • #13
pbuk said:
No, you have that the wrong way round; the argument is that if you try to remap the set of all infinite strings of digits to the set of all finite strings of digits I can always show that you have not succeeded by creating an infinite string of digits which you have not yet mapped.
Actually no, because that was precisely one of the questions I asked in that particular post: "How do we know that this new generated number wasn't actually in the original set, which we remapped to the natural numbers?"

Anyway, I may have expressed myself poorly in that paragraph which you are quoting. I was not trying to say "there's a flaw in Cantor's diagonal argument because (the reason I described)", but instead, "there's probably a flaw in my argument because (the reason I described), I just don't know exactly what this flaw is."

What I'm trying to say is that if I make the assumption that the set of all infinite strings is countable, and using this assumption assume that I can "remap" all these strings to the natural numbers, we don't even need the diagonalization operation to create a number that's not in this new set. We simply need any infinite string that contains an infinite amount of 1-digits (in base 3 or higher), and we have a string that doesn't exist in this new set.

However, something doesn't feel right here. We have no way of knowing if this string didn't already exist in the original set, and was simply remapped to some natural number. Thus the argument would just say, like "we remap all the original numbers, like 1/3, to the natural numbers... and now there's no 1/3 in this new set!" Which is completely tautological. Of course there isn't, because we just remapped it to a natural number. But it isn't any sort of proof either. It doesn't prove anything.

There must be some fundamental mistake I'm doing in this whole thing that I'm not grasping.
 
  • #14
Warp said:
Of course there isn't, because we just remapped it to a natural number.

There must be some fundamental mistake I'm doing in this whole thing that I'm not grasping.

Are you imagining that a typical item is the list is:
$$0.n_1n_2n_3 \dots$$
Where ##n_1n_2n_3 \dots## is a natural number?
 
  • #15
Here is a point that might simplify things. It is perfectly ok to have numbers on the list multiple times in different representations. If that list does not contain all numbers, then clearly a cleaner list, without repeats, would also not contain all numbers. Where you want to be careful is when you generate the unlisted number. Then you want to avoid generating any number that might have other representations which might be on the list. That is easy to avoid in base 10. Just generate a number that does not end in infinite 0's or 9's. Throw in other digits 1,2,..,8, (also different from the digit on the list in that position) periodically.
 
Last edited:
  • Like
Likes pbuk
  • #16
Warp said:
If we start with the assumption that the set of all possible infinite strings is countable

That's not the assumption we start with.

We start with the assumption that we have a sequence of infinite strings of binary digits. We do not assume that this sequence contains all possible infinite strings of binary digits. We simply ask the question: can we, given any such sequence, construct an infinite string of binary digits that cannot be in the sequence? The diagonal argument proves, by construction, that the answer to this question is yes.

This argument, by itself, does not prove that the set of infinite strings of binary digits is uncountable. To prove that, we need an additional argument as follows:

Premise #1: If the set of infinite strings of binary digits is countable, then there must exist some sequence of infinite strings of binary digits that contains all of them.

Premise #2: There does not exist any such sequence (by the argument above).

Conclusion: The set of infinite strings of binary digits is not countable.

Premise #1 does contain (as the first clause of a hypothetical) the assumption that the set of infinite strings of binary digits is countable--but Premise #1, as noted above, is not the premise used in the diagonalization construction. That construction--the proof that the answer to the question asked above is yes--makes no such assumption. So there is no circularity anywhere.
 
  • Like
Likes pbuk
  • #17
Warp said:
it ought to make no difference if we remap every element in this set to the natural numbers

The easiest way to avoid any issues with "mapping" (which, as others have noted, can be avoided by using some other representation for numbers than sequences of binary digits to do the mapping, but that involves a more complicated argument) is to note that there is an independent proof of the theorem that there is a bijection between infinite sequences of binary digits and the real numbers (not the natural numbers). So once we have proven that the set of infinite sequences of binary digits is uncountable, we can use that theorem about the existence of a bijection to prove that the set of real numbers is uncountable, without ever having to deal with any issues of how exactly we "map" infinite sequences of binary digits to numbers.
 
  • #18
PeterDonis said:
If we start with the assumption that the set of all possible infinite strings is countable
That's not the assumption we start with.

We start with the assumption that we have a sequence of infinite strings of binary digits. We do not assume that this sequence contains all possible infinite strings of binary digits.
But we have to assume that the list is countable, and in fact remappable to the natural numbers, else we wouldn't be able to list it, and most importantly we would not be able to say "take the first digit of the first string, the second digit of the second string, etc." The very fact that we are talking about "the first string", "the second string" and so on requires that we assume the list of strings is countable, ie. enumerable, ie. mappable to the natural numbers.
 
  • #19
Warp said:
we have to assume that the list is countable

We have to assume that we have a countable list of infinite binary strings, yes, since that's the definition of a "sequence". But we do not have to assume that that list contains all infinite binary strings. So we are not assuming that the set of all infinite binary strings is countable. We are only assuming that the set of infinite binary strings that appears in our list is countable. In other words, we are assuming that there exists a countable set of infinite binary strings. But this assumption is obviously true, since we know the set of all possible infinite binary strings is not finite, and any infinite set contains a countable subset.
 
  • #20
One thing that would invalidate the proof is if we assumed what we want to prove. We want to prove that the countable list does not contain all the reals. We have never assumed that -- we proved it.
 
  • #21
Warp said:
But we have to assume that the list is countable, and in fact remappable to the natural numbers, else we wouldn't be able to list it, and most importantly we would not be able to say "take the first digit of the first string, the second digit of the second string, etc." The very fact that we are talking about "the first string", "the second string" and so on requires that we assume the list of strings is countable, ie. enumerable, ie. mappable to the natural numbers.

Here we assume the opposite (negation) of what we want to prove. We assume the reals are countable; obtain a contradiction and conclude they must be uncountable.

This is called a proof by contradiction. For example, you can prove there are infinitely many primes using a proof by contradiction:

1) Assume there is only a finite number of primes.
2) Produce a contradiction (something you already know to be false, or that contradicts the assumption 1).
3) Conclude that the number of primes cannot be finite.
 
  • #22
@PeroK
There are two ways (both closely related) to look at it. The one you have described is one way. The other is described at the beginning of post#16:
PeterDonis said:
We start with the assumption that we have a sequence of infinite strings of binary digits. We do not assume that this sequence contains all possible infinite strings of binary digits. We simply ask the question: can we, given any such sequence, construct an infinite string of binary digits that cannot be in the sequence? The diagonal argument proves, by construction, that the answer to this question is yes.
P.S. pointing this out for benefit of OP
 
Last edited:
  • #23
In answer to the original question: Cantor's proof is indeed a proof and it is entirely rigorous.

In outline, it shows that IF there existed a one-to-one correspondence between the positive integers Z+ and the real numbers, then it is always possible to find a real number that no integer in Z+ corresponds to.

Which instantly shows that the IF clause is not possible. That's what it would mean for the set of real numbers and the set of integers to have "the same size". Further, since the integers form a subset of the real numbers, this shows that the set of real numbers is of a "larger size" than the set of integers.
 
  • #24
PeroK said:
The diagonal argument doesn't quite work in base 2. You need to represent the numbers in another base.

All you have is a failed attempt at a proof.
Cantor used base 2 in his 1891 paper.

From: https://en.wikipedia.org/wiki/Cantor's_diagonal_argument#
1596923922053.png


An illustration of Cantor's diagonal argument (in base 2) for the existence of uncountable sets. The sequence at the bottom cannot occur anywhere in the enumeration of sequences above.
 
  • Like
Likes PeroK
  • #25
zinq said:
In outline, it shows that IF there existed a one-to-one correspondence between the positive integers Z+ and the real numbers, then it is always possible to find a real number that no integer in Z+ corresponds to.

As already noted, the diagonal argument does not work directly with real numbers; it works with infinite sequences of binary digits. That avoids a number of complications which have been discussed in this thread.
 
  • #26
sysprog said:
Cantor used base 2

More precisely, he used the set of infinite sequences of binary digits, not the set of real numbers. As has been noted, it is possible to show that there is a bijection between those two sets, but Cantor did not do so and did not make use of any such bijection in his argument. His purpose was not to show that the real numbers, specifically, were uncountable, but that there must exist some set that was uncountable--the set he actually showed was uncountable was the set of infinite sequences of binary digits.
 
  • Like
Likes pbuk, FactChecker and sysprog
  • #27
PeroK said:
This is called a proof by contradiction.
I don't really understand why you are explaining this. You might want to read the second paragraph of my original post.
 
  • #28
I didn't read the entire thread, but maybe the following is helpful:

One way to get around the issue of multiple decimal representations is to, say, work only with the real numbers in ##[0,1]## admitting a decimal representation only using the digits ##7## and ##8##. The diagonalization argument shows that there are uncountably many sequences of ##7## and ##8##, which correspond to uncountably many real numbers because no two decimals containing only ##7## and ##8## are equal.
 
  • Like
Likes PeroK
  • #29
Warp said:
I don't really understand why you are explaining this. You might want to read the second paragraph of my original post.

You said:

Warp said:
It does this by, essentially, proof by contradiction: We start with the assumption that the set of all real numbers is countable, we list all these numbers (which should be possible if they are countable), and we proceed to construct a number that's not in the set. Thus our original assumption that we listed all the numbers was proven incorrect, and thus the premise that the set of real numbers is countable is incorrect.

But thinking about the argument, I can't help but get the feeling that it's circular: It assumes what it's trying to prove.

Is your doubt that proof by contradiction, in general, is an invalid approach to a proof?
 
Last edited:
  • Like
Likes pbuk
  • #30
PeroK said:
Is your doubt that proof by contradiction, in general, is an invalid approach to a proof?
No. What I said is that it looks to me like this argument is circular, which a proof, even a proof by contradiction, shouldn't be.
 
  • #31
Warp said:
What I said is that it looks to me like this argument is circular

It has been explained why it isn't. Do you have further questions about the explanations that have been provided?
 
  • #32
zinq said:
That makes no difference to me.

This thread is not your thread. And given this attitude, you have now been banned from further posting in it.
 
  • #33
@Warp Your confusion about the argument appearing circular seems to come from the fact that real numbers can have multiple decimal representations. Could you say if my post 28 answers your question, or if it's still not clear?
 
  • #35
Infrared said:
@Warp Your confusion about the argument appearing circular seems to come from the fact that real numbers can have multiple decimal representations. Could you say if my post 28 answers your question, or if it's still not clear?
I commented on that in post #13 in this tread. I suppose my difficulty in understanding the argument changed from "this argument looks circular to me, when it's done using base 2" to, essentially, "I don't know what's wrong in my logic when I do this remapping." To recapitulate this latter point:

The argument (which is a proof by contradiction) starts by making the assumption that the set of all infinite strings is countable, ie. enumerable, ie. possible to list in order (it has to make that assumption because else it would not be able to say "take the first digit of the first string, the second digit of the second string, etc.")

My thinking is (and where I'm probably mistaken, although I don't know the details) that if we assume the set is countable, ie. enumerable, it shouldn't make any difference if we replace every element in the list with a natural number. From the perspective of the proof it should make no difference. (Or should it?)

However, if we replace every element in the list with a natural number, eg. its position in the list, we have now converted it to a list of finite strings, and the only thing that the diagonal argument is now saying is that there are no infinite strings in the list. Which is self-evident to the point of being tautological. (Of course there are no infinite strings in the list, because we just remapped them to the natural numbers. This is not something that requires proof.)

In addition, the new number generated by the diagonalization is just a rational number (because it has an infinite expansion of eg. '1' digits, or whatever you decide to replace the '0' digits with, which makes it a rational number), and ostensibly all the rational numbers were in our original list, so it didn't actually create a new number. It just created one that we already remapped to a natural number in the first step above.

I suppose the conclusion is that we can't do that remapping, but it's not really the diagonalization argument that proves that we can't.
 
Back
Top