Algorithm for checking if a set of digits are all distinct

In summary, the conversation discusses different approaches to checking whether a set of 9 digits contains all digits from 1 to 9. The participants suggest using the sum and product of the digits as well as checking for duplicate digits. One person shares a quick script that solves the problem using bitwise operations, while another suggests using a temporary set to check for duplicates. However, it is mentioned that these methods may not guarantee that all digits are present. One person runs a test and finds an exception to the methods, while another shares a longer code that takes longer to run. In the end, it is noted that the best solution may be a combination of these methods.
  • #1
etotheipi
If I give you 9 digits ##u_1, u_2, \dots, u_9##, is there an operation/set of operations you can perform to check whether all the digits from 1 to 9 are represented in that set? Just asking because my solution was a boring brute force check.

I don't think anything useful becomes of the product ##u_1 u_2 \dots u_9## since, for instance, there exist other sets of 9 digits that multiply to ##9!## (e.g. 653346274).

I wondered whether anyone knew if a nicer method is possible?
 
Last edited by a moderator:
  • Like
Likes JD_PM and member 587159
Mathematics news on Phys.org
  • #2
etotheipi said:
If I give you 9 digits ##u_1, u_2, \dots, u_9##, is there an operation/set of operations you can perform to check whether all the digits from 1 to 9 are represented in that set? Just asking because my solution was a boring brute force check.

I don't think anything useful becomes of the product ##u_1 u_2 \dots u_9## since, for instance, there exist other sets of 9 digits that multiply to ##9!## (e.g. 653346274).

I wondered whether anyone knew if a nicer method is possible?
If the 9 digits are distinct, their sum will be 45. If their product also is 362,880 (9!), that would suggest that they are distinct. The conditions that their sum is 45 and their product is 362,880 are necessary, but I haven't convinced myself that they are sufficient conditions.
 
  • Like
Likes etotheipi and member 587159
  • #3
Mark44 said:
but I haven't convinced myself that they are sufficient conditions.

This can be brute-forced by a computer (didn't check myself, but it sounds plausible).
 
  • Like
Likes etotheipi
  • #4
I just wrote quick script to test this and it turns out there is one exception. The other one it threw up was 124445799.
Code:
found = ["123456789"]

for i in range(100000000,1000000000):
    num = i
    product = 1
    sum = 0

    while(i>=1):
        product *= i%10
        sum += i%10
        i//=10

    if(product == 362880 and sum == 45):
        ascending = "".join(sorted(str(num)))
        new = True
        for j in found:
            if(ascending == j):
                new = False
        if(new):
            found.append(ascending)

print(found)
OUTPUT:
['123456789', '124445799']
 
Last edited by a moderator:
  • Like
Likes member 587159
  • #5
Here's another check: 9! = ##2^7 \cdot 3^4 \cdot 5 \cdot 7##
If the product is divisible by any higher powers of the four factors above, at least one digit is repeated. In other words, the product can't be evenly divisible by ##2^8## or ##3^5## or ##5^2## or ##7^2##. And I'm assuming that the sum of the 9 digits is 45.
 
  • #6
etotheipi said:
I just wrote quick script to test this and it turns out there are a few exceptions. The first one it threw up was
124445799.

Code:
for i in range(100000000,1000000000):
    num = i
    product = 1
    sum = 0

    while(i>=1):
        product *= i%10
        sum += i%10
        i//=10

    if(product == 362880 and sum == 45):
        ascending = "".join(sorted(str(num)))
        if(ascending != "123456789"):
            print(num)
Why are you checking numbers in the range 100,000,000 to 999,999,999? If the product is larger than 362,880 (9!), then there have to be some repeated digits.

I think I understand your reasoning -- If all 9 digits are 9, the product is 387,420,489, which is within your range, but that's brute force with a bludgeon.
 
  • Like
Likes Janosh89 and etotheipi
  • #7
Mark44 said:
I think I understand your reasoning -- If all 9 digits are 9, the product is 387,420,489, which is within your range, but that's brute force with a bludgeon.

Whoops you're quite right, I wasn't thinking too much about that part 😂, a sledgehammer approach indeed. Luckily modern computing power makes fairly short work of my sloppiness anyway!
 
  • Haha
Likes member 587159
  • #8
Start off with a 16 bit integer set to 0
For each digit, set the corresponding bit
Compare the integer with 511
 
  • Like
Likes etotheipi
  • #9
Merlin3189 said:
Start off with a 16 bit integer set to 0
For each digit, set the corresponding bit
Compare the integer with 511

This is fairly similar to my initial approach but your method is much snappier; I just scanned through the result and checked for no zeroes but you're right it would be quicker to just compare with ##2^9 -1##. I think this is probably the best solution, since a number theory approach is perhaps overly complicated. Thanks!
 
  • #10
etotheipi said:
If I give you 9 digits ##u_1, u_2, \dots, u_9##, is there an operation/set of operations you can perform to check whether all the digits from 1 to 9 are represented in that set?
{...snip...}
I wondered whether anyone knew if a nicer method is possible?
If the solution does not have to include math operations, then
if a digit repeats, then not all digits from 1 to 9 are represented.

create a temporary set ##temp[1:9]##; zero ##temp##​
for each ##u_i##​
if (##u_i ∈ temp##) then return (false)​
let ##temp_i = u_i##​
end for​
return (true)​

For brevity, I left out the details for checking if digits repeat.
{Edit: removed check for zero since already zeroed ##temp##.}
 
  • Like
Likes etotheipi
  • #11
Setting digits is the quickest option. It's also one that comes with minimal code:
Code:
sum=0
for k in u:
  sum+=1<<k
return sum==1022
This is valid python code. It assumes "u" is a set of integers, it doesn't check that. If they might not be integers this needs to be verified as well.
 
  • Love
Likes etotheipi
  • #12
Here's an implementation of what I mentioned earlier.
Python:
from time import perf_counter
t1 = perf_counter()
count = 0
list = [1, 2, 3, 5, 5, 5, 7, 8,9]
sum = 0
prod = 1
for i in range (0, 9):
    sum += list[i]
    prod *= list[i]
if (sum == 45):
    if (prod % 256 == 0):
        print ("Duplicate digit! Too many 2s")
    elif (prod % 81 == 0):
        print ("Duplicate digit! Too many 3s")
    elif (prod % 25 == 0):
        print ("Duplicate digit! Too many 5s")
    elif (prod % 49 == 0):
        print ("Duplicate digit! Too many 7s")
    else:
        print("No duplicate digits...")
else:
    print("Sum is incorrect")
t2 = perf_counter()
print("t2: ", t2, " t1: ", t1)
print("Elapsed time: ", (t2-t1), " seconds")
@etothepi, I tried out your version, and it took longer to run than I wanted to wait -- maybe about 5 minutes before I CTRL-C-ed out.
The version above runs in 0.0002833 seconds, or just under 3 milliseconds, and prints the message that there are too many 5s.
Edit: just under 0.3 msec.
 
Last edited:
  • Like
Likes etotheipi
  • #13
Mark44 said:
The version above runs in 0.0002833 seconds, or just under 3 milliseconds, and prints the message that there are too many 5s.
0.3 milliseconds?
I ran my function a million times and it finished in 2-3 seconds. The runtime doesn't depend on the input, so I didn't vary that.

As discussed before, checking the powers of the primes isn't a guarantee that every digit occurs. 124445799 has the same digit sum and digit product as 123456789. Your 25 line code takes 100 times as long as the 4 line code and has a bug.

etotheipi's script was designed to look for a bug in your script. It's not a surprise that it takes longer because it does a vastly more complicated task than checking if every digit occurs once in a set of 9 digits.
 
  • Like
Likes etotheipi
  • #14
Mark44 said:
@etothepi, I tried out your version, and it took longer to run than I wanted to wait -- maybe about 5 minutes before I CTRL-C-ed out.
The version above runs in 0.0002833 seconds, or just under 3 milliseconds, and prints the message that there are too many 5s.
Although I guess this is mainly because you are checking one list whilst my one was checking about 900000000 of them :wink:
mfb said:
Setting digits is the quickest option. It's also one that comes with minimal code:
This is valid python code. It assumes "u" is a set of integers, it doesn't check that. If they might not be integers this needs to be verified as well.

This looks like magic... very neat!
 
  • Like
Likes mfb
  • #15
It's possible to make that code more pythonic and less portable, of course:
Code:
return sum(1<<k for k in u)==1022
 
  • Informative
Likes etotheipi
  • #16
mfb said:
As discussed before, checking the powers of the primes isn't a guarantee that every digit occurs. 124445799 has the same digit sum and digit product as 123456789. Your 25 line code takes 100 times as long as the 4 line code and has a bug.
My code catches 124445799, saying that it has too many 3s.

I ran my code 1,000,000 times as well, but commented out the line where it prints "too many 3's," because I/O is inherently slow. I don't dispute that your algorithm is faster, but running my code a million times took only 7.6 sec, which is less than 3 times as slow, not 100 times as slow. Your claim that my code takes 100 times as long doesn't take into account that your code was running from cache most of the time, whereas my one time through wasn't.

And what's the bug in my code? It successfully noted that 124445799 has at least one duplicate digit.
 
  • #17
Mark44 said:
My code catches 124445799, saying that it has too many 3s.
Uh, it's worse than I expected. Another bug was hiding the more fundamental flaw of the program logic. It also complains about too many 3s for correct sets. You would need to check for 243, not 81. But if you do that then 124445799 will pass. Of course it does: It has the same sum and product as the correct set.
 
  • #18
I'm lazier than that:
PHP:
<?php
echo strlen($u) == strlen(count_chars($u, 3));
?>
Or:
Python:
u = str(u);
print len(u) == len(set(u));
As a bonus, it works with any other set of characters - not just digits - of any length!
 
  • Like
  • Love
Likes Merlin3189 and etotheipi
  • #19
I admit I miscounted the exponent on 3, and that even with that correction, my algorithm misclassifies 124445799. I'll take a harder look at my algorithm.
 
  • #20
Fixed the miscount on the exponent on 3. This code checks that the sum of the squares of the digits is correct. It produces the correct answer for 123456789 and 124445799. Are there any that it miscategorizes?
Python:
from time import perf_counter
t1 = perf_counter()
count = 0
list = [1, 2, 4, 4, 4, 5, 7, 9, 9]
sum = 0
prod = 1
sum_sq = 0
for i in range (0, 9):
sum += list[i]
prod *= list[i]
   sum_sq += list[i]*list[i]
if (sum != 45):
   print("Sum is incorrect")
elif (prod != 362880):
   print("Product is incorrect")
elif (sum_sq != 285):
   print("Sum of squares is incorrect")
else:
   if (prod % 256 == 0):
      print ("Duplicate digit! More than 7 factors of 2")
   elif (prod % 243 == 0):
      print ("Duplicate digit! More than 4 factors of 3")
   elif (prod % 25 == 0):
      print ("Duplicate digit! More than 2 factors of 5")
   elif (prod % 49 == 0):
      print ("Duplicate digit! More than 2 factors of 7")
   else:
      print("No duplicate digits...")
t2 = perf_counter()
print("t2: ", t2, " t1: ", t1)
print("Elapsed time: ", (t2-t1), " seconds")
 
  • #21
mfb said:
It's possible to make that code more pythonic and less portable, of course:
Code:
return sum(1<<k for k in u)==1022
When you're setting flag bits, I would call this more assembly-like. Although it's not intuitively obvious at first glance, yours is a very elegant solution.
Here's a more-or-less equivalent version in ARMv7:
ARMv7:
@ Constants
    .eqv PR_STR, 0x69
    .eqv HALT, 11
    .eqv LIST_SIZE, 9
    .eqv STD_OUT, 1

    .data
List:              .word 1, 2, 4, 4, 4, 5, 7, 9, 9
MagicNo:    .word 0x3FE                     @ Bit pattern 11 1111 1110, or decimal 1022
NoDupe:     .asciz "No duplicate digits"
Dupe:          .asciz "Duplicate digits"

    .text 
main:
    mov r0, #0                   @ Accumulator
    ldr r1, =List
    mov r2, #LIST_SIZE    @ Loop counter
    mov r3, #1
    ldr r4, =MagicNo
    ldr r4, [r4]

Loop:
    ldr r5, [r1]         @ Get a number from List
    lsl r5, r3, r5       @ r5 <- 1 << r5
    add r0, r0, r5    @ Add to sum
    add r1, r1, #4   @ Increment address in List
    subs r2, r2, #1 @ Decrement loop counter
    bne Loop 
  
    subs r2, r0, r4           @ r2 <- sum - MagicNo
    beq NoDupeDigit    @ If equal, no duplicate digits
    mov r0, #STD_OUT  @ Otherwise, there are duplicates
    ldr r1, =Dupe
    swi PR_STR
    b Done 

NoDupeDigit:
    ldr r1, =NoDupe
    swi PR_STR
Done:
    swi HALT
 
  • #22
I would do some basic checks, sort the digits in ascending order and compare the result to "123456789".
 
  • Like
Likes etotheipi
  • #23
Mark44 said:
Are there any that it miscategorizes?
124445799 is the only case where you get the same sum and product as 123456789, so once you catch that there is nothing left that is misclassified. Note that checking for the product is sufficient, you don't need to check divisibility any more. If the product is 362880 then your divisibility checks have to pass.
I still don't see the point of the approach. It takes longer, it is harder to understand, harder to generalize, and most importantly it relies on an advance check of every single set of 9 digits to make sure there are no further counterexamples because there is no theorem guaranteeing the uniqueness of properties you check.
 
  • #24
I would accumulate a frequency array of all the digits. Freq( "digit" ) += 1.
I would verify that all digits had a frequency of one. Maybe check the product of all frequencies = 1.
 
  • #25
Baluncore said:
I would accumulate a frequency array of all the digits. Freq( "digit" ) += 1.
I would verify that all digits had a frequency of one. Maybe check the product of all frequencies = 1.

Assuming you did initialise all frequencies to 0, if you use = rather than +=, then each frequency ends up as 0 or 1, so you can check by adding rather than multiplying, which used to be faster, even for integers.

And it's probably faster than my method using bits rather than words to store the frequencies.
 
  • #26
Merlin3189 said:
Assuming you did initialise all frequencies to 0, ...
How could it be frequency, or quickly done otherwise?
Merlin3189 said:
... if you use = rather than +=, then each frequency ends up as 0 or 1, ...
But then it is presence and not frequency, and you must assume the correct number of digits were input.
Multiplication was slower than addition once, but now it is faster than indexing and fetching data.
 

FAQ: Algorithm for checking if a set of digits are all distinct

1. What is an algorithm for checking if a set of digits are all distinct?

An algorithm is a step-by-step procedure for solving a problem or accomplishing a task. In this case, the algorithm for checking if a set of digits are all distinct involves comparing each digit in the set to every other digit to ensure that there are no duplicates.

2. Why is it important to check for distinct digits in a set?

Checking for distinct digits ensures that each digit in the set is unique and not repeated. This is important in various applications, such as in data analysis, to avoid counting duplicates and getting inaccurate results.

3. How does the algorithm for checking distinct digits work?

The algorithm works by using a loop to compare each digit in the set to every other digit. If a duplicate is found, the algorithm will stop and return a false value. If no duplicates are found, the algorithm will continue until all digits have been checked and will return a true value.

4. Can this algorithm be used for sets of any size?

Yes, this algorithm can be used for sets of any size. It will simply require more iterations to check for duplicates in larger sets.

5. Are there any limitations to this algorithm?

One limitation of this algorithm is that it can be time-consuming for very large sets, as it requires comparing each digit to every other digit. Additionally, this algorithm only checks for duplicate digits and does not take into account other factors, such as the order of the digits.

Similar threads

Replies
1
Views
1K
Replies
1
Views
1K
Replies
17
Views
1K
Replies
3
Views
3K
Replies
8
Views
1K
Replies
18
Views
7K
Replies
125
Views
18K
Back
Top