Trying to determine the number of coin tosses for HHTH

  • Python
  • Thread starter member 428835
  • Start date
In summary, a user on a forum is discussing the expected number of coin tosses needed to get the sequence HHTH. They believe the answer is 30, but wanted to program it in python and ended up with an average of 18 tosses. Another user used a Perl program to simulate the sequence and also got an average of 18 tosses. The original user questions if their calculation is wrong and the simulations are more reliable.
  • #1
member 428835
Hi PF!

This is related to a forum I just posted here: how many coin tosses are expected to get the sequence HHTH. I believe the answer is 30, explained in the OP. However, I thought I'd try and program it in python. Below is my code:
Python:
import numpy as np
import statistics
# EXPECTED NUMBER OF TOSSES FOR COIN WITH T WEIGHTED p FOR SEQUENCE HHTH
class coin:
    def approx(self, n, p):
        lst = []
        for _ in range(n):
            toss = 0
            count = 0
            while count < 4:
                # THE TOSS
                x = np.random.rand()
                toss += 1
                # LET x < p INDICATE TAILS
                # LET x > p INDICATE HEADS
                ## AFTER HHT, WE SEEK H
                if count == 3:
                    if x > p:       # IF HEADS, APPEND lst WITH NUM OF TOSSES
                        lst.append(toss)
                        count = 4   # MAKE COUNT 4 TO BREAK WHILE LOOP
                    else:
                        count = 0   # SEQUENCE IS HHTT WHICH IS T, SO COUNT RESTARTS
                ## AFTER HH, WE SEEK T
                if count == 2:
                    if x > p:
                        count = 2   # IF HEADS, WE HAVE HH
                    else:
                        count = 3   # IF TAILS, WE HAVE HHT
                # FIRST TWO HH
                if count < 2:
                    if x < p: # IF TAILS, RESTART COUNT
                        count = 0
                    else:   # IF HEADS, WE HAVE H OR HH
                        count += 1
        return statistics.mean(lst)
         
if __name__ == "__main__":
    n = 10000
    p = 0.5
    print(coin().approx(n, p))

The results are giving me 18 coin tosses on average. Is my programming wrong or is my calculation in the other post wrong (or maybe both are)?
 
Last edited by a moderator:
Technology news on Phys.org
  • #2
I used a completely independent program to do a few simulations , each of 10 million strings, and got an average of 18 (+-0.005), counting the characters in the termination string, "HHTH".

I used this Perl program:

Perl simulation of 10 million strings to get average length to HHTH:
$numTrials=10000000;
@chars = ("H", "T");
foreach $i (1..$numTrials){
    $string='';
    $count=0;
    while( $string !~ /HHTH/){
        $ran = int(rand(2));
        $char = $chars[$ran];
        $string = "$string$char";
        $count++;
    }
    #    print "count=$count string=$string\n";
    $totCount += $count;
}
$avgCount = $totCount/$numTrials;
print "avgCount=$avgCount\n";

PS. In my opinion, it is not unusual for simulations to get answers much more easily and reliably than by using analysis. So I would go back and check your analysis.
 
Last edited:
  • Like
  • Haha
Likes Jarvis323, member 428835 and jedishrfu
  • #3
FactChecker said:
I used a completely independent program to do a few simulations , each of 10 million strings, and got an average of 18 (+-0.005), counting the characters in the termination string, "HHTH".

I used this Perl program:

Perl simulation of 10 million strings to get average length to HHTH:
$numTrials=10000000;
@chars = ("H", "T");
foreach $i (1..$numTrials){
    $string='';
    $count=0;
    while( $string !~ /HHTH/){
        $ran = int(rand(2));
        $char = $chars[$ran];
        $string = "$string$char";
        $count++;
    }
    #    print "count=$count string=$string\n";
    $totCount += $count;
}
$avgCount = $totCount/$numTrials;
print "avgCount=$avgCount\n";

PS. In my opinion, it is not unusual for simulations to get answers much more easily and reliably than by using analysis. So I would go back and check your analysis.
Nice way to code this! Thanks for the reinforcement!
 
  • #4
Here's a non-Monte-Carlo method. I keep track of the likelihood of each state after every coin toss.
I iterate for 1000 tosses:
C++:
   {
      int     i, toss;
      double  HHTH, expected, states[8], newstates[8], cum;
      typedef enum {
         TTT=0, TTH, THT, THH, HTT, HTH, HHT, HHH
      } Toss_t;

      //
      expected = 0.0;
      printf(
         "  # of   percent    Cummulative\n"
         " tosses occurence Expected value\n"
      );
      //
      // After 3 tosses, we will be in one of 8 states:
      for(i=0;i<8;i++) states[i] = 0.125;
      //
      // From toss number 4 on, the states will change as follows:
      for(toss=4;toss<=1000;toss++) {
         //
         //  If we were HHT before, there is a 50% chance we will
         // finish on this toss.
         HHTH = 0.5 * states[HHT];
         //
         // TTT and TTH will each be half of TTT and HTT:
         newstates[TTT] = (states[TTT]+states[HTT])/2;
         newstates[TTH] = (states[TTT]+states[HTT])/2;
         //
         // THT and THH will each be half of TTH and HTH:
         newstates[THT] = (states[TTH]+states[HTH])/2;
         newstates[THH] = (states[TTH]+states[HTH])/2;
         //
         // HTT will be half of THT and HHT,
         // But HTH will be half of THT:
         newstates[HTT] = (states[THT]+states[HHT])/2;
         newstates[HTH] = (states[THT])/2;
         //
         // HHT and HHH will each be half of THH and HHH:
         newstates[HHT] = (states[THH]+states[HHH])/2;
         newstates[HHH] = (states[THH]+states[HHH])/2;
         //
         for(cum=0.0, i=0;i<8;i++) {
            cum += states[i] = newstates[i];
         }
         expected += toss * HHTH;
         //
         // Report the first 10 tosses and then every 20th.
         if( (toss<=10) || ((toss%20)==0) ) {
            printf(
               " %4d  %9.5f%% %12.8f\n",
               toss, 100*HHTH, expected + ((toss+1)*cum)
            );
         }
      }
   }

And here are the results:
Code:
  # of   percent    Cummulative
 tosses occurence Expected value
    4    6.25000%   4.93750000
    5    6.25000%   5.81250000
    6    6.25000%   6.62500000
    7    5.46875%   7.38281250
    8    5.07812%   8.08984375
    9    4.68750%   8.75000000
   10    4.39453%   9.36621094
   20    2.20737%  13.66702461
   40    0.55596%  16.90866547
   60    0.14003%  17.72512859
   80    0.03527%  17.93076890
  100    0.00888%  17.98256295
  120    0.00224%  17.99560818
  140    0.00056%  17.99889384
  160    0.00014%  17.99972140
  180    0.00004%  17.99992983
  200    0.00001%  17.99998233
  220    0.00000%  17.99999555
  240    0.00000%  17.99999888
  260    0.00000%  17.99999972
  280    0.00000%  17.99999993
  300    0.00000%  17.99999998
  320    0.00000%  18.00000000
  340    0.00000%  18.00000000
  360    0.00000%  18.00000000
  380    0.00000%  18.00000000
  400    0.00000%  18.00000000
  420    0.00000%  18.00000000
  440    0.00000%  18.00000000
  460    0.00000%  18.00000000
  480    0.00000%  18.00000000
  500    0.00000%  18.00000000
  520    0.00000%  18.00000000
  540    0.00000%  18.00000000
  560    0.00000%  18.00000000
  580    0.00000%  18.00000000
  600    0.00000%  18.00000000
  620    0.00000%  18.00000000
  640    0.00000%  18.00000000
  660    0.00000%  18.00000000
  680    0.00000%  18.00000000
  700    0.00000%  18.00000000
  720    0.00000%  18.00000000
  740    0.00000%  18.00000000
  760    0.00000%  18.00000000
  780    0.00000%  18.00000000
  800    0.00000%  18.00000000
  820    0.00000%  18.00000000
  840    0.00000%  18.00000000
  860    0.00000%  18.00000000
  880    0.00000%  18.00000000
  900    0.00000%  18.00000000
  920    0.00000%  18.00000000
  940    0.00000%  18.00000000
  960    0.00000%  18.00000000
  980    0.00000%  18.00000000
 1000    0.00000%  18.00000000
 
Last edited by a moderator:
  • Like
  • Informative
Likes Jarvis323, WWGD, PeroK and 1 other person
  • #5
Here's my plodding Python code:

Python:
#
# Calculate the expected number of coin tosses to get the sequence HHTH:
# State 0 is the starting point; state 1 is H; state 2 is HH; state 3 is HHT
#
import random
#
# Choose the number of games in the simulation
games = 1_000
#
# Counter to keep the total number of tosses across all games
tot_count = 0
#
for k in range(0, games):
    state = 0
    toss_count = 0
    game_over = False
#
    while not game_over:
        toss_count += 1
        toss = random.randint(0, 1)
#
# 0 is tails, 1 is heads
#
        if state == 0:
            if toss == 0:
                state = 0
                continue
            else:
                state = 1
                continue
#
        if state == 1:
            if toss == 0:
                state = 0
                continue
            else:
                state = 2
                continue
#
        if state == 2:
            if toss == 0:
                state = 3
                continue
            else:
                state = 2
                continue
#
        if state == 3:
            if toss == 0:
                state = 0
                continue
            else:
                game_over = True
#
    tot_count += toss_count
#    print(toss_count)
#
# Output the average number of tosses per game
print(tot_count/games)
 
Last edited:
  • Like
Likes WWGD
  • #6
The "other thread" is interested in a variation of this program that handles unfair coin tosses.
For a probability of head being 0.1 (instead of 0.5), the expected number of tosses climbs above 1121.

C++:
   {
      int     i, toss;
      double  p, HHTH, expected, states[8], newstates[8], cum;

      //
      //  Change the constants below for different heads/tails
      // probabilities.
      //
      // For a fair coin:
      //  const double pH = 0.5;
      //  #define TOSS_LIMIT  (500)
      //  #define REPORT_INTERVAL (20)
      //
      // For a very biased coin:
      const double pH = 0.1;
      #define TOSS_LIMIT   (32000)
      #define REPORT_INTERVAL  (1000)

      const double pT = (1.0-pH);

      typedef enum {
         TTT=0, TTH, THT, THH, HTT, HTH, HHT, HHH
      } Toss_t;

      //
      expected = 0.0;
      //
      // After 3 tosses, we will be in one of 8 states:
      for(i=0;i<8;i++) {
         p = (i&1) ? pH : pT;
         p *= (i&2) ? pH : pT;
         states[i] = p * ((i&4) ? pH : pT);
      }
      //
      // From toss number 4 on, the states will change as follows:
      printf(
         "     # of   percent       Cumulative\n"
         "    tosses occurrence   Expected value\n"
      );
      for(toss=4;toss<=TOSS_LIMIT;toss++) {
         //
         //  If we were HHT before, there is a 50% chance we will
         // finish on this toss.
         HHTH = pH * states[HHT];
         //
         // TTT and TTH will each be based on TTT and HTT:
         newstates[TTT] = pT * (states[TTT]+states[HTT]);
         newstates[TTH] = pH * (states[TTT]+states[HTT]);
         //
         // THT and THH will each be based on TTH and HTH:
         newstates[THT] = pT * (states[TTH]+states[HTH]);
         newstates[THH] = pH * (states[TTH]+states[HTH]);
         //
         // HTT will be based on THT and HHT,
         // But HHT will be contribute to HTH:
         newstates[HTT] = pT * (states[THT]+states[HHT]);
         newstates[HTH] = pH * (states[THT]);
         //
         // HHT and HHH will each be based on THH and HHH:
         newstates[HHT] = pT * (states[THH]+states[HHH]);
         newstates[HHH] = pH * (states[THH]+states[HHH]);
         //
         for(cum=0.0, i=0;i<8;i++) {
            cum += states[i] = newstates[i];
         }
         expected += toss * HHTH;
         //
         // Report the first 10 tosses and then every 20th.
         if( (toss<=10) || ((toss%REPORT_INTERVAL)==0) ) {
            printf(
               " %7d  %9.5f%% %15.8f\n",
               toss, 100*HHTH, expected + ((toss+1)*cum)
            );
         }
      }
   }

Code:
     # of   percent       Cumulative
    tosses occurrence   Expected value
       4    0.09000%      4.99910000
       5    0.09000%      5.99730000
       6    0.09000%      6.99460000
       7    0.08919%      7.99100810
       8    0.08911%      8.98652511
       9    0.08903%      9.98115184
      10    0.08895%     10.97488903
    1000    0.03668%    663.30899519
    2000    0.01499%    934.00262938
    3000    0.00613%   1044.63793041
    4000    0.00250%   1089.85572754
    5000    0.00102%   1108.33671108
    6000    0.00042%   1115.89008108
    7000    0.00017%   1118.97722189
    8000    0.00007%   1120.23896841
    9000    0.00003%   1120.75465733
   10000    0.00001%   1120.96542475
   11000    0.00000%   1121.05156758
   12000    0.00000%   1121.08677505
   13000    0.00000%   1121.10116471
   14000    0.00000%   1121.10704591
   15000    0.00000%   1121.10944962
   16000    0.00000%   1121.11043204
   17000    0.00000%   1121.11083357
   18000    0.00000%   1121.11099768
   19000    0.00000%   1121.11106475
   20000    0.00000%   1121.11109216
   21000    0.00000%   1121.11110337
   22000    0.00000%   1121.11110795
   23000    0.00000%   1121.11110982
   24000    0.00000%   1121.11111058
   25000    0.00000%   1121.11111090
   26000    0.00000%   1121.11111102
   27000    0.00000%   1121.11111108
   28000    0.00000%   1121.11111110
   29000    0.00000%   1121.11111111
   30000    0.00000%   1121.11111111
   31000    0.00000%   1121.11111111
   32000    0.00000%   1121.11111111
 
Last edited:
  • #7
You might make sure that you are checking each toss as being the potential start of the string. It is easy to accidentally look at groups of four instead, which would ignore strings that occur outside such a group, such as THHT, HTTH, ect...
 
  • Skeptical
Likes PeroK
  • #8
Algr said:
You might make sure that you are checking each toss as being the potential start of the string. It is easy to accidentally look at groups of four instead, which would ignore strings that occur outside such a group, such as THHT, HTTH, ect...
We all seem to be getting the same answer and I know that at least one (post #2) does not have that problem.
 
  • #9
Algr said:
You might make sure that you are checking each toss as being the potential start of the string. It is easy to accidentally look at groups of four instead, which would ignore strings that occur outside such a group, such as THHT, HTTH, ect...
In the code I provided (post 4 and 6 above), the check is done on the last four tosses on every coin toss after the first three. ... so it's handled.
 
  • #10
Since the various code analysis were generating lower (and consistent) values, I was suggesting that the original summation (that thought the answer was 30) was the one with that error.
 
  • Like
Likes FactChecker and .Scott
  • #11
The other thread was locked, but I calculated it as a Markov process like @FactChecker suggested based on this link.

https://mpaldridge.github.io/math2750/S08-hitting-times.html

Here is the Markov process I came up with.

Screenshot_2022-05-06_17-23-07.png

The hitting time from state ##e## to state ##1101## is

\begin{split}
\mu_{e,1101} &= 1 + 0.5 \mu_{e,1101} + 0.5 \mu_{1,1101} \\
&= 1 + 0.5 \mu_{e,1101} + 0.5 ( 1 + 0.5 \mu_{e,1101} + 0.5 \mu_{11,1101} ) \\
\end{split}

Need to calculate ##\mu_{11,1101}## in terms of ##\mu_{e,1101}## to continue.

\begin{split}
\mu_{11,1101} &=1 + 0.5 \mu_{11,1101} + 0.5 \mu_{110,1101} \\
&= 1 + 0.5 \mu_{11,1101} + 0.5 ( 1 + 0.5 \mu_{e,1101} ) \\ &= 3 + 0.5 \mu_{e,1101} \\
\end{split}

Now substitute.

\begin{split}
\mu_{e,1101} &= 1 + 0.5 \mu_{e,1101} + 0.5 ( 1 + 0.5 \mu_{e,1101} + 0.5 ( 3 + 0.5 \mu_{e,1101} ) ) \\
&= 2.25 + 0.875 \mu_{e,1101} \\
&= 18 \\
\end{split}
 
Last edited:
  • Like
Likes FactChecker

FAQ: Trying to determine the number of coin tosses for HHTH

How do you determine the number of coin tosses for HHTH?

The number of coin tosses needed to get a specific sequence, such as HHTH, can be calculated using the formula 2^(n+1)-1, where n is the number of consecutive outcomes desired. In this case, n = 4 (HHTH), so the number of tosses needed is 2^(4+1)-1 = 31.

Why is the formula for determining the number of coin tosses 2^(n+1)-1?

This formula is derived from the concept of a binary tree, where each branch represents a possible outcome of a coin toss. The number of branches at each level of the tree is equal to the number of tosses, and the total number of branches is 2^n. However, the root of the tree should not be counted, hence the +1. The -1 at the end accounts for the fact that the final outcome of the tree is the desired sequence, so it does not need to be included in the total number of tosses.

Is the formula for determining the number of coin tosses always accurate?

Yes, the formula is always accurate for determining the minimum number of coin tosses needed to get a specific sequence. However, it is important to note that this is the minimum number, and it is possible to get the desired sequence in more tosses, depending on chance.

Can this formula be used for any sequence, or only for HHTH?

The formula can be used for any sequence. Simply substitute the desired sequence for HHTH in the formula and calculate the number of tosses needed. However, the formula assumes a fair coin and independent outcomes, so it may not be accurate for certain biased or dependent sequences.

Is there a faster way to determine the number of coin tosses for a specific sequence?

There are other methods, such as using probability and statistics, to estimate the number of coin tosses needed for a specific sequence. However, the 2^(n+1)-1 formula is the most accurate and efficient method for determining the minimum number of tosses needed.

Similar threads

Back
Top