```
import os
import time
import numpy as np
42) np.random.seed(
```

This problem and discussion herein was adapted from Computational Thinking course offered by MIT. Link to course page

**Question from First Course in Probability (Sheldon Ross):**

Given 20 people, what is the probability that, among the 12 months in the year, there are 4 months containing exactly 2 birthdays and 4 containing exactly 3 birthdays? Link to problem

### Problem setup

Rather than using a formula to estimate the probability we can setup an experiment using for-loops to mimic the trials and then count the actual number of cases where the condition is met.

```
= np.arange(1, 13) # array of months
month_number = 20 #Number of people considered
number_of_people = np.random.choice(month_number, size=number_of_people, replace=True, ) sample
```

Here `sample`

is the randomized set of 20 birthdays. Each entry is a month. Length of sample is same as number of people in the experiment, 20 in this case.

**Highlight cases which satisfy the constraints:** 1. 4 months have exactly 2 birthdays 2. 4 months have exactly 3 birthdays

For that first create an array with frequency of birthdays in each month:

```
= np.zeros(len(month_number)) #This is zeros frequency array
frequency for _entry in sample:
= _entry - 1
_index = frequency[_index] + 1 frequency[_index]
```

` sample`

```
array([ 7, 4, 11, 8, 5, 7, 10, 3, 7, 11, 11, 8, 5, 4, 8, 8, 3,
6, 5, 2])
```

` frequency`

`array([0., 1., 2., 2., 3., 1., 3., 4., 0., 1., 3., 0.])`

`frequency`

array shows how many birthday are there in a month (Jan - Dec).

Once we have that, now we have to consider the conditions given in the problem. We can either do this using `for/if`

statement or using `numpy.argwhere`

, where the number of months satisfying the conditions is estimated directly.

`len(np.argwhere(frequency == 2)) `

`2`

Now putting it all together and doing multiple trials.

In this case, I do 1,500,000 trials and count the ratio of successes to the total trials. While this process takes time, it is better than remembering permutation and combination formulae ;P

This is a variant of monte-carlo simulation.

```
%%time
# putting it all togther
= 1_500_000
_number_of_trials = 0
_success
for _ in range(_number_of_trials):
= np.random.choice(month_number, size=number_of_people, replace=True)
sample
= np.zeros( len(month_number) )
frequency for _entry in sample:
if _entry in month_number:
= _entry - 1
_index = frequency[_index] + 1
frequency[_index]
if ( len(np.argwhere(frequency == 2)) == 4 ) and ( len(np.argwhere(frequency == 3)) == 4 ):
= _success + 1
_success
print('Probability of success = {}'.format(_success / _number_of_trials))
```

```
Probability of success = 0.00106
CPU times: user 3min 8s, sys: 2.08 s, total: 3min 10s
Wall time: 3min 10s
```

This answer is very close you’d get from using probablity counting and formula!