faqts : Science : Mathematics : Probability

+ Search
Add Entry AlertManage Folder Edit Entry Add page to http://del.icio.us/
Did You Find This Entry Useful?

8 of 16 people (50%) answered Yes
Recently 4 of 10 people (40%) answered Yes

Entry

Math:Probability:Distribution:Binomial:Can you describe Bernouilli formula C(k,N). p^k. (1-p)^(N-k)?

Jan 16th, 2005 13:35
Knud van Eeden,


----------------------------------------------------------------------
--- Knud van Eeden --- 22 December 2004 - 01:00 am -------------------

Math:Probability:Distribution:Binomial:Can you describe Bernouilli 
formula C(k,N). p^k. (1-p)^(N-k)?

---

To describe in a loosely manner, using pattern recognition and
induction to get the idea of the formula for the binomial distribution,
also called Bernouilli trial.

---

Note:

'bi'nomial means here 2 possibilities, e.g. head and tail, or
probability p and (1-p), are involved, the generalization is
'multi'nomial distribution, meaning more than 2 are involved)

---

Given:

   N   k        (N - k)
  C  . p  . (1-p)
   k

---

To possibly understand this formula (which basically is a rewriting of
counting the total of certain possibilities, using the rules of
combinatorics, and laws of probability and algebra) first look at
throwing a coin N times.

You see a certain structure emerging.

The probability structure follows exactly the same pattern, as it goes
hand in hand with each new throw.

So you can basically replace what is there.

You then show that the individual terms, after summing,
are represented by this general formula.

---

So toss a coin, in a serial serie of throws.

---

Let H be heads.

Let T be tails.

---

If you throw a coin 1 times,
you have as possibilities:

1. H

2. T

---

If you throw a coin 2 times,
you have as possibilities

1. HH
2. HT

3. TH
4. TT

---

If you throw a coin 3 times,
you have as possibilities

1. HHH
2. HHT

3. HTH
4. HTT

5. THH
6. THT

7. TTH
8. TTT

---

If you throw a coin 4 times,
you have as possibilities

 1. HHHH
 2. HHHT

 3. HHTH
 4. HHTT

 5. HTHH
 6. HTHT

 7. HTTH
 8. HTTT

 9. THHH
10. THHT

11. THTH
12. THTT

13. TTHH
14. TTHT

15. TTTH
16. TTTT

---

and so on.

---

Note:
At each throw the total amount of rows doubles, because for each row
there are 2 new entries, one for head and one for tail. So overall
it doubles.

---

Now sorting this rows for the amount of occurences of H
(0, 1, 2, 3, ...)
and counting them:

---

If you throw a coin 1 times,
you have as possibilities

1. H

2. T

or thus adding all the rows with
a same total amount of Hs together
you get:

1 row of 0 H (and thus 1 T)
1 row of 1 H (and thus 0 T)

---

If you throw a coin 2 times,
you have as possibilities

4. TT

2. HT
3. TH

1. HH

or thus adding all the rows with
a same total amount of Hs together
you get:

1 row  of 0 H (and thus 2 T)
2 rows of 1 H (and thus 1 T)
1 row  of 2 H (and thus 0 T)

---

If you throw a coin 3 times,
you have as possibilities

8. TTT

4. HTT
6. THT
7. TTH

2. HHT
3. HTH
5. THH

1. HHH

or thus adding all the rows with
a same total amount of Hs together
you get:

1 row  of 0 H (and thus 3 T)
3 rows of 1 H (and thus 2 T)
3 rows of 2 H (and thus 1 T)
1 rows of 3 H (and thus 0 T)

---

If you throw a coin 4 times,
you have as possibilities

16. TTTT

 8. HTTT
12. THTT
14. TTHT
15. TTTH

 4. HHTT
 6. HTHT
 7. HTTH
10. THHT
11. THTH
13. TTHH

 2. HHHT
 3. HHTH
 5. HTHH
 9. THHH

 1. HHHH

or thus adding all the rows with
a same total amount of Hs together
you get:

1 row  of 0 H (and thus 4 T)
4 rows of 1 H (and thus 3 T)
6 rows of 2 H (and thus 2 T)
4 rows of 3 H (and thus 1 T)
1 rows of 4 H (and thus 0 T)

---

Now you might recognize that the total amount of rows of a given amount
of H is found by using the triangle of Pascal numbers.

        1
      1   1
    1   2   1
  1   3   3   1
1   4   6   4   1
...


So written vertically

1
1

1
2
1

1
3
3
1

1
4
6
4
1

...

---

This numbers happen to be the combinatorial numbers

 N
C
 k

---

To derive e.g. the value of that 4 rows of 1 'H', after 4 times 
throwing
the coin.

You have thus 1 'H' (and 3 'T'), and you want to look at the total 
amount
of different possibilities of putting these together in a line.

---

Now given that 'TTT' is already there, you can put the 'H' inside on 4
different positions (at the first position, second position, third 
position
and last position, thus totally four positions):

 HTTT
 or
 THTT
 or
 TTHT
 or
 TTTH

---

Another better explanation to get this combinatorial number here is,
suppose that you have 4 different objects, which you can put in 4
positions in a line.

The first object has 4 positions to be put in
(the first, second, third and fourth position)

The second object has 4 - 1, or 3 positions to be put in
(as one slot is already occupied by the first object)

The third object has 4 - 2, or 2 positions to be put in
(as two slots are already occupied by the first and second object)

The fourth object has 4 - 3, or 1 positions to be put in
(as three slots are already occupied by the first, second and third
object)

Thus altogether 4 . (4 - 1) . (4 - 2) . (4 - 3) possibilities.

Or thus 4 . 3 . 2 . 1 possibilities.

Or thus 4! possibilities.

In case some objects are equal (as there are 1 'H' and 3 'T' here), you
have to divide this total amount by their permutations (as any 
permutation,
or thus rotation of 3 equal objects 'TTT' again gives 'TTT' so you can
not really see any difference),
thus here 3 . 2 . 1, or thus 3! for this 3 'T', and 1 or thus 1! for 
this
single 'H' respectively.

So you get here for the total amount of possibilities,

 4!
-----
3! 1!


If you generalize this idea, you get thus similarly:

               (total amount of objects)!
-------------------------------------------------------------------
(total amount of equal objects1)! (total amount of equal objects2)!

or thus here similarly:

                 N!
-----------------------------------------
(total amount of H)! (total amount of T)!

or thus abbreviated:

      N!
---------------
 k! . (N - k)!

(as you know that total amount of H and total amount of T together
must be N, so to get T you subtract the total amount of H, from N)

and that is exactly this combinatorial number, which you can calculate
or derive otherwise for small numbers also from the triangle of Pascal.

---
---

To finally get the Bernouilli trial formula, you can replace
the H and T by p and (1 - p) respectively.

The probability structure follows exactly the same pattern,
as it goes hand in hand at each throw of the coin.
So you can basically replace what is there.

This because you get the probability per definition by dividing the
quantities by the total of quantities.

And dividing or thus multiplying by a certain number (1 / ...) is just
a special case of scaling. So you only make the vertical y values
smaller or larger, but thus the overall shape of the curve remains the
same.

---

Each time you throw a coin, the corresponding probability
for head or thus H is p, and the corresponding probability for
tail or thus T is (1 - p), and remains so.

---

At each new throw, you calculate the product of the probabilities.

To understand that a product is what you should have, think of the
following.

If you can choose random from 10 roads, and only 1 leads to the goal,
followed by that you after that have to choose random from 20 roads, 
where
only one leads to the goal, you can imagine that only once in 10 . 20 
or
1 in 200 times you get to your goal.

Or thus translated in probabilities, you calculate the product of the
probabilities 1/10 and 1/20.

Similarly for throwing a coin, with say a probability of once in 2
throws throwing a H, or thus a probability of H of 1/2.

---

So you get e.g. for 4 throws the following
possibilities (products):

 1. H . H . H . H
 2. H . H . H . T

 3. H . H . T . H
 4. H . H . T . T

 5. H . T . H . H
 6. H . T . H . T

 7. H . T . T . H
 8. H . T . T . T

 9. T . H . H . H
10. T . H . H . T

11. T . H . T . H
12. T . H . T . T

13. T . T . H . H
14. T . T . H . T

15. T . T . T . H
16. T . T . T . T

---

Sorting again on amount of 'H' you get thus:

16. T . T . T . T

 8. H . T . T . T
 9. T . H . H . H
12. T . H . T . T
14. T . T . H . T

 4. H . H . T . T
 6. H . T . H . T
 7. H . T . T . H
10. T . H . H . T
11. T . H . T . H
13. T . T . H . H

 2. H . H . H . T
 3. H . H . T . H
 5. H . T . H . H
15. T . T . T . H

 1. H . H . H . H

---

You can add all products with the same amount of 'H' together
(probability sum rule), and permutation of a product does not change
its value, so that terms are all equal.

---

So you get

1 . T . T . T . T

4 . H . T . T . T

6 . H . H . T . T

4 . H . H . H . T

1 . H . H . H . H

---

Note:

So you see that the combinatorial number

 N
C
 k

tells howmany 'equal' rows can be added together.
(as in a product the order of the factors is not of
 importance for the end result.
 For example H . T . T . T equals thus T . H . T . T, so
 can be added together to
 (H . T . T . T) + (T . H . T . T)
 or thus
 2 . H . T . T . T
 In the example above there are thus 4 such 'equal' rows which can be
 added together.

---

Writing this with exponents you get

     0   4
1 . H . T

     1   3
4 . H . T

     2   2
6 . H . T

     3   1
4 . H . T

     4   0
1 . H . T

---

Note:

Because you have 4 throws, the number of H and T together
must also equal to 4.
Thus their exponents must added together also be 4.
Or in general their exponents added together must be
N.

---

Now replacing H by p, and T by (1 - p).

---

Note:

You can use (1-p) here because of the fact that the total probability
of H and T must be equal to 1 (basically saying that you can not have
more than 100% together).
Thus the probability of T is 1 minus the probability of H.

---

So you get:

     0       4
1 . p . (1-p)

     1       3
4 . p . (1-p)

     2       2
6 . p . (1-p)

     3       1
4 . p . (1-p)

     4       0
1 . p . (1-p)

---

where thus each of this terms is a special case of the respective
Bernouilli trial formula:


   N!         k         (N-k)
---------- . p   . (1-p)
k! (N-k)!

---

So if you ask yourself for example, what is the probability
to find 3 heads 'H' after 4 times throwing a coin, you have thus to
select the term

     3       1
4 . p . (1-p)

or if you fill it in in the general formula, you get the result
immediately:

   4!         3         (4-3)
---------- . p   . (1-p)
3! (4-3)!

or thus also

     3        1
4 . p  . (1-p)

---
---

Note:

The expression 'Binomial' distribution, which points to a connection 
with the
binomial theorem

        N
 (a + b)


or thus here the identity, by replacing a by p, and b by (1-p)

              N
 (p + (1 - p))  = 1

comes from the fact that you get a similar structure if you work this 
out.

---

e.g. if you choose N equal to 2, thus 2 throws with a coin, you get:

              2
 (p + (1 - p))  = 1

or thus:

      2                                2
 1 . p  + 2 . p . (1 - p) + 1 . (1 - p)  = 1

---
---

Note:

To get the cumulative Binomial distribution, you can just add the 
corresponding terms.
E.g. what is the probability to find 3 or fewer heads, after throwing 
4 times a coin?

Thus you add all the rows containing 0, 1, 2 or 3 'H' together.
Thus you get a total of

16. T . T . T . T

 8. H . T . T . T
 9. T . H . H . H
12. T . H . T . T
14. T . T . H . T

 4. H . H . T . T
 6. H . T . H . T
 7. H . T . T . H
10. T . H . H . T
11. T . H . T . H
13. T . T . H . H

 2. H . H . H . T
 3. H . H . T . H
 5. H . T . H . H
15. T . T . T . H

---

or thus translating this in probabilities:

     0       4
1 . p . (1-p)

     1       3
4 . p . (1-p)

     2       2
6 . p . (1-p)

     3       1
4 . p . (1-p)

---

or thus using the formula

                          N!         k         (N-k)
SUM( k = 0 to 3 ) of ( ---------- . p   . (1-p)       )
                       k! (N-k)!

---
---

Note:

To more clearly see how the binomial distribution is build,
use the most straight forward representation, by
putting the possibilities vertically:

If you throw a coin 1 times,
you have as possibilities

1  T  H
 -------
   0  1   -> total amount of heads H

---

If you throw a coin 2 times,
you have as possibilities

2    TH
1 TT HT HH
 ----------
   0  1  2   -> total amount of heads H

 -------

---

If you throw a coin 3 times,
you have as possibilities

3     TTH HHT
      THT HTH
1 TTT HTT THH HHH
 ------------------
    0   1   2   3   -> total amount of heads H

---

If you throw a coin 4 times,
you have as possibilities

6           HHTT
            HTHT
4      TTTH HTTH HHHT
       TTHT THHT HHTH
       THTT THTH HTHH
1 TTTT HTTT TTHH THHH HHHH
 ---------------------------
     0    1    2    3    4  -> total amount of heads H

---

If you throw a coin 5 times,
you have as possibilities

10            HTTTH THHHT
              THTTH HTHHT
              HTTHT THHTH
              TTHTH HHTHT
              THTHT HTHTH
5       TTTTH HTHTT THTHH HHHHT
        TTTHT TTTHH HHHTT HHHTH
        TTHTT TTHHT HHTTH HHTHH
        THTTT THHTT HTTHH HTHHH
1 TTTTT HTTTT HHTTT TTHHH THHHH HHHHH
 ----------------------------------------
      0     1     2     3     4     5

               -> total amount of heads H

---

If you throw a coin 6 times,
you have as possibilities


20                    THHHTT
                      THHTTH
                      THTTHH
                      TTHHHT
                      TTHHTH
15             HTTTTH TTHTHH THHHHT
               THTTTH HHHTTT HTHHHT
               HTTTHT HHTTTH THHHTH
               TTHTTH HTTTHH HHTHHT
               THTTHT TTTHHH HTHHTH
               HTTHTT HTTTHH THHTHH
               TTTHTH HTTHHT HHHTHT
               TTHTHT HTHHTT HHTHTH
               THTHTT HHTTTH HTHTHH
6       TTTTTH HTHTTT HHTTHT THTHHH HHHHHT
        TTTTHT TTTTHH HHTHTT HHHHTT HHHHTH
        TTTHTT TTTHHT TTTHHH HHHTTH HHHTHH
        TTHTTT TTHHTT TTHHHT HHTTHH HHTHHH
        THTTTT THHTTT THHHTT HTTHHH HTHHHH
1 TTTTT HTTTTT HHTTTT HHHTTT TTHHHH THHHHH HHHHHH
 -------------------------------------------------
      0      1      2      3      4      5      6

                        -> total amount of heads H

and so on...

---
---

Note:

In the case that the probability of head equals the probability
of tails (e.g. both are 1/2):

Mirror symmetry (with respect to the vertical axis)

You see further that if there is an odd number of throws (1, 3, 5, 7,
9, ... throws) that because of symmetry (amount of heads versus tails
which are each others complement (e.g. 2 heads + 4 tails on the left, 
versus
4 heads + 2 tails on the right, so basically a mirror situation)

Or thus:

0 heads and 6 tails on the left, 6 heads and 0 tails on the right,
1 head  and 5 tails on the left, 5 heads and 1 tails on the right,
2 heads and 4 tails on the left, 4 heads and 2 tails on the right,
3 heads and 3 tails on the left, 3 heads and 3 tails on the right
(this is thus 1 peak)
4 heads and 2 tails on the left, 2 heads and 4 tails on the right
5 heads and 1 tail  on the left, 1 head  and 5 tails on the right
6 heads and 0 tails on the left, 0 heads and 6 tails on the right.

Only the names (=head or tail) differ, but the quantities are the
same, so the result should be the same, left and right. Thus the
vertical height is the same. Thus symmetry.

In general
N-m heads and m tails on the left, m heads and N-m tails on the right
will give equal vertical results.

You see further that the maximum amount of possibilities
occurs when the number of heads equals the number of
tails, or thus if

 N-m = m

or thus

 N = m + m

or thus

 N = 2 . m

or thus

 m = N/2

(e.g. here 3 heads and 3 tails).

That will thus in general only occur when in the case of an even
total amount of throws, thus N is even.
(thus 1 head and 1 tail, 2 heads and 2 tails,
3 heads and 3 tails, or thus totally 1+1=2, 2+2=4, 3+3=6, 4+4=8,
5+5=10, ... throws).
Thus an even amount of throws will show 1 single peak.

In case of an odd amount of throws totally, there are 2 peaks,
because there is no single peak, because there is no ( N-m = m )
situation, because N is here per definition not even, thus there
is no single peak. But there is still mirror symmetry.
Thus peaks at N-m is equal in height as at m.
Thus there are 2 equally heigh peaks, also around the center,
in case of an odd amount of throws.

---
---

Note:

I created this barcharts manually systematically
by first letting a fixed group of
HH... traveling to from the left to the right
then a fixed group
H.H... traveling from the left to the right
then
H..H.. traveling from the left to the right
and so on until the last
H....H

---

If I saw there was a symmetry,
I swapped the Hs and the Ts, using my
text editor.

---
---

Note:

If the occurrence (thus probability) of say head is much smaller than 
the probability
of tail then there will the height of the heads vertically will be 
much smaller.
While on the other hand the vertical height of tails will be much 
heigher.
So you get an asymmetric graph (low on the head side, and high on the 
tail side).

---
---

Book: see also:

---

[book: Kramer, Edna Erestine - the nature and growth of modern 
mathematics - p. 312 'From dice to quantum theory and quality control']

---

[book: Spiegel, Murray R. - Probability and statistics - McGraw-Hill 
(Schaum series) - p. 119 'The Binomial distribution']

---
---

Internet: see also:

---

Binomial Distribution
http://mathworld.wolfram.com/BinomialDistribution.html

---

Binomial Distribution
http://en.wikipedia.org/wiki/Binomial_distribution

---

Math: Probability: Distribution: Link: Overview: Can you give an 
overview of links?
http://www.faqts.com/knowledge_base/view.phtml/aid/32917/fid/815

----------------------------------------------------------------------