Combinatorics & Probability
[Utility]
(2002-02-05) The Monty Hall Paradox
In a game show,
the contestant wins if he guesses correctly which one of three doors hides the (only) prize.
The rules of the game state that the contestant makes a tentative guess and is then
shown one wrong choice among the two doors which he did not pick.
He is then given the opportunity to change his guess.
Should he do so?
Yes, always! The chances of winning are doubled by switching this way.
The reason is that the original guess is right with probability 1/3, whereas one of the
other two choices is correct with probability 2/3.
The rules are actually a clever disguise for a simple choice between two alternatives:
Would you rather win if the
prize is behind the door you first picked (1/3) or win if it's not (2/3)?
Before making national news in 1991 (see below), this puzzle appeared under other
guises, including The Three Prisoner Problem,
which Martin Gardner presented in 1959, in his Scientific American column
(and again in his 1961 book More Mathematical Puzzles).
The popularity of Monty Hall's
TV show Let's Make a Deal revived interest
in the problem, about which Steve Selvin wrote a letter to the editor
entitled "A Problem in Probability", published
in the February 1975 issue of The American Statistician
(with a follow-up in August 1975).
Most notably, the problem was posed by [a reader of] the famous puzzle columnist
Marilyn vos Savant
(Parade Magazine, September 9, 1990),
whose correct answer started a flood of objections.
The thing became known as the
Monty Hall Problem, in honor of
the popular host of the TV show Let's Make a Deal (which ran 4500 times
from 1963 to 1990). The craze culminated in a front-page article of the
Sunday New York Times, on
July 21, 1991.
A number of people just could not believe the above answer was correct (they were wrong).
Fewer people remarked that the above rules were not applicable to the
situation(s) in Let's Make a Deal (they are right).
Actually, it's quite interesting to investigate how different "rules" affect the conclusion:
- What if the show's host is not obligated to reveal a wrong choice?
We can't begin to address the issue if we don't have a clue about the host's obligations.
For example, the producers may ask him to flip a fair coin in order to decide whether
or not to proceed as above (in such a case the contestant should still take the
opportunity to switch whenever available, but his overall probability of winning is
reduced from 2/3 to 1/2).
It may also be known to the contestant that the "hostile" host only gives the opportunity
to switch when the initial guess was right (in which case the contestant should clearly
always decline any such "opportunity" and will win with probability 1/3).
Alternately, more interesting assumptions may be made, along the following lines:
- From watching many previous shows, the contestant knows that he will be given
an opportunity to switch with probability x if his original guess was right
and with probability y otherwise.
Should he switch when given the opportunity to do so?
That's more interesting! Let's compute the contestant's probability of winning if
he decides to switch whenever he has he chance:
- He was right with probability 1/3.
In this case he wins, with probability (1-x),
only when the host does not give him a chance to switch.
- He was wrong with probability 2/3.
In this case he wins, with probability y, only when given the chance to switch.
A contestant who always switches will thus win with probability (1-x)/3 + 2y/3.
Of course, a contestant who never switches still wins with probability 1/3.
A contestant should switch whenever the former probability is
larger than the latter, namely when 2y is at least equal to x.
A host may want to keep his show interesting by having (clever) contestants
react differently (or indifferently) to the switching opportunity.
The above demonstrate that such a host should allow good guesses to
be changed twice as often as he does for bad guesses (so that x = 2y).
This makes the contestant win with probability 1/3 whether he decides to switch or not.
(2002-04-01) The Three Prisoners Paradox
[abridged]
Of three prisoners (A, B and C), only the one secretly pardoned by the governor
will escape execution.
The governor picked at random and told the warden.
The warden refuses to tell the fate of A,
but agrees to name another prisoner who's doomed;
he reveals that B was not pardoned.
The survivor will thus be A or C.
What's the probability that it will be A?
Before the warden's revelation, the probability that A would survive was clearly 1/3,
as the governor's "random" choice was presumably not biased in favor of anyone.
The confirmation of the execution of either one of the other two prisoners does not
changes that.
It's irrelevant to the fate of A, whose probability of survival remains 1/3.
However, things are looking brighter for C who is now known to survive with probability 2/3,
since he escapes execution whenever A does not.
This is another version of the infamous "paradox" discussed in the
previous article.
However, this presentation of the so-called Three Prisoners Paradox predates the
equivalent Monty Hall Paradox by several decades,
as it was introduced by Martin Gardner in the October 1959 issue of
Scientific American.
A very recent account may be found in Gardner's own 2001 book:
The Colossal Book of Mathematics, W.W. Norton & Company, New York,
ISBN 0-393-02023-1.
( J. S. of Parkville, MD.
2000-05-03)
In how many ways can 6 children be seated at a circular table?
-
120 ways.
This is the factorial of 5
(namely, 5! = 5´4´3´2´1);
the number of ways the 5 other children may be placed to the left of
some arbitrarily chosen kid.
5 ways to choose the kid placed first to the left, 4 ways to chose the second,
3 ways for the third, 2 ways for the fourth, 1 way for the last...
(2002-02-15)
A "Bachet square" is a 4 by 4 layout of the 16 court cards in a deck
(aces and face cards) which is such that every suit and every value appears
once and only once in every row or column and in both diagonals.
How many different Bachet squares are they?
Well, there are only two ways to arrange 4 letters in a square
(once the first row is given) so that each letter appears once in each row,
in each column, and in either diagonal:
A | B | C | D | |
A | B | C | D | |
C | D | A | B |
D | C | B | A |
D | C | B | A |
B | A | D | C |
B | A | D | C |
C | D | A | B |
There are 576 = (4!)2 choices for the first row in a Bachet square,
namely, 4! = 24 ways to choose the order of the suits and
4! = 24 choices for the sequence of values.
For each such first row, we obtain two Bachet squares,
by using either the first pattern for suits and the second one for values,
or vice versa. There are thus 1152 Bachet squares.
Since there are 16! = 20922789888000 possible squares, the probabilty that a random
square arrangement of court cards is a Bachet square is only one in 18162144000.
These rare arrangements are named after
Claude Gaspar Bachet de Méziriac (1581-1638)
who first published this puzzle in 1624.
(by popular request)
Choice Numbers
What does C(n,p) mean?
[ We don't recommend the notation nCp. ]
C(n,p) is usually pronounced "n choose p".
It is simply the number of ways there are to pick
p objects among n different ones.
It's so commonly used when counting things that we need
a special notation for this.
In handwriting or in print, you will often encounter C(n,p)
written as a pair of parentheses enclosing an n above a p.
For example C(10,5) may be written as a 10 above a 5 between parentheses,
without anything else within the parentheses:
C(10,5) = |
ì10 î 5 |
ü þ |
= 252 |
How do we obtain this value of C(10,5)?
Well, let's first count how many ways you could pick the
5 elements in order:
You have 10 choices for the first element,
but then there are only 9 unpicked elements left, so you only have 9 choices for
the second element, 8 for the third, 7 for the fourth and 6 for the fifth.
All told, you have
10´9´8´7´6 = 30240
ways to choose an ordered sequence of
5 objects if you have 10 to choose from.
This is not quite the answer we want, because we're after
the number of ways you could get a given set of 5 elements
irrespective of the order in which you pick them... Read on.
Let's count the number of ways to order 5 things.
It looks very much like what we did before:
You have 5 ways to pick the first,
4 ways to pick the second, 3 ways to pick the third, 2 ways to pick the fourth,
and just 1 possibility left for the fifth and last.
All told, you have 1´2´3´4´5=120
ways to order 5 elements.
Each of your 30240 sequences of 5 objects is thus part of
one (and only one) collection of 120 sequences
in which the same objects appear, in some order.
The number of such collections is clearly 30240/120, or 252.
Therefore, there are 252 different ways to pick 5 unordered elements among 10: C(10,5)=252.
Make sure you understand the above example before going any further...
Instead of writing
1´2´3´4´...´(n-1)´n,
you may write n! (it's called a factorial,
but there's nothing to it; it's just shorthand).
The factorial notation makes it easy to express what we just told you about:
What's the number of ways you can order p elements?
Well, p!, of course...
The number of ways there are to pick an ordered
sequence of p elements from n possible ones,
is obtained with the same method used in the above example:
n (n-1) (n-2) ... (n-p+1) = n! / (n-p)!
Think about it:
The left-hand side has p factors and the right-hand side is
a product of n factors divided by (n-p) factors so there are only p of them left...
In terms of factorials, C(n,p) equals the above divided by (p!). Therefore:
C(n,p) = n! / [(n-p)!p!]
I hope I did not lose you.
If all these things are new to you, it's normal to get confused at first.
Study the above and everything will be clear, eventually...
The choice numbers C(n,p)
are also called binomial coefficients and they
will be useful to you for the rest of your mathematical life, in many situations...
Pascal's Formula and Pascal's Triangle:
C(n+1,p+1) = C(n,p) + C(n,p+1)
This could be shown
by using the above expression for C(n,p), but there's an easier approach...
Just notice that you can pick (p+1) objects among (n+1) in one of two ways:
Either you pick the last object or you don't!
You've C(n,p) ways to do the first thing and C(n,p+1) ways to do the second.
With this formula, you can easily build (without any multiplications)
what's called "Pascal's triangle", which is simply what you have when you
write the coefficients C(n,p) in a table where n is the number of the line
(lowest on top) and p the number of the column (lowest to the left).
Each coefficient is the sum of the two that are above it
(one is directly on top, the other is to the left of that one).
You may want to notice that C(n,n) is 1, C(n,0) is also 1
(there's only one way to pick no object at all), C(n,p)=C(n,n-p)
(there are as many ways to pick p objects as there are to leave p objects), etc.
Build your own Pascal's Triangle... Frame it.
0 1 2 3 4 5 6 7 8 9 10 11 12 ...
0: 1
1: 1 1
2: 1 2 1
3: 1 3 3 1
4: 1 4 6 4 1
5: 1 5 10 10 5 1
6: 1 6 15 20 15 6 1
7: 1 7 21 35 35 21 7 1
8: 1 8 28 56 70 56 28 8 1
9: 1 9 36 84 126 126 84 36 9 1
10: 1 10 45 120 210 252 210 120 45 10 1
11: 1 11 55 165 330 462 462 330 165 55 11 1
12: 1 12 66 220 495 792 924 792 495 229 66 12 1
13: 1 13 78 286 715 ...
calcuqueen (Linda of Hot Springs, SD.
2000-10-09)
A few days ago
[2000-10-06],
I asked how many 3-scoop sundaes I could make with 7 flavors of ice cream.
At first, I believed it was
a C(7,3) = 35 problem, but then I discovered this was not the case.
The combination does not allow for repeated use of the same flavor!
I found the answer by complete enumeration -- and someone posted
a very nice solution.
The answer is 84, which is 7 + (6)(7) + C(7, 3).
I find it interesting that the answer can also be found by:
(1)(7) + (2)(6) + (3)(5) + (4)(4) + (5)(3) + (6)(2) + (7)(1) = 84.
Does anyone recognize this pattern? It is not a combination OR a permutation,
but seems related [...]
(Taylor of Portland, OR.
2000-10-22)
If I have 31 different flavors of ice cream , how many triple scoop cones
can be made without duplicating combinations?
Well, it turns out that the solution to your problem (with N flavors) is an
ordinary combination number, namely C(N+2,3), which is N(N+1)(N+2)/6.
That's 84 if N = 7, or 5456 if N = 31...
To see this in concrete terms, consider that your list of flavors is given in some order
on the menu (alphabetical, say).
A weird waitress insists that you may not name any flavor more than once but are allowed
to use the words FIRST and LAST (at most once) to refer either to the first or the last
flavor in the menu list (among those you named).
Each type of sundae can be so ordered in a unique way.
This is equivalent to picking 3 different "flavors" among N+2
(adding the two bogus "flavors" FIRST and LAST), which can be done in C(N+2,3)
different ways.
Now, about Linda's pattern (which is indeed pretty):
It is fairly easy to prove its correctness using, for example,
the formulas for sums of consecutive integers and squares of consecutive integers.
However, there's a better way, using straight combinatorics
(which Linda may or may not have used to discover the pattern):
To order a sundae, you may first choose what will be your "middle" flavor on the menu's
ordered list.
You are then allowed to pick one flavor below the middle one (including itself)
and one flavor above it (including itself), for a total of 3 flavors, allowing any
repetitions.
You have K(N-K+1) choices if your middle choice is Kth on the list
(K choices for the lower flavor, (N-K+1) for the upper one).
The sum over all possible choices (from K=1 to K=N) is the total number of possible sundaes.
That's also Linda's "pattern".
Matthew LeBaron (2002-03-15; e-mail)
[...] Shirts come in 3 sizes (S, M, and L) and 5 colours (Red, Blue, Green, Black,
and White). They are sold in packages containing 4 randomly selected shirts.
[Order within a package is irrelevant and duplicates ares allowed.]
How many unique packages are there?
What's the formula to determine this, given any set of numbers? [...]
There are C(15+3,4) = 3060 distinct packages.
You have n = 15 different types of shirts and must choose p = 4 of these
(allowing duplicates); the generalized question may thus be cast as follows:
-
Allowing duplicates, how many [unordered] ways are they to choose
p items of n possible types?
Well, it could be helpful to look up the previous article
(which corresponds to the special case p = 3)
in order to get the idea which leads quickly to the surprisingly simple answer:
There are C(n+p-1, p) such choices...
The reason is that there is a one-to-one correspondence between a choice of
p items among n+p-1 whithout repetitions and a
choice of p items among n types allowing repetitions.
This can be seen by introducing p-1 bogus types in the former case (call them
FIRST, SECOND, ... LAST to be consistent with the nomenclature introduced in the
previous article).
We assume the regular types (there are n of them) are listed in some
fixed order (lexical, say).
Whenever present in the no-repetition choice,
the m-th bogus type simply means a repetition of the m-th type listed
in the duplicates-allowed choice
once lower bogus types have been similarly "translated".
For example the [unordered] combination {A, B, C, SECOND, THIRD}
corresponds to (A, B, B, B, C).
It should be clear that this correspondence is indeed one-to-one,
so that the total number of choices is the same in either case.
Just as in the previous article, there are several ways to count the same thing,
and this may lead to interesting identities. For example, we may build a
duplicates-allowed choice of p items of n possible types by first choosing a certain
number q+1 of different types [for some q between 0 and p-1], which can be
done in C(n,q+1) ways. Once this is done, we have C(p-1,q) possibilities to
cast the chosen types on the p items. (HINT: You may do so by choosing at which q
"intervals" in an ordered lineup of p items the item type is seen to "increase".)
All told, this demonstrates that C(n+p-1,p) is equal to the sum of the p terms
C(n,q+1)C(p-1,q) when q ranges from 0 to p-1, which may not have been
otherwise so obvious... (Check it!)
(2004-02-14)
Leterrier's Problem
In general, n points of the plane define C(n,2) straight lines, of which
no two are parallel and no three meet outside of the n original points.
How many new points of intersections are there?
I am dedicating this to my high-school teacher,
Mr. Leterrier, who posed this nice
question to my class 31 or 32 years ago (time flies).
I still remember finding the answer through some awkward method,
just like the rest of the class did.
Only after simplifying the result did it occur to me that there should be an easy
solution.
Give it a fair shot before peeking!
(J. F. of Los Angeles, CA.
2000-10-29)
Suppose 1.5 percent of the plastic spacers produced by a high speed mold
injection machine are defective.
For a random sample of 200 spacers, find the probability that:
A. None of the spacers are defective.
B. Three or more of the spacers are defective
With p=0.015 and q=1-p=0.985, exactly k spacers are defective with probability
C(200,k) pk´q200-k.
In particular:
No spacer is defective (k=0) with probability q200, which is about 4.86683%.
Exactly one is defective (k=1) with probability
200´p´q199,
or about 14.823%
Exactly 2 are defective with probability
19900´p2´q198,
or about 22.46%.
Adding the above three results, we see that 2 or less spacers are defective
with a probability of about 42.15%.
Therefore, 3 or more are defective with a probability of about 57.85%.
Incidentally, the most likely number of defective spacers is k=3
(with a probability of about 22.574%), after which the probabilities decrease rapidly:
16.93% for k=4, 10.1% for k=5, 5% for k=6, 2.11% fo k=7,
0.76% for k=8, 0.25% for k=9, 0.073% for k=10, etc.
In a sample of n spacers, the average number of bad spacers is n times the probability
of a bad spacer. Here, that means 200´0.015, which is also 3.
goldda (2001-03-14)
What is the probability [that] 2 children in a family of 5 have the same birthday?
As usual, observe
that "2 children have the same birthday" when 3, 4, or 5 do...
Let's first suppose that we are told that no one is born on February 29th:
If we assume each kid has one of 365 equiprobable birthdays,
it's easier to compute the probability that they all have different birthdays:
The second kid has a birthday different from the first one with probability (364/365).
Knowing that the first two birthdays are different,
the third birthday is different from these two with probability (363/365), etc.
All told, the probability of having 5 different birthdays is
364×363×362×361/3654.
The probability that (at least) two kids share the same birthday is therefore
1 - 364×363×362×361/3654,
or 481626601/17748900625, which is about 2.71% (0.0271355737).
Now, let's bring leap years and February 29 into the picture.
We may even afford the luxury of using the full Gregorian calendar
(century years are not leap years except when divisible by 400; 2000 was a leap year,
1900 was not).
This is appropriate only when we do not know at what time the family lived
(for a 20th or 21st century family the Julian odds of exactly one leap year in 4
are more appropriate since 2000 was a leap year).
The Julian probability of having one's birthday on Feb. 29 is q = 1/1461
(1 leap year in 4 years).
The corresponding Gregorian probability is q = 97/146097
(97 leap years in 400 years).
Using either value for q,
we may state that none of the 5 kids were born on February 29
with a probability (1-q)5 for which the above analysis applies,
so two kids have an identical birthday other than Feb. 29 with a probability
(481626601/17748900625)(1-q)5.
To this, we should add the probability that at least 2 kids were born on Feb. 29,
which is 1-(1-q)5.
All told, this gives a probability of
180043909073061/6656578833083301
or about 2.7047512782 % in the Julian case (limited to 20th/21st century),
whereas the Gregorian case (from recent history to far into the future) corresponds
to a probability of about 2.7050013288 %.
(That's exactly 1800420599383851642470257 chances in 66558954341646251642470257.)
To summarize, the desired probability is about:
- 2.705001 % based on 97 leap years in 400 years.
- 2.704751 % assuming every 4th year is a leap year (1901-2099)
- 2.713557 % if we know no one's born on Feb. 29.
Does this result apply to 5 siblings?
No, it does not.
Heck, it doesn't even apply to a random group of real people,
because maternity wards are notoriously busier at certain times of the year...
Various additional statistical biases apply to siblings.
One major observation is that twins are not so rare,
especially among large families.
A minor observation is that the same woman cannot give birth in
March and June of the same year...
For completeness, two people born on the same day of the same year may have the
same mother, even if they're not twins.
(Guess how.)
By itself, the possibility of twins (in about 2% of live births)
overshadows and invalidates the above result when applied to siblings.
deweydog (2001-03-24)
If you have a binomial distribution with n=200 and p=0.47,
what would be the variance? How do you work this [out]?
If n independent random samples are either equal to 1 (with probability p)
or to 0 (with probability 1-p), the sum of their values is a random variable which
is said to have a binomial distribution;
its average is np and its variance is np(1-p).
Here, this means 200´0.47´0.53,
or 49.82.
(The standard deviation is the square root of this, roughly 7.0583.)
The explanation for this simple formula repays study. Here it is:
To obtain a binomial distribution (both in theory and in practice),
you may consider the sum Y of n independent random variables (X1, X2, ... Xn),
each of which being equal to 1 with probability p and to 0 with probability q = 1-p.
The sum is equal to k with probability
C(n,k)(1-p)n-k pk .
The "binomial" distribution derives its name from the
binomial coefficients
C(n,k) which appear in this expression.
The average ("expectation" or "expected value") E(Y) of a sum being the sum of the averages,
we have simply E(Y)=np.
(The expectation E is a linear function.)
With n=200 and p=47%, the sum Y will have an expected value of E(Y)=94.
The variance V(Y) of Y is the expectation of the random variable
(Y-E(Y)) 2,
namely
V(Y)=E([Y-E(Y)] 2=E(Y 2-2E(Y)Y+E(Y) 2)=E(Y 2)-E(Y) 2.
(The last equality in this relation is known as Koenig's theorem.
It is obtained simply by noticing that E(Y) is just a constant number.
The expectation of a constant number is itself and the expectation
of a random variable multiplied by any constant number is simply that constant
number multiplied by the expectation of the random variable.)
For two independent random variables A and B,
the relation E(AB)=E(A)E(B) holds
(which is not generally the case for dependent variables).
You may consider the variance of the sum A+B:
V(A+B)=E([A+B] 2-[E(A+B)] 2.
V(A+B) may thus be expressed as
E(A 2+B 2+2AB)-[E(A)+E(B)] 2.
Since we have E(AB)=E(A)E(B) for statistically independent variables,
this boils down to
E(A 2)+E(B 2)-E(A) 2-E(B) 2,
which you may recognize as equal to V(A)+V(B).
In other words, if A and B are independent, V(A+B)=V(A)+V(B).
Therefore, the variance of the sum Y is n times the variance of each independent term X.
What is the variance of X? Well, as X is either 0 or 1,
X 2=X
and
E(X 2)=E(X)=p
so that
V(X)=E(X 2)-E(X) 2=p-p 2=pq.
All told, V(Y) = npq , as advertised.
(Seetharaman of United Kingdom.
2000-10-31)
In standard deviation calculations, why are we using N-1 instead of N?
There are some books where I find N being used instead of N-1.
Both formulas are correct but they apply to different situations.
Basically, you use N when you know the theoretical average of the sample
(for example, you may know it's zero because of some symmetry in the the phenomenon),
whereas you use N-1 when you are working with an estimate of the average,
obtained by averaging the sample at hand.
This latter use of (N-1) comes from the fact that you want what's called an
"unbiased" estimator of the standard deviation,
namely you want the calculations to yield a result which does not
have a systematic built-in error...
Consider n samples X1, X2, ... Xn
obtained from an unknown random variable X.
The expected value of the sum is the sum of the expected values.
Therefore, if you want to estimate the mean of X, your best shot is to divide by n
the sum of the samples. Call this the "sample average".
It's an "unbiased" estimate of the mean in the sense that the expected value
of the "sample average" equals the true mean of the random variable X. So far so good.
Now, what happens if you want to estimate the standard deviation but do not know
the true mean of X?
Well, you're gonna have to use the best estimate of the mean at your disposal,
namely the sample average.
The problem is that this sample average is obtained from the sample itself
and is therefore not completely independent of each individual sample.
Let's see what the expected value is for the square of the deviation of the sample
X1 from the sample average
(X1+...+Xn)/n:
That quantity is
(X1-(X1+...+Xn)/n)2
and its expected value may be computed from the fact that the samples are independent:
We obtain
(1-1/n)2 ´ E(X2)
+ (n-1)/n2 ´ E(X2)
- 2(1-1/n)(n-1)/n ´ E(X)2
+ (n-1)(n-2)/n2 ´ E(X)2
which is equal to
(n-1)/n ´ [(1-1/n)E(X2)
+ 1/n ´ E(X2)
- 2(1-1/n)E(X)2
+ (1-2/n)E(X2)]
and which boils down to (n-1)/n ´
[E(X2)-E(X)2].
Now, the quantity E(X2)-E(X)2 equals the square
of the standard deviation s.
In other words, the square of the deviation of X1 from the sample average
is expected to be (n-1)/ns 2 for each sample.
The same thing applies to X2,...,Xn and,
therefore, the sum of n such quantities
(one for each sample) has an expected value of
(n-1)s 2.
This explains why you have to divide the sum by (n-1) instead of n when you're working
with the sample average because the true mean is unknown.
When the true mean is used, there are no such complications from interdependencies
of the quantities involved and the divisor n should be used.
In other words, we may estimate the standard deviation s
of a random variable X from
a set of n samples X1, ..., Xn
using either of the following unbiased formulas:
- s 2 =
[ X12 + ... + Xn2 -
n ´ m2 ] / n
- s 2 =
[ X12 + ... + Xn2 -
n ´ A2 ] / (n-1)
m is the true mean of X, whereas A = (X1+...+Xn)/n
is an estimate of m based on the sample at hand.
MIMI192002 (2002-02-24)
Inclusion-Exclusion Enumeration
What is the probability P(AUBUC) of the union of 3 events A,B,C?
-
P(AUBUC) = P(A) + P(B) + P(C) - P(AÇB)
- P(BÇC)
- P(CÇA)
+ P(AÇBÇC)
You may prove this using the relation
P(AUE) = P(A) + P(E) - P(AÇE), with E=BUC,
namely: P(AUBUC) = P(A) + P(BUC)
- P((AÇB)U(AÇC)).
The last two terms may be expanded using the formula for the probalitity of a union again:
P(AUBUC) = P(A) + P(B) + P(C)
- P(BÇC)
- P(AÇB)
- P(AÇC)
+ P((AÇB)Ç(AÇC)).
The last term of this expression is clearly
P(AÇBÇC).
This identity is a special case of a relation which is worth committing to memory:
The probability of a union is equal to:
- the sum of the probabilities of the individual events,
- MINUS the sum of the probabilities of the intersections of 2 events,
- PLUS the sum of probabilities of the intersections of 3 events,
- MINUS the sum of probabilities of the intersections of 4 events,
- PLUS etc.
This is easily proved by induction; the formula for n+1 events is derived from the
formula for n events just as we went from 2 to 3 events in the above.
This type of computation is often called an inclusion-exclusion
enumeration.
It holds either for the probability of events, or for the number of elements in sets.
WiteoutKing (Lowell, MA.
2002-07-17) [ Cf. abridged answer. ]
What are the odds in favor of being dealt a given poker hand?
There are C(52,5) = 2598960 different poker hands and each of them is dealt with the
same same probability.
[The choice number C(52,5) is pronounced "52 choose 5" and
is equal to the number of ways 5 objects can be picked among 52 when the order is irrelevant,
as explained above.]
Therefore, it's only a matter of counting how many such hands belong to a given type
to determine the probability of that type
(namely, the ratio of that number to 2598960, the total number of possible hands).
When the probability of an event is the fraction P = x / (x + y) ,
its so-called odds are said to be either x to y in favor
or y to x against.
We pay tribute to this popular way of expressing probabilities in the list below
(which assumes some familiarity with poker hands).
Note that there are normally 10 different "heights" for a straight and
that the ace (A) belongs to the lowest (A,2,3,4,5) and the highest (10,J,Q,K,A),
which is traditionally called a Royal Flush if all cards belong to the same suit.
- Royal Flush: C(4,1)C(1,1) = 4; P = 1/649740
("1 to 649739 in favor")
- Straight Flush: C(4,1)C(9,1) = 36; P = 3/216580
("3 to 216577")
- 4 of a Kind: C(13,1)C(48,1) = 624; P = 1/4165
("1 to 4164")
- Full House: C(13,1)C(4,3)C(12,1)C(4,2) = 3744; P = 6/4165
("6 to 4159")
- Flush: C(4,1) [C(13,5) - 10] = 5108; P = 1277/649740
("1277 to 648463")
- Straight: C(10,1) (45-4) = 10200; P = 5/1274 ("5 to 1269")
- 3 of a Kind: C(13,1)C(4,3)C(12,2) 42 = 54912; P = 88/4165 ("88 to 4077")
- Two Pairs: C(13,2)C(4,2)2C(44,1) = 123552; P = 198/4165 ("198 to 3967")
- Pair: C(13,1)C(4,2)C(12,3) 43 = 1098240; P = 352/833 ("352 to 481")
- High Card: (C(13,5)-10) (45-4) = 1302540; P = 1277/2548 ("1277 to 1271")
-
A word of explanation for the last count.
"High Card" means that the hand is "none of the above"
(two such hands would be compared highest card first to decide who wins).
These hands are uniquely determined from two independent choices:
First, pick 5 distinct values in one of the
[C(13,5)-10] ways which aren't straights,
then assign suits in one of the
[45-4] ways that aren't flushes.
The total of all of the above is 2598960 = C(52,5), because
every hand is of one and only one of the 10 types listed.
You may want to check this...
Should your own local rules disallow the 10th straight sequence
(A,2,3,4,5), the above counts for straights and/or flushes should be changed
(and the "High Card" count should be modified as well),
replacing 4, 36, 5108, 10200 and 1302540 respectively
by 4, 32, 5112, 9180 and 1303560 = (C(13,5)-9)(45-4).
Don Anglen
(2004-01-26)
7-Card Stud Poker
I'm told there are 41584 seven-card hands [out of C(52,7)] containing one of the
40 five-card straight flushes. Why not 40 C(47,2) = 43240 ?
Let's first count how many ways there are to get a royal flush
(A,K,Q,J,10 in the same suit) when you are dealt q cards.
Among the C(52,q) hands of q cards, the following number contain a royal flush
(recall that C(n,p) is 0 for a negative p):
C(4,1)C(52-5,q-5) - C(4,2)C(52-10,q-10) + C(4,3)C(52-15,q-15) - C(4,4)C(52-20,q-20)
The first term corresponds to a choice of one of the 4 possible royal flushes
followed by q-5 cards among the 47 remaining cards.
This would count more than once any hand with more than one royal flush in it,
so a negative adjustment must be made...
What you must do is a proper "inclusion-exclusion" enumeration
(as presented above) which is where the 3 additional terms come from.
The same thing goes for straight flushes (let's consider a royal flush to be a particular case
of a straight flush) except the alternating terms are more complex, and more numerous
[as many as 20 straight flushes could coexist in a hand of q = 26 cards].
Clearly, the "naive" 40 C(47,2) count (for q = 7)
tallies more than once any hand containing more than one straight flush...
Here's a correct enumeration,
where the n-th square bracket corresponds to the total number of hands
containing n straight flushes.
The bracket contains up to 4n-3 terms, each corresponding
to a given total number of cards used by the
n straight flushes under consideration
(this number can be anything from 4+n to 5n).
The coefficients in front of the binomial numbers add up to C(40,n):
[ 40 C(52-5,q-5) ]
- [ 36 C(52-6,q-6) + 32 C(52-7,q-7) + 28 C(52-8,q-8) +
24 C(52-9,q-9) + 660 C(52-10,q-10) ]
+ [ 32 C(52-7,q-7) + 56 C(52-8,q-8) + ... ]
- [ 28 C(52-8,q-8) + ... ]
+ ...
For example, the second of the above brackets corresponds to two given
("featured") simultaneous straight flushes (there may be more, we don't care):
Among the C(40,2) = 780 such featured pairs, 36 use 6 cards,
32 use 7 cards, 28 use 8 cards, 24 use 9 cards, and 660 pairs are disjoint
and use 10 cards. Many of the terms cancel out (we're
missing a shortcut here) so the formula can be greatly simplified,
but there's no need for sophistication in the case where
q = 7, since the terms already given explicitely
include all the nonzero ones.
For q = 7, the thing boils down to the advertised result:
40 C(47,2) - 36 C(46,1) = 41584
Don Anglen
(2004-01-26)
26-Card Stud Poker & q-Card Poker
What's the probability of getting a straight [or royal] flush with 26 cards?
The enumeration presented in the previous article is only easy for small hands
or very large ones:
For example, in a "hand" of 
q = 45 cards there's always at least one straight flush.
Only one hand of 44 cards is without a straight flush.
However, no such shortcut applies to the case q = 26.
We'll use a simple computer program to do the actual
counting fairly efficiently, in two steps:
Step 1:
First, let's consider the C(13,q) hands consisting of q cards of the same suit.
Let's call N(q) the number of these which contain at least one straight.
There are only 8192 different such hands for q between 0 and 13 (that's 2 to the power of 13)
so it's easy to tabulate N(q) in a fraction of a second with a computer.
To do so, we assign each hand a pattern of 13 bits
(corresponding to the integers from 0 to 8191) with the most significant bit
set if we hold an Ace, the next one for a King, and so on,
down to the least significant bit for a deuce.
The number 31 (i.e, 5 consecutive bits set to 1) multiplied by
any power of two, from 1 to 256, gives a so-called mask
corresponding to one of the 9 regular straights,
while the mask for the tenth straight (A,2,3,4,5) is 4096+15 = 4111.
The tiny table below may thus be built by counting the number q of nonzero bits in each
of the 8192 patterns whose "bitwise and" with at least one of the 10 acceptable
masks is equal to that particular mask.
Here's a piece of UBASIC code to do just that: dim N(13)
for I=0 to 8191
Q=bitcount(I)
if 4111=bitand(4111,I) then *TALLY
M=31
while M<8192
if M=bitand(M,I) then *TALLY
M=2*M
wend: goto *SKIP
*TALLY: N(Q)=N(Q)+1
*SKIP: next I Number of combinations of q cards from the same suit (of 13
cards) which contain at least one straight flush:
q | < 5 | 5 |
6 | 7 | 8 | 9 |
10 | 11 | 12 | 13 |
> 13 |
N(q) | 0 | 10 | 71 |
217 | 371 | 384 |
234 | 77 | 13 | 1 | 0 |
Step 2 (using several methods):
Now, for the entire pack of 52 cards, we may consider the 4 nonexclusive
possibilities of having straight clubs, straight hearts, straight diamonds,
or straight spades...
Thus, a straightforward inclusion-exclusion enumeration gives
the desired result as the sum of the following expressions [of four terms]
over all the nonnegative indices
q1 q2 q3 q4
whose sum is equal to q:
C(4,1) N(q1) C(13,q2) C(13,q3) C(13,q4)
- C(4,2) N(q1) N(q2) C(13,q3) C(13,q4)
+ C(4,3) N(q1) N(q2) N(q3) C(13,q4)
- C(4,4) N(q1) N(q2) N(q3) N(q4)
Such terms are zero when an index is more than 13 or when the function N appears with an
argument below 5, so there are relatively few nonzero terms to add (or subtract):
For q = 26,
the 4 above types of terms are nonzero in 1209, 709, 334 and 84 cases, respectively.
A grand total of 2336 nonzero terms...
A priori, C(q+3,3) ordered sets of indices of sum q,
and 4 terms to each set could have meant 14616 terms for q = 26.
All told, 255774304012724 hands contain a
straight flush out of 495918532948104 possible hands of 26 cards.
Thus, the probability to have a straight [or royal] flush among 26 cards is:
9134796571883 / 17711376176718 = 0.5157587124082935...
Number
of combinations of q cards containing at least one straight flush,
and percentage of that to the total number of combinations
C(52,q) :
q= (%)
4 0 0
5 40 0.001539077
6 1844 0.009057633
7 41584 0.031082810
8 611340 0.081237077
9 6588116 0.179069883
10 55482100 0.350708060
11 380126920 0.629310354
12 2177910310 1.055294393
13 10644616240 1.676281723
14 45049914588 2.546680140
15 167011924492 3.726795587
16 547315800984 5.281342552
17 1597026077496 7.277207845
18 4173458163098 9.780325218
19 9813490226056 12.851547041
20 20841357619302 16.541465273
21 40096048882028 20.884249209
22 70043238025686 25.890741583
23 111299858400784 31.541291220
24 161084365419139 37.779089838 |
q= (%)
25 212534592698472 44.505092157
26 255774304012724 51.575871241
27 280828262657348 58.805898612
28 281310102228657 65.975612201
29 257052287198104 72.846103901
30 214209339496536 79.180215104
31 162748148903084 84.768275047
32 112708315365631 89.454857917
33 71138289660672 93.161256080
34 40921413681848 95.897626995
35 21453967174040 97.759817712
36 10250154666092 98.909218234
37 4460732889232 99.539237677
38 1766089532348 99.837373263
39 634724725868 99.954515344
40 206360120020 99.990654664
41 60402956200 99.998720874
42 15820006672 99.999889077
43 3679075192 99.999994346
44 752538149 99.999999867
45 133784560 100 |
The above summation can be simplified into the following expression,
which saves multiplications. It involves only 508 nonzero terms for
q = 26, and never more than
1195 nonzero terms for any q (this maximum is reached for q = 35):
åi N(i)
[ 4 C(39,q-i)
- åj N(j)
[ 6 C(26,q-i-j)
- åk N(k)
[ 4 C(13,q-i-j-k) - N(q-i-j-k) ]
] ]
We could also use the elegant approach presented below and
introduce the following polynomial S(x) which encodes the result of the above "first step":
10 x5 + 71 x6 + 217 x7 +
371 x8 + 384 x9 + 234 x10 +
77 x11 + 13 x12 + x13
= x5 (1+x) (10 + 61 x + 156 x2 + 215 x3
+ 169 x4 + 65 x5 + 12 x6 + x7 )
The number of combinations of q cards containing at least one straight [or royal]
flush is then equal to the coefficient of xq
in the following expression:
4 S(x) (1+x)39
- 6 S(x)2 (1+x)26
+ 4 S(x)3 (1+x)13
- S(x)4
This expression may be rewritten in a much more compact form:
(1+x)52
-
[ (1+x)13 - S(x) ] 4
The hands without a straight flush are thus enumerated by
[ (1+x)13 - S(x) ] 4.
This is easy to prove: The inside of the bracket counts the
combinations of cards of the same suit without a straight.
A combination from the whole pack without a straight flush is just
a juxtaposition of four such combinations; one in each suit.
Similarly, the formula given above for the number of
combinations of q cards containing at least one royal flush
is equal to the coefficient of xq in:
(1+x)52
-
[ (1+x)13 - x5 (1 + x)8 ] 4
So, the following difference counts hands
with straight flushes but no royal ones:
[ (1+x)13 - x5 (1 + x)8 ] 4
-
[ (1+x)13 - S(x) ] 4
This is equal to
36 x5 + 1656 x6 + 37260 x7 + 546480 x8 +
5874656 x9 + 49346350 x10 +
337176880x11 + ...
+ 490000 x47+ 25000 x47 + 625 x48.
These coefficients appear on the second line of the table in the next article...
(2004-02-06)
What's the probability of a poker hand appearing highest among q cards?
Note the careful wording of the question.
In the previous article(s) we counted royal flushes and straight flushes,
which are naturally the highest hands whenever they appear.
Lower hands are not counted if there is a higher combination among the q-cards,
just like 4-of-a-kind is not counted as two pairs in a regular 5-card hand.
For example, with 8 cards or more, we may have both a straight flush
and 4-of-a-kind; in which case the latter is dominated and is discarded...
Some of the data tabulated below may thus be surprising at first.
For example, with 8 cards or more, a full house is more likely than just three-of-a-kind...
The probability is the ratio of the tabulated number
to the total on the last line.
q-Card Stud | q = 5 | q = 6 | q = 7 |
q = 8 | q = 9 |
Royal Flush |
4 | 188 | 4324 | 64860 | 713460 |
---|
Straight Flush |
36 | 1656 | 37260 | 546480 | 5874656 |
---|
4 of a Kind | 624 |
14664 | 224848 |
2529262 | 22247616 |
---|
Full House | 3744 |
165984 | 3473184 |
44553888 | 410715240 |
---|
Flush | 5108 |
205792 | 4047644 |
50850320 | 453468584 |
---|
Straight | 10200 |
361620 | 6180020 |
67072620 | 509170720 |
---|
3 of a Kind | 54912 |
732160 | 6461620 |
39591240 | 164363844 |
---|
Two Pairs | 123552 |
2532816 | 31433400 |
257760900 | 1442570040 |
---|
Pair | 1098240 |
9730740 | 58627800 |
236092500 | 600163200 |
---|
High Card | 1302540 |
6612900 | 23294460 |
53476080 | 69788040 |
---|
TOTAL | 2598960 |
20358520 | 133784560 |
752538150 | 3679075400 |
---|
With 11 cards, what's really difficult is to avoid getting at least a pair,
for you must hold AKQJ9876432 (the only set of 11 different values which does not
contain a straight) without holding more than 4 cards of any suit.
There are as many ways to do this as there are arrangements of 11 suits where no suit
appears more than 4 times.
As explained elsewhere on this page, this amounts to:
The coefficient of x11 / 11! in
( 1 + x2 / 2! + x3 / 3!
+ x4 / 4! ) 4
This coefficient is 11! times 25/432, which is 2310000
(whereas there are 411, or 4194304 unrestricted arrangements).
All told, there are thus 2310000 combinations of 11 cards that do not include a
pair or better. Out of a total of C(52,11) = 60403728840
equiprobable combinations, this is a probability of only 0.000038242672...
With 12 cards, this probability would be zero,
since you would always hold either a straight or several cards of the same kind.
The Wizard of Odds
|
Poker Odds
| Excel format:
Huckster
Odds &
Dead
Man's Hand
(Lizzie of United Kingdom.
2000-10-29)
There are 3632428800 different possible arrangements of the letters
in the word "CONSTANTINOPLE".
How can I show that there are 457228800 arrangements of the letters
so that no two vowels are next to each other?
Let's first count the unrestricted arrangement of the letters in
CONSTANTINOPLE.
There are 14 letters
(3 N's, 2 O's, 2 T's, and a single copy of each of the 7 letters C,S,A,I,P,L,E).
If all these letters were different (say the N's are colored red white or blue, whereas both
the O's and the T's are either red or blue), you would have 14! different arrangements.
To each actual arrangement of the (uncolored) letters corresponds exactly 3!2!2! arrangements
of the colored letter. Therefore, there are 14!/(3!2!2!) or 3632428800 possible arrangements
of the (uncolored) letter in CONSTANTINOPLE. So far so good.
What happens when you don't allow adjacent vowels?
You have 5 vowels and 9 consonants.
In how many ways can you put P vowels in 2P-1+K positions?
Well, if K is negative you can't do it.
If K is zero, you've got only one choice (think about it).
In general, you have C(K+P,P) choices,
because that's the number of ways you could place the vowels in P+K positions
if you did not have any restriction.
With the restriction, each vowel is in effect occupying two spaces
(itself and the position to its right) except for the last one which is only occupying
one space. So, you may as well mentally ignore the P-1 "dead spaces"
and see the exact correspondence between the two situations.
Here, you have P=5 and K=5, which means C(10,5) = 252 ways of choosing the positions
of the vowels. Once this is done, you have 9!/(3!2!) choices for placing the consonants
(since there are 3 identical N's and 2 identical T's)
and 5!/2! choices for the vowels (there are two O's).
All told, you have
252 ´ 9! ´ 5! / 24
= 457228800 possibilities, as advertised.
Incidentally, the probability that no two vowels are adjacent in an arrangement
of P vowels among N letters (N = 2P-1+K) depends only on N and P,
because it's the same as the probability of having no adjacent marks
when P squares are randomly marked in a row of N squares.
(Putting equiprobable arrangements of vowels in the marked squares and
equiprobable arrangements of consonants in the unmarked ones won't change the odds.)
This probability is:
C(N-P+1,P) / C(N,P)
With P=5 and N=14, this ratio is simply 252/2002, or 18/143 (about 12.59%) which is
an easier way to find the ratio of the above pair of large numbers...
(Lizzie of United Kingdom.
2000-10-31)
From the 9 letters of the word "POSSESSES",
show that there are...- 12 possible selections of 4 letters, and
- 115 different arrangements of 4 letters in a straight line.
The 9 letters include: 5 S's, 2 E's, 1 P, 1 O.
Rather than count things tediously,
we'll present a very interesting approach...
To make a selection,
you have to chose a certain number of S's from 0 to the maximum allowed, which is 5
(the fact that you will eventually use at most 4 S's is irrelevant at this point),
then you choose the number of E's from 0 to 2,
and the numbers of P's and O's each from 0 to 1.
The four numbers you pick must add up to 4.
Well, the coefficient of x4 in the following expression is obtained with
exactly the same breakdown, so both quantities are equal (think about it)!
(1 + x + x2 + x3 + x4 + x5)
(1 + x + x2) (1+x) (1+x)
Expand this polynomial and you've got your answer,
not only for selections of 4 letters but for any number p from 0 to 9
(the result is the coefficient of xp ).
Here's the expansion:
1 + 4x + 8x2 +
11x3 + 12x4 + 12x5 +
11x6 + 8x7 + 4x8 + x9
In particular, the coefficient of x4 is 12,
that's the answer to your first question.
Now, let's try to count arrangements with the same polynomial idea.
Well, if all the letters were different,
you could rearrange each of the above basic selections in 4! different ways.
They are not different, though, and that means that for each set
of k identical letters in a basic selection, you've overestimated the true count
of arrangements by a factor of k!
All told, you'll see that the number of possible arrangements
is 4! times the coefficient of x4 in the following polynomial
(we find this to be more conveniently described as "the coefficient of x4/4!"):
(1 + x + x2/2! + x3/3! + x4/4! + x5/5!)
(1 + x + x2/2!) (1+x) (1+x)
Now, expand this polynomial and you get:
1 + 4x + 14x2/2! + 43x3/3! +
115x4/4! + 266x5/5! + 543x6/6! +
987x7/7! + 1512x8/8! + 1512x9/9!
In particular, the coefficient of x4/4! is 115
and that's the answer to your second question.
Incidentally, notice that, if you had an unlimited supply of the 4 letters,
the above polynomial would become a series equal to the product of 4 identical series,
each equal to exp(x).
Therefore, the whole thing would simply be the series for exp(4x), namely:
1 + (4x) + (4x)2/2! + ... + (4x)n/n! + ...
This is one (devious) way to prove that there are 4n arrangements of
n letters from an "alphabet" of only 4.
(4 choices for the 1st position, 4 for the 2nd, 4 for the 3rd, 4 for the 4th, etc.)
I encourage you to do the same thing to count basic selections (not arrangements)
of n letters of 4 possible differents kinds (whose supply is unlimited).
What, then, is the coefficient of xn
in 1/(1-x)4 ?
[Answer: C(n+3,3)]
Anonymous Query via Google (2004-11-10) :
How many positive integers less than 1000000 have the sum of their digits equal to 19?
The answer is 30492. The approach introduced above
shows that this is the coefficient of x19 in the expansion of
( 1 + x + x2 + x3 + ... + x9 ) 6.
(2004-07-17)
Polynacci Numbers
How many sequences of n bits are there without p consecutives zeroes?
Equivalently, you could ask about the number of ways to flip a coin n times
without ever getting p consecutive tails. Same problem.
Let's call a bit sequence [of zeroes and ones] without p consecutive zeroes
"satisfactory" and let F(n) be
the number of such satisfactory strings of length n.
Since all sequences of length less than p are satisfactory, we
have F(n) = 2n,
for n<p. On the other hand, if n is at least equal to p,
a satisfactory sequence starts with exactly k zeroes (k = 0, 1, ... p-1),
followed by a "one" bit and then any satisfactory sequence of length (n-k-1).
Therefore, the following relation holds, which defines F(n) recursively:
F(n) = F(n-1) + F(n-2) + ... + F(n-p)
In particular, for p=2, we're counting the number of flip sequences where tails
never occurs twice in a row, and obtain as F(n) the Fibonacci number of rank (n+2).
For p=3, it's the so-called "Tribonacci" number of rank (n+3).
For p=4, we have "Tetranacci" numbers, for p=5 "Pentanacci" numbers, etc.
Note that, for p=1, F(n) = 1, because there's only one string of n bits
without zeroes...
Number of strings
of n coin flips without p consecutive tails :
n = | 0 | 1 | 2 | 3 | 4 |
5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
Sloane's |
p = 1 |
1 | 1 | 1 | 1 | 1 | 1 |
1 | 1 | 1 | 1 | 1 | 1 | 1 |
A000012 |
---|
p = 2 |
1 | 2 |
3 | 5 | 8 | 13 | 21 | 34 | 55 |
89 | 144 | 233 | 377 |
A000045 |
---|
p = 3 |
1 | 2 | 4 |
7 | 13 | 24 |
44 | 81 | 149 | 274 | 504 | 927 | 1705 |
A000073 |
---|
p = 4 |
1 | 2 | 4 |
8 | 15 | 29 |
56 | 108 | 208 | 401 | 773 | 1490 | 2872 |
A000078 |
---|
p = 5 |
1 | 2 | 4 |
8 | 16 | 31 |
61 | 120 | 236 | 464 | 912 | 1793 | 3525 |
A001591 |
---|
p = 6 |
1 | 2 | 4 |
8 | 16 | 32 |
63 | 125 | 248 | 492 | 976 | 1936 | 3840 |
A001592 |
---|
¥ |
1 | 2 | 4 | 8 | 16 | 32 | 64 | 128 |
256 | 512 | 1024 | 2048 | 4096 |
A000079 |
---|
As there are 1024 equiprobable ways to flip a fair coin 10 times,
the probability you'll do so without ever getting tails
twice in a row is
144/1024 (about 14 %).
( Brad of Schaumburg, IL.
2000-11-03)
How many ways can you make change for a dollar?
If you only allow quarters, dimes, nickels and pennies, the answer is 242.
If you also allow dollar coins and half-dollars, the answer is 293.
Note that the number of ways you can obtain a sum of 100 by adding 25,10,5,
and 1 is the coefficient of x100 in the expression:
(1 + x25 + x50 +...)
(1 + x10 + x20 + x30 +...)
(1 + x5 + x10 + x15 +...)
(1 + x + x2 + x3 + x4 +...)
also equal to:
1 / [ (1-x25)(1-x10)(1-x5)(1-x) ]
The practical way to do it is via a small computer program.
In BASIC, this may be something as simple as the following
(which prints "242"):
s = 0
FOR q = 100 TO 0 STEP -25
FOR d = q TO 0 STEP -10
FOR n = d TO 0 STEP -5
s = s + 1
NEXT
NEXT
NEXT
PRINT s
I'll leave it up to you to change the above to account for half-dollars
(the result is 292)
and add the unique way corresponding to a single dollar coin (or bill)
for a total of 293.
The above simple-minded program can be made far more efficient if we
notice that there are 1+floor(p/5) ways to make change for an amount of
p cents with nickel and pennies only (that's what the inner loop computes).
The same idea is easy to extend to get rid of the next innermost loop by
counting the ways
to make change with pennies, nickels and dimes; there
are (1+d)(1-d+n) such ways, where n=floor(p/5) and d=floor(p/10).
The following program is indeed much faster for large values of p (well beyond p=100):
INPUT "Amount in cents p=";p
s=0
FOR j = p TO 0 STEP -25
s = s + (1 + INT(j/5))*(1 - INT(j/5) + INT(j/10))
NEXT j
PRINT s;"ways; with pennies, nickels, dimes and quarters."
It's more difficult to get rid of this "quarter counting"
loop with a closed formula, but it can be done (see below).
When we are faced with a large number of weird denominations
it may not be practical to derive such closed form formulas.
Since programs like the above tend to be too slow
[their running time is roughly proportional to the result,
which grows exponentially with the number of denominations]
we may consider a more efficient approach,
based on the actual computation of the product of the polynomials described above.
This may be done with a number of additions which is roughly proportional to
the number of denominations, and also to the square of the amount p...
Given below is a BASIC implementation of that approach.
As is, the program is written for pennies, nickels, dimes and quarters, but you may
have different and/or additional denominations by changing only the first line
of the program to define other coins (in terms of the number of pennies they stand for).
Two polynomials are used in the form of arrays (A and B).
For each new coin denomination, Polynomial B is updated to reflect the
number of ways to make change with such coins and/or the previously considered ones,
knowing that the (previously computed) Polynomial A
tells how many ways change could be made without the new denomination...
At the end, Array B contains all the answers for any amount p, up to the built-in
limit M defined at the beginning of the program (M=200 is just an example here).
It's then a simple matter to display interactively the answer B(p)
for any desired amount p:
N=3: DIM c(N): c(1) = 5: c(2) = 10: c(3) = 25
M=200: DIM A(M),B(M)
FOR i = 0 TO M: B(i) = 1: NEXT i
FOR d = 1 TO N
FOR i = 0 TO M: A(i) = B(i): B(i) = 0: NEXT i
FOR j = 0 TO M STEP c(d)
FOR k = 0 TO M-j
B(j+k) = B(j+k) + A(k)
NEXT k
NEXT j
NEXT d
Main:
INPUT "Amount in cents p=";p
PRINT B(p); "ways!"
GOTO Main Number of
ways to make change with given coin denominations.
Amount | 1,5 | 1,5,10 | 1,5,10,25 |
1,5,10,25,50 | 1,5,10,25,50,100 |
100 | 21 | 121 | 242 | 292 | 293 |
200 | 41 | 441 | 1463 | 2435 | 2728 |
500 | 101 | 2601 | 19006 | 59576 | 98411 |
1000 | 201 | 10201 | 142511 | 801451 | 2103596 |
2000 | 401 | 40401 | 1103021 | 11712101 | 53995291 |
5000 | 1001 | 251001 | 16892551 |
432699251 | 4587118976 |
10000 | 2001 | 1002001 | 134235101 |
6794128501 | 139946140451 |
p | (n+1) | (n-d+1) (d+1) | See formulas below... |
Making change with US coins:
Explicit formulas.
Let p be the amount in cents (¢) for which we have to make change.;
Let n be floor(p/5) which is the number of whole nickels
(5¢) contained in that.
Similarly, we'll call d,q,h and w the numbers of dimes (10¢)
quarters (25¢) half-dollars (50¢) and whole dollars (100¢)
in the amount p.
The number of ways to make change using only pennies and nickels is
simply n+1,
because all you have to do is decide on a number of nickels from 0 to n.
If you're allowed dimes as well,
you first choose a number i of dimes from 0 to d, and are left with (p-10i) cents
for which you must give change using only pennies and nickels.
Since floor((p-10i)/5) is n-2i, you have n-2i+1 ways to do that.
The total number of ways to make change is thus the sum of (n-2i+1) when i goes from 1 to d,
which works out to be:
(n-d+1)(d+1)
ways to give change for p cents using pennies, nickels & dimes,
where n = floor(p/5) and d = floor(p/10)
The situation is slightly more complicated when quarters are allowed as well...
If we use an even number 2j of quarters, we have to make change for p-50j cents
without quarters and have (n-d-5j+1)(d-5j+1) ways to do so...
With an odd number 2j+1 of quarters, we're left with (p-50j-25) pennies.
Since floor((p-50j-25)/10) can be shown to be equal to (n-d-5j-3), we have
(d-5j-1)(n-d-5j-2) ways to complete the transaction.
When q = 2h, we sum up the first expression for j going from 0 to h and the second one
for j between 0 and h-1. 
When q = 2h+1, on the other hand, we sum the second one also up to
j = h, which gives formally an additional term equal to
(d-5h-1)(n-d-5h-2) .
Thus, we obtain an expression valid when q is either 2h or 2h+1 by adding
(q-2h)(d-5h-1)(n-d-5h-2) to the expression
obtained for q = 2h.
The result is a mouthful, which shows that there are exactly
133333423333351000001 ways to make change for a million dollars
in pennies, nickels, dimes and quarters:
(h / 6) [ 100 h 2
+ 6 d (2n-2d-1)
- 15 h (2n-1)
- 7 ]
+
(n-d+1) (d+1)
+
(q-2h)
(d-5h-1)
(n-d-5h-2)
Eliminating h [via q = 2h + (q mod 2) ] we obtain the following expression
(whose last term is 0 or ±1/8 and may be discarded in
a rounded result):
(q / 24) [ 50 q 2
+ 12 d (2n-2d-1)
- 15 q (2n-1)
- 14 ]
+
(n-d+1) (d+1)
+
(q mod 2)
[ 2(n-2d) - 1 ] / 8
NOTE: This formula was worked out and posted here on February 8, 2004.
If you're aware of an earlier source, please
tell us,
and we'll give proper credit...
Otherwise, please give us credit:
The permanent link to (the latter part of) this article is
http://www.numericana.com/answer/counting.htm#uschange. Thanks.
(L. S. of Canada.
2000-11-20)
How many 5-digit numbers have their digits in a decreasing sequence?
There are 252 decreasing sequences of 5 digits (and 2002 nonincreasing ones).
To build such a sequence, you must choose 5 digits among the 10 possible ones.
Once you've done this, you've got no choice but to lay them out in decreasing order.
Therefore, this can be done in C(10,5)=252 ways.
If you were interested in a nonincreasing sequence instead,
you would allow several adjacent copies of the same digit.
This is only slightly more difficult to count; there are C(14,5)=2002 possibilities.
Why is that?
Well, consider a row of 9+5=14 squares of which 5 are "occupied"
and 9 are "empty".
In each occupied square, just write the total number of empty squares to its right.
The 5 numbers you wrote are between 0 and 9 and their sequence is nonincreasing.
Conversely, each nonincreasing sequence of 5 digits may occupy
(under the above rules) 5 of 14 such squares in a unique way;
place them right to left to see that you have no choice in the matter.
Therefore, there are just as many nonincreasing sequences of 5 digits
as there are ways to pick 5 squares among 5+9.
That's C(14,5)=2002.
(C.C. of Baton Rouge, LA.
2000-10-08)
1. A deck of 52 playing cards are divided into 4 piles of 13 cards each.
Find the probability that each pile has exactly one ace.
2. If a box contains 75 good IC (integrated circuit) chips and 25 bad chips,
and 10 are selected at random, find the probability that there are exactly 2 defective
chips in the 10 selected.
3. An airline overbooks its flight with a capacity of 350 by 20
(i.e. it makes 370 reservations).
If the probability of a passenger not showing up is 5 percent, what are the chances
that not every passenger that shows up will get a seat.
1. There is a total of C(52,4) equiprobable ways of choosing 4 positions for the aces in the
(ordered) piles and 134 ways of choosing one position in each pile.
The probability that each pile has one ace is thus 134/C(52,4)=2197/20825
or about 0.105498199... just under 10.55%.
2. C(100,10) ways of picking 10 ICs and C(25,2)C(75,8) ways of picking 2 bad ones and
8 good ones, for a probability of C(25,2)C(75,8)/C(100,10)=11002861125/37631107514
or about 29.24%.
3. The probability of exactly k passengers showing up is
C(370,k)(0.95)k(0.05)370-k.
Add these up from k=351 to k=370 and you'll get the probability that more than
350 passengers will show up, which is just under 60.7264% (0.6072639447).
( Erin of Edinboro, PA.
2000-10-24)
Find the probability of getting two face cards (jack, queen, or king) when two cards
are drawn from the deck without replacement.
11 / 221 is your answer:
In a 52 card deck, there are 12 face cards.
There are C(52,2) equiprobable combinations of 2 cards among which only C(12,2)
are pairs of face cards.
The probability of getting a pair of face cards is thus
C(12,2)/C(52,2) which is
12´11/(52´51)
= 11/221, or about 0.04977...
In other words, there are "11 chances in 221" (exact result);
a probability of about 4.977%.
If you were to put the first card back in the deck before the second draw,
the probability of getting a second face card would be higher,
since you'd improve the chances of getting a "good" card on the second draw
whenever you were successful the first time around.
(The result would be the probability of getting a face card twice in two
independent events, namely (3/13)2,
or about 5.325%.)
(J.O. of United Kingdom.
2000-29-09)
How many rectangles can be found in a square grid of 14 x 14 squares?
11025 for a square grid of side N=14. This has been asked/answered many times for other
values of N (N=8 for a chessboard).
My solution is to count the number of possible diagonals of such rectangles:
Each diagonal is uniquely determined by a pair of extremities.
The number of such pairs is the product of (N+1)2 (choices for the first point)
and N2 (choices for the second point not on the row or the column of the first)
divided by 2 (because there are two ordered pairs for every unordered pair).
As there are 2 diagonals per rectangle, the above product is simply twice the number
of rectangles.
Thus, the number of rectangles is N2(N+1)2/4, which is 11025 for N=14
(and 1296 for N=8, by the way).
- (Katie of Morton, PA.
2000-10-15)
What is the total numbers of squares that are in a checkerboard?
I'm not looking for the answer 64, but more than that.
The answer is 204 for a regular 8 by 8 checkerboard.
It's N(N+1)(2N+1)/6 for an N by N checkerboard.
Here's one way to prove this:
A square of size P is uniquely determined by its lower left corner, which may be picked
anywhere on the (N-P+1) by (N-P+1) grid at the lower left of the chessboard.
The total number of squares is therefore
N2+(N-1)2+(N-2)2+...+22+12,
which is equal to N(N+1)(2N+1)/6.
Namely:
- 204 for a regular chessboard (N=8),
- 285 for a shogi board (N=9),
- 385 for the board in international checkers (N=10),
- 1240 for a Scrabble(tm) board (N=15),
- 2109 for a go board (19 lines in either direction; N=18).
(2001-12-25)
What's the average distance between two random points on a segment?
In all questions that involve random variables which may assume a continuous range of values,
we must carefully specify the probability distribution involved.
This is often more delicate than in the discrete case.
Here, it suffices to say that if 0<a<b<1, a "random number" between 0 and 1 falls
between a and b with probability b-a.
This is not a statement to be proven, it's a description of the Lebesgue probability
measure on the unit segment (the so-called uniform probability distribution),
which we choose to assume.
With another choice of distribution, we would obtain other results.
That's all.
Well, with this distribution, the average distance from
a random point in a segment to either of its extremities
is half the length of the segment.
[The reader is encouraged to prove this.]
If we consider a fixed point x on the unit segment [0,1], a random point y will be
less than x with probability x, and it will be more than x with probability (1-x).
The distance between x and y averages x/2 in the former case,
and (1-x)/2 in the latter.
The overall average distance to point x is obtained as the weighted average of the two cases
(each case is weighted according to its own probability).
It is thus equal to:
x2/2 + (1-x)2/2 = x2 - x + 1/2
= d/dx [ x3/3 - x2/2 + x/2 ]
If x is a random number uniformly distributed between 0 and 1, the
average of the above quantity is equal to its
integral from 0 to 1, namely 1/3. Therefore:
The average distance between two points [uniformly] picked at
random on a segment is one third the length of the segment.
A nice introduction to continuous random variables and/or elementary calculus...
It's much more difficult to work out the average distance between two
random points on a unit square.
The answer (posted 2001-12-11
by Mark R. Diamond)
is [2 + Ö2 + 5 ln(1+Ö2) ] / 15,
which is about 0.52140543316472067833...
(Gérard
Michon.
2000-10-15)
What would you say is the "probability" that a
randomly
chosen integer has an even number of digits?
Is it possible to define a "probability" over the set of natural integers?
Well, if we require that finite sets of integers have zero "probability",
such a "probability" cannot be a measure in the usual sense of the term,
as required of probabilities defined over uncountable sets,
since a measure must be countably subadditive (that is,
the measure of a countable union is no greater than the sum of the measures of
its components).
If such a function is zero for singletons, it's also zero for any set of integers
(necessarily a countable union of singletons).
However, we may investigate the possibility of
a "probability" function that would have all the properties of a measure
except that it would be merely finitely subadditive
(only the "probability" of a union of finitely many sets does not
exceed the sum of their probabilities).
We may define the density of a set E of integers as the limit (if it exists) of 1/n
times the number of elements of E that are no greater that n (a ratio which we may call a
"partial density").
We would require the "probability" of E to be equal to this density
whenever it exists.
For example, the "probability" of the multiples of P must be 1/P.
Some sets, however,
do not have such a well-defined density.
If E is the set of integers having an even
number of digits, the above "partial density" ratio keeps oscillating between
1/11 and 10/11 and has no limit.
Another example is the set L(10,D) of integers whose leading decimal digit is D;
for this one we would probably like the newly defined probability to verify
"Benford's law":
P( L(10,D) ) = ln(1+1/D) / ln(10)
It seems 3 fundamental properties are required:
- The "probability" of a set with a well-defined density is this density.
- The "probability" of the union of two disjoint sets is the sum of their "probabilities".
- Every set has a "probability".
Once these 3 conditions are met, it would seem reasonable to drop the quotes around the word
"probability"...
To satisfy the third requirement, one could propose to define, say, a "probability" as the
superior limit of partial densities. As partial densities are bounded (they are between 0
and 1) such a superior limit always exists. It is equal to 10/11 in the case of the set
"EVEN" of integers with an even number of digits. This, however, won't meet the second
requirement.
Just split EVEN into two disjoint sets A and B, where A consists of all
integers of EVEN whose leading digit is 1 (one).
A={10,11...19,1000,1001...1999,100000...199999,10000000...}
B={20,21...99,2000,2001...9999,200000...999999,20000000...}
Now, partial densities for A are highest when n is 19, 1999, 199999, etc. as there are
respectively 10, 1010, 101010, etc. members of A no larger than n. The limsup of partial
densities is therefore 50/99 for A.
On the other hand, partial densities for B are highest when n is 99, 9999, 999999, etc. as
there are respectively 80, 8080, 808080, etc. members of B no greater than n. The limsup of
partial densities is therefore 80/99 for B.
Now 50/99+80/99 id 130/99 which is clearly not
the limsup of partial densities for EVEN (namely 10/11). Heck, this sum of probabilities
of two disjoint sets does not even have the decency to be less than 1...
A similar problem is encountered with inferior limits of partial densities, which are lowest
for A when n is 9, 999, 99999, 9999999 etc. (with respectively 0,10,1010,101010, etc.
members of A).
The liminf for A is 1/99.
For B, lowest partial densities are achieved when n is 19, 1999, 199999, 19999999, etc.
(for a count of 0, 80, 8080, 808080 etc.) and the liminf is 4/99.
Again, 1/99+4/99=5/99
is well below the corresponding liminf for the union EVEN of A and B (namely 1/11, or 9/99).
To provide a satisfactory answer, we have to ensure that additivity occurs with whatever
limiting process we use. Partial densities are additive and their limits (if they exist) are
thus also additive.
Similarly, the sequence of the averages of the partial
densities whose rank is n or less will also be additive, so will its limit if it exists.
We may iterate this process by considering the sequence of averages of the previous sequence.
It is not difficult to prove that in such an iteration if a sequence has a limit so do the
following ones and the limits are all equal...
|