Làm thế nào để dễ dàng xác định phân phối kết quả cho nhiều con xúc xắc?


21

Tôi muốn tính toán phân phối xác suất cho tổng số tổ hợp xúc xắc.

Tôi nhớ rằng xác suất là số lượng kết hợp có tổng số đó trên tổng số kết hợp (giả sử súc sắc có phân phối đồng đều).

Các công thức cho

  • Tổng số kết hợp
  • Số lượng kết hợp có tổng số nhất định

1
Tôi nghĩ bạn nên coi và là các sự kiện khác nhau. (X1=1,X2=2)(X1=2,X2=1)
Miền Bắc sâu

Câu trả lời:


15

Giải pháp chính xác

Số lượng kết hợp trong ném là tất nhiên .n6n

Các tính toán này được thực hiện dễ dàng nhất bằng cách sử dụng hàm tạo xác suất cho một lần chết,

p(x)=x+x2+x3+x4+x5+x6=x1x61x.

(Trên thực tế, con số này gấp lần pgf - Tôi sẽ xử lý hệ số ở cuối.)66

Pgf cho cuộn là . Chúng ta có thể tính toán điều này khá trực tiếp - nó không phải là một dạng đóng mà là một dạng hữu ích - sử dụng Định lý Binomial:np(x)n

p(x)n=xn(1x6)n(1x)n

=xn(k=0n(nk)(1)kx6k)(j=0(nj)(1)jxj).

The number of ways to obtain a sum equal to m on the dice is the coefficient of xm in this product, which we can isolate as

6k+j=mn(nk)(nj)(1)k+j.

The sum is over all nonnegative k and j for which 6k+j=mn; it therefore is finite and has only about (mn)/6 terms. For example, the number of ways to total m=14 in n=3 throws is a sum of just two terms, because 11=143 can be written only as 60+11 and 61+5:

(30)(311)+(31)(35)

=1(3)(4)(13)11!+3(3)(4)(7)5!

=1212133267=15.

(You can also be clever and note that the answer will be the same for m=7 by the symmetry 1 <--> 6, 2 <--> 5, and 3 <--> 4 and there's only one way to expand 73 as 6k+j; namely, with k=0 and j=4, giving

(30)(34)=15.

The probability therefore equals 15/63 = 5/36, about 14%.

By the time this gets painful, the Central Limit Theorem provides good approximations (at least to the central terms where m is between 7n23n and 7n2+3n: on a relative basis, the approximations it affords for the tail values get worse and worse as n grows large).

I see that this formula is given in the Wikipedia article Srikant references but no justification is supplied nor are examples given. If perchance this approach looks too abstract, fire up your favorite computer algebra system and ask it to expand the nth power of x+x2++x6: you can read the whole set of values right off. E.g., a Mathematica one-liner is

With[{n=3}, CoefficientList[Expand[(x + x^2 + x^3 + x^4 + x^5 + x^6)^n], x]]

Will that mathematica code work with wolfram alpha?

1
That works. I tried your earlier version but could not make any sense of the output.

2
@Srikant: Expand[Sum[x^i,{i,1,6}]^3] also works in WolframAlpha

1
@A.Wilson I believe many of those references provide a clear path to the generalization, which in this example is (x+x2++x6)(x+x2+x3+x4)3. If you would like R code to compute these things, see stats.stackexchange.com/a/116913 for a fully implemented system. As another example, the Mathematica code is Clear[x, d]; d[n_, x_] := Sum[x^i, {i, 1, n}]; d[6, x] d[4, x]^3 // Expand
whuber

1
Note that @whuber's clarification is for 1d6 + 3d4, and that should get you there. For an arbitrary wdn + vdm, (x + x^2 + ... + x^w)^n(x + x^2 + ... + x^v)^m. Additional terms are polynomials constructed and multiplied with the product in the same way.
A. Wilson

8

Yet another way to quickly compute the probability distribution of a dice roll would be to use a specialized calculator designed just for that purpose.

Torben Mogensen, a CS professor at DIKU has an excellent dice roller called Troll.

The Troll dice roller and probability calculator prints out the probability distribution (pmf, histogram, and optionally cdf or ccdf), mean, spread, and mean deviation for a variety of complicated dice roll mechanisms. Here are a few examples that show off Troll's dice roll language:

Roll 3 6-sided dice and sum them: sum 3d6.

Roll 4 6-sided dice, keep the highest 3 and sum them: sum largest 3 4d6.

Roll an "exploding" 6-sided die (i.e., any time a "6" comes up, add 6 to your total and roll again): sum (accumulate y:=d6 while y=6).

Troll's SML source code is available, if you want to see how its implemented.

Professor Morgensen also has a 29-page paper, "Dice Rolling Mechanisms in RPGs," in which he discusses many of the dice rolling mechanisms implemented by Troll and some of the mathematics behind them.

A similar piece of free, open-source software is Dicelab, which works on both Linux and Windows.


7

Let the first die be red and the second be black. Then there are 36 possible results:

12345611,11,21,31,41,51,623456722,12,22,32,42,52,634567833,13,23,33,43,53,645678944,14,24,34,44,54,6567891055,15,25,35,45,55,66789101166,16,26,36,46,56,6789101112

Each of these 36 (red,black) results are equally likely.

When you sum the numbers on the faces (total in blue), several of the (red,black) results end up with the same total -- you can see this with the table in your question.

So for example there's only one way to get a total of 2 (i.e. only the event (1,1)), but there's two ways to get 3 (i.e. the elementary events (2,1) and (1,2)). So a total of 3 is twice as likely to come up as 2. Similarly there's three ways of getting 4, four ways of getting 5 and so on.

Now since you have 36 possible (red,black) results, the total number of ways of getting all the different totals is also 36, so you should divide by 36 at the end. Your total probability will be 1, as it should be.


Wow, the table is beautiful!
Deep North

Very pretty indeed
wolfies

6

There's a very neat way of computing the combinations or probabilities in a spreadsheet (such as excel) that computes the convolutions directly.

I'll do it in terms of probabilities and illustrate it for six sided dice but you can do it for dice with any number of sides (including adding different ones).

(btw it's also easy in something like R or matlab that will do convolutions)

Start with a clean sheet, in a few columns, and move down a bunch of rows from the top (more than 6).

  1. put the value 1 in a cell. That's the probabilities associated with 0 dice. put a 0 to its left; that's the value column - continue down from there with 1,2,3 down as far as you need.

  2. move one column to the right and down a row from the '1'. enter the formula "=sum(" then left-arrow up-arrow (to highlight the cell with 1 in it), hit ":" (to start entering a range) and then up-arrow 5 times, followed by ")/6" and press Enter - so you end up with a formula like =sum(c4:c9)/6 (where here C9 is the cell with the 1 in it).

    enter image description here

    Then copy the formula and paste it to the 5 cells below it. They should each contain 0.16667 (ish).

    enter image description here

    Don't type anything into the empty cells these formulas refer to!

  3. move down 1 and to the right 1 from the top of that column of values and paste ...

    enter image description here

    ... a total of another 11 values. These will be the probabilities for two dice.

    enter image description here

    It doesn't matter if you paste a few too many, you'll just get zeroes.

  4. repeat step 3 for the next column for three dice, and again for four, five, etc dice.

    enter image description here

    We see here that the probability of rolling 12 on 4d6 is 0.096451 (if you multiply by 46 you'll be able to write it as an exact fraction).

If you're adept with Excel - things like copying a formula from a cell and pasting into many cells in a column, you can generate all tables up to say 10d6 in about a minute or so (possibly faster if you've done it a few times).


If you want combination counts instead of probabilities, don't divide by 6.

If you want dice with different numbers of faces, you can sum k (rather than 6) cells and then divide by k. You can mix dice across columns (e.g. do a column for d6 and one for d8 to get the probability function for d6+d8):

enter image description here


5

Approximate Solution

I explained the exact solution earlier (see below). I will now offer an approximate solution which may suit your needs better.

Let:

Xi be the outcome of a roll of a s faced dice where i=1,...n.

S be the total of all n dice.

X¯ be the sample average.

By definition, we have:

X¯=iXin

In other words,

X¯=Sn

The idea now is to visualize the process of observing Xi as the outcome of throwing the same dice n times instead of as outcome of throwing n dice. Thus, we can invoke the central limit theorem (ignoring technicalities associated with going from discrete distribution to continuous), we have as n:

X¯N(μ,σ2/n)

where,

μ=(s+1)/2 is the mean of the roll of a single dice and

σ2=(s21)/12 is the associated variance.

The above is obviously an approximation as the underlying distribution Xi has discrete support.

But,

S=nX¯.

Thus, we have:

SN(nμ,nσ2).

Exact Solution

Wikipedia has a brief explanation as how to calculate the required probabilities. I will elaborate a bit more as to why the explanation there makes sense. To the extent possible I have used similar notation to the Wikipedia article.

Suppose that you have n dice each with s faces and you want to compute the probability that a single roll of all n dice the total adds up to k. The approach is as follows:

Define:

Fs,n(k): Probability that you get a total of k on a single roll of n dices with s faces.

By definition, we have:

Fs,1(k)=1s

The above states that if you just have one dice with s faces the probability of obtaining a total k between 1 and s is the familiar 1s.

Consider the situation when you roll two dice: You can obtain a sum of k as follows: The first roll is between 1 to k1 and the corresponding roll for the second one is between k1 to 1. Thus, we have:

Fs,2(k)=i=1i=k1Fs,1(i)Fs,1(ki)

Now consider a roll of three dice: You can get a sum of k if you roll a 1 to k2 on the first dice and the sum on the remaining two dice is between k1 to 2. Thus,

Fs,3(k)=i=1i=k2Fs,1(i)Fs,2(ki)

Continuing the above logic, we get the recursion equation:

Fs,n(k)=i=1i=kn+1Fs,1(i)Fs,n1(ki)

See the Wikipedia link for more details.


@Srikant Excellent answer, but does that function resolve to something arithmetic (ie: not recursive)?
C. Ross

@C. Ross Unfortunately I do not think so. But, I suspect that the recursion should not be that hard as long as are dealing with reasonably small n and small s. You could just build-up a lookup table and use that repeatedly as needed.

1
The wikipedia page you linked has a simple nonrecursive formula which is a single sum. One derivation is in whuber's answer.
Douglas Zare

The wiki link anchor is dead, do you know of a replacement?
Midnighter

4

Characteristic functions can make computations involving the sums and differences of random variables really easy. Mathematica has lots of functions to work with statistical distributions, including a builtin to transform a distribution into its characteristic function.

I'd like to illustrate this with two concrete examples: (1) Suppose you wanted to determine the results of rolling a collection of dice with differing numbers of sides, e.g., roll two six-sided dice plus one eight-sided die (i.e., 2d6+d8)? Or (2) suppose you wanted to find the difference of two dice rolls (e.g., d6-d6)?

An easy way to do this would be to use the characteristic functions of the underlying discrete uniform distributions. If a random variable X has a probability mass function f, then its characteristic function φX(t) is just the discrete Fourier Transform of f, i.e., φX(t)=F{f}(t)=E[eitX]. A theorem tells us:

If the independent random variables X and Y have corresponding probability mass functions f and g, then the pmf h of the sum X+Y of these RVs is the convolution of their pmfs h(n)=(fg)(n)=m=f(m)g(nm).

We can use the convolution property of Fourier Transforms to restate this more simply in terms of characteristic functions:

The characteristic function φX+Y(t) of the sum of independent random variables X and Y equals the product of their characteristic functions φX(t)φY(t).

This Mathematica function will make the characteristic function for an s-sided die:

MakeCf[s_] := 
 Module[{Cf}, 
  Cf := CharacteristicFunction[DiscreteUniformDistribution[{1, s}], 
    t];
  Cf]

The pmf of a distribution can be recovered from its characteristic function, because Fourier Transforms are invertible. Here is the Mathematica code to do it:

RecoverPmf[Cf_] := 
  Module[{F}, 
    F[y_] := SeriesCoefficient[Cf /. t -> -I*Log[x], {x, 0, y}];
    F]

Continuing our example, let F be the pmf that results from 2d6+d8.

F := RecoverPmf[MakeCf[6]^2 MakeCf[8]]

There are 628=288 outcomes. The domain of support of F is S={3,,20}. Three is the min because you're rolling three dice. And twenty is the max because 20=26+8. If you want to see the image of F, compute

In:= F /@ Range[3, 20]

Out= {1/288, 1/96, 1/48, 5/144, 5/96, 7/96, 13/144, 5/48, 1/9, 1/9, \
5/48, 13/144, 7/96, 5/96, 5/144, 1/48, 1/96, 1/288}

If you want to know the number of outcomes that sum to 10, compute

In:= 6^2 8 F[10]

Out= 30

If the independent random variables X and Y have corresponding probability mass functions f and g, then the pmf h of the difference XY of these RVs is the cross-correlation of their pmfs h(n)=(fg)(n)=m=f(m)g(n+m).

We can use the cross-correlation property of Fourier Transforms to restate this more simply in terms of characteristic functions:

The characteristic function φXY(t) of the difference of two independent random variables X,Y equals the product of the characteristic function φX(t) and φY(t) (N.B. the negative sign in front of the variable t in the second characteristic function).

So, using Mathematica to find the pmf G of d6-d6:

G := RecoverPmf[MakeCf[6] (MakeCf[6] /. t -> -t)]

There are 62=36 outcomes. The domain of support of G is S={5,,5}. -5 is the min because 5=16. And 5 is the max because 61=5. If you want to see the image of G, compute

In:= G /@ Range[-5, 5]

Out= {1/36, 1/18, 1/12, 1/9, 5/36, 1/6, 5/36, 1/9, 1/12, 1/18, 1/36}

1
Of course, for discrete distributions, including distributions of finite support (like those in question here), the cf is just the probability generating function evaluated at x = exp(i t), making it a more complicated way of encoding the same information.
whuber

2
@whuber: As you say, the cf, mgf, and pgf are more-or-less the same and easily transformable into one another, however Mathematica has a cf builtin that works with all the probability distributions it knows about, whereas it doesn't have a pgf builtin. This makes the Mathematica code for working with sums (and differences) of dice using cfs particularly elegant to construct, regardless of the complexity of dice expression as I hope I demonstrated above. Plus, it doesn't hurt to know how cfs, FTs, convolutions, and cross-correlations can help solve problems like this.

1
@Elisha: Good points, all of them. I guess what I wonder about the most is whether your ten or so lines of Mathematica code are really more "elegant" or efficient than the single line I proposed earlier (or the even shorter line Srikant fed to Wolfram Alpha). I suspect the internal manipulations with characteristic functions are more arduous than the simple convolutions needed to multiply polynomials. Certainly the latter are easier to implement in most other software environments, as Glen_b's answer indicates. The advantage of your approach is its greater generality.
whuber

4

Here's another way to calculate the probability distribution of the sum of two dice by hand using convolutions.

To keep the example really simple, we're going to calculate the probability distribution of the sum of a three-sided die (d3) whose random variable we will call X and a two-sided die (d2) whose random variable we'll call Y.

You're going to make a table. Across the top row, write the probability distribution of X (outcomes of rolling a fair d3). Down the left column, write the probability distribution of Y (outcomes of rolling a fair d2).

You're going to construct the outer product of the top row of probabilities with the left column of probabilities. For example, the lower-right cell will be the product of Pr[X=3]=1/3 times Pr[Y=2]=1/2 as shown in the accompanying figure. In our simplistic example, all the cells equal 1/6.

Next, you're going to sum along the oblique lines of the outer-product matrix as shown in the accompanying diagram. Each oblique line passes through one-or-more cells which I've colored the same: The top line passes through one blue cell, the next line passes through two red cells, and so on.

alt text

Each of the sums along the obliques represents a probability in the resulting distribution. For example, the sum of the red cells equals the probability of the two dice summing to 3. These probabilities are shown down the right side of the accompanying diagram.

This technique can be used with any two discrete distributions with finite support. And you can apply it iteratively. For example, if you want to know the distribution of three six-sided dice (3d6), you can first calculate 2d6=d6+d6; then 3d6=d6+2d6.

There is a free (but closed license) programming language called J. Its an array-based language with its roots in APL. It has builtin operators to perform outer products and sums along the obliques in matrices, making the technique I illustrated quite simple to implement.

In the following J code, I define two verbs. First the verb d constructs an array representing the pmf of an s-sided die. For example, d 6 is the pmf of a 6-sided die. Second, the verb conv finds the outer product of two arrays and sums along the oblique lines. So conv~ d 6 prints out the pmf of 2d6:

d=:$%
conv=:+//.@(*/)
|:(2+i.11),:conv~d 6
 2 0.0277778
 3 0.0555556
 4 0.0833333
 5  0.111111
 6  0.138889
 7  0.166667
 8  0.138889
 9  0.111111
10 0.0833333
11 0.0555556
12 0.0277778

As you can see, J is cryptic, but terse.


3

This is actually a suprisingly complicated question. Luckily for you, there exist an exact solution which is very well explained here:

http://mathworld.wolfram.com/Dice.html

The probability you are looking for is given by equation (10): "The probability of obtaining p points (a roll of p) on n s-sided dice".

In your case: p = the observed score (sum of all dice), n = the number of dice, s = 6 (6-sided dice). This gives you the following probability mass function:

P(Xn=p)=1snk=0(pn)/6(1)k(nk)(p6k1n1)

Welcome to our site, Felix!
whuber

1

Love the username! Well done :)

The outcomes you should count are the dice rolls, all 6×6=36 of them as shown in your table.

For example, 136 of the time the sum is 2, and 236 of the time the sum is 3, and 436 of the time the sum is 4, and so on.


I'm really confused by this. I answered a very recent newbie question from someone called die_hard, who apparently no longer exists, then found my answer attached to this ancient thread!
Creosote

Your answer to the question at stats.stackexchange.com/questions/173434/… was merged with the answers to this duplicate.
whuber

1

You can solve this with a recursive formula. In that case the probabilities of the rolls with n dice are calculated by the rolls with n1 dice.

an(l)=l6kl1 and n1k6(n1)an1(k)

The first limit for k in the summation are the six preceding numbers. E.g if You want to roll 13 with 3 dice then you can do this if your first two dice roll between 7 and 12.

The second limit for k in the summation is the limits of what you can roll with n-1 dice

The outcome:

1 1 1  1  1   1
1 2 3  4  5   6   5  4   3   2   1
1 3 6  10 15  21  25 27  27  25  21  15  10  6    3   1
1 4 10 20 35  56  80 104 125 140 146 140 125 104  80  56  35  20  10   4   1
1 5 15 35 70 126 205 305 420 540 651 735 780 780 735 651 540 420 305 205 126 70 35 15 5 1  

edit: The above answer was an answer from another question that was merged into the question by C.Ross

The code below shows how the calculations for that answer (to the question asking for 5 dice) were performed in R. They are similar to the summations performed in Excel in the answer by Glen B.

# recursive formula
nextdice <- function(n,a,l) {
  x = 0
  for (i in 1:6) {
    if ((l-i >= n-1) & (l-i<=6*(n-1))) {
      x = x+a[l-i-(n-2)]
    }
  }
  return(x)  
}  

# generating combinations for rolling with up to 5 dices
a_1 <- rep(1,6)
a_2 <- sapply(2:12,FUN = function(x) {nextdice(2,a_1,x)})
a_3 <- sapply(3:18,FUN = function(x) {nextdice(3,a_2,x)})
a_4 <- sapply(4:24,FUN = function(x) {nextdice(4,a_3,x)})
a_5 <- sapply(5:30,FUN = function(x) {nextdice(5,a_4,x)})

@user67275 your question got merged into this question. But I wonder what your idea was behind your formula: "I used the formula: no of ways to get 8: 5_H_2 = 6_C_2 = 15" ?
Sextus Empiricus

1

One approach is to say that the probability Xn=k is the coefficient of xk in the expansion of the generating function

(x6+x5+x4+x3+x2+x16)n=(x(1x6)6(1x))n

So for example with six dice and a target of k=22, you will find P(X6=22)=1066. That link (to a math.stackexchange question) gives other approaches too

Khi sử dụng trang web của chúng tôi, bạn xác nhận rằng bạn đã đọc và hiểu Chính sách cookieChính sách bảo mật của chúng tôi.
Licensed under cc by-sa 3.0 with attribution required.