I have created a program that does 1000 simulations of randomly selecting N number of balls (with different colors) from let's say 20 balls. This program also counts how many instances that a particular desired outcome/event (e.g. all N balls are the same color, etc.) has occurred from all those simulations. I may also tweak the program to do 2000, 3000,...,10,000 simulations, and compare the number of instances that the desired event has occurred. Now, my problem is how can I use those figures in a graph to prove that the probability of that event is equal to or approximately equal to X%. I have no problem with making the program itself, but any suggestions for the graph (i.e. x versus y axis) would be appreciated.
What kind of graph (or distribution function) should I use to show that the probability of an event is X%
-
0I guess that you could run your program for various number of simulations and plot the calculated ratios (desired outcomes / all outcomes) to show graphically that there is a limit as the number of simulations tends to infinity. That limit would be the desired probability. As for the plot, I mean: number of simulations on X axis and ratios on Y axis. – 2017-01-24
1 Answers
Here's an example I put together, simulating a basic fair coin toss. On the X-axis, the number of simulations is shown and on the Y-axis the success rate (e.g. the proportion of times we got heads), which should approximate the probability. (The plot was done using R, and the number of simulations is from 100 to 10000.)
From the picture, you can see that there is, of course, some variance in the data as the experiment is random. However, as we make more simulations, the variance of the data is getting smaller. Also, this plot is a close-up view, and the values on the Y-axis are mostly $0.5\pm0.02$ (the error is smaller near the end), so we have a good approximation. On the next plot, I rescaled the Y-axis so it shows values from $0$ to $1$. The limit can there be easily seen, and is, of course, $0.5$.
-
0How did you come up with that graph? I've been trying to do something similar in Python, but it's taking very long(I was using a for loop to create the array of probabilities). – 2017-01-26
-
0Well, I used standard functions in R, the code was more or less mean(replicate(simulation(),n). I don't know if these functions include some optimizations, it's not unlikely. And it did take some time, maybe a minute or so. – 2017-01-26
-
0That code was for simulating n times. As for generating a sequence of simulations, I used sapply. Maybe there is a similar function for python. I basically applied the above code to a sequence of 'n's from 100 to 10000 with a step of 100 – 2017-01-26

