-1
$\begingroup$

I was going to post this question on SO but I suspect it needs mathematical treatment. I need to make a decision(True or False) while running a simulation and I decided that this particular decision should be random.
At first I compared a random value with a static value (0.5) but I soon realized that I am getting 50% True, and 50% False. (So I can use 0.75 if I want 75% True and 25% False)
But I didn't want this particular distribution. So I wrote a function which compared a random value (between 0 and 1) with another random value.
[I have taken care of seeding the random generator with microseconds so IMO 'seed'ing part is not the problem.]
The function looks like

getResponse() {     rand1 = rand();     rand2 = rand();     if ( rand1 > rand2 )         return True;     else         return False; } 

My question is how random is the outcome of this function?
Here is the outcome (1=True 0=False) for 100 calls to function
0001101001010100110011100100011001010111101101100110010100100001110111110110001101101110000111010100
Do you see any pattern? Should there be any pattern? What happens if I reseed the random generator after say every 3rd call?
If this approach doesn't look right, can you suggest any other approach?


EDIT

I have realized my mistake of confusing pattern and distribution. I apologize for the same. I must understand first what distribution I want . Unless that is taken care of, I can not make any progress.
After going through comments and suggestions, I feel that a functional approach will either require keeping track of past results or decide on the distribution of outcome.
On second thoughts I decided neither is necessary if we get a constantly changing variable into the picture.
Enter microtime() which returns microseconds after current second.
Too bad I can't answer till next 8 hours, so I am excusing myself to type it here.

getResponse() {     return ( microtime()*1000000 % 2 ) ? True : False; } 
  • 4
    The probability of getting True is still 1/2.2011-08-25
  • 6
    What distribution DO you want?2011-08-25
  • 0
    The function `rand()` spits out numbers according to a (uniform) distribution, so we can associate it to a random variable $X$. By symmetry, $P(X (assuming two samples are almost surely unequal, which is close enough to the case here). You should explain what distribution you *do* want - we can help devise a computational way to realize it if it's simple enough.2011-08-25
  • 0
    Florian : How come? @Ricky : Err, I want random distribution? i.e. the probability of getting True or False should not depend on anything, perhaps I am getting this wrong. Let me put it in this way, I want getResponse() to return True or False (truly) randomly.2011-08-25
  • 0
    Reading this a second time, I think that the question is about quality of pseudo random number generators rather than about probabilities.2011-08-25
  • 3
    So, the question should be: Does the quality improve if one compares two random numbers instead of comparing one random number to 1/2. I can't think of any reason why this should be true, but I can think of one why the quality can get worse: the cycle length decreases to one half of the original one.2011-08-25
  • 0
    @Florian : perhaps, but I feel that my approach is wrong in the sense that I don't know what distribution I want. In fact, on a second thought, I do NOT want a particular distribution at all ! Every time I run the simulation, I do not want a particular distribution of True/False (say 50/50 in this case)2011-08-25
  • 3
    @Sudhi - what do you mean by random then? I'm unclear what you are trying to achieve.2011-08-25
  • 0
    What I want is the distribution of True(1) and False(0) for consecutive calls to the function should not be determined. e.g. if `1100001110001010101` (almost 50/50) is one outcome, then another call to function should not return `1010101000111000011` (reversed, just to make a point) which is again almost 50/50 . The ratio of True/False should be random for a given sample output of calls to the function.2011-08-25
  • 0
    Sudhi, do you want the ratio of True to False's in a fixed number of calls to be uniformly distributed? If the probability of returning True and False is fixed for each call I don't think is possible.2011-08-25
  • 2
    Maybe you want the probability of getting True to be fixed the start of each simulation. So, at the start, somehow determine a number `p`, and getResponse becomes `{ return rand() <= p; }` Is this what you want? I'm just guessing wildly.2011-08-25
  • 0
    @anon : no I do not want it to be fixed. See the thing is, I am running a simulation, and I do not want the overall distribution of True/False (whatever be the number of calls during simulation) to be a fixed ratio (which in this case is 50/50)2011-08-25
  • 0
    which is where my answer comes in.2011-08-25
  • 0
    Sudhi: If you want the ratio to not settle down over time then you must make the chance of an outcome of True/False at every juncture to be dependent on the past sequence of Trues and Falses.2011-08-25
  • 0
    Kindly look at the updated question and comment. Thanks for your time.2011-08-25
  • 1
    Despite the long question and the many comments asking for clarification, it's still not clear that you even know what you want. At the end you've settled for precisely what you started with: a fixed 50/50 distribution.2011-08-25
  • 0
    You just have a 50/50 distribution as before, but with a far worse source of randomness - measuring time might be suitable for a single random number (used as a seed for example) but not at all for a sequence of numbers. I'm under the impression that you actually want a 50/50 distribution, you just don't realize it yet. So just use `{ return rand() < 0.5; }` in your function.2011-08-25
  • 0
    thanks for the clarification, I have realized my mistake, I think I confused distribution with pattern, and I must go back to my simulation as problem lies there, not in this function. After evaluating the needs, I should decide on the distribution. Sorry for this mess :( . Thank you all for your help. I hope to report back here (if I use the answer given below) with my findings.2011-08-25

1 Answers 1

1

Use http://www.random.org/clients/ or
http://en.wikipedia.org/wiki/Hardware_random_number_generator.

When there have been m falses and n trues,
return false with probability (n+1)/(m+1+n+1) and true with probability (m+1)/(m+1+n+1).

  • 0
    thanks for your answer, it does seem a good option but I am not so keen on it as it entails keeping track of past results, which I don't want. How about this (see my answer below)2011-08-25
  • 0
    darn! cant answer, here is the short of it. `microtime()` returns time in microseconds after the current second. So `getResponse() { return ( microtime()*1000000 % 2 ); }`. What do you think about it?2011-08-25
  • 0
    That would not be "(truly) randomly".2011-08-25
  • 0
    can you please elaborate on it?2011-08-25
  • 0
    For any time, it is guaranteed what the output will be.2011-08-25
  • 0
    indeed, but the time itself is changing, so I dont know what is the **exact** time, hence the randomness. Or am I thinking totally wrong?2011-08-25
  • 0
    The "overall distribution of True/False" would still be 50/50.2011-08-25
  • 0
    hmmm, partially agreed, let `r=numOf(True)/numOf(False)` then `r~=1` (~ means nearly) for very large `num`. However, during my simulation, I observed `for num < 3000 ; 0.6 < r < 1.3`. I also discovered that changing `%2` to `%3` results in similar variance around `r=2`. I suspect that I will have to follow in your suggestions. When I do, I will report back here. Thanks.2011-08-25
  • 0
    Sudhi: that is correct, and is absolutely as it should be. A coin flipped a hundred times will show heads 50 times _on average_, and in fact the most likely outcome is that it shows 50 heads, but the likelihood that it shows exactly 50 heads is much, much smaller - slightly less than 8 percent. Indeed, the standard deviation of n coin tosses is 1/2 sqrt(n) - for isntance, flipping 100 coins will result in being more than 5 flips away from 'even' (further than 45-55) roughly a third of the time.2011-08-25
  • 1
    Also, using a RNG like 'time mod 2' is a dangerously bad idea. Most notably, many systems will not return those sorts of values with as much precision as you'd expect; I've seen multiple systems where the 'time in microseconds' value is always a multiple of 2 - in fact, always a multiple of 10! I strongly discourage using anything like this, and encourage some further study of what randomness _means_, because I think you're still somewhat confused on the core principles.2011-08-25
  • 0
    Thats a very good point @Steven . Thank you very much for your insight and explanation. I must admit my lack of knowledge of what _randomness means_ (as I expressed in my last update of question) and if one thing this exercise has taught me, that for every function that returns a bounded value, there will be _some_ (overall) distribution in hiding. And it is this outcome that I must look/think for before getting into what function I should use. I thank you all again for your help.2011-08-25