2
$\begingroup$

What is probability of having a String with random length with random characters ? For example I want to generate 100 Strings with random length k (between 1 to 100) and generate k numbers of characters (between 1-256 ASCII characters). What's the probability to get a duplicate (exact same string with same length and characters) ?

Here's the psuedocode

    for i=0 to 100           int size = random_int_range(1,100)           str = ""                          for j=0 to size               str += (char) random_int_range(1,256)     

random_int_range is uniformly distributed.

  • 0
    You need to tell us the distributions. Are they each uniform? Or are all strings equally likely? Are the $k$s the same for all 100 strings?2012-10-03
  • 0
    @Henry they are uniformly distributed, and different k for all 100 strings2012-10-03
  • 0
    @Henry I dont think he is thinking about pdfs, I would simply guess they are equally likely2012-10-03
  • 0
    Duplicate means equal length and equal content, right?2012-10-03
  • 0
    @Sasha that's correct2012-10-03
  • 0
    It's a little cumbersome to get an exact expression, but for practical purposes an approximation (via Poissonization) gives a probabily of $1- \approx \exp(-1/510) \approx 0.001958 \approx 2/1000$2012-10-03
  • 0
    Can you explain step by step by in words ? I think i can get it without exact expression.2012-10-03

1 Answers 1