3
$\begingroup$

I'm working on a little neural network for the first time. I have a number that represents how likely is that a gen should be mutated called mutation rate. I wrote a small function that tells me whether I should mutate or not the gen according to the mutation rate.

What it does is, every time is called, it generates a random number between 0 and 1 and if the number is smaller than the mutation rate then it tells me I should mutate the gen.

Now, I'll have probably thousands of gens, and calling this function for every single gen is probably a waste of time. There must be (well, there is for sure) some formula that given a mutation rate, the amount of gens and some random number, tells me how many gens I should mutate. So what would this formula be? Does it involve calculus?

  • 0
    well is 1000 per chromosome and I might have thousands of chromosomes. This should be as fast as possible anyway. If I know how many gens to mutate I can easily make some other random function that will tell me which of them to mutate2011-04-28

2 Answers 2

3

Each single gene mutates with probability $p$, independently on the others, and you consider $n$ genes. Thus the number of genes $X$ which mutate is binomial $(n,p)$, that is, $ P(X=k)={n\choose k}p^k(1-p)^{n-k}. $ The regime you are interested in seems to be when $n$ is large and $p$ small. Then the arch classical approximation of binomial $(n,p)$ is Poisson with parameter $np$, that is, $ P(X=k)\approx\mathrm{e}^{-\lambda}\lambda^k/k!,\qquad\lambda=np. $ Some relevant keywords here are Poisson approximation or the law of rare events. You could also read this page and/or describe more precisely the kind of approximation you are interested in.

  • 1
    I have a question though: how is it that *the number of mutating genes* at a given step is enough for your purposes? I would have thought that as soon as you were considering several steps of this evolution, to know *which genes* did mutate was important.2011-04-28
2

My reading of the question:

You are wanting to run the code

 for i = 0 to num_gens   if (rand() < mutation_rate)     mutate_gen() 

but you don't want to run this loop for the sake of efficiency. You are interested in $X$ where $X$ is the expected number of times mutate_gen() is called above.

$X$ here has a binomial distribution with parameters $n,p$ with $n$ being num_gens and $p$ being mutation_rate. You can compute $P(X=k)$ for any $k$ explicitly using the binomial distribution formula (see Wikipedia). The mean is $np$ and variance is $np(1-p)$. When $p$ is small and $n$ is sufficiently large, a binomial random variable is well-approximated by a Poisson random variable.

  • 0
    Your reading is perfectly right. I'll check that article. Hope I can manage to plug it with your explanation here.2011-04-28