Given:
- a website has $n$ concurrent users on average
- each user makes on average $s$ searches per minute while on the site
- the searches are distributed evenly (round-robin style) over $v$ servers
There is a threshold of $x$ searches per second per server; any more than this and strongly undesirable things happen.
If you know $n$ and $s$, what is the formula to find the minimum number of servers $v$ that keeps the searches per second below the threshold at a certain probability level (e.g. 90% certainty versus 99% certainty).
Is this even enough information? Or do you need to know something about the shape of the Gaussian(?) distributions for $n$ and $s$, say $\sigma_n$ and $\sigma_s$?
Real World Application
For the curious, my company is planning on proxying Google Search API requests through a local server. However, Google has automatic DOS detection that kicks in if the same public IP makes "too many" requests per second. If this happens, the IP gets blacklisted and search will stop working on the site until humans intervene and remove the blacklisting. So, while we don't know $n$ or $s$ for sure, we need to guess at them and pick a large enough $v$ to ensure that a spike in traffic probably won't stop our servers from working.