1
$\begingroup$

sorry if this question sounds basic.

Basically I'm creating a P2P network and need to know how long it takes for data to be distributed through the entire network.

This has been my thought process so far:

Each node has it's own list of peers, they randomly select five peers to share data with every five minutes. This means that they could randomly select a peer that already has the data when distributing new data.

In order to calculate how many nodes data has been distributed so far I used:

$$\sum_{i=0}^n 5^i \left(\frac{p-5^{i-1}}{p}\right)$$

n is the number of five minute cycles.

p is the number of peers on the network.

I can't seem to figure out, with my pre-calc level (11th grade) math skillz how to know from just the number of peers how long it takes for the data to be completely distributed through the network. Currently I have to know how many cycles, which isn't exactly useful. Any help would be appricated.

1 Answers 1

0

To get a rough idea, let your network have $n$ peers and at some point $m$ of them have the data. In the next step there are $5m$ messages sent out. Each peer that does not have the data is targeted by a given peer that does with probability $\frac 5{n-1}$. The chance that the peer does not receive the data is then $\left(\frac {n-6}{n-1}\right)^m$ and the expected number that do not have the data is $(n-m)\left(\frac {n-6}{n-1}\right)^m$

I made a spreadsheet to compute the expected number. For $100$ peers you have about one chance in six that some peer doesn't have the data after four rounds. For $10,000$ peers you almost certainly have it distributed in eight rounds. For a million peers you want twelve rounds. A rule of thumb seems to be $2 \log_{10}n$ rounds to have rather high probability that everybody has the data.