12
$\begingroup$

Flock of sheep is walking on the path in one direction. There are 30 sheeps. Initially they all have different random speeds.

The main feature is that in the process of walking, if a sheep at a faster speed bumps on a sheep which has a slower speed, then the faster sheep starts moving with the speed of this slower sheep.

Obviously, through a sufficiently large amount of time this flock will be divided into groups which have a constant speed. The task is to find the average number of groups.

Edit: Let's consider uniform distribution of speeds: each sheep given a uniform random speed in (0,1) independently.

PS. Can you look at my answer and check my logic?

  • 0
    Interesting question! How are the initial speeds of the sheep distributed? Is each sheep given a uniform random speed in (0,1) independently?2017-02-19
  • 0
    @JRichey oh I forgot. Yes. Let's consider uniform distribution.2017-02-19
  • 0
    Can I ask you where you have found this interesting problem?2017-02-20
  • 1
    Possible duplicate of [Probability problem: cars on the road](http://math.stackexchange.com/questions/201807/probability-problem-cars-on-the-road)2017-02-20
  • 0
    @JeanMarie, this is a classical problem about the number of records in a random permutation, which is, by a simple bijection, distributed as the number of cycles, which in turn given by cyclic Stirling numbers.2017-02-20
  • 0
    @zhoroaster Thank you very much2017-02-20
  • 0
    @zhoraster Using your remark, I looked for references on the web and I found two items that maybe have some interest for people wanting to know more 1) **About records**, this paedogical paper from American Mathematical Monthly Vol 85, January 1978 by Glick (http://www.thp.uni-koeln.de/krug/teaching-Dateien/WS2011/Glick1978.pdf) and 2) **About cycles** in a random permutation, this question (http://math.stackexchange.com/q/1276590)2017-02-20

3 Answers 3

5

Providing that all orders of sheep speeds are equally likely and no pair of sheep have the same speed, the $n$th sheep (counting from the front) is the slowest of its group if and only if all $n-1$ sheep in front of it are all faster than it is, which has probability $\dfrac{1}{n}$

Each group has precisely one slowest sheep, so with $30$ sheep the expected number of groups is $$\displaystyle \sum_{n=1}^{30} \dfrac1n \approx 3.995$$

  • 0
    Can you look at my answer and say wher I have mistaken. Because I tried to find exact answer for flock of 5 sheeps with no calculations. Just drawings into groups. And it give me the whole answer 3. My formula gives exactly 3. And yours isn't.2017-02-19
  • 1
    It might be easier with $3$ sheep: your formula gives $2$, while mine gives $\frac{11}{6}$. The six possible equally-likely patterns are: (a) fastest first, then mid-speed then slowest, giving $3$ groups; (b) fastest first, then slowest then mid-speed, giving $2$ groups; (c) mid-speed first, then slowest then fastest, giving $1$ group; (d) mid-speed first, then fastest then slowest, giving $2$ groups; (e) slowest first, then fastest then mid-speed, giving $2$ groups; (f) slowest first, then mid-speed then fastest, giving $1$ group. Expected number of groups $\dfrac{3+2+1+2+2+1}{6}$2017-02-19
  • 0
    now I see that some of my groups include several of your groups... Hm, pretty interesting2017-02-19
  • 0
    I mean my group "<>" is d+e, "><" is b+c, ">>" is f and "<<" is a2017-02-19
  • 0
    I confirm that the result $\sum_{k=1}^N 1/k$ of Henry is the good one, theoreticaly, as well as on the basis of simulation where the agreement is excellent, not only for N=30 sheep but even for an armada of $N=200$ sheeps...2017-02-19
  • 0
    Henry's answer seems to be solving a different problem... if the $n$th sheep from the front is the slowest, that doesn't mean there are $1/n$ clusters: there may (and often will) be more clusters, if the sheep in front of the $n$th sheep are also divided into smaller clusters.2017-02-19
  • 1
    Henry's comment about the 3 sheep case is incorrect, for example: his case (c) yields 2 groups, not 1. Also case (e) is wrong: if the slowest sheep is in front, there is always 1 group.2017-02-19
  • 0
    @JRichey: With your corrections (thank you) I should have said: The six possible equally-likely patterns are: (a) fastest first, then mid-speed then slowest, giving 3 groups; (b) fastest first, then slowest then mid-speed, giving 2 groups; (c) mid-speed first, then slowest then fastest, giving 2 group; (d) mid-speed first, then fastest then slowest, giving 2 groups; (e) slowest first, then fastest then mid-speed, giving 1 groups; (f) slowest first, then mid-speed then fastest, giving 1 group. Expected number of groups $\dfrac{3+2+2+2+1+1}{6}=\dfrac{11}{6}$ still not $\dfrac{3+1}{2}$2017-02-19
  • 0
    Is there a connection with the coupon collector problem which also dealswith harmonic numbers ?2017-03-24
  • 0
    @Henry, in my answer I added another derivation proving your assertion on the expected No. of groups being the Harmonic number, as well the relevant probability distribution. I would like to know your opinion about2017-03-25
  • 0
    Why is the probability all $n-1$ sheep in front if it are faster $1/n$?2018-04-11
  • 0
    @user428487 With $n$ sheep, the probability any particular sheep is the slowest would be $\frac1n$. An alternative approach would be to say that of the $n!$ orders of speeds, $(n-1)!$ orders have that sheep as the slowest2018-04-11
3

Two parts:

  • 1) A visualization (see below) of the flock's evolution, on a time-position graphics displaying the aggregation process.

enter image description here

In this case, we are with 4 groups.

Moreover, this representation has been obtained by a simple programming "trick" that can interest some people (Matlab program below) who would like to adapt it for other purposes.

Matlab program

clear all;close all;hold on;box on;
n=30;s=rand(1,n);
a=100;b=60;axis([0,a,0,b]);
fill([0,a,a,0],[b,b,0,0],'c');
for k=1:n
    fill([0,a,a,0],[b,b,k+a*s(k),k],'c');
end;

Comment: This program displays trapezoidal shapes by a kind of stacking, each newly "stacked" shape yielding a possible partial occluding of previously displayed shapes.

  • 2) A simple proof of the result given by Henry.

Let $G_k$ be the Random Variable equal to the number of groups for a flock of $k$ sheep.

Considering that a $n$th sheep is aggregated to a flock with $n−1$ sheep, we have

$$ \begin{cases}\text{either}&G_n=G_{n−1}+0 \ \\ \text{or}&G_n=G_{n−1}+1 \end{cases}$$

The latter occurs in the exceptional case where the speed of the newly added animal is smaller than all other speeds (this can clearly happen with probability $\tfrac{1}{n}$). Thus, the natural model in terms of random variables is:

$$\tag{1}G_n=G_{n−1}+B$$

where $B$ is a Bernoulli RV with parameter $\tfrac{1}{n}$.

Taking expectation on both sides of (1),

$$E(G_n)=E(G_{n-1})+\tfrac{1}{n}.$$

Using the fact that $G_1=1 \implies E(G_1)=1$ (a single sheep constitutes a group in itself), we finally deduce, by an immediate recurrence, that: $$E(G_n)=1+\tfrac{1}{2}+\tfrac{1}{3}+\cdots+\tfrac{1}{n}=H_n=\ln(n)+\gamma,$$

where $\gamma$ is the Euler-Mascheroni constant.

(the $n$th harmonic number).

  • 0
    Wow, I really appreciate that! Thanks! Hm, but it is pretty interesting that J Richey and I have different answer. Now I don't understand fully who is wrong.2017-02-19
3

The answers already provided to this and to the equivalent post about cars in queue deal with the expected number of groups, and that is what the post asks.
Since the problem is quite interesting, I got curious about the underlying PDF but I did not succeed to find a satisfactory hint about (admittedly, I might have overlooked something) .
However I tried and develop a different approach which shows which is the probability distribution behind it.

Consider to "quantize" the speed into $n$ classes.
Then we can represent the $q$ sheeps in queue at time $0$ onto a diagram speed vs. position as in the sketch.

sheeps1

It is clear that the first ship will block all the following ones with higher or equal speed. That means all those before the first (n. $4$ in the sketch) that has a speed lower than n. $1$.
That in her turn will block the following ones with same or higher speed, etc.
Note that we are individuating the resulting groups as the blocks that along the time get separated by an even larger distance. So in the sketch n. $1$ and n. $3$ are in the same group.

Now the total possible ways of arranging the diagram is $T(n,q)=n^q$.

The number of ways to arrange a group, with leader speed $v$ and $m$ members, is: $$ G(v,m) = \left( {n - v + 1} \right)^{\,m-1} $$ Therefore the number of ways $N_{1}(n,q)$ to arrange the sheeps in such a way that they finally make up only one group will be: $$ \begin{gathered} N_{\,1} (n,q) = \sum\limits_{1\, \leqslant \,\,v_{\,1} \, \leqslant \,n} {\left( {n - v_{\,1} + 1} \right)^{\,q - 1} } = \sum\limits_{1\, \leqslant \,\,k\, \leqslant \,n} {k^{\,q - 1} } = \hfill \\ = \sum\limits_{\left( {0\, \leqslant } \right)\,j\,\left( { \leqslant \,q - 1} \right)} {\left\langle \begin{gathered} q - 1 \\ j \\ \end{gathered} \right\rangle \left( \begin{gathered} n + 1 + j \\ q \\ \end{gathered} \right)} = \sum\limits_{\left( {0\, \leqslant } \right)\,j\,\left( { \leqslant \,q - 1} \right)} {\;j!\;\left\{ \begin{gathered} q - 1 \\ j \\ \end{gathered} \right\}\left( \begin{gathered} n + 1 \\ j + 1 \\ \end{gathered} \right)} = \hfill \\ = \frac{1} {q}\sum\limits_{0\, \leqslant \,j\, \leqslant \,q - 1} {\left( \begin{gathered} q \\ j \\ \end{gathered} \right)\;B_j \;\left( {n + 1} \right)^{\,q - j} } \quad \left| {\;1 \leqslant \text{integer }q,n} \right. \hfill \\ \end{gathered} $$ where $ \left\langle {} \right\rangle $ indicate the Eulerian N., $\left\{ {} \right\}$ the Stirling N. 2nd kind, and $B_j$ the Bernoulli N.
Then the number of ways to arrange them as to finally have two groups is $$ \begin{gathered} N_{\,2} (n,q) = \sum\limits_{\left\{ \begin{subarray}{l} 1\, \leqslant \,v_{\,2} \, < \,v_{\,1} \, \leqslant \,n \\ m_{\,2} \, + \,m_{\,1} \, = \,q\;\;\left| {\;1\, \leqslant \,m_{\,k} \,} \right. \end{subarray} \right.} {\left( {n - v_{\,1} + 1} \right)^{\,m_{\,1} - 1} \left( {n - v_{\,2} + 1} \right)^{\,m_{\,2} - 1} } = \hfill \\ = \sum\limits_{\left\{ \begin{subarray}{l} 1\, \leqslant \,k_{\,1} \, < \,k_{\,2} \, \leqslant \,n\, \\ m_{\,2} \, + \,m_{\,1} \, = \,q\;\;\left| {\;1\, \leqslant \,m_{\,k} \,} \right. \end{subarray} \right.} {k_{\,1} ^{\,m_{\,1} - 1} \;k_{\,2} ^{\,m_{\,2} - 1} } \hfill \\ \end{gathered} $$ to which corresponds a probability $$ P_{\,2} (n,q) = \frac{{N_{\,2} (n,q)}} {{n^{\,q} }} = \frac{1} {{n^{\,2} }}\sum\limits_{\left\{ \begin{subarray}{l} 1\, \leqslant \,k_{\,1} \, < \,k_{\,2} \, \leqslant \,n\, \\ m_{\,2} \, + \,m_{\,1} \, = \,q\;\;\left| {\;1\, \leqslant \,m_{\,k} \,} \right. \end{subarray} \right.} {\left( {\frac{{k_{\,1} }} {n}} \right)^{\,m_{\,1} - 1} \;\left( {\frac{{k_{\,2} }} {n}} \right)^{\,m_{\,2} - 1} } $$ and so forth.
I do not know whether the multiple summations can be reduced to a simpler form. (*)

However, if we increase speed granularity (i.e. $n$) till continuum then we can replace the summation over the $k's$ in the expression for the probability into integrals $$ \begin{gathered} P_{\,g} (q) = \sum\limits_{\begin{subarray}{l} \\ \\ \,m_{\,1} \, + \,m_{\,2} + \, \cdots \, + m_{\,g} \, = \,q\;\;\left| {\;1\, \leqslant \,m_{\,k} \,} \right. \end{subarray} } {\mathop {\int {} }\limits_{\begin{subarray}{l} \\ 0\, \leqslant \,x_{\,1} \, < \,x_{\,2} < \, \cdots \, < \,x_{\,g} \, \leqslant \,1 \end{subarray} } \prod\limits_{1\, \leqslant \,j\, \leqslant \,g\,} {x_{\,j} ^{\,m_{\,j} - 1} \;dx_{\,j} } } = \hfill \\ = \sum\limits_{\begin{subarray}{l} \\ \,m_{\,1} \, + \,m_{\,2} + \, \cdots \, + m_{\,g} \, = \,q\;\;\left| {\;1\, \leqslant \,m_{\,k} \,} \right. \end{subarray} } {\frac{1} {{m_{\,1} }}\;\mathop {\int {} }\limits_{0\, \leqslant \,\,x_{\,2} < \, \cdots \, < \,x_{\,g} \, \leqslant \,1} x_{\,2} ^{\,m_{\,1} + m_{\,2} - 1} \;dx_{\,2} \prod\limits_{3\, \leqslant \,j\, \leqslant \,l\,} {x_{\,j} ^{\,m_{\,j} - 1} \;dx_{\,j} } } = \hfill \\ = \sum\limits_{\begin{subarray}{l} \\ \,m_{\,1} \, + \,m_{\,2} + \, \cdots \, + m_{\,g} \, = \,q\;\;\left| {\;1\, \leqslant \,m_{\,k} \,} \right. \end{subarray} } {\frac{1} {{m_{\,1} }}\frac{1} {{m_{\,1} + m_{\,2} }}\, \cdots \,\frac{1} {{m_{\,1} + m_{\,2} + \, \cdots \, + m_{\,g} }}} \hfill \\ \end{gathered} $$ We can see that $P$ satisfies the following recursion $$ \bbox[lightyellow] { P_{\,g} (q) = \frac{1} {q}\;\sum\limits_{\left( {0\, \leqslant \,g - 1 \leqslant } \right)\,k\, \leqslant \,q - 1} {P_{\,g - 1} (k)} }$$

and for the initial conditions we can put that $0$ sheeps can only be arranged into a empty group and v.v. that an empty group can gather only $0$ sheeps. $$ \left\{ \begin{gathered} P_{\,g} (q) = 0\quad \left| {\;g < 0\; \vee \;p < 0} \right. \hfill \\ P_{\,g} (0) = \delta _{\,g,\,0} \hfill \\ P_{\,0} (q) = \delta _{\,q,\,0} \hfill \\ \end{gathered} \right. $$ Now, starting from a known identity about Stirling N. of the 1st kind, we have $$ \begin{gathered} \left[ \begin{gathered} n + 1 \\ m + 1 \\ \end{gathered} \right] = \sum\limits_{0\, \leqslant \,k\, \leqslant \,n} {\left[ \begin{gathered} k \\ m \\ \end{gathered} \right]n^{\,\underline {\,n - k\,} } } = n!\sum\limits_{0\, \leqslant \,k\, \leqslant \,n} {\left[ \begin{gathered} k \\ m \\ \end{gathered} \right]\frac{1} {{k!}}} \quad \Rightarrow \hfill \\ \Rightarrow \quad \left[ \begin{gathered} n \\ m \\ \end{gathered} \right] = \left( {n - 1} \right)!\sum\limits_{0\, \leqslant \,k\, \leqslant \,n - 1} {\left[ \begin{gathered} k \\ m - 1 \\ \end{gathered} \right]\frac{1} {{k!}}} \quad \left| {\;1 \leqslant n,m} \right.\quad \Rightarrow \hfill \\ \Rightarrow \quad \left( {\frac{1} {{n!}}\left[ \begin{gathered} n \\ m \\ \end{gathered} \right]} \right) = \frac{1} {n}\;\sum\limits_{0\, \leqslant \,k\, \leqslant \,n - 1} {\left( {\frac{1} {{k!}}\left[ \begin{gathered} k \\ m - 1 \\ \end{gathered} \right]} \right)} \hfill \\ \end{gathered} $$

Finally, having the same recursion and same initial conditions, we reach to: $$ \bbox[lightyellow] { P_{\,g} (q) = \frac{1} {{q!}}\left[ \begin{gathered} q \\ g \\ \end{gathered} \right] }$$ and then it is known (**) that $$ \bbox[lightyellow] { \overline g (q) = \sum\limits_{\left( {0\, \leqslant } \right)\,g\,\left( { \leqslant \,q} \right)} {\frac{g} {{q!}}\left[ \begin{gathered} q \\ g \\ \end{gathered} \right]} = H(q) }$$

-----------
Note (*)
I have later found a closed form for $N_{g}(n,q)$, which is presented in this related post.

----------

Note (**)
because we have in fact $$ \begin{gathered} \frac{{\left( {x + 1} \right)^{\,\overline {\,n\,} } }} {{n!}} = \prod\limits_{0\, \leqslant \,k\, \leqslant \,n - 1} {\frac{{x + 1 + k}} {{1 + k}}} = \exp \left( {\sum\limits_{0\, \leqslant \,k\, \leqslant \,n - 1} {\ln \left( {1 + \frac{x} {{1 + k}}} \right)} } \right) = \sum\limits_{\left( {0\, \leqslant } \right)\,k} {\frac{1} {{n!}}\left[ \begin{gathered} n \\ k \\ \end{gathered} \right]\left( {x + 1} \right)^{\,k} } \hfill \\ \left( {x + 1} \right)\frac{d} {{d\,x}}\frac{{\left( {x + 1} \right)^{\,\overline {\,n\,} } }} {{n!}} = \exp \left( {\sum\limits_{0\, \leqslant \,k\, \leqslant \,n - 1} {\ln } \left( {1 + \frac{x} {{k + 1}}} \right)} \right)\sum\limits_{0\, \leqslant \,k\, \leqslant \,n - 1} {\frac{{x + 1}} {{x + k + 1}}} = \sum\limits_{\left( {0\, \leqslant } \right)\,k} {\frac{k} {{n!}}\left[ \begin{gathered} n \\ k \\ \end{gathered} \right]\left( {x + 1} \right)^{\,k} } \hfill \\ \left. {\left( {\left( {x + 1} \right)\frac{d} {{d\,x}}\frac{{\left( {x + 1} \right)^{\,\overline {\,n\,} } }} {{n!}} = \left( {x + 1} \right)\frac{d} {{d\,x}}\frac{{\Gamma \left( {x + 1 + n} \right)}} {{\Gamma \left( {x + 1} \right)\Gamma \left( {1 + n} \right)}}} \right)\;} \right|_{\;x\, = \,0} = \sum\limits_{0\, \leqslant \,k\, \leqslant \,n - 1} {\frac{1} {{k + 1}}} = \sum\limits_{\left( {0\, \leqslant } \right)\,k} {\frac{k} {{n!}}\left[ \begin{gathered} n \\ k \\ \end{gathered} \right]} \hfill \\ \end{gathered} $$

  • 0
    [+1] An interesting analysis. I didn't know the final connection between Stirling numbers and Harmonic numbers. For my own answer, I have decided to incorporate in it a short and rigorous proof for the mean value.2017-03-24
  • 0
    @JeanMarie: honoured by your appreciation: the appeal of this site is in fact to learn from each other (and I learned a lot from so many of your answers!). But you are right, possibly the connection Stirling1 - Harmonic is not so known, so I added the derivation of it.2017-03-25
  • 0
    @JeanMarie: since you expressed interest for this derivation, allow me to indicate the further extension provided in the post as per Note(\*) above.2017-03-29