Question:
Consider the list $L[0:n]$ where $n = 2k – 1$. Calculate the average complexity $A(n)$ of Linear Search, where the following conditions all hold simultaneously: the probability that search element $X$ occurs in the list is $.7$; given that $X$ occurs in the list it is twice as likely to occur in the first half $L[0:k – 1]$ as the second half $L[k, n – 1]$; and for any given half, $X$ is equally likely to occur in any position of that half
I came up with an answer that I intuitively came up with and I think it was very close to being correct but found it slightly faulty, as I was not considering the probability of the element not being in the list, of course. I then saw [this](Involving probability, distinct integers and linear search algorithms... distinct-integers-and-linear-search-algorithms) question which made a lot of sense to me, and therefore I tried to extend the answer to create the following, on which I'd like feedback. Thanks!
For the probability of the element existing in the first half to be twice as likely as the second half, we have to divide the entire probability up into thirds. $\frac{.7}{3} = .2\overline{333}$. This means the first half has probability $.2\overline{333}*2 = 0.4\overline{666}$ whereas the second half has probability of just $.2\overline{333}$.
Let's assume the basic operation of linear search is a $O(1)$ comparison.
AVG number of basic operations incurred if element $\in$ first half = $\frac{n}{4}(.4667)$
AVG number of basic operations incurred if element $\in$ second half = $\frac{n}{4}(.2334)$
AVG number of basic operations incurred if element $\notin$ array = $.3n$
Therefore the average complexity:
$$\frac{n}{4}(.4667) + \frac{n}{4}(.2334) + .3n$$
which simplifies to:
$$\require{cancel}\cancel{.4667n + .2334n + 1.2n}$$
$$\frac{.4667n + .2334n + 1.2n}{4}$$
Thus:
$$\cancel{A(n) \approx1.9n}$$
$$A(n) \approx \frac{1.9n}{4} \approx .475$$
Is this logic correct?
Edit
I think the above answer, and the current "answer" to this question are both wrong. Some of my new work indicating a possibly correct answer:
First let's define some things:
- Let the desired value to find be named $x$
- Let $p$ be the probability that $x$ is in the list
- Assume the list has a uniform probability distribution, i.e. the probability that any given element is $x = \frac{1}{n}$
The definition of average complexity with input of size $n$ is:
$$A(n) = E[\tau] = E[\tau ~|~ x \in L]*p + E[\tau ~|~ x \notin L]*(1 - p)$$
where
$$E[\tau ~|~ x \in L]*p = p\sum_{i = B(n)}^{W(n)}{\frac{i}{n}}$$
and
$$E[\tau ~|~ x \notin L]*(1 - p) = n(1 - p)$$
The list in our example of course does not have a uniform distribution but we can use the fact that each half has a uniform distribution to our advantage. Let's consider the average complexity of linear search in the first half. $p = .4667$ and the length of the half is $\frac{n}{2}$
$$A\Big(\frac{n}{2}\Big) = E[\tau] = .4667\sum_{i = 1}^{i = \frac{n}{2}}{\frac{i}{\frac{n}{2}}} = \frac{.9334}{n}\Big(\frac{n^2}{8} + \frac{n}{4}\Big) = .116675n + .2334$$
Let's consider the second half
$$A\Big(\frac{n}{2}\Big) = .2334\sum_{i = \frac{n}{2}}^{i = n}{\frac{i}{\frac{n}{2}}} + \frac{.2334n}{2} = \frac{.4667}{n}\Big(\frac{3n(n+2)}{8}\Big) + \frac{.2334n}{2} = .1750125n + .350025 + \frac{.2334n}{2}$$
And of course the $E[\tau]$ when $x \notin L = .3n$
Making the entire complexity:
$$A(n) = .116675n + .2334 + .1750125n + .350025 + \frac{.2334n}{2} + .3n$$
Simplifying to
$$.7084n + .5834$$