0
$\begingroup$

Question:

Consider the list $L[0:n]$ where $n = 2k – 1$. Calculate the average complexity $A(n)$ of Linear Search, where the following conditions all hold simultaneously: the probability that search element $X$ occurs in the list is $.7$; given that $X$ occurs in the list it is twice as likely to occur in the first half $L[0:k – 1]$ as the second half $L[k, n – 1]$; and for any given half, $X$ is equally likely to occur in any position of that half

I came up with an answer that I intuitively came up with and I think it was very close to being correct but found it slightly faulty, as I was not considering the probability of the element not being in the list, of course. I then saw [this](Involving probability, distinct integers and linear search algorithms... distinct-integers-and-linear-search-algorithms) question which made a lot of sense to me, and therefore I tried to extend the answer to create the following, on which I'd like feedback. Thanks!

For the probability of the element existing in the first half to be twice as likely as the second half, we have to divide the entire probability up into thirds. $\frac{.7}{3} = .2\overline{333}$. This means the first half has probability $.2\overline{333}*2 = 0.4\overline{666}$ whereas the second half has probability of just $.2\overline{333}$.

Let's assume the basic operation of linear search is a $O(1)$ comparison.

  • AVG number of basic operations incurred if element $\in$ first half = $\frac{n}{4}(.4667)$

  • AVG number of basic operations incurred if element $\in$ second half = $\frac{n}{4}(.2334)$

  • AVG number of basic operations incurred if element $\notin$ array = $.3n$

Therefore the average complexity:

$$\frac{n}{4}(.4667) + \frac{n}{4}(.2334) + .3n$$

which simplifies to:

$$\require{cancel}\cancel{.4667n + .2334n + 1.2n}$$

$$\frac{.4667n + .2334n + 1.2n}{4}$$

Thus:

$$\cancel{A(n) \approx1.9n}$$

$$A(n) \approx \frac{1.9n}{4} \approx .475$$

Is this logic correct?


Edit

I think the above answer, and the current "answer" to this question are both wrong. Some of my new work indicating a possibly correct answer:

First let's define some things:

  • Let the desired value to find be named $x$
  • Let $p$ be the probability that $x$ is in the list
  • Assume the list has a uniform probability distribution, i.e. the probability that any given element is $x = \frac{1}{n}$

The definition of average complexity with input of size $n$ is:

$$A(n) = E[\tau] = E[\tau ~|~ x \in L]*p + E[\tau ~|~ x \notin L]*(1 - p)$$

where

$$E[\tau ~|~ x \in L]*p = p\sum_{i = B(n)}^{W(n)}{\frac{i}{n}}$$

and

$$E[\tau ~|~ x \notin L]*(1 - p) = n(1 - p)$$

The list in our example of course does not have a uniform distribution but we can use the fact that each half has a uniform distribution to our advantage. Let's consider the average complexity of linear search in the first half. $p = .4667$ and the length of the half is $\frac{n}{2}$

$$A\Big(\frac{n}{2}\Big) = E[\tau] = .4667\sum_{i = 1}^{i = \frac{n}{2}}{\frac{i}{\frac{n}{2}}} = \frac{.9334}{n}\Big(\frac{n^2}{8} + \frac{n}{4}\Big) = .116675n + .2334$$

Let's consider the second half

$$A\Big(\frac{n}{2}\Big) = .2334\sum_{i = \frac{n}{2}}^{i = n}{\frac{i}{\frac{n}{2}}} + \frac{.2334n}{2} = \frac{.4667}{n}\Big(\frac{3n(n+2)}{8}\Big) + \frac{.2334n}{2} = .1750125n + .350025 + \frac{.2334n}{2}$$

And of course the $E[\tau]$ when $x \notin L = .3n$

Making the entire complexity:

$$A(n) = .116675n + .2334 + .1750125n + .350025 + \frac{.2334n}{2} + .3n$$

Simplifying to

$$.7084n + .5834$$

1 Answers 1

0

No. I got the following.

n/4(.4666) + n/4(.2333) + .3(n) = .4749n

Which I believe makes more sense given that x is twice as likely to occur in the first half of the list of size n.

I was given this problem before expect where the probability the item is in the list is 1, meaning it is in the list. Which gives a probability of 0 it is not. So the math would be as follows,

.6666/4+.3333/4 == .25

  • 0
    Ahh I multiplied by 4 and never ended up dividing by 4 I think. That makes more sense - stupid mistake!2017-01-30
  • 0
    Im in your class right now @DomFarolino2017-01-30
  • 0
    Lmao really. Nice #algos2017-01-30
  • 0
    Consider the problem where the probability that the item is in the list = 1. The first half's probability = 0, and the second half's prob = 1. By the current logic, the average complexity would be (n/4)*(0) + (n/4)(1), which would make the avg complexity = (n/4) which is clearly not true because in reality, the best case = (n/2), worst case = (n).2017-01-30
  • 0
    I could be wrong - but I think instead of just taking account the avg complexity of item being in the second half, we also need to take in account complexity of it not being in first half. Example: Prob item exists in list = 1. First half = .5, second half = .5.... (n/4)*(.5) + (n/4)*(.5) = (n/4) avg complexity = wrong. I think we'd have to do (n/4)*(.5) + [(n/4)*(.5) + (n/2)*.5] to make the actual avg complexity = (n/2). Does this kinda make sense?2017-01-30
  • 0
    I do not think so. Because we are looking for overall probability, and overall average complexity. So I do not think the individual halves would matter.2017-01-30
  • 0
    Given your array where the probability of an element existing is twice as likely in the first half as it is in the second half (.666, .333) (total = 1) I cannot see how the A(n) = .25. By that logic if you have the array whose halves have probability .00001 and .99999 respectively, the A(n) would also be .25 which is wrong. Instead A(n) = (.00001)/4 + .99999/4 + (.99999)/2 = n*(.7499)2017-01-31