|
|
|
Algorithm Analysis |
|
Searching |
|
Sorting |
|
|
|
Reading: Main p.19-28, 548-560, 590-604, 780-781 |
|
|
|
|
Good programs are those which |
|
run correctly |
|
run efficiently |
|
are easy to debug |
|
are easy to maintain or to modify |
|
What does good mean? |
|
|
|
|
Suppose you want to judge two algorithms, A and
B, to find which was better in terms of running time. You implement both A
and B, run them and discover that A took 2 min and B took 1 min 45 sec. |
|
Does this prove that B is better than A? |
|
What if you standardized a computer, OS, test
data, etc.? |
|
Instead of measuring time computer scientist
would calculate a number of steps necessary to complete an algorithm. A
step is a simple operation like an assignment, comparison, a simple
arithmetic operation, etc. |
|
Ex. Problem: Calculate the sum of all the
integers from 1 to n. |
|
Alg.#1: int sum=0; for (int count=1;
count<=n; count++) sum += count; |
|
Alg. #2 int sum=((n+1)*n)/2; |
|
|
|
|
|
|
Input |
|
Compiler |
|
Machine |
|
Algorithm |
|
Most interesting algorithms depend on their
input size, n |
|
|
|
|
Algorithms are analyzed by considering n as
unbounded and looking at the function of n: i.e. expressing the number of
algorithm steps in n. As n becomes large we can ignore any housekeeping
activities of a program and only consider constantly repeated tasks. |
|
Algorithms are classified by the type of
function that is related to them using "The Big Oh Notation". Big
Oh gives a very general idea of the type of algorithm. |
|
|
|
|
|
|
f(n) is actual function
associated with
algorithm execution |
|
|
|
g(n) is simplest
function such that
and f(n)
<= c g(n)) |
|
with n> n0 |
|
|
|
|
|
|
Constant |
|
Logarithmic |
|
Linear |
|
n log(n) |
|
Quadratic |
|
Cubic |
|
Exponential |
|
O(1) |
|
O(log(n)) |
|
O(n) |
|
O(n log(n)) |
|
O(n2) |
|
O(n3) |
|
O(2n) |
|
|
|
|
|
Another reason why programmers like big O is
that it can often be concluded from the control structures in a program |
|
E.g. , no loops or recursion – O(1) |
|
A single counting loop or recursion that
simplifies by a fixed amount – linear n |
|
A double nested FOR loop – ? |
|
Recursion that divides by a fixed constant - ? |
|
|
|
|
|
|
|
If a polynomial time algorithm exists for a
problem it is generally considered to be ‘well solved’ (polynomial
functions are functions where the powers are constants, e.g. n5
+ n3). A polynomial solution means some method has been found to
solve a problem that is better than just ‘blind guessing’ (technically
known as exhaustive search). |
|
Algorithms that ‘just guess’ typically have an exponential
growth rate in n (e.g. 2n). As n grows such problems quickly
become too large to solve. An example of such an algorithm would be to sort
a list by randomly rearranging elements and then testing the list to see if
it is in order. If we had no insight into how to sort a list this would be
our best approach. |
|
Research in computer science has revealed
several categories of ‘hard’ problem: |
|
Undecidable problems (i.e. cannot be solved) |
|
Nondeterministically’ intractable |
|
‘Nondeterministically’ polynomial (NP) |
|
|
|
|
|
|
The third category of problem (NP problems) can
be solved in polynomial time but only assuming an unbounded number of
processors. This means there is no polynomial time algorithm yet developed
for these problems that would work on a single processor - but no one can
prove that such an algorithm may not be possible. |
|
NP problems are recognised because they can all
be reduced to a common base problem called the satisfiability problem. The
reason NP problems are studied is because a large number of practical
problems have been shown to be NP-complete or NP-hard (NP-complete have a
yes/no answer whereas NP-hard problems have a more complex output). |
|
The attempt to make computers ‘intelligent’ has
hit the NP barrier. Most activities that we do naturally, like walking,
talking, recognising objects, people and handwriting, etc, when specified
to a computer become NP-complete search problems. Therefore much work has
gone into developing algorithms that, while still exponential in theory,
perform well in practice. The human brain is an excellent example of a
computer that can solve NP problems efficiently. |
|
|
|
|
Suppose we have a method whose worst running
time worstTime(n) is a given function in n. Determine the effect of
tripling n on the estimate of worst time. That is, estimate worstTime(3n)
in terms of worstTime(n) if |
|
1) WorstTime(n) is linear in n |
|
2) WorstTime(n) is quadratic in n |
|
3) WorstTime(n) is constant |
|
|
|
|
Numbers can be in any order. |
|
Works for Linked Lists. |
|
|
|
|
|
|
Need |
|
List to be sorted. |
|
To be able to do random accesses. |
|
|
|
|
|
To find where a function is zero. |
|
Compute functions. |
|
Tree data structures. |
|
Data processing. |
|
Debugging code. |
|
|
|
|
|
The Problem. |
|
To rearrange a set of items into order. |
|
|
|
Applications |
|
Aiding searches. |
|
Finding duplicates. |
|
Finding matching entries in different files. |
|
|
|
|
No best method of sorting. |
|
First ones people think of. |
|
Generalize to better ones. |
|
Useful for a small number of items. |
|
Useful for “almost sorted” items. |
|
|
|
|
Sorting |
|
Bubble Sort |
|
Selection Sort |
|
Insertion Sort |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Insertion Sort |
|
Linear for almost sorted arrays. |
|
Linear in arrays where items are a constant
distance from their final position. |
|
|
|
Selection Sort |
|
Useful for large records. |
|
|
|