Last updated 2/11/05
Time efficiency - a measure of amount of time for an algorithm to execute.
Space efficiency - a measure of the amount of memory needed for an algorithm to execute.
Complexity theory - a study of algorithm performance based on cost functions of statement counts.
Function dominance - a comparison of cost functions. Note that cost functions are functions that always grow.
The cost functions are typically using the size of the data collection as its domain, n.
Cost functions are non-decreasing. They're almost always purely increasing.
This is a comparison of cost functions when n is large.
Formally, function g asymptotically dominates function f if there are two positive constants c and n0 such that
c*g(n) >= f(n) for all n>= n0
That is, "g asymptotically dominates f if c*g dominates f for all large values of n larger than n0".
Or, "c*g(n) is always bigger than f(n) for values larger than n0"
At some point, g(n) is always larger than f(n). We're always interested if n0 exists (otherwise g wouldn't dominate f). Sometimes we're interested in the value of n0. The value c is rather arbitrary and we're interested if we can find one. It's particular value is rather useless. Different values of c may determine different n0 .
We are trying to place functions into categories, so that when n is large the differences between the categories are striking, rather than almost linear.
The linear type of relationship between n and the cost is eliminated.
Defn: Given two nonnegative functions f and g, f = O(g) if and only if g asymptotically dominates f
We say "f is of the order g" or "The order of f is g" or "f is big-Oh of g"
Formally, if f(n) and g(n) are functions defined for positive
integers then f(n) = O(g(n))
if there exists a c such that |f(n)|
<=c|g(n)| for all sufficiently large positive integers n.
We want to show that n2 dominates the polynomial 3n2 + 3n + 10, or, in other words
3n2 + 3n + 10 = O(n2)
Proof. Let c = 4 and show that there exists an n0 in which the equation holds. There are lots of c's to choose from. We're picking a value that larger than the coefficient of the highest degree term in the case of a polynomial.
3n2 + 3n + 10 <=4n2
0 <= n2-3n-10
0 <=
(n-5)(n+2)
so for c=4 we found n >= 5. That is, we found a c and an n0
Therefore, we can say O(n2) = 3n2 + 3n + 10
Here is an example where n may have to become larger in order for g(n)=2n to dominate the polynomial f(n)=8n4. We'll simply let c=1.
Find an n when 2n > 8n4
n
log2 2 > log28 + 4 log2n
n >
3 + 4 log2 n
(n-3)/4 > log2 n
so n > 16
We can say O(2n)=8n4
Here are some simplifications that can be applied
f = O(f) That is, a function can be its own order.
O(k*f) = O(f) That is, constants in the function can be ignored.
O(f+g) = Max[ O(f), O(g) ] That is, terms of lower degree can be ignored.
O(f*g) = O(f) * O(g) if a function is a product then its order is the product of the orders of the factors.
O(f/g) = O(f) / O(g) same for a function that is a quotient
O(f) > O(g) if and only if f dominates g
Be careful with the rare functions that have subtraction:
If f(n) = O(h(n)) and g(n) =
O(h(n)) then
f(n)-g(n) is not equal
to O(h(n)) - O(h(n)) = 0
2n2 - n2 = n2
Some other rules
The powers of n are ordered according to the exponent na = O(nb) iff a <= b
The order of log n is independent of the base taken loga n = O(logb n) for all a, b > 1
Logarithms grow more slowly than any power of n log n = O(na) for any a>0 but na != O(log n)
na = O(bn) for all a, b>1 but bn != O(na) for b>1
Let X and Y denote variables and a,b,c, and d denote constants
In order of dominance (top is most dominant). Realize that algorithms of higher dominance is bad!
|
Function |
condition |
common name |
|
f(N) = NN |
|
|
|
f(N) = N! |
|
N factorial |
|
f(n) = an |
dominates bn if a > b |
exponential |
|
f(N) = Nc |
dominates Nd if c>d |
polynomial (cubic, quadratic, etc.) |
|
f(n) = n log n |
|
n log n |
|
f(n) = cn |
|
linear |
|
f(n) = c logb n |
dominates c loga n if b<a |
logarithmic (regardless of base) |
|
f(n) = 1 |
|
constant |
We strive to get algorithm cost functions as small as possible (better performance). A "dominant" cost function is not better!
Below are pairs of functions, determine which dominates the other. Also determine each function's big-O notation to give it order. [These exercises are borrowed from Headington/Riley, Data Abstraction and Structures Using C++, DC Heath, 1994. ]
| T1 | T1-big O | T2 | T2-big O |
|---|---|---|---|
| T1(X)=2X2 | T2(X) = 2X | ||
| T1(X)=5X2 | T2(X) = 2X3 | ||
| T1(X)= 66X15+13X4+7500 | T2(X) = X16+2 | ||
| T1(X)=(X +3)(X +5) | T2(X) = 250X | ||
| T1(X)=(X + 7)(X + 9) | T2(X) = 4X3 | ||
| T1(X) = 32X2 | T2(X) = X log X | ||
| T1(X) = 32X2 | T2(X) = X2 log X | ||
| T1(X) = log2 X | T2(X) = log10 X | ||
| T1(X) = 84 | T2(X) = log10 X | ||
| T1(X) = (35X+1) / X | T2(X) = 8X | ||
| T1(X) = (X4+X2+17)/X3 | T2(X) = X2 | ||
| T1(X) = X9 | T2(X) = 9X | ||
| T1(X) = X! | T2(X) = 57X | ||
| T1(X) = XX | T2(X) = 25X | ||
| T1(X,Y) = 2X3Y | T2(X,Y) = 3XY | ||
| T1(X,Y) = X2+75 | T2(X,Y) = 7Y | ||
| T1(X,Y) = Y4+Y+74 | T2(X,Y) = X+18 |

> log2 := loglogplot (log[2](x), x=1..lim, color=green):
> lin := loglogplot (x, x=1..lim, color=gold):
> nlogn := loglogplot (x*log[2](x), x=1..32, color=blue):
> quad :=loglogplot (x^2, x=1..32, color=yellow):
> cube := loglogplot (x^3, x=1..lim, color=violet):
> quart := loglogplot (x^4, =1..lim, color=maroon):
> expon := loglogplot (2^x, x=1..lim, color=red):
> fact := loglogplot (factorial(x), x=1..10,
color=orange):
|
Linear |
logarithmic |
n*log2N |
quadratic |
cubic |
exponential |
exponential |
factorial |
|
1 |
0 |
0 |
1 |
1 |
2 |
3 |
1 |
|
2 |
1 |
2 |
4 |
8 |
4 |
9 |
2 |
|
4 |
2 |
8 |
16 |
64 |
16 |
81 |
24 |
|
8 |
3 |
24 |
64 |
512 |
256 |
6561 |
40320 |
|
16 |
4 |
64 |
256 |
4096 |
65,536 |
43,046,721 |
2.09E+013 |
|
32 |
5 |
160 |
1024 |
322,768 |
4,294,967,296 |
1.85E+15 |
2.63E+035 |
|
64 |
6 |
384 |
4096 |
262,144 |
1.84E+17 |
3.43E+30 |
1.27E+089 |
|
128 |
7 |
896 |
16,384 |
2,097,152 |
3.4E+38 |
1.18E+61 |
3.86E+215 |
|
256 |
8 |
2048 |
65,536 |
1,677,216 |
1.16E+77 ??? |
1.39E+122 |
Find other calculator |
Note1: The value here is approximately the number of machine instructions executed by a 1 gigaflop computer in 5000 years, or 5 years on a current supercomputer (teraflop computer)
The common thought is that if an algorithm runs at a particular speed and you need to increase the amount of data to be processed, just get a faster computer equivalent to the increase factor in the data size.
This table shows for different algorithm complexities, just how much more data could be processed on a machine 1000 times faster.
|
Algorithm time function |
Size factor increase of a data set on a computer 1000 times faster |
|
N (linear algorithms) |
1000.00 X |
|
N*log2N (e.g. best sorting) |
140.22 X |
|
N2 (e.g. simple sorting) |
31.62 X |
|
N3 |
10.00 X |
|
2N |
9.97 X |
|
3N |
6.29 X |
|
4N |
4.98 X |
Rank the following big-O measures from greatest to least (worst to best performance). [These exercises are borrowed from Headington/Riley, Data Abstraction and Structures Using C++, DC Heath, 1994. ]
| big-O | rank | common name |
|---|---|---|
| a. O(N) | ||
| b. O(N3) | ||
| c. O(4N) | ||
| d. O(log2 N) | ||
| e. O(log8 N) | ||
| f. O(N2) | ||
| g. O(1) | ||
| h. O(N log2 N) | ||
| i. O(N2 log2 N) |
Classify each of the cost functions by name
| Cost function | Big-O name |
|---|---|
| T(N) = 4N2 + N | |
| T(N) = 88N3+77N2+99 | |
| T(N) = 2N*N2 | |
| T(N) = 7500 | |
| T(N) = log N + 46N | |
| T(N) = 2N(N+1) | |
| T(N) = (N log N) / (N +2) | |
| T(N) = (3N4+4N3) / (5N2+N) | |
| T(N) = 17N/N2 |
One method to measure running time of an algorithm against its input size is to place put timing points in the program. This is TimeTester.java (from .../Programs/Chapter4)
import java.util.*;
public class TimeTester{
public static void main(String[] args){
long problemSize = 1000000;
int work = 0;
for (int i = 1; i <= 5; i++){ //run 5 times for average
Date d1 = new Date();
//change the blue code for other algorithms to measure
for (int j = 1; j <= problemSize; j++){
work++;
work--;
}
Date d2 = new Date();
long msec = d2.getTime() - d1.getTime();
System.out.println("Problem size = " + problemSize + "\n" +
"Time in msec = " + msec);
problemSize = problemSize * 2;
}
}
}
|
Control structure or form of loop |
Running Time |
|
assignment statement |
O(1) "fixed cost" |
|
expression |
O(1) "fixed cost" |
//the sequence Stmt1; Stmt2; |
Max [ O(S1), O(S2) ] |
if (condition){
Stmt1
} else {
Stmt2
}
|
Max [ O(cond), O(S1), O(S2) ] |
for (i=1; i<=n; i++){
Stmt1
}
//while loops too, generally
|
O(n * S1) |
|
algorithms without looping or recursion |
O(1) |
for (i=a; i<=b; i++){
//body requiring constant time
}
|
O(1) |
for (i=a; i<=n; i++){
//body requiring constant time
}
|
O(n) -- linear time |
for (i=a; i<=n; i++){
for(j=b; j<=n; j++){
//body requiring constant time
}
}
|
O(n2) -- quadratic time |
for (i=a; i<=n; i++){
//body requiring O(M) time
}
|
O(n * M) |
//----------------------------------------------------------------------
// This program prints all Pythagorean triples up through some user-
// specified value. All output triples are in increasing order.
//----------------------------------------------------------------------
import java.io.*;
public class pythag {
public static void main()
{
int maxLen;
int small;
int next;
int last;
maxLen = Console.readInteger("Max. side length: ");
small = 1;
while (small <= maxLen) {
// INV (prior to test): All triples in the range
// (1..small-1, 1..maxLen, 1..maxLen) have been output && small<=maxLen+1
next = small;
while (next <= maxLen) {
// INV (prior to test): All triples in the range
// (small, 1..next-1, 1..maxLen) have been output && next<=maxLen+1
last = next;
while (last <= maxLen) {
// INV (prior to test): All triples in the range
// (small, next, 1..last-1) have been output && last<=maxLen+1
if (last*last == small*small + next*next)
System.out.println(small+", "+next+", "+last+"\n");
last++;
}
next++;
}
small++;
}
System.exit(0);
}
Note that our big-Oh n is the program's maxLen--it's what the statement count depends on.
O(1) O(n) O(n2) O(n3)