Counting operations
While much can be said about the efficiency of implementation and programming languages (e.g. Python vs Fortran…), it is interesting to be able to compare algorithms directly. One way of doing this is to look at the number of elementary operations necessary for each algorithm to run: in many cases, less operations will be faster. Of course, there are some caveats to this (e.g. what is an *elemntary operation exaclty?) but it is a good starting point.
Suppose you must create a function that does an operation on a vector of length \(n\) (e.g. sort a list of length \(n\)). For this, you write an algorithm (some code), which is effectively a function \(f\). When you execute \(f\), a certain number of elementary operations must be performed by the computer (add, multiply, assign…). Let \(T_f\) be this number of operations for \(f\). Often, a fixed number of operations is always performed (e.g. before a loop), plus a number that is a function of the length of the input \(n\).
Let’s look at an example: suppose you want the sum of integers up to n
. You could do a loop, or you could remember your high-school formulas. Let’s look at it in Python:
def sum1(n):
tmp = 0
for i in range(1, n + 1):
tmp = tmp + i
return tmp
def sum2(n):
return (n * (n + 1)) / 2
sum1
performs \(T_1(n) = 2 + 2n\) operations:
- 1 before the loop, to assign tmp
- 1 in the loop definition (n + 1
)
- then 1 addition and 1 assign, times n
loops
sum2
however only performs 1 addition, 1 multiplication and 1 subtraction, a total of 3 operations whatever the value of n
. This implies that \(T_2(n) = 3\), and in fact \(T_2\) does not depend on n
. Unsurprisingly, the second algorithm is much more efficient that the first.
Big-O notation
Math and computer-science people are often more concerned about orders of magnitude. When n
is very small, often times the optimisation doesn’t matter much on modern hardware (if you are not a diehard code optimiser). But when n
becomes large, algorithms that don’t scale well can break down spectacularly.
Big-O notation evaluates the order of magnitude of the number of operations of a given algorithm with respect to the input size \(n\). As \(n\) becomes large, anything but the fastest growing part of the function \(T\) becomes gradually negligible. Moreover, two algorithms with the same order of magnitude are said to behave similarly, whatever the multiplicative coefficient in front.
Going back to our previous example: if \(n\) becomes large, \(T_1(n)\) becomes indistinguishable from \(2n\). But what’s more, most issues you would encouter with this algorithm would arise with the same order of magnitude \(n\) even if you found a third algorithm with \(T_3(n) = n\). For this reason, algorithm 1 (the loop) is considered \(O(n)\).
These rules imply some simple comparison examples:
Algo 1 | Algo 2 | Big-O 1 | Big-O 2 | Cheapest |
---|---|---|---|---|
\(2n + 1\) | \(n + 10\) | O(n) | O(n) | Same |
\(n^2\) | \(1000n\) | O(n^2) | O(n) | N°2 |
\(10n^2 + n + 5\) | \(n^2\) | O(n^2) | O(n^2) | Same |
\(n log(n)\) | \(n^2\) | O(n log(n)) | O(n^2) | N°1 |