Part 10 - Complexity (2)

Posted on Dec 9, 2022

(Last updated: May 26, 2024)

We have actually covered everything in this course - in this part we’ll do some exercises!

Order of growth of Functions

Let’s find out the complexity ($\mathcal{O}$) of: $$ T(n) = 5(3n^2 + 2n + 6)(4\ log_{10}(n) + 1) $$

Since we seek the growth rate - we can use our rules about complexity. We can remove all constants: $$ T(n) = (n^2 + n)(log_{10}(n)) $$

The next rule we can apply is, “the most dominating factor ‘wins’” as I like to call it. Therefore: $$ T(n) = (n^2)(log_{10}(n)) $$

Then we just multiply! $$ T(n) = n^2\ log_{10}(n) $$

And since we usually write $log$ when using Big-O notation: $$ T(n) = n^2\ log(n) $$

Now we can say that $T(n)$ has a $\mathcal{O}(n^2\ log(n))$ complexity!. Since we are talking about $\mathcal{O}$, this means this function has a lower bound of this. This means that $T(n)$ also has a complexity of $\mathcal{O}(n^3)$ for example.

So what we really mean is that $T(n)$ has a $\Theta(n^2\ log(n))$ complexity.

Suppose an algorithm takes time t on an input of size n. How many times longer does it take on an input of size 10n if…

If the algorithm is $\Theta(n)$?
If the algorithm is $\Theta(n^2)$?
If the algorithm is $\Theta(n^3)$?
If the algorithm is $\Theta(n\ log(n))$?
If the algorithm is $\Theta(log(n))$?

This is quite easy! We just plug in our new $n$, and see how much $t$ grows!

If the algorithm is $\Theta(n)$?
- We get $10n \rightarrow 10t$!
If the algorithm is $\Theta(n^2)$?
- We get $100n \rightarrow 100t$!
If the algorithm is $\Theta(n^3)$?
- We get $1000n \rightarrow 1000t$!
If the algorithm is $\Theta(n\ log(n))$?
- We get $10n\ \cdot log(10n)$!
- This isn’t just $10 log(10)" times more - it’s a little bit longer, or so called “logarithmic linear”.
If the algorithm is $\Theta(log(n))$?
- We get $log(10n)$!
- This means just a constant more time!

Complexity Analysis

Let’s analyze the following snippet of code and, it’s complexity.

found = false
for x in list:
    for y in list:
        if x + y == 0:
            found = true

If we say that the length of list is $n$. In the first loop we will have a complexity of $\mathcal{O}(n)$. The inner loop will follow, using our previous rule of ’nested loops means multiplication’. This means our final program will have the complexity of $\mathcal{O}(n^2)$.

Now let’s see over this code snippet:

found = false
for i in 0 .. n - 1:
    for j in i .. n - 1:
        x = list[i]
        y = list[j]
        if x + y == 0:
            found = true

In this snippet - we’ll have the first loop iterating $n$ times - however, the second loop, it will iterate $(0, \dots ,n -1), (1, \dots , n - 2), \dots$

This means that the number of times the second loop will run is between 1 and $n$ times - using our definition of complexity. Let’s call the number of times our loop runs $m$, $m \leq n$ which means m has a complexity of $\mathcal{O}(n)$.

This finally means we have a total complexity of $\mathcal{O}(n^2)$

Now let’s do the same, but for three numbers!

found = false
for i in 0 .. n - 1:
    for j in i .. n - 1:
        for k in j .. n - 1:
            x = list[i]
            y = list[j]
            z = list[k]

            if x + y + z == 0:
                found = true

Exactly the same logic goes as from the last question to this, we can prove that each loop has a complexity of $\mathcal{O}(n)$.

Which gives the total complexity of $\mathcal{O}(n^3)$.

Now let’s look at a similar program:

pairs_list = []
for i in 0 .. n - 1:
    for j in i .. n - 1:
        x = list[i]
        y = list[j]
        pairs_list.add(x + y)

found = false
for xplusy in pairs_list:
    for zplusw in pairs_list:
        if xplusy + zplusw == 0:
            found = true

As we’ve stated above, the first part of the program will have a complexity of $\mathcal{O}(n^2)$. Note that the add() function takes $\mathcal{O}(1)$ for dynamic arrays.

However, in the next block, the new array length is $n^2$, since we have added all possible permutations of pairs. So the loops will now through $n^2$ elements. Which in total results a complexity of $\mathcal{O}(n^4)$.

From our earlier rules, we ‘add’ blocks of codes, so the complexity is $\mathcal{O}(n^2) + \mathcal{O}(n^4)$. Which means the resulting complexity becomes $\mathcal{O}(n^4)$.

Data Structure Complexities

Let’s refresh our memory and state all the complexities for our data structures and their functions.

Dynamic Arrays:
- Get/Set:
  - $\mathcal{O}(1)$
- Add/Remove at end:
  - $\mathcal{O}(1)$
- Add/remove elsewhere:
  - $\mathcal{O}(n)$
Stacks/queues:
- Push/Pop:
  - $\mathcal{O}(1)$
- Enqueue/Dequeue:
  - $\mathcal{O}(1)$
Binary Heaps:
- Add/RemoveMin (or Max):
  - $\mathcal{O}(log(n))$
- getMin (or Max):
  - $\mathcal{O}(1)$
BSTs:
- Add/Remove/Search (worst case, meaning it’s already sorted):
  - $\mathcal{O}(n)$
- Otherwise:
  - $\mathcal{O}(log(n))$
Stacks/queues:
- Add/Remove/Search (Always!):
  - $\mathcal{O}(log(n))$
Hash Tables:
- Add/Remove/Search (Given that the hash function is ‘good’):
  - $\mathcal{O}(1)$
General Tree:
- If you down in a tree, you’ll visit:
  - $\mathcal{O}(height)$ nodes
- If you explore every node, you’ll visit:
  - $\mathcal{O}(n)$ nodes
- A tree has the worst case $\mathcal{O}(n)$ height.
- A balanced tree is always $\mathcal{O}(log(n))$ height.

Analyzing more complexities

Let’s take a look at program which utilizes different data structures now:

pairs_list = []
for i in 0 .. n - 1:
    for j in i .. n - 1:
        x = list[i]
        y = list[j]
        pairs_list.add(x + y)

merge_sort(pairs_list)

found = false
for xplusy in pairs_list:
    if binary_search(pairs_list, -xplusy):
        found = true

So, we’ve seen that first part, we know it’s $\mathcal{O}(n^2)$. But now we see a merge_sort() - this has a complexity of $\mathcal{O}(n\ log(n))$. As we stated before, the list after the first block has a length of $n^2$. Which means merge_sort() will have a complexity of $\mathcal{O}(n^2\ log(n^2))$

This will just sort it so, no length is added.

Then the next block, the for loop will have a complexity of $\mathcal{O}(n^2)$. The binary search algorithm, has a complexity of $\mathcal{O}(log(n^2))$.

So this block will in total have a complexity of $\mathcal{O}(n^2\ log(n^2))$.

So if we add these blocks together and apply our rules we will get a total complexity of: $\mathcal{O}(n^2\ log(n^2))$. We can apply some log rules to this: $$ \mathcal{O}(n^2\ log(n^2)) \newline \mathcal{O}(n^2\ 2\ log(n)) \newline \mathcal{O}(n^2\ log(n)) $$

So finally our answer is, $\mathcal{O}(n^2 log(n))$

Let’s now look at a case using a tree:

pairs_set = empty AVL_tree
for i in 0 .. n - 1:
    for j in i .. n - 1:
        x = pairs_set[i]
        y = pairs_set[j]

        if pairs_set.contains(-(x+y)):
            return true

        pairs_set.add(x+y)

As we’ve seen before, the loops are $\mathcal{O}(n^2)$ - now the interesting part is the contains() to check if an element is present. To check whether an element is present in a tree, has a complexity of $\mathcal{O}(log(n))$. The rest of the operations are constant so, we can ignore them (including the add() for the AVL tree).

So-therefore the final complexity is $\mathcal{O}(n^2\ log(n))$. In the absolute worst case the contains() will be $\mathcal{O}(n)$, but let’s ignore that :).

Now let’s look at a hash table:

pairs_set = empty Hash_table
for i in 0 .. n - 1:
    for j in i .. n - 1:
        x = pairs.set[i]
        y = pairs_set[j]

        if pairs_set.contains(-(x + y)):
            return true

        pairs_set.add(x + y)

As per usual, the loops together create a complexity of $\mathcal{O}(n^2)$, now, a search in a hash table is $\mathcal{O}(1)$, if our hash function is ‘good’.

The rest of the operations are constant. Therefore, the overall complexity is $\mathcal{O}(n^2)$.

Now let’s look at a BST example:

pairs_set = empty BST
for i in 0 .. n - 1:
    for j in i .. n - 1:
        x = pairs_set[i]
        y = pairs_set[j]

        if pairs_set.contains(-(x + y)):
            return true

        pairs_set.add(x + y)

The usual $\mathcal{O}(n^2)$ loops :). Now the contains() is the interesting part. Since this is a BST, the contains() will have a complexity of $\mathcal{O}(n)$ in the worst case, since the BST can become unbalanced.

The same applies for add(). Since the rest of the operations are constant therefore it will be, $\mathcal{O}(n^4)$.

Different kinds of complexities

We also need to consider the different cases

Best-case
- This is not useful.
Worst-case
- This is the most useful.
Average-case
- Can be useful sometimes, mostly gives us a ‘indicator’.

Let’s now talk about expected and amortised complexity.

Expected Complexity

This is useful for randomized algorithms! It’s the average over all possible random choice for a particular input.

For example, if we choose a random pivot, we turn quicksort from average-case $\mathcal{O}(n\ log(n))$ to expected $\mathcal{O}(n\ log(n))$

Amortised Complexity:

Amortised complexity is, the average over any sequence of operations, this is super useful!

For example, we use this to make dynamic arrays have an amortised complexity of $\mathcal{O}(1)$.

However, when we’re calculating the total runtime of a program, it’s safe to forget about this amortised bit and just treat each operation as costing $\mathcal{O}(1)$.

Conclusion

This was it for this part - and the final part in this series. I really enjoyed this DSA course, super fun :).