Part 7 - Binary Heaps

Posted on Dec 4, 2022
(Last updated: May 26, 2024)

In this part we’ll cover priority queues, binary heaps, and lastly heaps as arrays. So let’s begin with priority queues!

Priority Queues

A priority queues is a collection of items which we can:

  • Add an item to the queue
  • Find the smallest item in the queue
  • Remove the smallest item in the queue

In this we’ll handle a priority queue as a min-priority queue, but there’s also a max-priority queue which gets/removes the maximum item in the queue as well. Both are useful though!

Applications

The priority queue data structure is a very natural one for example, we can use it for:

  • A printer queue where we prioritize the shortest document first!
  • The scheduler in a computer operating system
    • A lot of different parameters but for example, prioritize tasks which takes the minimum time.
  • Sorting a list
    • Start with an empty priority queue
    • Add each element of the input list to the queue
    • Remove the smallest element in the queue
    • You get all elements out in ascending order!

For this last application - say each operation takes $\mathcal{O}(log\ n)$, then this algorithm takes $\mathcal{O}(n\ log(n))$, which is good!

This algorithm would look something like:

sort(arr: list) -> list:
    prio_q = new min_prio_q()

    for x in arr:
        pq.add(x)

    result = []
    while not pq.empty():
        result.add(pq.remove_min())

    return result

There are loads more application, we’ll be looking into the graph searching and more AI applications.

Implementation of a Priority Queue

One’s first approach might be using a sorted/unsorted array, but the complexity becomes really bad quickly.

As we covered in the last part, a balanced BST seems like a good approach - and it is, however it will quickly become painful since we need to allow duplicates…

So the answer is, using a binary heap - it has the same complexity of a BST, but simpler to implement.

Binary (min-)heaps

A binary heap implements a priority queue as we said - using an ordinary binary tree, not a BST.

Let’s add some invariants to make it easy to find and remove the minimum element.

Invariant #1 - The Heap Property

The tree satisfies the so-called heap property if the value of each node is less than (or equal to) the value of its children:

With this the smallest node is always the root - therefore it takes $\mathcal{O}(1)$ to find this.

Invariant #2 - Completeness

The tree is complete if all levels are full expect the bottom level, which is filled from left to right.

This means only ‘NULL’ at the bottom level and no ‘NULL’ to the left of a node.

We can see that the first invariant makes it quick to find the minimum element and that the second invariant makes it so, the tree is balanced.

Let’s try to implement this!

Binary Heaps as Arrays

We can use an array as the underlying data structure for this - since we have invariant #2 or the ‘completeness’ we can easily index them correctly.

So, if we have node $i$, the child is at index $2i + 1$ and the right child at $2i + 2$. You can test this for yourself, but this just indexing left to right level by level.

We can also use that the parent of node i is at index $(i - 1) / 2$

Implementation

class min_prio_q[Item]:
    heap: list[Item] = []

    greater(i: int, j: int) -> boolean:
        return heap[i] > heap[j]

    get_min() -> Item:
        return heap[0]

    add(item: Item):
        // Add the item to the end of list (The end of the heap)
        heap.append(item)

        // Swap child with parent if the child is greater than the parent.
        // To achieve completeness that may have been lost
        while k > 0 and greater((k - 1) / 2, k):
            swap heap[(k - 1) / 2] and heap[k]
            k = (k - 1) / 2

    remove_min() -> Item:
        min: Item = heap[0]
        n: int = len(heap) - 1

        // Swap root with the end node to fix the invariant that may have been lost
        swap heap[0] and heap[n]

        k: int = 0
        // While child to the left is in tree
        while 2 * k + 1 <= n:
            j: int = 2 * k + 1
            // If right child < left child, set j to the right child instead
            if j < n and greater(j, j + 1):
                j += 1

            // If parent is larger than smallest child, swap and descend into j
            if not greater(k, j):
                break

            swap heap[k] and heap[j]
            k = j

        return min

So this is the implementation of a min-heap. If we instead want a max-heap we only change the sign of the greater function - that’s it!

We could also use the ‘heap’ mind-set to convert an array into a heap, using an in-place algorithm. Heaps are really powerful when it comes to sorting as well, heap sort for example.

Summary & Conclusion

So to summarize:

  • Priority Queues:
    • Min-priority queue: add(), get_min(), remove_min().
    • Max-priority queue: add(), get_max(), remove_max().
  • Binary Heaps:
    • Min-heaps implements min-priority queues
    • Max-heaps implements max-priority queues
    • Logarithmic performance!

This was it for this quite short part - in the next part(s) we’ll begin the final part of this course - graphs!