Knowledge/Algorithms/Introduction
24 Chapters
Big-O Throughout
Living Document
Free License
Engineering Publication · Syed Omar Ibn Feroj← Back to portfolio
Σ
Engineering Notes · Open Knowledge Repository

Algorithms — Engineering
Notes for Working Code

A working reference for the algorithms that show up in real systems — complexity first, then sorting, searching, graphs, dynamic programming and the design paradigms underneath them. Every chapter pairs the idea with a clean implementation and the cost you actually pay.

24
Chapters
Big-O
Throughout
Living
Document
Free
License
Ch. 01
Complexity & Big-O
Counting the work an algorithm does as a function of its input — the only fair way to compare two solutions.
1.1 — The asymptotic notations

Big-O describes an upper bound on growth: an algorithm is O(f(n)) if, past some input size, its running time never exceeds a constant multiple of f(n). We drop constants and lower-order terms because they stop mattering as n grows.

NotationBoundReads as
O(f)upper"grows no faster than f"
Ω(f)lower"grows at least as fast as f"
Θ(f)tight"grows exactly like f"
1.2 — The growth ladder

From cheapest to most expensive, the classes you meet daily:

Growth
# cheapest → most expensive
O(1)        # constant   — hash lookup
O(log n)    # logarithmic — binary search
O(n)        # linear     — scan an array
O(n log n)  # linearithmic — good sorts
O(n^2)      # quadratic  — naive nested loops
O(2^n)      # exponential — naive subsets
O(n!)       # factorial  — naive permutations
Rule of thumb at ~10^8 operations/second: O(n^2) is fine to ~10,000 elements, O(n log n) to ~10 million, O(n) to ~100 million. If n is large and you wrote a nested loop, that's usually the thing to fix first.
1.3 — Time vs space; worst vs amortised

Space complexity counts extra memory beyond the input. Amortised cost averages an expensive-but-rare operation over the cheap common ones — a dynamic array's push is O(1) amortised even though the occasional resize is O(n).

Ch. 02
Recursion
A function defined in terms of itself, plus the recurrence math that tells you what it costs.
2.1 — Base case and recursive case

Every correct recursion has a base case that stops it and a recursive case that moves strictly toward the base. Miss either and you get a stack overflow.

Python
def factorial(n):
    if n <= 1:          # base case
        return 1
    return n * factorial(n - 1)  # shrinks toward base
2.2 — Solving recurrences (Master Theorem)

For divide-and-conquer recurrences of the form T(n) = a·T(n/b) + O(n^d):

Master Theorem
if  d > log_b(a):   T(n) = O(n^d)
if  d = log_b(a):   T(n) = O(n^d log n)
if  d < log_b(a):   T(n) = O(n^(log_b a))

# merge sort: a=2, b=2, d=1 → 1 = log2(2) → O(n log n)
2.3 — Recursion vs iteration

Any recursion can be rewritten with an explicit stack. Recursion is clearer for tree/graph shapes; iteration avoids call-stack limits. Tail calls aren't optimised in most mainstream runtimes — assume each call costs a frame.

Ch. 03
Elementary Sorts
Bubble, selection and insertion — O(n^2), but worth knowing why insertion sort still earns its place.
3.1 — The three quadratic sorts
Python
def selection_sort(a):
    for i in range(len(a)):
        m = i
        for j in range(i+1, len(a)):
            if a[j] < a[m]: m = j
        a[i], a[m] = a[m], a[i]   # swap min into place

def insertion_sort(a):
    for i in range(1, len(a)):
        key, j = a[i], i - 1
        while j >= 0 and a[j] > key:
            a[j+1] = a[j]; j -= 1
        a[j+1] = key
Insertion sort is O(n) on already-mostly-sorted data and has tiny constants, which is why production sorts (Timsort, introsort) fall back to it for small or nearly-sorted subarrays.
Ch. 04
Merge Sort
Divide in half, sort each half, merge. Stable, predictable O(n log n), O(n) extra space.
4.1 — Implementation
Python
def merge_sort(a):
    if len(a) <= 1: return a
    mid = len(a) // 2
    L = merge_sort(a[:mid])
    R = merge_sort(a[mid:])
    return merge(L, R)

def merge(L, R):
    out, i, j = [], 0, 0
    while i < len(L) and j < len(R):
        if L[i] <= R[j]: out.append(L[i]); i += 1
        else: out.append(R[j]); j += 1
    return out + L[i:] + R[j:]
Merge sort is stable (equal elements keep their input order) and its O(n log n) holds in the worst case — the safe default when worst-case guarantees matter (external sorts, linked lists, stability-sensitive data).
Ch. 05
Quicksort
Partition around a pivot, recurse on each side. O(n log n) average, O(n^2) worst, in-place.
5.1 — Lomuto partition
Python
def quicksort(a, lo, hi):
    if lo >= hi: return
    p = partition(a, lo, hi)
    quicksort(a, lo, p-1)
    quicksort(a, p+1, hi)

def partition(a, lo, hi):
    pivot, i = a[hi], lo
    for j in range(lo, hi):
        if a[j] < pivot:
            a[i], a[j] = a[j], a[i]; i += 1
    a[i], a[hi] = a[hi], a[i]
    return i
The O(n^2) worst case hits on already-sorted input with a naive last-element pivot. Mitigations: randomised pivot, median-of-three, or switch to heapsort once recursion gets too deep (this hybrid is introsort, what most standard libraries ship).
Ch. 06
Heapsort
Build a max-heap, repeatedly extract the max. O(n log n) worst case, in-place, not stable.
6.1 — Sift-down and sort
Python
def heapify(a, n, i):
    big = i; l, r = 2*i+1, 2*i+2
    if l < n and a[l] > a[big]: big = l
    if r < n and a[r] > a[big]: big = r
    if big != i:
        a[i], a[big] = a[big], a[i]
        heapify(a, n, big)

def heapsort(a):
    n = len(a)
    for i in range(n//2-1, -1, -1): heapify(a, n, i)
    for i in range(n-1, 0, -1):
        a[0], a[i] = a[i], a[0]
        heapify(a, i, 0)

Building the heap is O(n) (not O(n log n) — a tighter analysis); the n extractions are O(n log n) total. Use heapsort when you need a worst-case guarantee with no extra memory.

Ch. 07
Linear-Time Sorts
Counting, radix and bucket sort beat the O(n log n) comparison lower bound — by not comparing.
7.1 — Counting sort

When keys are small integers in a known range [0, k], tally occurrences and rebuild. O(n + k), stable.

Python
def counting_sort(a, k):
    count = [0] * (k+1)
    for x in a: count[x] += 1
    out = []
    for v in range(k+1):
        out += [v] * count[v]
    return out
The O(n log n) lower bound only applies to comparison sorts. Counting/radix/bucket sidestep it by exploiting structure in the keys — but they're useless for arbitrary comparables (custom objects, strings with locale rules).
7.2 — Radix sort

Sort by least-significant digit first, using a stable counting sort per digit. O(d·(n + b)) for d digits in base b.

Ch. 08
Binary Search
Halve the search space each step. O(log n) — but the off-by-one bugs are legendary.
8.1 — The canonical form
Python
def binary_search(a, target):
    lo, hi = 0, len(a) - 1
    while lo <= hi:
        mid = (lo + hi) // 2
        if a[mid] == target: return mid
        if a[mid] < target: lo = mid + 1
        else: hi = mid - 1
    return -1
8.2 — Binary search on the answer

The real power: when a predicate is monotonic (false…false, true…true), binary-search the boundary. "Smallest capacity that ships in D days", "minimum max-page split" — all the same template. The array doesn't have to exist; the answer space just has to be sorted by feasibility.

Two perennial bugs: integer overflow in (lo+hi)/2 (use lo + (hi-lo)//2 in fixed-width languages), and infinite loops from the wrong lo/hi update when searching for a boundary rather than an exact value.
Ch. 09
Two Pointers & Sliding Window
Two indices moving over a sequence — turns many O(n^2) scans into O(n).
9.1 — Opposite-end two pointers

On a sorted array, converge from both ends. Classic: does a pair sum to a target?

Python
def has_pair(a, target):  # a is sorted
    i, j = 0, len(a) - 1
    while i < j:
        s = a[i] + a[j]
        if s == target: return True
        if s < target: i += 1
        else: j -= 1
    return False
9.2 — Sliding window

Grow the window's right edge; when a constraint breaks, shrink from the left. Longest substring without repeats, minimum window covering a set, max sum of size-k — all this shape. Each element enters and leaves the window once → O(n).

Ch. 10
Divide & Conquer
Split into independent subproblems, solve recursively, combine. Merge sort and quicksort are instances; here's the general lens.
10.1 — The shape

Three steps: divide the problem into subproblems of the same kind, conquer them recursively, combine their answers. The combine step's cost drives the recurrence (see the Master Theorem in Ch. 02).

10.2 — Worked example: maximum subarray (crossing sum)
Python
def max_sub(a, lo, hi):
    if lo == hi: return a[lo]
    mid = (lo + hi) // 2
    left  = max_sub(a, lo, mid)
    right = max_sub(a, mid+1, hi)
    cross = cross_sum(a, lo, mid, hi)
    return max(left, right, cross)

O(n log n) here; Kadane's DP (Ch. 12) does the same in O(n). Divide-and-conquer isn't always optimal — but it's often the first correct thing you can reason about.

Ch. 11
Greedy
Make the locally optimal choice and never look back. Fast when it works — and provably wrong when it doesn't.
11.1 — When greedy is correct

Greedy is valid only when the problem has the greedy-choice property (a global optimum contains a locally optimal first step) and optimal substructure. Activity selection, Huffman coding, Kruskal's and Prim's MSTs, and Dijkstra are correct greedies. The 0/1 knapsack is the classic where greedy fails and you need DP.

Python
# Interval scheduling: pick the most activities
def max_activities(intervals):
    intervals.sort(key=lambda x: x[1])  # by end time
    count, end = 0, -float('inf')
    for s, e in intervals:
        if s >= end: count += 1; end = e
    return count
Never assume greedy is correct without proof (exchange argument or matroid theory). A greedy that passes your test cases but is subtly wrong is one of the hardest bugs to find later.
Ch. 12
Dynamic Programming
Overlapping subproblems + optimal substructure → cache the subproblem answers. The single highest-leverage technique here.
12.1 — Memoisation (top-down) vs tabulation (bottom-up)
Python
# Top-down: recursion + cache
from functools import lru_cache
@lru_cache(None)
def fib(n):
    return n if n < 2 else fib(n-1) + fib(n-2)

# Bottom-up: fill a table, O(1) space here
def fib_bu(n):
    a, b = 0, 1
    for _ in range(n): a, b = b, a + b
    return a
12.2 — 0/1 Knapsack
Python
def knapsack(wt, val, W):
    dp = [0] * (W+1)
    for i in range(len(wt)):
        for c in range(W, wt[i]-1, -1):  # reverse!
            dp[c] = max(dp[c], dp[c-wt[i]] + val[i])
    return dp[W]
The DP recipe: (1) define the state precisely, (2) write the transition (recurrence), (3) identify base cases, (4) decide iteration order so dependencies are computed first, (5) optionally compress dimensions for space. Step 1 is where most attempts fail — if you can't name the state, you don't have the recurrence yet.
Ch. 13
Backtracking
Build a candidate incrementally; abandon a branch the moment it can't lead to a solution.
13.1 — Permutations as a template
Python
def permute(nums):
    res, n = [], len(nums)
    def bt(path, used):
        if len(path) == n:
            res.append(path[:]); return
        for i in range(n):
            if used[i]: continue
            used[i] = True; path.append(nums[i])
            bt(path, used)
            path.pop(); used[i] = False   # undo
    bt([], [False]*n)
    return res

The whole technique is "choose → recurse → un-choose". Add a pruning test before recursing (constraint propagation, bounds) and exponential search spaces become tractable: N-Queens, Sudoku, subset-sum.

Ch. 14
Hashing
Average O(1) insert/lookup/delete by mapping keys to buckets. The default associative structure.
14.1 — Collisions and load factor

Two keys hashing to the same bucket is a collision. Two resolution families: chaining (each bucket holds a list) and open addressing (probe to the next free slot). Keep the load factor (n/buckets) under ~0.75 and resize when it exceeds — amortised O(1) holds, worst case degrades to O(n) on adversarial keys.

"O(1)" is amortised and assumes a good hash. Adversarial input can force collisions and turn a hash map into a linked list — the basis of hash-flooding DoS attacks. Production maps randomise the hash seed per process to defend against this.
14.2 — What hashing buys you

Most "is this a duplicate / have I seen this / count occurrences / group by" problems collapse to a hash set or map and one linear pass — turning the obvious O(n^2) into O(n).

Ch. 15
Heaps & Priority Queues
A binary heap gives O(log n) insert and extract-min, O(1) peek. The engine behind Dijkstra, Prim and top-k.
15.1 — Array-backed binary heap
Python
import heapq
pq = []
heapq.heappush(pq, (priority, item))
priority, item = heapq.heappop(pq)   # min by default

# Top-k largest in O(n log k):
def top_k(a, k):
    h = a[:k]; heapq.heapify(h)
    for x in a[k:]:
        if x > h[0]: heapq.heapreplace(h, x)
    return h

Parent of index i is (i-1)//2; children are 2i+1 and 2i+2. That index arithmetic is why a heap needs no pointers — it lives in a flat array.

Ch. 16
Binary Search Trees
Ordered structure with O(log n) operations — when balanced. The 'when balanced' is the whole story.
16.1 — BST property and the balance problem

For every node, all keys in the left subtree are smaller and all in the right are larger. Search/insert/delete are O(h) where h is height. A balanced tree has h = O(log n); insert sorted data into a naive BST and h = O(n) — it degenerates into a linked list.

Self-balancing variants fix the height guarantee: red-black trees (what most standard-library ordered maps use), AVL trees (stricter balance, faster lookups, more rotations), and B-trees (the database/filesystem workhorse, optimised for block storage).
16.2 — In-order traversal yields sorted order
Python
def inorder(node, out):
    if not node: return
    inorder(node.left, out)
    out.append(node.val)
    inorder(node.right, out)
Ch. 17
Union-Find
Disjoint Set Union — near-O(1) 'are these in the same group?' and 'merge these groups'.
17.1 — Path compression + union by rank
Python
parent = list(range(n)); rank = [0]*n

def find(x):
    while parent[x] != x:
        parent[x] = parent[parent[x]]  # path compression
        x = parent[x]
    return x

def union(a, b):
    ra, rb = find(a), find(b)
    if ra == rb: return
    if rank[ra] < rank[rb]: ra, rb = rb, ra
    parent[rb] = ra
    if rank[ra] == rank[rb]: rank[ra] += 1

With both optimisations, m operations run in O(m·α(n)) where α is the inverse Ackermann function — effectively constant. Powers Kruskal's MST, connected components, and cycle detection in undirected graphs.

Ch. 18
Graph Representations
Adjacency list vs matrix — the choice that silently sets every later algorithm's complexity.
18.1 — The trade-off
AspectAdjacency listAdjacency matrix
SpaceO(V + E)O(V²)
Edge exists?O(degree)O(1)
Iterate neighboursO(degree)O(V)
Best forsparse graphsdense / fast edge queries

Most real graphs are sparse (E ≪ V²), so adjacency list is the default. Reach for a matrix only when the graph is dense or you need constant-time edge existence checks (e.g. Floyd-Warshall).

Ch. 19
BFS & DFS
The two traversals every graph algorithm is built on. BFS = queue = shortest path in unweighted graphs. DFS = stack/recursion = structure.
19.1 — Both in one shape
Python
from collections import deque
def bfs(g, start):
    seen = {start}; q = deque([start])
    while q:
        u = q.popleft()
        for v in g[u]:
            if v not in seen:
                seen.add(v); q.append(v)

def dfs(g, u, seen):
    seen.add(u)
    for v in g[u]:
        if v not in seen: dfs(g, v, seen)

Both are O(V + E). BFS explores level by level — its tree gives shortest paths in unweighted graphs. DFS goes deep first — it reveals structure: cycles, bridges, articulation points, topological order, strongly connected components.

Ch. 20
Shortest Paths
Dijkstra, Bellman-Ford, Floyd-Warshall — pick by edge weights and how many sources you need.
20.1 — Dijkstra (non-negative weights, single source)
Python
import heapq
def dijkstra(g, src):
    dist = {src: 0}; pq = [(0, src)]
    while pq:
        d, u = heapq.heappop(pq)
        if d > dist.get(u, float('inf')): continue
        for v, w in g[u]:
            nd = d + w
            if nd < dist.get(v, float('inf')):
                dist[v] = nd; heapq.heappush(pq, (nd, v))
    return dist

O((V + E) log V) with a binary heap.

20.2 — Choosing the right one
AlgorithmUse whenCost
BFSunweighted, single sourceO(V+E)
Dijkstranon-negative weights, single sourceO((V+E) log V)
Bellman-Fordnegative weights / detect negative cycleO(V·E)
Floyd-Warshallall pairs, small dense graphO(V³)
Dijkstra is wrong on negative edges — it commits to a node's distance permanently when popped. If any edge can be negative, use Bellman-Ford (which also detects negative cycles).
Ch. 21
Minimum Spanning Tree
Cheapest set of edges connecting all vertices. Kruskal (sort + union-find) or Prim (grow + heap).
21.1 — Kruskal
Python
def kruskal(n, edges):  # edges: (w, u, v)
    edges.sort()
    mst, total = [], 0
    for w, u, v in edges:
        if find(u) != find(v):
            union(u, v)
            mst.append((u, v)); total += w
    return total, mst

Kruskal is O(E log E) — sort edges, add the cheapest that doesn't form a cycle (union-find detects the cycle). Prim grows one tree from a seed using a priority queue, O((V+E) log V); it wins on dense graphs.

Ch. 22
Topological Sort
Linear ordering of a DAG so every edge points forward. Build orders, task scheduling, dependency resolution.
22.1 — Kahn's algorithm (BFS on in-degrees)
Python
from collections import deque
def toposort(g, n):
    indeg = [0]*n
    for u in range(n):
        for v in g[u]: indeg[v] += 1
    q = deque([i for i in range(n) if indeg[i] == 0])
    order = []
    while q:
        u = q.popleft(); order.append(u)
        for v in g[u]:
            indeg[v] -= 1
            if indeg[v] == 0: q.append(v)
    return order if len(order) == n else None  # None ⇒ cycle

If the output is shorter than the vertex count, the graph had a cycle and no valid ordering exists — a free cycle detector.

Ch. 23
String Matching
Find a pattern in text faster than the naive O(n·m). KMP and Rabin-Karp are the two you'll actually use.
23.1 — KMP: never re-examine text characters

KMP precomputes a failure table — for each pattern prefix, the length of the longest proper prefix that's also a suffix. On a mismatch it jumps the pattern forward using that table instead of restarting. O(n + m), no backtracking over the text.

23.2 — Rabin-Karp: hash the window
Python
def rabin_karp(text, pat):
    n, m = len(text), len(pat)
    if m > n: return -1
    B, M = 256, 1_000_000_007
    hp = ht = 0; h = 1
    for _ in range(m-1): h = (h*B) % M
    for i in range(m):
        hp = (hp*B + ord(pat[i])) % M
        ht = (ht*B + ord(text[i])) % M
    for i in range(n-m+1):
        if hp == ht and text[i:i+m] == pat: return i
        if i < n-m:
            ht = ((ht - ord(text[i])*h)*B + ord(text[i+m])) % M
    return -1

The rolling hash makes each window O(1) to update → O(n + m) average. Rabin-Karp shines for multi-pattern search (hash all patterns once).

Ch. 24
Bits & Number Theory
Bit tricks and the handful of number-theoretic algorithms that recur in real problems.
24.1 — Bit manipulation
Python
x & (x - 1)        # clears the lowest set bit
x & -x             # isolates the lowest set bit
x & (x - 1) == 0   # is x a power of two? (x > 0)
x ^ y              # XOR — find the single non-paired value
(x >> i) & 1        # read bit i
x | (1 << i)        # set bit i
24.2 — Euclid's GCD, sieve, modular exponentiation
Python
def gcd(a, b):
    while b: a, b = b, a % b
    return a

def sieve(n):                # primes up to n, O(n log log n)
    is_p = [True]*(n+1); is_p[0:2] = [False, False]
    for i in range(2, int(n**.5)+1):
        if is_p[i]:
            for j in range(i*i, n+1, i): is_p[j] = False
    return [i for i, p in enumerate(is_p) if p]

pow(base, exp, mod)        # fast modular exponentiation, O(log exp)
REF
Complexity Cheatsheet
The numbers worth memorising. Pull this up before you optimise anything.
Sorting
AlgorithmBestAverageWorstSpaceStable
InsertionO(n)O(n²)O(n²)O(1)yes
MergeO(n log n)O(n log n)O(n log n)O(n)yes
QuickO(n log n)O(n log n)O(n²)O(log n)no
HeapO(n log n)O(n log n)O(n log n)O(1)no
CountingO(n+k)O(n+k)O(n+k)O(k)yes
Data structures (average)
StructureAccessSearchInsertDelete
ArrayO(1)O(n)O(n)O(n)
Hash tableO(1)O(1)O(1)
Balanced BSTO(log n)O(log n)O(log n)O(log n)
Binary heapO(n)O(log n)O(log n)
Graphs (V vertices, E edges)
AlgorithmComplexityNote
BFS / DFSO(V + E)traversal, connectivity
Dijkstra (heap)O((V+E) log V)non-negative weights
Bellman-FordO(V·E)handles negative edges
Floyd-WarshallO(V³)all pairs
Kruskal / PrimO(E log E)minimum spanning tree
Topological sortO(V + E)DAG only