Linear Structures
Linear structures store elements in a sequence. The trade-offs between them come down to where you spend time: at the ends, in the middle, or on memory locality.| Structure | Access | Search | Insert (head/tail) | Insert (middle) | Delete | Space |
|---|---|---|---|---|---|---|
| Array | O(1) | O(n) | O(n) (shift) | O(n) | O(n) | O(n) |
| Linked list (singly) | O(n) | O(n) | O(1) (head) / O(n) (tail without tail ptr) | O(1) with ptr | O(1) with ptr | O(n) + pointer overhead |
| Doubly linked list | O(n) | O(n) | O(1) (head and tail) | O(1) with ptr | O(1) with ptr | O(n) + 2× pointer overhead |
| Stack | O(1) top | O(n) | O(1) push | — | O(1) pop | O(n) |
| Queue | O(1) front | O(n) | O(1) enqueue | — | O(1) dequeue | O(n) |
| Deque | O(1) ends | O(n) | O(1) both ends | — | O(1) both ends | O(n) |
Trees
Trees are hierarchical structures where each node has zero or more children. The most important variants differ in their balance guarantees and therefore their worst-case performance.Binary Tree
A tree where each node has at most two children (left and right). Without balance constraints, a binary tree degenerates to a linked list in the worst case (O(n) search). Full binary tree: every node has 0 or 2 children.Complete binary tree: all levels are fully filled except possibly the last, which is filled left to right.
Perfect binary tree: all levels are fully filled.
Binary Search Tree (BST)
A binary tree satisfying the BST property: for every node, all values in the left subtree are smaller and all values in the right subtree are larger.| Operation | Average | Worst case |
|---|---|---|
| Search | O(log n) | O(n) |
| Insert | O(log n) | O(n) |
| Delete | O(log n) | O(n) |
AVL Tree
An AVL tree maintains the balance factor (height of left subtree − height of right subtree) at every node within . After an insertion or deletion that violates this invariant, the tree performs rotations to restore balance.- Guaranteed O(log n) height.
- Stricter balance than red-black trees → faster lookups.
- More rotations on insert/delete → slower writes.
Red-Black Tree
A red-black tree is a self-balancing BST where each node is colored red or black and four invariants are maintained:- Every node is red or black.
- The root is black.
- No two adjacent red nodes (red node cannot have a red parent or red child).
- Every path from a node to its descendant null pointers has the same number of black nodes.
std::map, std::set (C++ STL), TreeMap (Java), and epoll’s internal socket tracking in the Linux kernel.
B+ Tree
A B+ tree is a multi-way balanced tree where every node can hold multiple keys (up to order m). All data records are stored in leaf nodes, which are linked in a doubly-linked list for efficient range scans. Internal nodes store only keys (routing information). Properties:- Tree height is O(log_m n) — very shallow for large m.
- Each node fits in one disk page (typically 4 KB or 16 KB), minimizing I/O.
- Sequential range scans traverse the leaf linked list, not the tree.
Heaps and Priority Queues
A heap is a complete binary tree stored implicitly in an array that satisfies the heap property:- Max-heap: every parent is ≥ its children. Root holds the maximum.
- Min-heap: every parent is ≤ its children. Root holds the minimum.
Array Representation
For a node at indexi (1-indexed):
- Left child:
2i - Right child:
2i+1 - Parent:
⌊i/2⌋
Operations
| Operation | Time |
|---|---|
| Find min/max (peek) | O(1) |
| Insert | O(log n) |
| Extract min/max | O(log n) |
| Build heap from array (heapify) | O(n) |
| Decrease/increase key | O(log n) |
Use Cases
- Priority queues: task schedulers, Dijkstra’s shortest path, Prim’s MST.
- Heap sort: in-place O(n log n) sort.
- K-th largest/smallest: maintain a min-heap of size k; O(n log k).
- Median maintenance: two heaps (max-heap for lower half, min-heap for upper half).
Hash Tables
A hash table maps keys to values in O(1) average time by using a hash function to compute an array index from a key.Hash Functions
A good hash function distributes keys uniformly across the bucket array, minimizing collisions. Common approaches include polynomial rolling hash for strings, and bit-mixing algorithms (FNV-1a, MurmurHash3, xxHash) for general data. Cryptographic hash functions (SHA-256) are too slow for hash tables.Collision Resolution
Separate chaining: each bucket holds a linked list (or another structure) of entries that hash to it. Lookup walks the list. Works well at high load factors but adds pointer overhead. Open addressing: all entries live in the array itself. On a collision, probe alternative slots:- Linear probing: check
i+1, i+2, …— simple but suffers clustering. - Quadratic probing: check
i+1, i+4, i+9, …— reduces primary clustering. - Double hashing: use a second hash function for the step size — best distribution.
Load Factor
load factor α = n / m where n is the number of entries and m is the number of buckets.
- At α < 0.7, separate chaining and open addressing both perform well.
- Above α ≈ 0.75 (Java HashMap’s threshold), performance degrades. The table rehashes by doubling capacity and reinserting all entries — O(n) amortized.
| Aspect | Separate Chaining | Open Addressing |
|---|---|---|
| Memory per entry | Key + value + pointer | Key + value only |
| Cache friendliness | Poor (pointer chasing) | Good (array locality) |
| Deletion | Easy (unlink node) | Requires tombstone markers |
| High load factor | Degrades gracefully | Degrades quickly |
std::unordered_map uses separate chaining. Rust’s HashMap uses Robin Hood open addressing. Python dict uses a custom open-addressing scheme.
Graphs
A graph G = (V, E) consists of a set of vertices V and edges E. Graphs may be directed or undirected, weighted or unweighted.Representation
| Representation | Space | Edge lookup | Enumerate neighbors | Best for |
|---|---|---|---|---|
| Adjacency matrix | O(V²) | O(1) | O(V) | Dense graphs |
| Adjacency list | O(V + E) | O(degree) | O(degree) | Sparse graphs |
vector<vector<pair<int,int>>> or a linked-forward-star) are the standard choice in competitive programming and production graph libraries.
BFS and DFS
BFS (Breadth-First Search) explores layer by layer using a queue. It finds the shortest path (by hop count) in an unweighted graph and is the basis for multi-source shortest path and bipartite checking. DFS (Depth-First Search) explores as deep as possible before backtracking, using a stack (implicit via recursion). It underlies topological sort, cycle detection, SCC (Tarjan’s / Kosaraju’s), and many tree-based DP patterns. Both run in O(V + E).Shortest Path Algorithms
| Algorithm | Graph type | Time | Notes |
|---|---|---|---|
| BFS | Unweighted | O(V + E) | Optimal hop count |
| Dijkstra (binary heap) | Non-negative weights | O((V + E) log V) | Greedy; fails on negative edges |
| Bellman-Ford | Any weights | O(VE) | Detects negative cycles |
| SPFA | Any weights | O(VE) worst, often faster | Bellman-Ford with queue optimization |
| Floyd-Warshall | Any weights, all pairs | O(V³) | DP over intermediate nodes |
Topological Sort
Topological sort orders vertices of a DAG (directed acyclic graph) such that for every edge u → v, u appears before v. It is the basis for build systems, dependency resolution, and course prerequisite ordering. The standard BFS-based approach (Kahn’s algorithm):- Compute the in-degree of every vertex.
- Enqueue all vertices with in-degree 0.
- Dequeue a vertex, output it, decrement the in-degree of its neighbors, enqueue any neighbor whose in-degree reaches 0.
- If fewer than V vertices were output, the graph contains a cycle.