Binary Search Trees

An ordered dictionary ADT is a dictionary in which the keys have a total order, just like in a heap. You can insert, find, and remove elements, just as with a hash table. But unlike a hash table, you can also quickly find the element with minimum or maximum key, or the entry nearest another entry in the total order. An ordered dictionary does anything a dictionary or binary heap can do and more, albeit more slowly.
A binary search tree (BST) is a simple way to implement an ordered dictionary.
In fact, the STL (multi)set and (multi)map use a variation of a BST (a red-black tree) as their internal data structure.

Representing Binary Trees

Recall that a binary tree is a rooted tree wherein no node has more than two children. Additionally, every child is either a left child or a right child of its parent, even if it is its parent’s only child. In the most popular binary tree representation, each tree node has three pointers: a parent pointer, a left pointer and a right pointer. Each node also stores a key (its value).
In a binary search tree, elements are maintained in a (somewhat) sorted order.
A binary search tree satisfies the binary-search-tree property (also called the binary-search-tree invariant):

Let $x$ be a node in a binary search tree. If $y$ is a node in the left subtree of $x$ , then $y.key \leq x.key$ . If $y$ is a node in the right subtree of $x$ , then $y.key \geq x.key$ .
When a node has only one child, that child is either a left child or a right child, depending on whether its key is smaller or larger than its parent’s key. A key equal to the parent’s key can go into either subtree.
Below is an example of a binary search tree.
```
       18
      /  \
    12    25
   / \    / \
  4  15  25  30
 /  /  \    /
1  13  17  28
 \  \       \
  3  14      29
```
You can verify this is a binary search tree: for instance, the root is 18, its left subtree (rooted at 12) contains numbers from 1 to 17, and its right subtree (rooted at 25) contains numbers from 25 to 30.
An in-order traversal of a binary search tree visits the nodes in sorted order. In this sense, a search tree maintains a sorted list of its elements.

For example, the in-order traversal of the above tree is

$1, 3, 4, 12, 13, 14, 15, 17, 18, 25, 25, 28, 29, 30$

A pre-order and post-order of a BST do not have any meaning.

Operations on BSTs

Operations on a search tree are usually more efficient than the same operations on a sorted linked list.

Find

Depending on how we define the data structure, the find method can return the key, the node that stores that key or a boolean value.

Pseudocode:

find(root, x):
    node = root
    while node ≠ nil:
        if x < node.key:
            node = node.left
        else if x > node.key
            node = node.right
        else:
            return node
    return nil

We can also easily modify this code so that it returns the largest key $\leq x$ or the smallest key $\geq x$ in the tree.

Minimum and Maximum

minimum and maximum are procedures that allow us to find the smallest key and the largest key stored in the BST. These can be easily implemented in a BST since the node with the smallest key is the leftmost node in the tree and the node with the largest key is the rightmost node in the tree.
Depending on how we define the data structure, these methods can return the key or the node that stores that key.

Pseudocode for minimum:

minimum(root, x):
    node = root
    while node.left ≠ nil:
        node = node.left
    return node

Pseudocode for maximum:

minimum(root, x):
    node = root
    while node.right ≠ nil:
        node = node.right
    return node

Insert

The insert method inserts a new node into the tree.
We start by following the same path through the tree as find. When we reaches a null pointer (missing node), replace that node with a new node storing the new key.
Duplicate keys are allowed. If insert finds a node that already has the same key, we’ll put the new node in the right subtree of the older one. (We could just as easily choose the left subtree; it doesn’t matter.)

Pseudocode:

insert(tree, newKey):
    newNode = Node()
    newNode.left = newNode.right = nil
    newNode.key = newKey

    nodeToCheck = tree.root
    parentOfNewNode = nil
    while nodeToCheck ≠ nil:
        parentOfNewNode = nodeToCheck
        if newNode.key < nodeToCheck.key:
            nodeToCheck = nodeToCheck.left
        else:
            nodeToCheck = nodeToCheck.right
    newNode.parent = parentOfNewNode
    if parentOfNewNode == nil:
        // tree was empty
        tree.root = newNode
    else if newNode.key < parentOfNewNode.key:
        parentOfNewNode.left = newNode
    else:
        parentOfNewNode.right = newNode

Remove

The remove method removes a node with the specified key from the tree. If there are multiple nodes with that key, we remove only one node (the first one that we happen to find).
We start by following the same path through the tree as find. If the key is not in the tree, we are done. Otherwise, let n be the first node with that key.
- If n has no children, we easily detach it from its parent.
- If n has one child, move n's child up to take n's place. n's child becomes the child of n's parent and n's parent becomes the parent of n's child.
If n has two children, however, we have to be a bit more clever. Let successorNode be the node in n's right subtree with the smallest key. We detach successorNode from its parent and replace replace n with successorNode in the tree. Since successorNode's key is the next smallest key after the key we wanted to remove, the binary-search-tree property still holds.

In order to move subtrees around within the binary search tree, we’ll define a subroutine transplant, which replaces one subtree as a child of its parent with another subtree:

transplant(tree, oldChild, newChild):
    if oldChild.parent == nil:
        tree.root = newChild
    else if oldChild == oldChild.parent.left:
        oldChild.parent.left = newChild
    else:
        oldChild.parent.right = newChild
    if newChild ≠ nil:
        newChild.parent = oldChild.parent

We can then write this pseudocode for remove:

remove(tree, key):
    nodeToRemove = find(key)
    if nodeToRemove ≠ nil:
        if nodeToRemove.left == nil:
            transplant(tree, nodeToRemove, nodeToRemove.right)
        else if nodeToRemove.right == nil:
            transplant(tree, nodeToRemove, nodeToRemove.left)
        else:
            successorNode = minimum(nodeToRemove.right)
            if successorNode.parent ≠ nodeToRemove:
                // detach successorNode from its parent
                transplant(tree, successorNode, successorNode.right)
                successorNode.right = nodeToRemove.right
                successorNode.right.parent = successorNode
            // replace nodeToRemove with successorNode
            transplant(tree, nodeToRemove, successorNode)
            successorNode.left = nodeToRemove.left
            successorNode.left.parent = successorNode

Successor and Predecessor

The successor of a node $x$ is the node with the smallest key $< x.key$ if all keys are distinct and $\leq x.key$ if duplicate keys are allowed.
- We break the algorithm for finding the successor into two cases:
  1. If the right subtree of $x$ is nonempty, then the successor of $x$ is the leftmost node in the right subtree of $x$ .
  2. If the right subtree of $x$ is empty and $x$ has a successor $y$ , then $y$ is the lowest ancestor of $x$ whose left child is also an ancestor of $x$ . To find $y$ , we simply go up the tree from $x$ until we encounter a node that is the left child of its parent.
- Pseudocode:
  successor(node): if node.right ≠ nil: return minimum(x.right) else: y = node.parent while y ≠ nil and node == y.right node = y y = y.parent return y
The predecessor of a node $x$ is the node with the smallest key $> x.key$ if all keys are distinct and $\geq x.key$ if duplicate keys are allowed.
- We break the algorithm for finding the successor into two cases:
  1. If the left subtree of $x$ is nonempty, then the predecessor of $x$ is the rightmost node in the left subtree of $x$ ..
  2. If the left subtree of $x$ is empty and $x$ has a predecessor $y$ , then $y$ is the lowest ancestor of $x$ whose right child is also an ancestor of $x$ . To find $y$ , we simply go up the tree from $x$ until we encounter a node that is the right child of its parent.
- The pseudocode for predecessor is analogous to that of successor.
As you can see, the structure of a binary search tree allows us to determine the successor and the predecessor of a node without ever comparing keys.

Complexity

Running times of find, insert and remove are $O(h)$ where $h$ is the height of the tree.
In the worst case, the height of the tree is $\Theta(n)$ where $n$ is the number of nodes in the tree.

For example, suppose we inserted a sorted sequence of numbers $1, 2, 3, 4, 5$ into the BST. The resulting tree will look like
```
1
 \
  2
   \
    3
     \
      4
       \
        5
```
Therefore, the worst case running time of find, insert and remove can also be expressed as $\Theta(n)$ where $n$ is the number of elements in the tree.

Balanced BSTs

In a perfectly balanced binary tree with height $h$ , the number of nodes $n = 2^{h + 1} - 1$ . In other words, no node has depth $d > \log_2 n$ .
Therefore, the running times of find, insert and remove are $O(\log n)$ where $n$ is the number of nodes in the tree.
In a balanced BST, all the leaf nodes are at the same level.
A balanced BST stays balanced when we insert new elements and remove the existing ones.
Some of the popular data structures that implement self-balancing BSTs include:
- 2-3 trees
- AVL trees
- Red-black trees
Most STL implementations of the ordered associative containers (sets, multisets, maps and multimaps) use red-black trees.

Binary Search Trees

Binary Search Trees

Representing Binary Trees

Operations on BSTs

Find

Minimum and Maximum

Insert

Remove

Successor and Predecessor

Complexity

Balanced BSTs

results matching ""

No results matching ""