18
/ \
12 25
/ \ / \
4 15 25 30
/ / \ /
1 13 17 28
\ \ \
3 14 29
Binary Search Trees
-
An ordered dictionary ADT is a dictionary in which the keys have a total order, just like in a heap. You can insert, find, and remove elements, just as with a hash table. But unlike a hash table, you can also quickly find the element with minimum or maximum key, or the entry nearest another entry in the total order. An ordered dictionary does anything a dictionary or binary heap can do and more, albeit more slowly.
-
A binary search tree (BST) is a simple way to implement an ordered dictionary.
-
In fact, the STL (multi)set and (multi)map use a variation of a BST (a red-black tree) as their internal data structure.
Representing Binary Trees
-
Recall that a binary tree is a rooted tree wherein no node has more than two children. Additionally, every child is either a left child or a right child of its parent, even if it is its parent’s only child. In the most popular binary tree representation, each tree node has three pointers: a parent pointer, a left pointer and a right pointer. Each node also stores a key (its value).
-
In a binary search tree, elements are maintained in a (somewhat) sorted order.
-
A binary search tree satisfies the binary-search-tree property (also called the binary-search-tree invariant):
Let be a node in a binary search tree. If is a node in the left subtree of , then . If is a node in the right subtree of , then .
-
When a node has only one child, that child is either a left child or a right child, depending on whether its key is smaller or larger than its parent’s key. A key equal to the parent’s key can go into either subtree.
-
Below is an example of a binary search tree.
You can verify this is a binary search tree: for instance, the root is 18, its left subtree (rooted at 12) contains numbers from 1 to 17, and its right subtree (rooted at 25) contains numbers from 25 to 30.
-
An in-order traversal of a binary search tree visits the nodes in sorted order. In this sense, a search tree maintains a sorted list of its elements.
For example, the in-order traversal of the above tree is
A pre-order and post-order of a BST do not have any meaning.
Operations on BSTs
-
Operations on a search tree are usually more efficient than the same operations on a sorted linked list.
Find
-
Depending on how we define the data structure, the
find
method can return the key, the node that stores that key or a boolean value. -
Pseudocode:
find(root, x): node = root while node ≠ nil: if x < node.key: node = node.left else if x > node.key node = node.right else: return node return nil
We can also easily modify this code so that it returns the largest key or the smallest key in the tree.
Minimum and Maximum
-
minimum
andmaximum
are procedures that allow us to find the smallest key and the largest key stored in the BST. These can be easily implemented in a BST since the node with the smallest key is the leftmost node in the tree and the node with the largest key is the rightmost node in the tree. -
Depending on how we define the data structure, these methods can return the key or the node that stores that key.
-
Pseudocode for
minimum
:minimum(root, x): node = root while node.left ≠ nil: node = node.left return node
-
Pseudocode for
maximum
:minimum(root, x): node = root while node.right ≠ nil: node = node.right return node
Insert
-
The
insert
method inserts a new node into the tree. -
We start by following the same path through the tree as
find
. When we reaches a null pointer (missing node), replace that node with a new node storing the new key. -
Duplicate keys are allowed. If
insert
finds a node that already has the same key, we’ll put the new node in the right subtree of the older one. (We could just as easily choose the left subtree; it doesn’t matter.) -
Pseudocode:
insert(tree, newKey): newNode = Node() newNode.left = newNode.right = nil newNode.key = newKey nodeToCheck = tree.root parentOfNewNode = nil while nodeToCheck ≠ nil: parentOfNewNode = nodeToCheck if newNode.key < nodeToCheck.key: nodeToCheck = nodeToCheck.left else: nodeToCheck = nodeToCheck.right newNode.parent = parentOfNewNode if parentOfNewNode == nil: // tree was empty tree.root = newNode else if newNode.key < parentOfNewNode.key: parentOfNewNode.left = newNode else: parentOfNewNode.right = newNode
Remove
-
The
remove
method removes a node with the specified key from the tree. If there are multiple nodes with that key, we remove only one node (the first one that we happen to find). -
We start by following the same path through the tree as
find
. If the key is not in the tree, we are done. Otherwise, letn
be the first node with that key.-
If
n
has no children, we easily detach it from its parent. -
If
n
has one child, moven
's child up to taken
's place.n
's child becomes the child ofn
's parent andn
's parent becomes the parent ofn
's child.
-
-
If
n
has two children, however, we have to be a bit more clever. LetsuccessorNode
be the node inn
's right subtree with the smallest key. We detachsuccessorNode
from its parent and replace replacen
withsuccessorNode
in the tree. SincesuccessorNode
's key is the next smallest key after the key we wanted to remove, the binary-search-tree property still holds. -
In order to move subtrees around within the binary search tree, we’ll define a subroutine
transplant
, which replaces one subtree as a child of its parent with another subtree:transplant(tree, oldChild, newChild): if oldChild.parent == nil: tree.root = newChild else if oldChild == oldChild.parent.left: oldChild.parent.left = newChild else: oldChild.parent.right = newChild if newChild ≠ nil: newChild.parent = oldChild.parent
We can then write this pseudocode for
remove
:remove(tree, key): nodeToRemove = find(key) if nodeToRemove ≠ nil: if nodeToRemove.left == nil: transplant(tree, nodeToRemove, nodeToRemove.right) else if nodeToRemove.right == nil: transplant(tree, nodeToRemove, nodeToRemove.left) else: successorNode = minimum(nodeToRemove.right) if successorNode.parent ≠ nodeToRemove: // detach successorNode from its parent transplant(tree, successorNode, successorNode.right) successorNode.right = nodeToRemove.right successorNode.right.parent = successorNode // replace nodeToRemove with successorNode transplant(tree, nodeToRemove, successorNode) successorNode.left = nodeToRemove.left successorNode.left.parent = successorNode
Successor and Predecessor
-
The successor of a node is the node with the smallest key if all keys are distinct and if duplicate keys are allowed.
-
We break the algorithm for finding the successor into two cases:
-
If the right subtree of is nonempty, then the successor of is the leftmost node in the right subtree of .
-
If the right subtree of is empty and has a successor , then is the lowest ancestor of whose left child is also an ancestor of . To find , we simply go up the tree from until we encounter a node that is the left child of its parent.
-
-
Pseudocode:
successor(node): if node.right ≠ nil: return minimum(x.right) else: y = node.parent while y ≠ nil and node == y.right node = y y = y.parent return y
-
-
The predecessor of a node is the node with the smallest key if all keys are distinct and if duplicate keys are allowed.
-
We break the algorithm for finding the successor into two cases:
-
If the left subtree of is nonempty, then the predecessor of is the rightmost node in the left subtree of ..
-
If the left subtree of is empty and has a predecessor , then is the lowest ancestor of whose right child is also an ancestor of . To find , we simply go up the tree from until we encounter a node that is the right child of its parent.
-
-
The pseudocode for
predecessor
is analogous to that ofsuccessor
.
-
-
As you can see, the structure of a binary search tree allows us to determine the successor and the predecessor of a node without ever comparing keys.
Complexity
-
Running times of
find
,insert
andremove
are where is the height of the tree. -
In the worst case, the height of the tree is where is the number of nodes in the tree.
For example, suppose we inserted a sorted sequence of numbers into the BST. The resulting tree will look like
1 \ 2 \ 3 \ 4 \ 5
Therefore, the worst case running time of
find
,insert
andremove
can also be expressed as where is the number of elements in the tree.
Balanced BSTs
-
In a perfectly balanced binary tree with height , the number of nodes . In other words, no node has depth .
-
Therefore, the running times of
find
,insert
andremove
are where is the number of nodes in the tree. -
In a balanced BST, all the leaf nodes are at the same level.
-
A balanced BST stays balanced when we insert new elements and remove the existing ones.
-
Some of the popular data structures that implement self-balancing BSTs include:
-
2-3 trees
-
AVL trees
-
Red-black trees
Most STL implementations of the ordered associative containers (sets, multisets, maps and multimaps) use red-black trees.
-