Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

AVL and Red-Black Search Trees: Balancing Binary Search Trees, Study notes of Algorithms and Programming

Avl and red-black search trees, two types of self-balancing binary search trees. Avl trees use recursive algorithms to maintain height-balance, while red-black trees use color attributes to maintain balance. Both trees ensure efficient access to data and prevent degenerate trees.

Typology: Study notes

Pre 2010

Uploaded on 08/19/2009

koofers-user-c4f
koofers-user-c4f šŸ‡ŗšŸ‡ø

10 documents

1 / 13

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Binary search trees are designed for efficient access to data. They are built to be search engines that locate
an element along a path from the root. In an application, the actual efficiency depends on the shape of the
tree. The order in which data enters the tree may cause a subtree to be heavily weighted to one side or the
other. In a worst case, the tree is degenerate or "almost degenerate" where most of the n elements are
stored as a lone child of a parent. The shape resembles a linked list (Figure 1(a)) and has search efficiency
O(n). The other extreme is a complete binary tree that stores the n elements in a tree of minimum height by
uniformly distributing the nodes in the left and right subtrees. Access to any element requires no more than
int(log2 n) + 1 comparisons and the search efficiency is O(log2 n). A complete tree represents an ideal shape
for a search tree.
A normal binary search tree uses search tree ordering to insert an element. The add() method
follows a rigid set of rules that locates an element as a leaf node without regard for the overall shape of the
tree. We need search trees that use a dynamic insert algorithm that rearranges elements whenever a subtree
get out of balance. The goal is to have a search tree with a measure of balance among the subtrees similar
to a complete tree. Over the years, researchers have developed just such search trees. In this document, we
will discuss AVL search trees and red-black search trees. An AVL tree, named after its discoverers
Adelson, Velskii and Landis, uses recursive insert and delete algorithms that maintain height-balance at
each node. By this we mean that for each node, the difference in height of its two subtrees is in the range -1
to 1. Figure 1(b) is an AVL tree. For node 70, the height of its left subtree is 1 and the height of its right
subtree is 2. The difference heightL - heightR = -1 and the tree is "slightly tilted" to the right. In contrast to a
simple binary search tree, an AVL tree can never become heavily weight to one side or the other. A red-
black search tree provides a different kind a structure and balance criteria. The design of a red-black tree
has its origins in a balanced search tree called a 2-3-4 tree. The description comes from the fact that each
node has 2, 3, or 4 links (children). A 2-3-4 tree is perfectly balanced in the sense that no interior node has
a null child and all leaf nodes are at the same level in the tree (Figure 1(d)). A red-black tree provides a
representation of 2-3-4 trees. The trees feature nodes that have the color attribute BLACK or RED. The tree
maintains a measure of balance called the BLACK-height. Figure 1(c) is a representation of the 2-3-4 tree
with RED nodes displayed with shading. It is BLACK-height balanced since the path from the root to any
empty subtree has two black node. This concept will become clear when we introduce red-black trees in
Section 4. Modern data structures use red-black trees to implement ordered sets and map such as the
TreeSet and TreeMap collection classes. In Section 7, we create the RBTree class that implements the red-
black tree balancing algorithms.
FIGURE 1
Different search tree structures for the list {50, 95, 60, 90, 70, 80, 75, 78}.
Binary search
trees are
designed for
efficient location
of an element.
AVL and
red-black trees
balance a binary
search tree so it
more nearly
resembles a
complete tree.
Binary Se arch Tree (a)
50
78
75
80
70
90
60
80
75
7850
9060
70
AVL Tree (b)
60, 75, 90
50 70
2-3-4 Tree (d)
78, 80 95
Red-Black Tree (c)
78
807050
9060
75
95
95 95
pf3
pf4
pf5
pf8
pf9
pfa
pfd

Partial preview of the text

Download AVL and Red-Black Search Trees: Balancing Binary Search Trees and more Study notes Algorithms and Programming in PDF only on Docsity!

Binary search trees are designed for efficient access to data. They are built to be search engines that locate

an element along a path from the root. In an application, the actual efficiency depends on the shape of the

tree. The order in which data enters the tree may cause a subtree to be heavily weighted to one side or the

other. In a worst case, the tree is degenerate or "almost degenerate" where most of the n elements are

stored as a lone child of a parent. The shape resembles a linked list (Figure 1(a)) and has search efficiency

O(n). The other extreme is a complete binary tree that stores the n elements in a tree of minimum height by

uniformly distributing the nodes in the left and right subtrees. Access to any element requires no more than

int(log 2 n) + 1 comparisons and the search efficiency is O(log 2 n). A complete tree represents an ideal shape

for a search tree.

A normal binary search tree uses search tree ordering to insert an element. The add() method

follows a rigid set of rules that locates an element as a leaf node without regard for the overall shape of the

tree. We need search trees that use a dynamic insert algorithm that rearranges elements whenever a subtree

get out of balance. The goal is to have a search tree with a measure of balance among the subtrees similar

to a complete tree. Over the years, researchers have developed just such search trees. In this document, we

will discuss AVL search trees and red-black search trees. An AVL tree, named after its discoverers

Adelson, Velskii and Landis, uses recursive insert and delete algorithms that maintain height-balance at

each node. By this we mean that for each node, the difference in height of its two subtrees is in the range -

to 1. Figure 1(b) is an AVL tree. For node 70, the height of its left subtree is 1 and the height of its right

subtree is 2. The difference heightL - heightR = -1 and the tree is "slightly tilted" to the right. In contrast to a

simple binary search tree, an AVL tree can never become heavily weight to one side or the other. A red-

black search tree provides a different kind a structure and balance criteria. The design of a red-black tree

has its origins in a balanced search tree called a 2-3-4 tree. The description comes from the fact that each

node has 2, 3, or 4 links (children). A 2-3-4 tree is perfectly balanced in the sense that no interior node has

a null child and all leaf nodes are at the same level in the tree (Figure 1(d)). A red-black tree provides a

representation of 2-3-4 trees. The trees feature nodes that have the color attribute BLACK or RED. The tree

maintains a measure of balance called the BLACK-height. Figure 1(c) is a representation of the 2-3-4 tree

with RED nodes displayed with shading. It is BLACK-height balanced since the path from the root to any

empty subtree has two black node. This concept will become clear when we introduce red-black trees in

Section 4. Modern data structures use red-black trees to implement ordered sets and map such as the

TreeSet and TreeMap collection classes. In Section 7, we create the RBTree class that implements the red-

black tree balancing algorithms.

FIGURE 1

Different search tree structures for the list {50, 95, 60, 90, 70, 80, 75, 78}.

Binary search trees are designed for efficient location of an element. AVL and red-black trees balance a binary search tree so it more nearly resembles a complete tree.

Binary Search Tree (a)

AVL Tree (b)

2 - 3 - 4 Tree (d)

Red-Black Tree (c)

1 AVL Trees

AVL trees are modeled after binary search trees but with new algorithms to insert and delete an element.

These operations must preserve the balance feature of the tree. Associated with each AVL tree node is its

balanceFactor , which is the difference between the heights of the left and right subtrees.

balanceFactor = height(left subtree) - height(right subtree)

An AVL tree is height-balanced when the balanceFactor for each node is in the range -1 to 1.

If balanceFactor is positive, the node is "heavy on the left" since the height of the left subtree is greater than

the height of the right subtree. With a negative balanceFactor, the node is "heavy on the right." A balanced

node has balanceFactor = 0. Figure 2 describes three AVL trees with tags -1, 0, or 1 on each node to

indicate its balanceFactor.

(a) (b) (c)

FIGURE 2

AVL Trees with Height-balance Factor

The AVLTree Class

The AVLTree class has the same public methods as the STree class. It implements the Collection

framework and adds familiar methods that make it a good implementation structure. The constructor

creates an empty collection. The find() method is an implementation utility operation. It has an Object

parameter and returns a reference to an AVL tree node that matches the argument. The return value is null

is the object is not in the tree. For output, the toString() method returns a comma-separated ordered list of

elements enclosed in square brachets. Modified version of the displayTree(), drawTree(), and

drawMultiTree() methods provide console and graphical displays of a tree. Different from the BinaryTree

methods, the AVLTree versions display the node label with its balance factor included in parentheses. The

UML diagram detail AVLTree methods in the Collection interface and the special class methods.

In each node of an AVL tree, the difference between the heights of its left and right subtrees does not exceed

The AVLTree class implements the Collection interface and builds an AVL tree. It has the same public methods as the STree class.

2 Implementing the AVLTree Class

The building blocks of an AVL tree are AVLTreeNode objects. Like an STNode for the STree class, an

AVLTreeNode includes a nodeValue field and references left and right that point to the two children. The

node also contains a height field that defines the height of the node as the root of a subtree.

left nodeValue^ height right

AVLTreeNode

The height of the node is defined in terms of the heights for the left and right subtrees.

height(node) = max (height(node.left), height(node.right)) + 1;

The following is a declaration of the AVLTreeNode class. It is defined as a private inner class

within the AVLTree class. Since a node object is used only as an implementation structure, we define the

data members to be public. This allows us to directly reference the fields when accessing and updating

their values. A constructor takes a value argument that initializes the nodeValue field. The height field is

set to 0 and the reference pointers are set to null.

AVLNode Class:

private static class AVLNode

// node data

public Object nodeValue;

// child links and link to the node's parent

public AVLNode left, right;

// public int balanceFactor;

public int balanceFactor;

// constructor loads only value field of the node

public AVLNode (Object element)

nodeValue = element;

left = null;

right = null;

height = 0;

The AVLTree add() Method

The implementation of add() uses the private recursive method addNode() to insert a new element. The

algorithm provides for the reordering of elements when the a node falls out of balance; that is, when the

balance factor of the node is -2 or +2. As we will discover, the algorithm introduces single and double

rotations that restore height-balance at a node.

An AVLNode contains the node value, references to the node's children, and the balance factor of the node.

Insert 65 along path 40 - 50 - 60 40 (- 1 ) 60 ( 0 ) 20 ( 1 ) 55 ( 0 ) 65 ( 0 ) 40 ( 0 ) 60 ( 0 ) 45 ( 0 ) 50 ( 0 ) 20 ( 1 ) 10 ( 0 ) Insert 55 along path 40 - 50 - 60 40 (- 1 ) 60 ( 1 ) 50 (- 1 ) 55 ( 0 ) 45 ( 0 ) 50 (- 1 ) 10 ( 0 ) 45 ( 0 ) 20 ( 1 ) 10 ( 0 )

The addNode() algorithm traverses down a path of nodes from the root using the usual search tree

criteria. It proceeds to the left subtree if the new element is less than the value of the current node and to the

right subtree if the new element is greater than the value of the current node. The scan terminates at an

empty subtree which becomes the new location for the element in the tree.

Adding an element to the tree may change the balance factor associated with one or more nodes in

the search path. As a result, the tree may fall out of balance and require a reordering of nodes to reestablish

the height-balance criteria. Since the insertion process is recursive, we have access to the nodes in the

search path in reverse order. This allows the method to visit each successive parent back to the root and

check its balance factor. In some cases, the factor is changed but remains within the valid range -1 to 1. In

other cases, the parent has a balance factor of -2 or 2 indicating that the subtree is out of balance. The

algorithm then employs rebalancing operations.

Let us look at examples where an insertion maintains the height-balance of the AVL tree. Figure 3

displays the effect of inserting 55 and then 65 into an AVL tree that initially contains six elements. For the

element 55, the search path includes the nodes 40 - 50 - 60 which have balance factor 0. After the

insertion, the balance factor for each node on the path changes but still remains in range. The same search

path is used for element 65 and only the balance factor for 60 is changed.

FIGURE 3

Inserting 55 and 65 maintains AVL height-balance

Imbalanced Subtrees

The insertion of a new element in an AVL tree may cause a parent node to become imbalanced when the

element is added as a leaf node in a subtree of one of the parent's children. Let us look at the different

situations. A parent node has balance factor 2 when a new element X is inserted as a leaf node in a subtree

of the parent's left child. The new element is in the left (outside) grandchild subtree when its value is less

than the value of the left child (LC) (Figure 4 (a)). The new element is the right (inside) grandchild subtree

when its value is greater than the value of the left child (Figure 4 (b)).

The recursive addNode() algorithm moves to the insertion point using the usual rules for a binary search tree. The addition of an element may cause the tree to be out of balance. The recursive addNode() algorithm reorders nodes as it returns from function calls. The addition of an element may cause the tree to be out of balance. The recursive addNode() algorithm reorders nodes as it returns from function calls.

When the new element enters the subtree of an outside grandchild, a single rotation exchanges the parent

and child node. The figure illustrates the case where element X enters the subtree of the left grandchild of

P. A single right rotation rotates the nodes so that the left child (LC) replaces the parent, which becomes a

right child. In the process, the nodes in the right subtree of LC (RGC) are attached as a left child of P. This

maintains the search tree ordering since nodes in the right subtree are greater than LC but less than P.

Single Right Rotation

P

LC

X

LGC

RGC RC

P

LC

X

LGC RGC

RC

The private method singleRotationRight() reorders the nodes and updates the height field for both the

parent and child. The new subtree, with the left child LC as the root is the return value.

singleRotateRight():

private static AVLNode singleRotateRight(AVLNode p)

AVLNode lc = p.left;

p.left = lc.right;

lc.right = p;

p.height = max( height(p.left), height(p.right)) + 1;

lc.height = max(height(lc.left ), lc.height) + 1;

return lc;

A symmetric single left rotation occurs when the new element enters the subtree of the right

outside grandchild. The rotation exchanges the parent and right child nodes, and attaches the subtree LGC

as a right subtree for the parent node.

When the new element enters the subtree of an outside grandchild, a single rotation exchanges the parent and child node. A rotation is either to the left or the right.

Single Left Rotation

P

LC LGC

RC

X

RGC

P

RC

X

LGC RGC

LC

The private method singleRotateLeft() updates the height field for both the parent and the right child.

The child is the root of the reordered subtree and is the return value.

singleRotateLeft():

private static AVLNode singleRotateLeft(AVLNode p)

AVLNode rc = p.right;

p.right = rc.left;

rc.left = p;

p.height = max(height(p.left), height(p.right)) + 1;

rc.height = max(height(rc.right), rc.height) + 1;

return rc;

Double Rotations

A different rebalancing algorithm occurs when the new element is added to the subtree of an inside

grandchild. Let us look at the case where the balance factor of the parent is 2 and thus the imbalance

occurs in the left subtree (Figure 6). The new element X is less than the value of the parent and greater

than the value of the left child. To rebalance the parent subtree, use a double right rotation which is series

of two single rotations. Start with a single left rotation about the left child (LC) and follow that with a

single right rotation about the parent (P). The two single rotations update the height field for the affected

nodes.

When the new element enters the subtree of an inside grandchild, a double rotation corrects the imbalance. A double rotation consists of two single rotations.

// recursive descent to the left child

t.left = addNode( t.left, item);

// when backtracking to the parent, check for balance

if( height(t.left) - height(t.right) == 2 )

// if out of balance, determine whether item is in the left or the

// right subtree of the left child

if( ((Comparable)item).compareTo(t.left.nodeValue) < 0 )

t = singleRotateRight(t);

else

t = doubleRotateRight(t);

The implementation of addNode() uses symmetric code if the original descent is into the right

subtree and the parent has a balance factor -2 when backtracking.

addNode():

private AVLNode addNode(AVLNode t, Object element)

if( t == null )

t = new AVLNode( element);

else if( ((Comparable)element).compareTo(t.nodeValue) < 0 )

t.left = addNode( t.left, element);

if( height(t.left) - height(t.right) == 2 )

if( ((Comparable)element).compareTo(t.left.nodeValue) <

t = singleRotateRight(t);

else

t = doubleRotateRight(t);

else if( ((Comparable)element).compareTo(t.nodeValue) > 0 )

t.right = addNode(t.right, element );

if( height(t.left) - height(t.right) == -2)

if( ((Comparable)element).compareTo(t.right.nodeValue)>

t = singleRotateLeft(t);

else

t = doubleRotateLeft(t);

else

// duplicate; throw IllegalStateException

throw new IllegalStateException();

t.height = max( height(t.left), height(t.right) ) + 1;

return t;

The add() Method

The details of inserting a new element in a tree are handled primarily by the addNode() method. Overseeing

the operation is the responsibility of the add() method. Note that addNode() throws an exception if a

duplicate element is already in the tree. This allows an immediate exit from the recursive process. The

method add() simply catches the exeption and returns the value false to indicate that no new element is

added to the tree. Otherwise, addNode() does the insertion and returns the root which may have changed

due to rebalancing. The method concludes by incrementing the tree size and the variable modCount and

then returns true.

The add() method assures that item is not in the tree, calls addNode() to insert it, and then increments treeSize and modCount.

add():

// it element is not in the tree, insert it and return true.

// if element is a duplicate, do not insert it and return false

public boolean add(Object element)

try

root = addNode(root, element);

catch (IllegalStateException ise)

{ return false; }

// increment the tree size and modCount

treeSize++;

modCount++;

// we added a node to the tree

return true;

EXAMPLE 2

The example illustrates each of the four AVL tree rotations. We build a tree with elements from the integer

array {24, 12, 5, 30, 20, 45, 11, 13, 9, 16}. Rotations occur when inserting elements 5, 45, 9, and 16. The

figure displays the tree after adding each of these key elements. You first see the tree after the familiar

binary search tree insert algorithm appends the element as a leaf node. The imbalanced parent node is

shaded. The next view is the tree after a rotation has reestablished height-balance for the parent.

Part 1: Insert the first three elements 24, 12, and 5. At 5, node 24 has balance factor 2. Viewing 24 as a

parent, the new element 5 enters as an outside grandchild in the left subtree 12. The parent is

rebalanced with a single right rotation.

24 12 5 12 (^524) Insert 24 12 5 Single Rotate Right (P= 24 )

Part 2: Insert the next three elements 30, 20, and 45. At 45, node 12 has a balance factor -2. Since 45

entered as an outside grandchild in the right subtree 24, rebalance the parent with a single left rotation.