Trees

[Trees] [Hello. This is Rachel Cardell-Oliver from the University of Western Australia]

Trees are a data structure that occurs in many real world situations: genealogical trees, organizational trees, biological hierarchies, evolutionary trees, population trees, book classification trees, probability trees, decision trees, search trees, graph spanning trees, planning trees, encoding trees, compression trees, program dependency trees, expression/syntax trees, family trees and so on.

How would you represent your family members in a computer? We could simply use a list, but there's a clear hierarchy here.

Let's say we're beginning with my great-grandfather, George. His son (one of) is my grandfather, Alan. Alan has 3 children, my uncle, Richard, my aunt, Felicity, and my mother Nicola. Nicola has three children: Rachel (me), Cathy and Tom. And I have 2 children: Francis and Alexander. Not all family information is shown in this tree. For example, we only show one of two parents.

Why would organising your family members in this way be better than a simple list representation? One reason is that this hierarchical structure, called a tree, contains more information than a simple list. We know the family relationships between everyone just by examining the tree. Also, it can speed up look-up time tremendously, if tree data is sorted.

We can't take advantage of that here, but we'll see an example of this soon. Each person is represented by a node in the tree. Nodes can have child nodes as well as a parent node. These are the technical terms, even when using trees for things besides families. Nicola's node has 3 children and 1 parent, while my node (Rachel) has 2 children and 1 parent. George’s node has no parent and one child. A node with no parent, the topmost node, is called the root of the tree. A leaf node is one that does not have any children on the outside edge of the tree. Francis, Felicity, and Gilbert are all leaf nodes.

The depth of a node is the length of the path from the root to the node. The height of a node is the length of the path from the node to the deepest leaf. Thus, the height of a tree is the length of the path from the root to the deepest tree: the number of levels.

[Figure 18.1 from Weiss]

Nodes with no children (leaves) are sometimes called external nodes and nodes with one or more children are internal nodes.

A binary tree is a specific type of tree, in which each node has, at most 2 children. Binary trees are used in many applications such as programming language compilers, and for expressing codes. It is also an important base for other data structures such as binary search trees and priority queues.

[Figure 18.11 from Weiss]

Here is the class for a node for a binary tree in Java. Every node has some data associated with it and 2 pointers to other nodes.

publicclass BinaryTreeNode {

intnodeValue; // data value associated with a node

BinaryTreeNodeleft; // reference for the left child

BinaryTreeNoderight; // reference for the right child

//constructor here

}

In our family tree, the associated data was each person's name. Names could be represented with a String. As it turns out, a binary tree would NOT be a good representation for a family, since people frequently have more than 2 children.

Tree Traversal

Suppose we wanted to visit all the elements of a binary tree. This is called traversing the tree. Tree traversal is used to search for an element, to print out all elements, and so on.

There are different ways of doing tree traversal depending on how we write the recursion. See UWA lecture notes Topic12-Traversals.pdf for more details.

Pre-order (depth first, parents first)

preorder(t)

if (!t.isEmpty)

visit root of t;

preorder t.left;

preorder t.right;

Post order (children first)

postorder(t)

if (!t.isEmpty)

postorder t.left;

postorder t.right;

visit root of t;

In-order (left them parent then right)

An inorder traversal means to print everything out in the left subtree followed by printing the node itself and then followed by printing everything out in the right subtree.

inorder(t)

if (!t.isEmpty)

inorder t.left;

visit root of t;

inorder t.right;

Level order (breadth first)

A breadth first traversal visits the root, then all nodes at level 1 (children of the root), then all nodes at level 3 (grand-children nodes). That is, starting at root, visit nodes level by level (left to right). This type of traversal does not suit a recursive approach because you have to jump from subtree to subtree.

Consider the results of different traversals for the following tree:

PreOrder Traversal: 621438710

PostOrder Traversal:134271086

InOrder Traversal:123467810

[ Part 1 of this talk is based on a video by Bannus Van der Kloot for Harvard University cs50. Figures are from Weiss, Data Structures Ch 18]