I created a small React application to go along with this blog-post. It allows one to insert nodes into, and visualize the formation of, a B-Tree of order 4.

A link to a github repository will be up soon!

image-title-here

On Trees

Trees are prominent, non-linear data structures with a wide range of use-cases, including:

  • Representing priority queues (as heaps)
  • Serving as the foundational structure of decision making algorithms, i.e. the Decision Tree.
  • Providing effecient access to stored information when balanced, as in the case of B-Trees
  • Representing the DOM!

Tree data structures can be visualized by imagining a real life tree hanging upside down, where its root is at the top and the branches and leaves spread out downwards.

[shotsFired.jpg]

image-title-here

The individual components of a tree are called nodes, and the topmost component is the root node. Nodes with branches are called branch nodes, and the bottom-most nodes with no connecting branches are called leaf nodes. Every node has data, as well as zero or more children, which are the nodes on a deeper level that are connected to it.

More properties of trees:

  • A path is a sequence of connected nodes.
  • A path is of length n where n is the number of nodes in the path.
  • The height of a tree is the length of the longest path from the root node to a leaf.
  • The depth of a node is the length of the path from said node to the root.

Trees can be structured in specific ways to accomplish different tasks. For instance, a Binary Search Tree (shown below) is a tree such that:

  1. The left subtree of a node only contains nodes with keys lesser than said node’s key.
  2. The right subtree of a node only contains nodes with keys greater than said node’s key.
  3. Each left and right subtree must also be Binary Search Trees, i.e. the above two conditions hold all throughout the levels of the tree.
  4. There can be no duplicate nodes.
            [8]        //root, depth = 0. Tree of height 3. key = 8.
           /   \ 
          /     \
         [3]    [10]  //<-- depth = 1. height = 2
        /  \      \
       /    \      \
      [1]   [6]     [14]  //leaf node, depth = 2

A binary search tree with height n has O(log n) levels if implemented correctly. This means searching is an O(log n) operation as compared to an array, where it is an O(n) operation.

On B-Trees

A B-Tree is a self-balancing search tree and extension of the Binary Search Tree. Each node has a maximum of “m” children, which means it is of order “m”. A Binary Search Tree can also be thought of as a B-tree of order 2.

A B-tree of order “m”:

  • Each node has at MOST “m” children.

  • Each internal node has at least children.

  • Root has at least 2 children if it is not a leaf.

  • Non-leaf node with “k” children will have (k-1) keys. For a given key, all children to the left will have keys less than that key, and all children to the right will be greater.

               [[5] [9]]      //2 Keys
                /  |  \       
               /   |   \         
              /    |    \     //3 Children
      [[1][2]] [[6][7]]  [[10][11] 
    
  • All leaves appear in the same level. Important for maintaining a height bound!

The structure of a B-Tree is maintained by the algorithm by which nodes are inserted to the tree. Any insertion must occur at a leaf node. The process is outlined as below, and was adapted from “Introduction to Algorithms by Cormen. It’s a pro-active insertion algorithm:

  1. Start with currentNode as the root.
  2. While currentNode is not a leaf,

    • Find child x to be traversed to.
    • If child x is not full, change currentNode to child x.
    • If child x is full, split() the node into partA and partB. * If they inserted key is < the mid value of currentNode, set currentNode to the firt part, partA, else partB. * When splitting the currentNode, we move a key from currentNode to its parent
  3. Exit loop in step 2 when x is a leaf. At this point, x must have room for the key due to all the splitting we’ve done in advance.

There are numerous other trees worth checking out! These include:

  • Red-Black Trees, a self-balancing binary search tree where each node also has a bit that is interpreted as the color of the node. This is used to maintain balance during insertion and deletions.
  • Tries, a tree used to store a dynamic set or associative array of keys that are usually strings. The nodes do not store the key associated with that node. Instead, the node’s position defines what key it is linked to.
  • AVL Trees, a balanced binary-search tree where the heights of the two child subtrees of a given node differ by at most one.