Comprehensive Trees Tutorial

Introduction to Trees

A tree is a hierarchical data structure that consists of nodes connected by edges. The top node is called the root, and each node can have zero or more child nodes. Trees are used in various applications, including databases, file systems, and network routing.

Key Operations: Insertion, Deletion, Traversal (Preorder, Inorder, Postorder), Searching

Time Complexity: Depends on the type of tree (typically O(log n) for balanced trees)

Space Complexity: O(n)

Types of Trees

1. Binary Tree

A binary tree is a tree data structure where each node has at most two children, referred to as the left child and the right child.

Example (C++):

struct Node {
    int data;
    Node* left;
    Node* right;

    Node(int val) {
        data = val;
        left = right = nullptr;
    }
};

Node* root = new Node(1);
root->left = new Node(2);
root->right = new Node(3);
            

Use Cases: Simple data structures, basic tree algorithms.

Time Complexity (Insertion/Deletion): O(h) where h is the height of the tree

Space Complexity: O(n)

2. Binary Search Tree (BST)

A binary search tree is a binary tree with the property that the left subtree of a node contains only nodes with values less than the node’s key, and the right subtree only nodes with values greater than the node’s key.

Example (C++):

struct Node {
    int data;
    Node* left;
    Node* right;

    Node(int val) {
        data = val;
        left = right = nullptr;
    }
};

Node* insert(Node* node, int key) {
    if (node == nullptr) return new Node(key);

    if (key < node->data)
        node->left = insert(node->left, key);
    else if (key > node->data)
        node->right = insert(node->right, key);

    return node;
}

Node* root = insert(nullptr, 10);
insert(root, 5);
insert(root, 20);
            

Use Cases: Efficient searching, dynamic sets of data, maintaining sorted lists.

Time Complexity (Insertion/Deletion/Search): O(log n) for balanced BST, O(n) for unbalanced BST

Space Complexity: O(n)

3. AVL Tree

An AVL tree is a self-balancing binary search tree where the difference between the heights of left and right subtrees cannot be more than one for all nodes.

Example (C++):

struct Node {
    int data;
    Node* left;
    Node* right;
    int height;
    
    Node(int val) {
        data = val;
        left = right = nullptr;
        height = 1;
    }
};

int height(Node* N) {
    if (N == nullptr)
        return 0;
    return N->height;
}

int getBalance(Node* N) {
    if (N == nullptr)
        return 0;
    return height(N->left) - height(N->right);
}

Node* rotateRight(Node* y) {
    Node* x = y->left;
    Node* T2 = x->right;
    x->right = y;
    y->left = T2;
    y->height = std::max(height(y->left), height(y->right)) + 1;
    x->height = std::max(height(x->left), height(x->right)) + 1;
    return x;
}

Node* rotateLeft(Node* x) {
    Node* y = x->right;
    Node* T2 = y->left;
    y->left = x;
    x->right = T2;
    x->height = std::max(height(x->left), height(x->right)) + 1;
    y->height = std::max(height(y->left), height(y->right)) + 1;
    return y;
}

Node* insert(Node* node, int key) {
    if (node == nullptr)
        return new Node(key);

    if (key < node->data)
        node->left = insert(node->left, key);
    else if (key > node->data)
        node->right = insert(node->right, key);
    else
        return node;

    node->height = 1 + std::max(height(node->left), height(node->right));

    int balance = getBalance(node);

    if (balance > 1 && key < node->left->data)
        return rotateRight(node);

    if (balance < -1 && key > node->right->data)
        return rotateLeft(node);

    if (balance > 1 && key > node->left->data) {
        node->left = rotateLeft(node->left);
        return rotateRight(node);
    }

    if (balance < -1 && key < node->right->data) {
        node->right = rotateRight(node->right);
        return rotateLeft(node);
    }

    return node;
}
            

Use Cases: Applications where the data is dynamic, and the height of the tree needs to be balanced for efficient operations.

Time Complexity (Insertion/Deletion/Search): O(log n) for all operations

Space Complexity: O(n)

4. Red-Black Tree

A red-black tree is a self-balancing binary search tree where each node has an extra bit for denoting the color of the node, either red or black. It ensures that the tree remains balanced, and the path from the root to the farthest leaf is no more than twice the path from the root to the nearest leaf.

Example (C++):

enum Color {RED, BLACK};

struct Node {
    int data;
    bool color;
    Node* left, *right, *parent;

    Node(int data) {
        this->data = data;
        left = right = parent = nullptr;
        this->color = RED;
    }
};

Node* rotateLeft(Node* &root, Node* &pt) {
    Node* pt_right = pt->right;
    pt->right = pt_right->left;

    if (pt->right != nullptr)
        pt->right->parent = pt;

    pt_right->parent = pt->parent;

    if (pt->parent == nullptr)
        root = pt_right;

    else if (pt == pt->parent->left)
        pt->parent->left = pt_right;

    else
        pt->parent->right = pt_right;

    pt_right->left = pt;
    pt->parent = pt_right;
}

Node* rotateRight(Node* &root, Node* &pt) {
    Node* pt_left = pt->left;
    pt->left = pt_left->right;

    if (pt->left != nullptr)
        pt->left->parent = pt;

    pt_left->parent = pt->parent;

    if (pt->parent == nullptr)
        root = pt_left;

    else if (pt == pt->parent->left)
        pt->parent->left = pt_left;

    else
        pt->parent->right = pt_left;

    pt_left->right = pt;
    pt->parent = pt_left;
}
            

Use Cases: Balancing large datasets, implementing associative arrays, multilevel indexing.

Time Complexity (Insertion/Deletion/Search): O(log n) for all operations

Space Complexity: O(n)

5. B-Tree

A B-tree is a self-balancing search tree in which nodes can have more than two children. It is commonly used in databases and file systems to allow quick access to data while maintaining balanced tree properties.

Example (Pseudocode):

// Node structure of B-Tree
struct BTreeNode {
    int *keys;
    int t;
    BTreeNode **C;
    int n;
    bool leaf;
};

BTreeNode* createNode(int t, bool leaf) {
    BTreeNode* node = new BTreeNode();
    node->t = t;
    node->leaf = leaf;
    node->keys = new int[2*t-1];
    node->C = new BTreeNode *[2*t];
    node->n = 0;
    return node;
}

// B-Tree Insertion involves splitting nodes that overflow, and the structure ensures balance by distributing keys across nodes evenly.
            

Use Cases: Database indexing, file systems, efficient insertion, deletion, and searching in large datasets.

Time Complexity (Insertion/Deletion/Search): O(log n) for all operations

Space Complexity: O(n)

6. Heap (Binary Heap)

A binary heap is a complete binary tree that satisfies the heap property: in a max heap, for any given node, the value of the node is greater than or equal to the values of its children, and in a min heap, the value of the node is less than or equal to the values of its children.

Example (C++):

void heapify(int arr[], int n, int i) {
    int largest = i;
    int left = 2*i + 1;
    int right = 2*i + 2;

    if (left < n && arr[left] > arr[largest])
        largest = left;

    if (right < n && arr[right] > arr[largest])
        largest = right;

    if (largest != i) {
        std::swap(arr[i], arr[largest]);
        heapify(arr, n, largest);
    }
}

void heapSort(int arr[], int n) {
    for (int i = n / 2 - 1; i >= 0; i--)
        heapify(arr, n, i);

    for (int i=n-1; i>=0; i--) {
        std::swap(arr[0], arr[i]);
        heapify(arr, i, 0);
    }
}
            

Use Cases: Implementing priority queues, scheduling tasks, finding the k-th smallest or largest element.

Time Complexity (Insertion/Deletion/Search): O(log n) for insertion and deletion, O(n log n) for heap sort

Space Complexity: O(n)

Tree Operations

1. Insertion

Insertion in trees varies depending on the type of tree. In binary search trees, insertion follows the property of the tree to maintain order. In AVL and Red-Black Trees, rotations may be necessary to maintain balance.

2. Deletion

Deletion also depends on the type of tree. For binary search trees, deletion of a node with two children involves finding the in-order successor. In AVL and Red-Black Trees, deletions may require rotations and recoloring to maintain balance.

3. Traversal

Traversal of trees can be done in several ways:

Here’s a code example for Preorder Traversal:

Example (C++):

void preorder(Node* node) {
    if (node == nullptr)
        return;

    std::cout << node->data << " ";
    preorder(node->left);
    preorder(node->right);
}

Node* root = new Node(1);
root->left = new Node(2);
root->right = new Node(3);
preorder(root);  // Output: 1 2 3
            

Here’s a code example for Inorder Traversal:

Example (C++):

void inorder(Node* node) {
    if (node == nullptr)
        return;

    inorder(node->left);
    std::cout << node->data << " ";
    inorder(node->right);
}

Node* root = new Node(1);
root->left = new Node(2);
root->right = new Node(3);
inorder(root);  // Output: 2 1 3
            

Here’s a code example for Postorder Traversal:

Example (C++):

void postorder(Node* node) {
    if (node == nullptr)
        return;

    postorder(node->left);
    postorder(node->right);
    std::cout << node->data << " ";
}

Node* root = new Node(1);
root->left = new Node(2);
root->right = new Node(3);
postorder(root);  // Output: 2 3 1
            

Here’s a code example for Level Order Traversal:

Example (C++):

#include 

void levelOrder(Node* root) {
    if (root == nullptr)
        return;

    std::queue q;
    q.push(root);

    while (!q.empty()) {
        Node* node = q.front();
        std::cout << node->data << " ";
        q.pop();

        if (node->left != nullptr)
            q.push(node->left);

        if (node->right != nullptr)
            q.push(node->right);
    }
}

Node* root = new Node(1);
root->left = new Node(2);
root->right = new Node(3);
levelOrder(root);  // Output: 1 2 3
            

4. Searching

Searching in binary search trees, AVL Trees, and Red-Black Trees typically has a time complexity of O(log n) due to the balanced nature of these trees. In unbalanced trees, searching can take O(n) time.

Here’s a code example for Searching in a BST:

Example (C++):

Node* search(Node* root, int key) {
    if (root == nullptr || root->data == key)
        return root;

    if (root->data < key)
        return search(root->right, key);

    return search(root->left, key);
}

Node* root = insert(nullptr, 10);
insert(root, 5);
insert(root, 20);
Node* result = search(root, 5);
if (result != nullptr)
    std::cout << "Found " << result->data << "\n";  // Output: Found 5
else
    std::cout << "Not Found\n";
            

When to Use Different Types of Trees

Each type of tree has its strengths and is suitable for different scenarios:

1. Binary Tree

Use Cases: When a simple, flexible tree structure is needed without strict balance requirements. Suitable for basic tree operations, expression trees, and simple hierarchical data.

2. Binary Search Tree (BST)

Use Cases: When dynamic data needs to be stored in a sorted manner with efficient searching, insertion, and deletion. Useful for building dictionaries, priority queues, and implementing associative arrays.

3. AVL Tree

Use Cases: When maintaining a balanced tree is crucial for ensuring efficient operations. Useful in database indexing, dynamic data structures, and scenarios where frequent insertions and deletions occur.

4. Red-Black Tree

Use Cases: Similar to AVL trees but with slightly relaxed balancing, making them faster in practice for insertions and deletions. Commonly used in implementing balanced maps, sets, and associative arrays in libraries like C++ STL and Java’s TreeMap.

5. B-Tree

Use Cases: Ideal for database indexing and file systems where large datasets need to be managed efficiently. B-trees excel in scenarios with high read and write throughput and where disk access speed is a concern.

6. Heap (Binary Heap)

Use Cases: Best for implementing priority queues, scheduling tasks, and algorithms like Dijkstra's shortest path. Useful when finding the minimum or maximum element is a frequent operation.