Data Structures & Algorithms – Disjoint sets

A disjoint set, also known as a union-find data structure, is a collection of sets, where each set is disjoint (i.e., no elements are shared between sets). The main operations supported by a disjoint set are:

  • make_set(x): creates a new set containing only element x.
  • find_set(x): returns a representative element of the set containing x.
  • union(x, y): combines the sets containing x and y into a single set.

There are different ways to implement a disjoint set, and one common approach is to use a forest of trees, where each tree represents a set and the root of the tree is the representative element of the set. The find_set operation can be implemented by traversing the tree to find the root of the tree, and the union operation can be implemented by merging two trees by attaching the root of one tree to the root of the other tree.

Here is an example of how to implement a disjoint set using a forest of trees in Python using primitives:

class DisjointSet:
    def __init__(self):
        self.parent = {}

    def make_set(self, x):
        self.parent[x] = x

    def find_set(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find_set(self.parent[x])
        return self.parent[x]

    def union(self, x, y):
        x_root = self.find_set(x)
        y_root = self.find_set(y)
        self.parent[x_root] = y_root

Here is an example of how to implement a disjoint set using a forest of trees in Java using primitives:

import java.util.HashMap;
import java.util.Map;

class DisjointSet {
    Map<Integer, Integer> parent;

    DisjointSet() {
        parent = new HashMap<Integer, Integer>();
    }

    void makeSet(int x) {
        parent.put(x, x);
    }

    int findSet(int x) {
        if (parent.get(x) != x) {
            parent.put(x, findSet(parent.get(x)));
        }
        return parent.get(x);
    }

    void union(int x, int y) {
        int x_root = findSet(x);
        int y_root = findSet(y);
        parent.put(x_root, y_root);
    }
}

Here is an example of how to implement a disjoint set using a forest of trees in Rust using primitives:

struct DisjointSet {
    parent: HashMap<usize, usize>,
}

impl DisjointSet {
    fn new() -> Self {
        Self {
            parent: HashMap::new(),
        }
    }

    fn make_set(&mut self, x: usize) {
        self.parent.insert(x, x);
    }

    fn find_set(&mut self, x: usize) -> usize {
        let p = self.parent.get(&x).unwrap();
        if p != &x {
            self.parent.insert(x, self.find_set(*p));
        }
        *self.parent.get(&x).unwrap()
    }

    fn union(&mut self, x: usize, y: usize) {
        let x_root = self.find_set(x);
        let y_root = self.find_set(y);
        self.parent.insert(x_root, y_root);
    }
}

It’s important to note that the above implementation is not the most efficient way to implement a disjoint set, as it has a worst case time complexity of O(n*alpha(n)), where alpha(n) is the inverse Ackermann function, which is a very slow-growing function. Other implementations such as the path compression and union by rank can be used to optimize the performance of the disjoint set.

In conclusion, a disjoint set is a useful data structure for solving problems that involve combining and querying sets of elements. It can be implemented using a forest of trees in Python, JavaScript, Java and Rust using primitives. However, for better performance, other techniques such as path compression and union by rank can be used to optimize the implementation.