Data Structures & Algorithms – Greedy algorithms (e.g. Huffman coding, Kruskal’s minimum spanning tree)

Greedy algorithms are a class of algorithms that make locally optimal choices at each step with the hope of finding a global optimum. They are particularly useful for solving optimization problems where the goal is to find the best solution among a set of possibilities. Some examples of greedy algorithms include:

  1. Huffman coding: Huffman coding is a lossless data compression algorithm that works by constructing a prefix code tree, also known as a Huffman tree, based on the frequency of characters in a given text. The characters with the highest frequency are assigned the shortest codewords and the characters with the lowest frequency are assigned the longest codewords. The time complexity of Huffman coding is O(nlogn)

Here is an example of Huffman coding implemented in Python:

from queue import PriorityQueue

class Node:
    def __init__(self, char, freq):
        self.char = char
        self.freq = freq
        self.left = None
        self.right = None

    def __lt__(self, other):
        return self.freq < other.freq

def huffman_coding(data):
    freq = {}
    for char in data:
        freq[char] = freq.get(char, 0) + 1

    q = PriorityQueue()
    for char, f in freq.items():
        q.put(Node(char, f))

    while q.qsize() > 1:
        left = q.get()
        right = q.get()
        node = Node(None, left.freq + right.freq)
        node.left = left
        node.right = right
        q.put(node)

    codes = {}
    def dfs(node, code):
        if node.char:
            codes[node.char] = code
            return
        dfs(node.left, code + '0')
        dfs(node.right, code + '1')

    root = q.get()
    dfs(root, '')

    return codes

data = 'aaaabbbbcccc'
print(huffman_coding(data))
  1. Kruskal’s minimum spanning tree: Kruskal’s algorithm is a greedy algorithm that finds the minimum spanning tree of a given graph by iteratively adding edges to the tree while making sure that the edges do not form a cycle. The edges are added in increasing order of weight. The time complexity of Kruskal’s algorithm is O(ElogE)

Here is an example of Kruskal’s algorithm implemented in Python:

class UnionFind:
    def __init__(self, n):
        self.parent = [i for i in range(n)]
        self.rank = [0</code> for i in range(n)]

def find(self, x):
    if self.parent[x] != x:
        self.parent[x] = self.find(self.parent[x])
    return self.parent[x]

def union(self, x, y):
    x_root = self.find(x)
    y_root = self.find(y)
    if self.rank[x_root] < self.rank[y_root]:
        self.parent[x_root] = y_root
    elif self.rank[x_root] > self.rank[y_root]:
        self.parent[y_root] = x_root
    else:
        self.parent[y_root] = x_root
        self.rank[x_root] += 1

def kruskal(n, edges):
edges.sort(key=lambda x: x[2])
mst = []
uf = UnionFind(n)

for edge in edges:
    u, v, w = edge
    if uf.find(u) != uf.find(v):
        uf.union(u, v)
        mst.append(edge)

return mst

n = 9
edges = [(0, 1, 4), (0, 7, 8), (1, 2, 8), (1, 7, 11), (2, 3, 7), (2, 8, 2), (2, 5, 4), (3, 4, 9), (3, 5, 14), (4, 5, 10), (5, 6, 2), (6, 7, 1), (6, 8, 6), (7, 8, 7)]

print(kruskal(n, edges))

Both algorithms can be implemented in various programming languages such as Python, JavaScript, Java and Rust. Keep in mind that the examples provided are simplified versions of the original algorithms and may not be suitable for production use. In conclusion, greedy algorithms make locally optimal choices at each step with the hope of finding a global optimum. Huffman coding and Kruskal’s minimum spanning tree are examples of greedy algorithms that can be used to solve optimization problems. Huffman coding is used for lossless data compression, while Kruskal’s algorithm is used to find the minimum spanning tree of a given graph. Both algorithms have a time complexity of O(nlogn) and O(ElogE) respectively, which makes them efficient in practice.