Union Find

March 7, 2011 – 6 years ago, by sarahgrebing

Short Description

This benchmark makes use of the union-find data structure shown in the code Section.
The verification should be performed for an arbitrary rand() function, even if an
implementation is actually provided with the solution.

Code

class UnionFind {
 static UnionFind create(uint sz);
 uint getNumClasses();
 int find(int a);
 void union(int a, int b);
}

uint rand();
void buildMaze(uint n) {
 UnionFind u = create(n*n);
 while (u.getNumClasses() > 1) {
  uint x = rand() % n, y = rand() % n, d = rand() % 2;
  uint w = x, z = y;
  if (d == 0) w++; else z++;
  if (w < n && z < n) {
   int a = y*n + x, b = w*n + z;
   if (u.find(a) != u.find(b)) {
    output edge ((x,y), (w,z));
    u.union(a, b);
   }
  }
 }
}

Specification

Long Description

This benchmark makes use of the union-find data structure shown in the code Section.

Method create() creates a union-find data structure consisting of sz elements,
identified by the integers from 0 to less than sz. Initially, each element is in an equivalence class by itself; that is, it is its own representative. In other words, the number of equivalence classes, which is returned by getNumClasses(), is initially sz.
As usual, find() returns the representative element for a given element, and union() merges two equivalence classes.
The test harness constructs a random maze. The maze has size n times n. Thus, it
is a graph with n*n nodes. The maze will be a spanning tree of that graph. The maze construction makes use of union-find as illustrated in the code Section.

The verification should be performed for an arbitrary rand() function, even if an
implementation is actually provided with the solution. The test harness shown above is acceptable for verifying partial correctness, but needs to be suitably modified in order to stand a chance of scoring points for termination.
The maze is considered correct if it is a spanning tree. More precisely, the task in this benchmark is to show that all n*n nodes are reachable (either from a particular node or that nodes are pairwise connected) and that there are n*n1
edges. (From these two properties, it follows that nodes are uniquely reachable.)

Verification Tasks

  1. Verify the correctness of the maze creation. 60 points.
  2. Include path compression in the implementation of the union-find algorithms. 10points.
  3. Include node balancing in the implementation of the union-find algorithms. 10points.

Tasks 2 and 3 involve the two standard optimizations that make union-find efficient.
For task 1, any correct implementation of union-find is acceptable.

Supplementary Materials

The paper from which this benchmark is taken was published in may 2010 by K. Rustan M. Leino and Michal Moskal. It contains some more benchmarks which are also included on this site.
Information about the scoring system and the pseudo-code description of the programs can be found here.

Submitted by

Sarah Grebing