实现

一个parent数组用来储存每个节点 (node) 的最终父节点 (root)
一个find方法用来找寻当前节点 (node) 的最终父节点 (root)
一个union方法用来合并两个符合关系的节点
main方法对于需要合并的节点进行判断，并使用合并后的集合

//find the root parent of index i, it represent the whole cluster
public int find( int[] parent, int i ) {
    if(parent[i] == i) {
        return parent[i];
    }
    return find( parent, parent[i] );
}

//merge two node to the same parent because they are correlated
public void union ( int[] parent, int i, int j) {
    int ip = find ( parent, i );
    int jp = find ( parent, j );
    if ( ip != jp ) {
        parent[ jp ] = ip; 
    }
}

public int DSU(int[][] M) {
    if ( M.length == 0 ) {
        return 0;
    }
    //initialize the parent array
    int[] parent = new int [ M.length ];
    for (int i = 0; i < M.length; i++) {
        parent[ i ] = i;
    }
    for (int i = 0; i < M.length; i++) {   
        for (int j = 0; j < M.length; j++) {
            //if satisfied, merge them to one union
            if ( i != j && M[i][j] == 1) {
                union( parent, i, j );
           }
        }
    }
    int count = 0;
    //use the union result
    for (int i = 0; i < parent.length; i++) {
        if ( parent[i] == i ) {
            count++;       
        }
        return count;
}

写在一个function内

public int[] findRedundantConnection ( int[][] edges ) {
    //only one redundant edge, so number of nodes = edges.length
    int[] parent = new int [ edges.length ];
    for(int i = 0; i < edges.length; i++) {
        parent[ i ] = i;
    }
    for(int i = 0; i < edges.length; i++) {
        // because the node start at 1, parent[i] represent the parent of node i+1
        // minus 1 for check and modification, plus 1 when return results
        int start = edges[i][0]-1;
        int end = edges[i][1]-1;
        // ---------------- equivalent to find() function --------------------
        // recursively get the root of the union of start
        int sroot = start;
        while ( sroot != parent[sroot] ) {
            sroot = parent[sroot];
        }
        // recursively get the root of the union of end
        int eroot = end;
        while ( eroot != parent[eroot] ) {
            eroot = parent[eroot];
        }
        // ---------------- equivalent to union() function --------------------
        if ( sroot==eroot ) {
            // is already in the same union, current edge is redundant
            int[] ans= { start+1, end+1 };
            return ans;
        }else {
            // merge to the same union
            parent[eroot] = sroot;
        }
    }
    return null;
}

优化

路径优化 Path Compression

在find时使所有子节点均指向最终父节点，缩短递归查找父节点的路径
注：路径压缩并不会改变每个节点的秩

Path Compression

private int find(int[] parent, int p) {
    if( parent[i] == i ) {
        return parent[i];    
    }    
    parent[i] = find( parent, parent[i] );  //refresh value in the array before pass it to bottom
    return parent[i];
}

按秩合并 Union by Rank

在union时将节点少的树向节点(size)多的树合并。因为节点多少并不一定代表树的高度，所以可以进一步抽象化为将高度(rank，即这里的秩)小的树合并向高度大的树，以减少向上搜索的复杂度

Union by Size

//merge tree with smaller size to tree with larger size
// size[i] is initialized in main method to 1
public void union ( int[] parent, int[] size, int i, int j) {
    int ip = find ( parent, i );
    int jp = find ( parent, j );
    if ( ip != jp ) {
        if ( size[ip] > size[jp] ) {
            parent[ jp ] = ip; 
            size[ip] += size[jp];
        } else {
            parent[ ip ] = jp;
            size[jp] += size[ip];
        }
    }
}

Union By Rank

//merge tree with smaller height to tree with larger height
// rank[i] is initialized in main method to 1
public void union ( int[] parent, int[] rank, int i, int j) {
    int ip = find ( parent, i );
    int jp = find ( parent, j );
    if ( ip != jp ) {
        //only have to modify rank if the current two trees are with equivalent height
        if ( rank[ip] > rank[jp] ) {
            parent[ jp ] = ip; 
        } else if ( rank[jp] > rank[ip] ) {
            parent[ ip ] = jp;
        } else {
            parent[ jp ] = ip; 
            rank[ip]++;
        }
    }
}

复杂度

N = 需要合并的节点数
空间： $O(N)$ ，建立了与节点树等长的parent数组
时间：当同时使用了路径压缩和按秩合并后，合并的时间复杂度为 $O(N\alpha(N))$ ≈ $O(N)$ 。单次操作复杂度为 $O(\alpha(N))$ ≈1。其中 $\alpha$ 是Inverse-Ackerman函数。如果只有路径压缩或按秩合并，单次操作复杂度为 $O\log{N}$ 。