1 The Multiway Cut Problem

The Multiway Cut (MC) problem is a generalization of the known Min-Cut problem. The input is a triple where is an undirected graph with a set of vertices and a set of edges, is the set of terminals, and is a function that assign positive weight to each edge in the graph. A valid cut is a set of edges such that any two vertices are disconnected over the sub-graph . The weight of a given cut is the sum of wights of its edges, that is, . A solution for the MC problem is a valid cut of minimum weight.
Let . Note that when the problem is reduced to the regular Min-Cut problem. For general graphs, any makes the problem NP-Hard. However, there are efficient algorithms for Multiway-Cut in trees. A tree is a connected acyclic graph. It is easy to see that if our graph is a tree, any two vertices in it must be connected by exactly one path. Moreover, . We will use these properties later when we prove that the algorithm gives an optimal solution.

2 Efficient Algorithm for MC in Trees

Let be the input to the algorithm, and is a tree. To simplify the analysis later on, we will work slightly differently; instead of trying to find a small-weight set to discard from the graph, we will try to find a large-weight set to keep in the graph. Note that obviously and . Therefore, finding with maximum weight is equivalent to finding with minimum weight.
The idea is to start with a trivial (and probably not optimal) solution , that is, we don’t keep any edge in the graph such that all the terminals are separated, and we will improve iteratively.
We start off with some definitions:
A connected component (CC) of a vertex over a graph is the set of all vertices that can be reached from in . Note that if then there are at most CCs in (each vertex has its own CC, which is the case when the graph has no edges).
A vertex inside some CC is a master node if it is a terminal vertex. We then say that all the other vertices in that CC are dominated by .
Now it is time to describe the algorithm:
  1. Start by sorting the edges of in descending order according to their weights. Let be the ordered sequence such that .
  2. Initialize a trivial solution . Maintain a set of all the CCs in the resulting graph (each vertex is its own CC).
  3. For each do the following:
    1. Consider the endpoints of the edge and let be the connected components of respectively. If at least one of them do not has a master node, update and let be the same as but with merged into a new single connected component. Otherwise and .
  4. Return the set .

3 The Solution is Correct and Optimal

It is easy to see that is a valid solution; we started with all the terminals separated in their own CC, and in each iteration we never connect two CCs that have different masters (i.e. terminals). Therefore, in the returned set, all the connected components has at most one master, meaning that the terminals are separated.
The fact that has maximum weight follows by the two following claims:
  1. Any optimal solution has . Moreover, .
  2. For any iteration of the algorithm, there is an (possibly different) optimal solution such that .
So we conclude that there exists some optimal such that , thus has maximum weight.
We move to state and prove those claims formally.
Let be the set of all maximum-wight solutions for Multiway-Cut for the instance and is a tree. Then any have . Moreover, .
Since we started from a graph that is a tree, the number of CCs in the final sub-graph is exactly , for any . Let be some optimal solution. It left to show that the sub-graph has exactly connected components:

Assume that there are less than , then we must have two terminals which live in the same CC - invalidating the solution . If we assume that there are more than CCs, then we must have a connected component without a master, so we can safely connect it to some other CC with some edge that is not in (and such edge exists since is connected). That way, we get a new valid solution with greater weight than the weight of which is a contradiction.

Similarly, we prove that has CCs: If it has less than , there must be two terminals in the same CCs, and that contradicts the correctness of . If it has more than CCs after iterations, there must be a CC without a master, so we may connect it with some edge, let it be , to other CC without invalidating the solution. But that wasn’t picked when the algorithm considered it - meaning that it could cause a violation in the sub-graph of the th iteration, and since that sub-graph is contained in the sub-graph of the last iteration then we can’t add to our solution - a contradiction.
Note that since any optimal solution have exactly CCs, we have that each CC must contains one terminal. So in the final sub-graph defined by the solution, any is dominated by exactly one master mode .
Let be the set of all optimal solutions for the instance and in a tree. For any let be the set of optimal solutions such that contained in them. Then, for any , .
For we have thus which is not empty. Assume toward contradiction that the claim is false, and let be the smallest such that , so we know that is not empty, and hence . In other words, during the th iteration, the algorithm added the edge to , causing to be empty.

Fix some . Denote and the subgraphs of . From the update rule of the algorithm, at least one of is not dominated by a master over , w.l.o.g assume it is . Let be the master that dominates over , and consider the path connecting , note that any edge along that path belongs to . Since the path does not exists in and in particular in , there is at least one edge that is part of the path and not in , let be such one with minimum weight.

Observe that it is impossible that because it means that the algorithm already rejected on the th iteration and since it was a violation for and then it is also a violation for therefore does not belongs to any optimal solution in , in contradiction that .

So suppose . Note that in the sub-graph induced by , must be dominated my some terminal . Because is a sub-graph of a tree, if then we can close a cycle by adding to , thus . Consider the sub-graph . The edge connects now the terminals and , and that is the only violation in . Since is a sub-graph of a tree there is only one path connecting and and it must contains the path , so removing the edge will break the connection between leading to a valid solution . By the assumption that the weight of can only increase (in fact, the optimality of forces ) and we get an optimal solution that contains , therefore , contradiction.

4 Running Time and Similar Algorithms

Note that , so step 1 of the algorithm takes . Step 2 is order of , times the cost of the operations inside the loop. There are data structures such as Union-Find that allow us to track the CCs of the current sub-graph in amortize small running time that is smaller than . Therefore, the total running time is bounded by .
Note that the algorithm presented in section 2, is similar to Kruskal’s algorithm for finding the minimum spanning tree. In both cases, the greedy approach leads to an optimal solution.