Ordering cyclic graphs

Suppose there are n sports teams, some of them play each other, in each game one team wins and the other loses and in the end we want to order the teams such that if A beats B than A is before B in the order. If two teams A and B played each other multiple times but, say, A clearly dominated B, we consider as if they would have played once and A beat B. If they played each other several times and there is no clear winner, or if they played only one draw, we consider as if they would have played twice and once A won, once B.

We can represent the whole situation as a directed graph with n vertices {1, …, n} and an edge from i to j if i won against j. Clearly, there are no self-loops. If the graph has no cycles, we can efficiently compute a topological sorting of the vertices and we are done. What if the graph has cycles (two teams that played a draw, or a sequence of teams where A beat B, B beat C, …, X beat A), but we still want to output a reasonable ordering? What is a reasonable ordering?

This is the vertex ordering problem and we can formulate many different criteria. I’ll go through some variants and mention for each of them, whether they are NP-hard or polynomially solvable. It turns out that almost all of them are NP-hard, and those which are not are quite trivial (plus the topological sorting mentioned before, which is also polynomially – in fact, linearly – computable). I’ll try to refer to sources where I found some, where I didn’t find any, the result is simple enough to be considered folklore. Not all the criteria that I list are reasonable – a reasonable criterion should actually lead to the topological sorting when there are no cycles. Did I forget some sensible criteria? Let me know what you think.

The general set-up is the following: we search for a permutation f:[n] -> [n], such as to minimize or maximize some quantity as follows:

1. The first criterion is that we want to minimize/maximize some reasonable function of the number of wins and the number of losses of a team. This includes the methods used in most sport tournaments where wins and loses are worth some number of points and in the end the teams are ranked by their accumulated number of points. This criterion is easy, because we can just compute the indegree/outdegree of each vertex, compute the necessary function values and sort vertices according to it.

Now we move to some more interesting criteria.

  • We either minimize (MIN) or maximize (MAX) some quantity.
  • The quantity is either the SUM, the MAXimum, the MINimum or the COUNT of something.
  • That something is either the length of EDGEs, the length of BACK-EDGEs, or SIGNED length of EDGEs.

This gives 2*4*3=24 cases.

For example, if i->j means that there is an edge from i to j:

MIN COUNT EDGE refers to    

MAX SUM EDGE refers to    

MIN SUM BACK-EDGE refers to    

MAX MIN SIGNED-EDGE refers to     .

These are all for directed graphs, as mentioned earlier. Before starting to enumerate all these cases, let’s define the same problem for undirected graphs. For undirected graphs there are less cases, as we don’t have to consider separately any edge or back-edge or signed edge. Thus there are just 2*4=8 possibilities here.

Now let’s see what we’ve got, numbering continuously all the cases.

Undirected graphs

2. MIN SUM: This is called Minimum Linear Arrangement, a known NP-hard problem.

3. MIN MAX: This is called Minimum Bandwidth, a known NP-hard problem.

4. MIN MIN: makes no sense, it can always be 1.

5. MIN COUNT: makes no sense, it is always the number of edges.

6. MAX SUM: this is equivalent to 2. on the complement graph, therefore NP-hard.

7. MAX MAX: makes no sense, it can always be n-1.

8. MAX MIN: This is the anti-bandwidth problem, studied in literature, see here, or here (leads to paywall). It is NP-hard. This can be seen via the following reduction: “Does G have a Hamiltonian path?” is the same as “can the vertices of G be arranged such that there are edges between neighbors?” is the same as “can the vertices of G be arranged such that the complement of G has anti-bandwith at least 2?”. If we could answer the last question efficiently, we could answer the first.

9. MAX COUNT: makes no sense, it is always the number of edges.

Now we move on to the original question:

Directed graphs

10. MIN SUM EDGE: NP-hard. From 2. if we replace every undirected edge with a directed edge.

11. MIN SUM BACK-EDGE: NP-hard. From 2. if we replace every undirected edge with two directed edges.

12. MIN SUM SIGNED-EDGE: this is quite interesting. Notice that we can replace a path a->b->c with a single edge a->c without changing the sum. If we do this sufficiently many times we will be left with vertices with only incoming or only outgoing edges. To minimize the sum we clearly have to put vertices with outgoing edges first, then vertices with incoming edges. Otherwise we could make a simple replacement that would decrease the sum. Furthermore, vertices with outgoing edges have to be ordered in decreasing order of degree, vertices with incoming edges ordered in increasing order of degree. Again, otherwise, a simple swap would decrease the sum. That’s all.. Observe that what we’ve got is the same as ordering the original vertices in decreasing order of [outderee-indegree]. In this sense, this is a special case of 1.

13. MAX SUM EDGE: NP-hard. From 6. if we replace every edge with a directed edge.

14. MAX SUM BACK-EDGE: NP-hard. From 6. if we replace every edge with two directed edges.

15. MAX SUM SIGNED-EDGE: solution is the reverse order of 12.

16. MIN MAX EDGE: NP-hard. From 3. if we replace every undirected edge with a directed edge.

17. MIN MAX BACK-EDGE: NP-hard. From 3. if we replace every undirected edge with two directed edges.

18. MIN MAX SIGNED-EDGE: NP-hard. From 3. if we replace every undirected edge with two directed edges. Known as Minimum Directed Bandwidth.

19. MAX MAX EDGE: makes no sense, it can always be n-1.

20. MAX MAX BACK-EDGE: makes no sense, it can always be n-1.

21. MAX MAX SIGNED-EDGE: makes no sense, it can always be n-1.

22. MIN MIN EDGE: makes no sense, it can always be 1.

23. MIN MIN BACK-EDGE: makes no sense, it can always be 1 and has to be at least 1 if there are cycles.

24. MIN MIN SIGNED-EDGE: makes no sense, it can always be -(n-1).

25. MAX MIN EDGE: NP-hard. From 8. if we replace every undirected edge with a directed edge.

26. MAX MIN BACK-EDGE: NP-hard. From 8. if we replace every undirected edge with two directed edges.

27. MAX MIN SIGNED-EDGE: NP-hard. From 8. if we replace every undirected edge with two directed edges.

28. MIN COUNT EDGE: makes no sense, it is always the number of edges.

29. MIN COUNT BACK-EDGE: NP-hard, known as Minimum Feedback Arc Set.

30. MIN COUNT SIGNED-EDGE: makes no sense, it is always the number of edges.

31. MAX COUNT EDGE: makes no sense, it is always the number of edges.

32. MAX COUNT BACK-EDGE: NP-hard, it is the reverse order of 29.

33. MAX COUNT SIGNED-EDGE: makes no sense, it is always the number of edges.

Phew :) Did I forget any interesting criteria?

From the point of view of the original motivation (a ranking of teams) the criteria that seem to some extent reasonable are: 11, 12, 17, 18, 29.

1 comment so far ↓

#1 a user on 03.25.14 at 2:28 pm

what if we have vertices 1..n, classifying them by m disjoint types, say type ti is a subset of {1…n} and ti is disjoint to tj and the union of all ti is {1…n}.

we are looking for a topological sort order that minimizes the changes between types.

for example consider the vertices: 1, 2, 3, 4 and the two types: t1={1,2}, t2={3,4}

The order 1,2,3,4 has only one change while the order 1,3,4,2 has two changes ( t1->t2->t1).

Any idea? Would appreciate it if you send me a mail if you got any idea.