我如何才能找到(遍历)有向图中从/到给定节点的所有周期?
例如,我想要这样的东西:
A->B->A
A->B->C->A
而不是: B - > C > B
我如何才能找到(遍历)有向图中从/到给定节点的所有周期?
例如,我想要这样的东西:
A->B->A
A->B->C->A
而不是: B - > C > B
从节点X开始,检查所有子节点(如果无方向,父节点和子节点是等价的)。将这些子节点标记为X的子节点。对于任何这样的子节点A,标记它的子节点是A的子节点,X',其中X'标记为2步远。)如果您稍后点击X并将其标记为X的子节点”,这意味着X处于3节点周期中。回溯到它的父节点很容易(因为算法不支持这一点,所以你可以找到任何一个有X'的父节点)。
注意:如果图是无向的或者有任何双向边,这个算法会变得更复杂,假设你不想在一个周期内两次遍历同一条边。
我曾经在面试中遇到过这样的问题,我怀疑你遇到过这种情况,你来这里寻求帮助。把这个问题分解成三个问题就容易多了。
如何确定下一个有效点 路线 你如何确定一个点是否存在 被使用 你如何避免越过 同样的观点
问题1) 使用迭代器模式提供迭代路由结果的方法。放置获取下一个路由的逻辑的一个好地方可能是迭代器的“moveNext”。要找到有效的路由,这取决于您的数据结构。对我来说,这是一个sql表充满有效的路由可能性,所以我必须建立一个查询,以获得有效的目的地给定的源。
问题2) 当您找到每个节点时,将它们推入一个集合,这意味着您可以通过动态询问正在构建的集合,很容易地查看是否在某个点上“返回”。
问题3) 如果在任何时候你看到你正在折回,你可以从集合中弹出东西并“后退”。然后从这一点开始,再次尝试“前进”。
黑客:如果你正在使用Sql Server 2008,有一些新的“层次结构”的东西,你可以用它来快速解决这个问题,如果你把你的数据结构成树状。
深度优先搜索和回溯应该在这里工作。 保存一个布尔值数组,以跟踪您以前是否访问过某个节点。如果您没有新节点可访问(不涉及已经访问过的节点),那么只需返回并尝试不同的分支。
如果你有一个邻接表来表示图,DFS很容易实现。例如adj[A] = {B,C}表示B和C是A的子结点。
例如,下面的伪代码。“start”是开始的节点。
dfs(adj,node,visited):
if (visited[node]):
if (node == start):
"found a path"
return;
visited[node]=YES;
for child in adj[node]:
dfs(adj,child,visited)
visited[node]=NO;
用开始节点调用上面的函数:
visited = {}
dfs(adj,start,visited)
我在搜索中找到了这个页面,由于循环与强连接组件不相同,我继续搜索,最后,我找到了一个高效的算法,它列出了有向图的所有(基本)循环。这篇论文来自唐纳德·b·约翰逊(Donald B. Johnson),可以在以下链接中找到:
http://www.cs.tufts.edu/comp/150GA/homeworks/hw1/Johnson%2075.PDF
java实现可以在下面找到:
http://normalisiert.de/code/java/elementaryCycles.zip
约翰逊算法的Mathematica演示可以在这里找到,实现可以从右边下载(“下载作者代码”)。
注:实际上,这个问题有很多算法。本文列举了其中一些:
http://dx.doi.org/10.1137/0205007
根据文章,Johnson的算法是最快的。
我无意中发现了下面的算法,它似乎比Johnson的算法更有效(至少对于更大的图)。然而,与Tarjan的算法相比,我不确定它的性能如何。 此外,到目前为止,我只检查了三角形。如果感兴趣,请参阅千叶Norishige和西泽木高雄(http://dx.doi.org/10.1137/0214017)的“树状性和子图列表算法”
如果你想要在图中找到所有基本电路,你可以使用JAMES C. TIERNAN的EC算法,该算法在1970年的一篇论文中发现。
非常原始的EC算法,因为我设法在php中实现它(希望没有错误如下所示)。如果有循环,它也可以找到。这个实现中的电路(试图克隆原始电路)是非零元素。0在这里代表不存在(我们知道它是空的)。
除此之外,下面的实现使算法更具独立性,这意味着节点可以从任何地方开始,甚至从负数开始,例如-4,-3,-2,..等。
在这两种情况下,都要求节点是顺序的。
你可能需要研究原始论文,James C. Tiernan基本电路算法
<?php
echo "<pre><br><br>";
$G = array(
1=>array(1,2,3),
2=>array(1,2,3),
3=>array(1,2,3)
);
define('N',key(array_slice($G, -1, 1, true)));
$P = array(1=>0,2=>0,3=>0,4=>0,5=>0);
$H = array(1=>$P, 2=>$P, 3=>$P, 4=>$P, 5=>$P );
$k = 1;
$P[$k] = key($G);
$Circ = array();
#[Path Extension]
EC2_Path_Extension:
foreach($G[$P[$k]] as $j => $child ){
if( $child>$P[1] and in_array($child, $P)===false and in_array($child, $H[$P[$k]])===false ){
$k++;
$P[$k] = $child;
goto EC2_Path_Extension;
} }
#[EC3 Circuit Confirmation]
if( in_array($P[1], $G[$P[$k]])===true ){//if PATH[1] is not child of PATH[current] then don't have a cycle
$Circ[] = $P;
}
#[EC4 Vertex Closure]
if($k===1){
goto EC5_Advance_Initial_Vertex;
}
//afou den ksana theoreitai einai asfales na svisoume
for( $m=1; $m<=N; $m++){//H[P[k], m] <- O, m = 1, 2, . . . , N
if( $H[$P[$k-1]][$m]===0 ){
$H[$P[$k-1]][$m]=$P[$k];
break(1);
}
}
for( $m=1; $m<=N; $m++ ){//H[P[k], m] <- O, m = 1, 2, . . . , N
$H[$P[$k]][$m]=0;
}
$P[$k]=0;
$k--;
goto EC2_Path_Extension;
#[EC5 Advance Initial Vertex]
EC5_Advance_Initial_Vertex:
if($P[1] === N){
goto EC6_Terminate;
}
$P[1]++;
$k=1;
$H=array(
1=>array(1=>0,2=>0,3=>0,4=>0,5=>0),
2=>array(1=>0,2=>0,3=>0,4=>0,5=>0),
3=>array(1=>0,2=>0,3=>0,4=>0,5=>0),
4=>array(1=>0,2=>0,3=>0,4=>0,5=>0),
5=>array(1=>0,2=>0,3=>0,4=>0,5=>0)
);
goto EC2_Path_Extension;
#[EC5 Advance Initial Vertex]
EC6_Terminate:
print_r($Circ);
?>
然后这是另一个实现,更独立于图形,没有goto和数组值,而是使用数组键,路径,图形和电路存储为数组键(如果你喜欢使用数组值,只需更改所需的行)。示例图从-4开始,以显示其独立性。
<?php
$G = array(
-4=>array(-4=>true,-3=>true,-2=>true),
-3=>array(-4=>true,-3=>true,-2=>true),
-2=>array(-4=>true,-3=>true,-2=>true)
);
$C = array();
EC($G,$C);
echo "<pre>";
print_r($C);
function EC($G, &$C){
$CNST_not_closed = false; // this flag indicates no closure
$CNST_closed = true; // this flag indicates closure
// define the state where there is no closures for some node
$tmp_first_node = key($G); // first node = first key
$tmp_last_node = $tmp_first_node-1+count($G); // last node = last key
$CNST_closure_reset = array();
for($k=$tmp_first_node; $k<=$tmp_last_node; $k++){
$CNST_closure_reset[$k] = $CNST_not_closed;
}
// define the state where there is no closure for all nodes
for($k=$tmp_first_node; $k<=$tmp_last_node; $k++){
$H[$k] = $CNST_closure_reset; // Key in the closure arrays represent nodes
}
unset($tmp_first_node);
unset($tmp_last_node);
# Start algorithm
foreach($G as $init_node => $children){#[Jump to initial node set]
#[Initial Node Set]
$P = array(); // declare at starup, remove the old $init_node from path on loop
$P[$init_node]=true; // the first key in P is always the new initial node
$k=$init_node; // update the current node
// On loop H[old_init_node] is not cleared cause is never checked again
do{#Path 1,3,7,4 jump here to extend father 7
do{#Path from 1,3,8,5 became 2,4,8,5,6 jump here to extend child 6
$new_expansion = false;
foreach( $G[$k] as $child => $foo ){#Consider each child of 7 or 6
if( $child>$init_node and isset($P[$child])===false and $H[$k][$child]===$CNST_not_closed ){
$P[$child]=true; // add this child to the path
$k = $child; // update the current node
$new_expansion=true;// set the flag for expanding the child of k
break(1); // we are done, one child at a time
} } }while(($new_expansion===true));// Do while a new child has been added to the path
# If the first node is child of the last we have a circuit
if( isset($G[$k][$init_node])===true ){
$C[] = $P; // Leaving this out of closure will catch loops to
}
# Closure
if($k>$init_node){ //if k>init_node then alwaya count(P)>1, so proceed to closure
$new_expansion=true; // $new_expansion is never true, set true to expand father of k
unset($P[$k]); // remove k from path
end($P); $k_father = key($P); // get father of k
$H[$k_father][$k]=$CNST_closed; // mark k as closed
$H[$k] = $CNST_closure_reset; // reset k closure
$k = $k_father; // update k
} } while($new_expansion===true);//if we don't wnter the if block m has the old k$k_father_old = $k;
// Advance Initial Vertex Context
}//foreach initial
}//function
?>
我已经分析并记录了EC,但不幸的是,文档是希腊语。
首先,你并不是真的想要找出所有的循环因为如果有1个,那么就会有无穷多个循环。比如A-B-A, A-B-A- b - a等等。或者可以将2个循环组合成一个8-like循环等等……有意义的方法是寻找所有所谓的简单循环——那些除了开始/结束点之外不交叉的循环。如果你愿意,你可以生成简单循环的组合。
One of the baseline algorithms for finding all simple cycles in a directed graph is this: Do a depth-first traversal of all simple paths (those that do not cross themselves) in the graph. Every time when the current node has a successor on the stack a simple cycle is discovered. It consists of the elements on the stack starting with the identified successor and ending with the top of the stack. Depth first traversal of all simple paths is similar to depth first search but you do not mark/record visited nodes other than those currently on the stack as stop points.
The brute force algorithm above is terribly inefficient and in addition to that generates multiple copies of the cycles. It is however the starting point of multiple practical algorithms which apply various enhancements in order to improve performance and avoid cycle duplication. I was surprised to find out some time ago that these algorithms are not readily available in textbooks and on the web. So I did some research and implemented 4 such algorithms and 1 algorithm for cycles in undirected graphs in an open source Java library here : http://code.google.com/p/niographs/ .
顺便说一句,因为我提到了无向图:它们的算法是不同的。构建一棵生成树,然后每一条不属于树的边与树中的一些边一起形成一个简单的循环。这样发现的循环形成了所谓的循环基。所有的简单循环都可以通过组合两个或多个不同的基循环来找到。更多细节请参见:http://dspace.mit.edu/bitstream/handle/1721.1/68106/FTL_R_1982_07.pdf。
在DAG中查找所有循环涉及两个步骤(算法)。
第一步是使用Tarjan的算法找到强连接组件的集合。
从任意顶点开始。 这个顶点的DFS。每个节点x保留两个数字,dfs_index[x]和dfs_lowval[x]。 Dfs_index [x]存储访问节点的时间,而dfs_lowval[x] = min(dfs_low[k]) where K是x的所有子结点在dfs生成树中不是x的父结点。 具有相同dfs_lowval[x]的所有节点都在同一个强连接组件中。
第二步是在连接的组件中找到循环(路径)。我的建议是使用改进版的Hierholzer算法。
这个想法是:
Choose any starting vertex v, and follow a trail of edges from that vertex until you return to v. It is not possible to get stuck at any vertex other than v, because the even degree of all vertices ensures that, when the trail enters another vertex w there must be an unused edge leaving w. The tour formed in this way is a closed tour, but may not cover all the vertices and edges of the initial graph. As long as there exists a vertex v that belongs to the current tour but that has adjacent edges not part of the tour, start another trail from v, following unused edges until you return to v, and join the tour formed in this way to the previous tour.
下面是带有测试用例的Java实现的链接:
http://stones333.blogspot.com/2013/12/find-cycles-in-directed-graph-dag.html
在无向图的情况下,最近发表的一篇论文(无向图中循环和st路径的最优列表)提供了一个渐近最优解。你可以在这里阅读http://arxiv.org/abs/1205.2766或http://dl.acm.org/citation.cfm?id=2627951 我知道它不能回答你的问题,但由于你的问题标题没有提到方向,它可能仍然对谷歌搜索有用
从开始节点s开始的DFS,在遍历过程中跟踪DFS路径,如果在到s的路径中发现从节点v开始的边,则记录该路径。(v,s)是DFS树中的后边,因此表示包含s的周期。
澄清:
Strongly Connected Components will find all subgraphs that have at least one cycle in them, not all possible cycles in the graph. e.g. if you take all strongly connected components and collapse/group/merge each one of them into one node (i.e. a node per component), you'll get a tree with no cycles (a DAG actually). Each component (which is basically a subgraph with at least one cycle in it) can contain many more possible cycles internally, so SCC will NOT find all possible cycles, it will find all possible groups that have at least one cycle, and if you group them, then the graph will not have cycles. to find all simple cycles in a graph, as others mentioned, Johnson's algorithm is a candidate.
关于你关于排列周期的问题,请在这里阅读更多: https://www.codechef.com/problems/PCYCLE
您可以尝试以下代码(输入大小和数字number):
# include<cstdio>
using namespace std;
int main()
{
int n;
scanf("%d",&n);
int num[1000];
int visited[1000]={0};
int vindex[2000];
for(int i=1;i<=n;i++)
scanf("%d",&num[i]);
int t_visited=0;
int cycles=0;
int start=0, index;
while(t_visited < n)
{
for(int i=1;i<=n;i++)
{
if(visited[i]==0)
{
vindex[start]=i;
visited[i]=1;
t_visited++;
index=start;
break;
}
}
while(true)
{
index++;
vindex[index]=num[vindex[index-1]];
if(vindex[index]==vindex[start])
break;
visited[vindex[index]]=1;
t_visited++;
}
vindex[++index]=0;
start=index+1;
cycles++;
}
printf("%d\n",cycles,vindex[0]);
for(int i=0;i<(n+2*cycles);i++)
{
if(vindex[i]==0)
printf("\n");
else
printf("%d ",vindex[i]);
}
}
我发现解决这个问题的最简单的选择是使用名为networkx的python库。
它实现了这个问题的最佳答案中提到的约翰逊算法,但它的执行非常简单。
简而言之,你需要以下几点:
import networkx as nx
import matplotlib.pyplot as plt
# Create Directed Graph
G=nx.DiGraph()
# Add a list of nodes:
G.add_nodes_from(["a","b","c","d","e"])
# Add a list of edges:
G.add_edges_from([("a","b"),("b","c"), ("c","a"), ("b","d"), ("d","e"), ("e","a")])
#Return a list of cycles described as a list o nodes
list(nx.simple_cycles(G))
答案:[['a', 'b', 'd', 'e'], ['a', 'b', 'c']]
基于dfs的带有后边缘的变体确实会发现循环,但在许多情况下,它不会是最小循环。一般来说,DFS给出了存在循环的标志,但它不足以真正找到循环。例如,想象5个不同的循环共用两条边。仅仅使用DFS(包括回溯变量)没有简单的方法来识别周期。
Johnson算法确实给出了所有唯一的简单循环,并具有良好的时间和空间复杂度。
但如果你只想找到最小循环(意味着可能有多个循环通过任何顶点,我们感兴趣的是找到最小循环),并且你的图不是很大,你可以尝试使用下面的简单方法。 它非常简单,但与约翰逊的相比相当慢。
So, one of the absolutely easiest way to find MINIMAL cycles is to use Floyd's algorithm to find minimal paths between all the vertices using adjacency matrix. This algorithm is nowhere near as optimal as Johnson's, but it is so simple and its inner loop is so tight that for smaller graphs (<=50-100 nodes) it absolutely makes sense to use it. Time complexity is O(n^3), space complexity O(n^2) if you use parent tracking and O(1) if you don't. First of all let's find the answer to the question if there is a cycle. The algorithm is dead-simple. Below is snippet in Scala.
val NO_EDGE = Integer.MAX_VALUE / 2
def shortestPath(weights: Array[Array[Int]]) = {
for (k <- weights.indices;
i <- weights.indices;
j <- weights.indices) {
val throughK = weights(i)(k) + weights(k)(j)
if (throughK < weights(i)(j)) {
weights(i)(j) = throughK
}
}
}
Originally this algorithm operates on weighted-edge graph to find all shortest paths between all pairs of nodes (hence the weights argument). For it to work correctly you need to provide 1 if there is a directed edge between the nodes or NO_EDGE otherwise. After algorithm executes, you can check the main diagonal, if there are values less then NO_EDGE than this node participates in a cycle of length equal to the value. Every other node of the same cycle will have the same value (on the main diagonal).
为了重建周期本身,我们需要使用带有父跟踪的稍微修改版本的算法。
def shortestPath(weights: Array[Array[Int]], parents: Array[Array[Int]]) = {
for (k <- weights.indices;
i <- weights.indices;
j <- weights.indices) {
val throughK = weights(i)(k) + weights(k)(j)
if (throughK < weights(i)(j)) {
parents(i)(j) = k
weights(i)(j) = throughK
}
}
}
如果顶点之间有边,父矩阵最初应该包含边缘单元中的源顶点索引,否则为-1。 函数返回后,对于每条边,您都将引用到最短路径树中的父节点。 然后很容易恢复实际的循环。
总之,我们有下面的程序来求所有的最小循环
val NO_EDGE = Integer.MAX_VALUE / 2;
def shortestPathWithParentTracking(
weights: Array[Array[Int]],
parents: Array[Array[Int]]) = {
for (k <- weights.indices;
i <- weights.indices;
j <- weights.indices) {
val throughK = weights(i)(k) + weights(k)(j)
if (throughK < weights(i)(j)) {
parents(i)(j) = parents(i)(k)
weights(i)(j) = throughK
}
}
}
def recoverCycles(
cycleNodes: Seq[Int],
parents: Array[Array[Int]]): Set[Seq[Int]] = {
val res = new mutable.HashSet[Seq[Int]]()
for (node <- cycleNodes) {
var cycle = new mutable.ArrayBuffer[Int]()
cycle += node
var other = parents(node)(node)
do {
cycle += other
other = parents(other)(node)
} while(other != node)
res += cycle.sorted
}
res.toSet
}
还有一个小的main方法来测试结果
def main(args: Array[String]): Unit = {
val n = 3
val weights = Array(Array(NO_EDGE, 1, NO_EDGE), Array(NO_EDGE, NO_EDGE, 1), Array(1, NO_EDGE, NO_EDGE))
val parents = Array(Array(-1, 1, -1), Array(-1, -1, 2), Array(0, -1, -1))
shortestPathWithParentTracking(weights, parents)
val cycleNodes = parents.indices.filter(i => parents(i)(i) < NO_EDGE)
val cycles: Set[Seq[Int]] = recoverCycles(cycleNodes, parents)
println("The following minimal cycle found:")
cycles.foreach(c => println(c.mkString))
println(s"Total: ${cycles.size} cycle found")
}
输出是
The following minimal cycle found:
012
Total: 1 cycle found
DFS c++版本的伪代码在二楼的答案:
void findCircleUnit(int start, int v, bool* visited, vector<int>& path) {
if(visited[v]) {
if(v == start) {
for(auto c : path)
cout << c << " ";
cout << endl;
return;
}
else
return;
}
visited[v] = true;
path.push_back(v);
for(auto i : G[v])
findCircleUnit(start, i, visited, path);
visited[v] = false;
path.pop_back();
}