如何使用非递归方法实现深度优先搜索图

Posted 2023-03-31

技术标签:

【中文标题】如何使用非递归方法实现深度优先搜索图【英文标题】：How to implement depth first search for graph with a non-recursive approach 【发布时间】：2014-02-02 08:58:36 【问题描述】：

我在这个问题上花了很多时间。但是，我只能找到对树使用非递归方法的解决方案：Non recursive for tree，或者为图形找到递归方法，Recursive for graph。

而且很多教程（我在这里不提供这些链接）也没有提供方法。或者教程完全不正确。请帮帮我。

更新：

真的很难形容：

如果我有一个无向图：

1-- 2-- 3 --1是一个循环。

在步骤：'将弹出的顶点的邻居推入堆栈'时，顶点应该被推入的顺序是什么？

如果推送顺序为2、4、3，则栈中的顶点为：

| |
|3|
|4|
|2|    
 _

弹出节点后，我们得到结果：1 -> 3 -> 4 -> 2 而不是1--> 3 --> 2 -->4。

这是不正确的。我应该添加什么条件来停止这种情况？

【问题讨论】：

使用@amit 的优秀答案中的算法，我无法让 4 出现在 3 和 2 之间。算法中有一个重要的细节。仅在实际访问节点时才将节点添加到访问集中，而不是在将其压入堆栈时。在堆栈推送上将其标记为已访问会导致您遇到的问题，因为它会阻止将 3 视为 2 的子项，或将 2 视为 3 的子项。 【参考方案1】：

没有递归的 DFS 与 BFS 基本相同 - 但使用 stack 而不是队列作为数据结构。

线程Iterative DFS vs Recursive DFS and different elements order 处理这两种方法以及它们之间的区别（还有！你不会以相同的顺序遍历节点！）

迭代方法的算法基本上是：

DFS(source):
  s <- new stack
  visited <-  // empty set
  s.push(source)
  while (s is not empty):
    current <- s.pop()
    if (current is in visited):
        continue
    visited.add(current)
    // do something with current
    for each node v such that (current,v) is an edge:
        s.push(v)

【讨论】：

嗯，那个线程并没有告诉我具体的函数实现比如：stack.Read();无论如何，我想知道how to avoid cyclic problem，避免在添加一个节点的邻居时重复添加相同的节点AND，推入堆栈的节点的顺序是正确的，因此当我将它们弹出时，它可以得到正确的顺序。 @Stallman visited 跟踪访问过的节点，这样您就不会两次访问同一个节点。至于迭代的顺序，如果您有特定的顺序，您可以通过将for each node... 行替换为您的迭代顺序来实现它。 +1 以获得好的答案！我必须实现一个可执行代码来确保你的伪代码是正确的。或者您可以提供具体的source code 来证明。这段代码的空间复杂度不会占用额外的内存吗？我相信迭代 DFS 的最佳空间复杂度是 O(n)。此代码的空间和时间为 O(n²)。考虑一个complete graph（每个顶点都连接到其他每个顶点）。对于所有 n 个顶点，while 循环的每次迭代都会将另一个 n 个顶点列表添加到堆栈中，因此总共会有 O(n²) 次迭代。由于访问顶点的弹出，空间增长有点慢，但它仍然是 O(n²)，因为它随着算术级数 n+(n-1)+(n-2)+...+2+1 增长。试试this example code。【参考方案2】：

这不是一个答案，而是一个扩展评论，展示了算法在@amit 对当前版本问题中的图的回答中的应用，假设 1 是起始节点，其邻居按 2 顺序推送, 4, 3:

               1
             / |  \
            4  |   2
               3 /

Actions            Stack             Visited
=======            =====             =======
push 1             [1]               
pop and visit 1    []                1
 push 2, 4, 3      [2, 4, 3]         1
pop and visit 3    [2, 4]            1, 3
 push 1, 2         [2, 4, 1, 2]      1, 3
pop and visit 2    [2, 4, 1]         1, 3, 2
 push 1, 3         [2, 4, 1, 1, 3]   1, 3, 2
pop 3 (visited)    [2, 4, 1, 1]      1, 3, 2
pop 1 (visited)    [2, 4, 1]         1, 3, 2
pop 1 (visited)    [2, 4]            1, 3, 2
pop and visit 4    [2]               1, 3, 2, 4
  push 1           [2, 1]            1, 3, 2, 4
pop 1 (visited)    [2]               1, 3, 2, 4
pop 2 (visited)    []                1, 3, 2, 4

因此应用以 2、4、3 顺序推送 1 的邻居的算法导致访问顺序为 1、3、2、4。无论 1 的邻居的推送顺序如何，2 和 3 在访问顺序中将是相邻的，因为先访问过的会推送尚未访问过的另一个，以及已访问过的 1。

【讨论】：

很好的解释。由于没有提供源代码，我无法将此标记为答案。为什么要推送访问过的节点？ @Shiro 一个节点可能在堆栈中被访问，因此在弹出期间检查访问是不可避免的。请参见示例中的非 2。您也可以在推送期间进行检查，但没有必要这样做。 @Shiro 你会如何改变算法？ @Shiro 这样做的代价是额外的条件分支。通过导致管道刷新，条件分支可能非常昂贵。很明显，一个简单的堆栈 push-pop 所节省的成本将涵盖分支的成本，并且添加它会使算法变得比它需要的更复杂。【参考方案3】：

DFS 逻辑应该是：

1）如果当前节点未被访问，则访问该节点并将其标记为已访问 2) 对于所有未访问过的邻居，将它们推送到堆栈中

例如，让我们在 Java 中定义一个 GraphNode 类：

class GraphNode 
    int index;
    ArrayList<GraphNode> neighbors;

这是没有递归的 DFS：

void dfs(GraphNode node) 
    // sanity check
    if (node == null) 
        return;
    

    // use a hash set to mark visited nodes
    Set<GraphNode> set = new HashSet<GraphNode>();

    // use a stack to help depth-first traversal
    Stack<GraphNode> stack = new Stack<GraphNode>();
    stack.push(node);

    while (!stack.isEmpty()) 
        GraphNode curr = stack.pop();

        // current node has not been visited yet
        if (!set.contains(curr)) 
            // visit the node
            // ...

            // mark it as visited
            set.add(curr);
        

        for (int i = 0; i < curr.neighbors.size(); i++) 
            GraphNode neighbor = curr.neighbors.get(i);

            // this neighbor has not been visited yet
            if (!set.contains(neighbor)) 
                stack.push(neighbor);

我们可以使用相同的逻辑递归地做 DFS，克隆图等。

【讨论】：

【参考方案4】：

很多人会说非递归 DFS 只是 BFS 有一个堆栈而不是一个队列。这不准确，让我再解释一下。

递归 DFS

递归 DFS 使用调用堆栈来保持状态，这意味着您无需自己管理单独的堆栈。

但是，对于大型图，递归 DFS（或任何递归函数）可能会导致深度递归，这可能会导致您的问题因堆栈溢出而崩溃（不是本网站，the real thing）。

非递归 DFS

DFS 与 BFS 不同。它有不同的空间利用率，但如果你像 BFS 一样实现它，但使用堆栈而不是队列，你将使用比非递归 DFS 更多的空间。

为什么要更多空间？

考虑一下：

// From non-recursive "DFS"
for (auto i&: adjacent) 
    if (!visited(i)) 
        stack.push(i);

并将其与此进行比较：

// From recursive DFS
for (auto i&: adjacent) 
    if (!visited(i)) 
        dfs(i);

在第一段代码中，您将所有相邻节点放入堆栈中，然后再迭代到下一个相邻顶点，这需要空间成本。如果图表很大，它可能会产生显着差异。

那该怎么办？

如果您决定在弹出堆栈后通过再次迭代邻接列表来解决空间问题，那将增加时间复杂度成本。

一种解决方案是在您访问项目时将项目一项一项添加到堆栈中。为此，您可以在堆栈中保存一个迭代器，以便在弹出后恢复迭代。

懒惰的方式

在 C/C++ 中，一种懒惰的方法是编译具有更大堆栈大小的程序并通过ulimit 增加堆栈大小，但这真的很糟糕。在 Java 中，您可以将堆栈大小设置为 JVM 参数。

【讨论】：

【参考方案5】：

递归是一种使用调用栈来存储图遍历状态的方法。您可以显式使用堆栈，例如通过使用类型为 std::stack 的局部变量，那么您将不需要递归来实现 DFS，而只需一个循环。

【讨论】：

是的，我尝试使用堆栈，但我不知道如何避免循环问题。【参考方案6】：

好的。如果你还在寻找java代码

dfs(Vertex start)
    Stack<Vertex> stack = new Stack<>(); // initialize a stack
    List<Vertex> visited = new ArrayList<>();//maintains order of visited nodes
    stack.push(start); // push the start
    while(!stack.isEmpty()) //check if stack is empty
        Vertex popped = stack.pop(); // pop the top of the stack
        if(!visited.contains(popped)) //backtrack if the vertex is already visited
            visited.add(popped); //mark it as visited as it is not yet visited
            for(Vertex adjacent: popped.getAdjacents()) //get the adjacents of the vertex as add them to the stack
                    stack.add(adjacent);
            
        
    

    for(Vertex v1 : visited)
        System.out.println(v1.getId());

【讨论】：

你应该稍微解释一下代码... OP应该了解代码是如何工作的... 提供纯代码就像做他们的家庭作业... 嗯，我明白了。有道理。【参考方案7】：

Python 代码。时间复杂度为O(V+E)，其中V和E分别是分别为顶点数和边数。空间复杂度为 O(V)，因为最坏的情况是有一条路径包含每个顶点而没有任何回溯（即搜索路径是 linear chain）。

堆栈存储形式为 (vertex, vertex_edge_index) 的元组，以便 DFS 可以从紧跟从该顶点处理的最后一条边的特定顶点处恢复（就像递归的函数调用堆栈DFS）。

示例代码使用complete digraph，其中每个顶点都连接到其他每个顶点。因此没有必要为每个节点存储一个显式的边列表，因为图是一个边列表（图 G 包含每个顶点）。

numv = 1000
print('vertices =', numv)
G = [Vertex(i) for i in range(numv)]

def dfs(source):
  s = []
  visited = set()
  s.append((source,None))
  time = 1
  space = 0
  while s:
    time += 1
    current, index = s.pop()
    if index is None:
      visited.add(current)
      index = 0
    # vertex has all edges possible: G is a complete graph
    while index < len(G) and G[index] in visited:
      index += 1
    if index < len(G):
      s.append((current,index+1))
      s.append((G[index], None))
    space = max(space, len(s))
  print('time =', time, '\nspace =', space)

dfs(G[0])

输出：

time = 2000 
space = 1000

注意这里的时间是测量V操作而不是E。该值是 numv*2，因为每个顶点都被考虑两次，一次是在发现时，一次是在完成时。

【讨论】：

我来到***是为了避免堆栈溢出。【参考方案8】：

实际上，stack 并不能很好地处理发现时间和完成时间，如果我们想用 stack 实现 DFS，并且想要处理发现时间和完成时间，我们需要求助于另一个记录器堆栈，我的实现如下图，测试正确，下图为case-1、case-2和case-3。

from collections import defaultdict

class Graph(object):

    adj_list = defaultdict(list)

    def __init__(self, V):
        self.V = V

    def add_edge(self,u,v):
        self.adj_list[u].append(v)

    def DFS(self):
        visited = []
        instack = []
        disc = []
        fini = []
        for t in range(self.V):
            visited.append(0)
            disc.append(0)
            fini.append(0)
            instack.append(0)

        time = 0
        for u_ in range(self.V):
            if (visited[u_] != 1):
                stack = []
                stack_recorder = []
                stack.append(u_)
                while stack:
                    u = stack.pop()
                    visited[u] = 1
                    time+=1
                    disc[u] = time
                    print(u)
                    stack_recorder.append(u)
                    flag = 0
                    for v in self.adj_list[u]:
                        if (visited[v] != 1):
                            flag = 1
                            if instack[v]==0:
                                stack.append(v)
                            instack[v]= 1



                    if flag == 0:
                        time+=1
                        temp = stack_recorder.pop()
                        fini[temp] = time
                while stack_recorder:
                    temp = stack_recorder.pop()
                    time+=1
                    fini[temp] = time
        print(disc)
        print(fini)

if __name__ == '__main__':

    V = 6
    G = Graph(V)

#==============================================================================
# #for case 1
#     G.add_edge(0,1)
#     G.add_edge(0,2)
#     G.add_edge(1,3)
#     G.add_edge(2,1)
#     G.add_edge(3,2) 
#==============================================================================

#==============================================================================
# #for case 2
#     G.add_edge(0,1)
#     G.add_edge(0,2)
#     G.add_edge(1,3)
#     G.add_edge(3,2)  
#==============================================================================

#for case 3
    G.add_edge(0,3)    
    G.add_edge(0,1)

    G.add_edge(1,4)
    G.add_edge(2,4)
    G.add_edge(2,5)
    G.add_edge(3,1)
    G.add_edge(4,3)
    G.add_edge(5,5)    


    G.DFS()

【讨论】：

【参考方案9】：

我认为您需要使用visited[n] 布尔数组来检查当前节点是否被更早地访问过。

【讨论】：

【参考方案10】：

递归算法非常适合 DFS，因为我们尝试尽可能深入，即。一旦我们找到一个未探索的顶点，我们将立即探索它的第一个未探索的邻居。找到第一个未探索的邻居后，您需要立即跳出 for 循环。

for each neighbor w of v
   if w is not explored
       mark w as explored
       push w onto the stack
       BREAK out of the for loop

【讨论】：

当我们到达一条路径的尽头直到最后一个深度时，如果我们不真的一次将所有邻居都保存在堆栈中，堆栈会不会是空的？【参考方案11】：

我认为这是一个关于空间的优化 DFS-如果我错了，请纠正我。

s = stack
s.push(initial node)
add initial node to visited
while s is not empty:
    v = s.peek()
    if for all E(v,u) there is one unvisited u:
        mark u as visited
        s.push(u)
    else 
        s.pop

【讨论】：

【参考方案12】：

在递归过程中使用堆栈并按照调用堆栈的方式实现-

想法是将一个顶点压入堆栈，然后将其相邻的顶点压入存储在该顶点索引处的邻接列表中，然后继续此过程，直到我们无法在图中进一步移动，现在如果我们无法在图中继续前进，那么我们将删除当前位于堆栈顶部的顶点，因为它无法将我们带到任何未访问的顶点。

现在，我们使用堆栈来处理这一点，即只有当所有可以从当前顶点探索的顶点都已被访问时，该顶点才会从堆栈中移除，这是由递归过程自动完成的。

前 -

See the example graph here.

(0 (1 (2 (4 4) 2) (3 3) 1) 0) (6 (5 5) (7 7) 6)

上面的括号显示了顶点被添加到堆栈和从堆栈中删除的顺序，因此只有当所有可以访问的顶点都完成时，顶点的括号才会关闭。

（这里我使用了邻接表表示，并通过使用 C++ STL 实现为列表的向量（向量 > AdjList））

void DFSUsingStack() 
    /// we keep a check of the vertices visited, the vector is set to false for all vertices initially.
    vector<bool> visited(AdjList.size(), false);

    stack<int> st;

    for(int i=0 ; i<AdjList.size() ; i++)
        if(visited[i] == true)
            continue;
        
        st.push(i);
        cout << i << '\n';
        visited[i] = true;
        while(!st.empty())
            int curr = st.top();
            for(list<int> :: iterator it = AdjList[curr].begin() ; it != AdjList[curr].end() ; it++)
                if(visited[*it] == false)
                    st.push(*it);
                    cout << (*it) << '\n';
                    visited[*it] = true;
                    break;
                
            
            /// We can move ahead from current only if a new vertex has been added on the top of the stack.
            if(st.top() != curr)
                continue;
            
            st.pop();

【讨论】：

【参考方案13】：

下面的 Java 代码会很方便：-

private void DFS(int v,boolean[] visited)
    visited[v]=true;
    Stack<Integer> S = new Stack<Integer>();
    S.push(v);
    while(!S.isEmpty())
        int v1=S.pop();     
        System.out.println(adjLists.get(v1).name);
        for(Neighbor nbr=adjLists.get(v1).adjList; nbr != null; nbr=nbr.next)
             if (!visited[nbr.VertexNum])
                 visited[nbr.VertexNum]=true;
                 S.push(nbr.VertexNum);
             
        
    

public void dfs() 
    boolean[] visited = new boolean[adjLists.size()];
    for (int v=0; v < visited.length; v++) 
        if (!visited[v])/*This condition is for Unconnected Vertices*/ 

            System.out.println("\nSTARTING AT " + adjLists.get(v).name);
            DFS(v, visited);

【讨论】：

以上是关于如何使用非递归方法实现深度优先搜索图的主要内容，如果未能解决你的问题，请参考以下文章