Graph Algorithms - Search
When working with graph, search is an important topic. For example, search for connectivity, search for shortest path. There are two basic strategies to do search in graph: Depth-first(DFS) and Breadth-first(BFS). Note that in this blog, all the discussions are based on undirected graph. But the strategy can be used to all kind of graphs given they share similar data structures.
What kind of problems we are solving?
The basic idea of search in general is to walk through the data structure and collection information we need. In terms of Graph, only two elements matters: nodes (vertices) and edges. Walking through a graph, really means iterating the nodes in a way.
Data Structure
The next questions to ask is that how can I solve a question by looping through the least nodes? Well to answer this question, we need to decide a data structure to represent graph.
Here we select a straight forward way: adjacent list. Completed code can be found here
Essentially, we use dict of dict to represent nodes, and dict of dict of dict to represent adjacent list. I know.. it is not a list at all. But the idea is the same, the benefit of using a dict is that it is very easy to embed meta data to either nodes or edges. And it is an easy way to extend this data structure to other types of graph, say weighted graph.
1 | class UndirectedGraph(Graph): |
And.. in the end, dict (hash map) is just a list with hashable index instead of int as index. Or in another words, dict is just a generalized list… alright.. too far away. :smirk:
For example, we can represent graph:
1 | 1 - 2 - 3 |
with following:
1 | __adj = {1: {2: {}}, 2:{3:{}, 4:{}}, 3:{2:{}}, 4:{2:{}}} |
Search strategy
Ok, we got our little dict(s), the next question is how can we search or walk through this structure? Well when we meet the first node, there are two obvious ways: 1. go to one of its adjacent node and go even deeper via that node. 2. go to all of its adjacent nodes and do the same for other nodes.
The first way is called depth-first, the second is called breadth-first.
Apperently they have different properties.
Depth-first search
For detailed code, please go https://github.com/wangzhe3224/pygraph/blob/master/pygraph/algos/dfs.py
We can prove that DFS marks all the nodes connected to a given node in time proportional to the sum of their degrees.
Recall degree of a node
is the number of nodes connected to it directly.
This strategy is efficient in may problems:
- Given a graph, are two given nodes are connected? This question, is equivalent to ask, given two nodes, is there a path from node a to b? if so, find me the path (in terms of sequence of nodes of course)!
- How many connected components does the graph have?
All right, let’s try to solve a find path problem using DFS.
Here is one question: given a graph, node a, calculate one path between a and the rest of the nodes, if no path, return None.
So let’s start with a recursive way, which is the nature of DFS.
1 | def path_view(nodes, edge_to: dict, source): |
Breadth-first search
Breadth-first search use a different strategy from depth-first search. It will search all the connected nodes and do the same process to sub-nodes. While depth search will go down a path to the end.
1 | def bfs_path(graph, source): |