A Graph consists of nodes connected or not to other nodes by edges.
There are several ways of representing a graph. In this post, I will discuss 2 of those representations: adjancy list, adjancy matrix. Depending on the nature of the graph (dense or sparse), one or the other representation would be favored.
- Dense graph:
- Sparse graph:
Adjancy list
The key are the nodes, and the value is a list of connected nodes.
{
A: [B, C, E]
B: [A, C]
}
Pros:
- faster and uses less space for sparse Graphs
Cons:
- slower for dense Graphs
Adjancy Matrix
For each connected nodes, the value of the matrix at the corresponding indexes is 1.
If the edge is weighted, we can replace 1 by the weight. For example:
Pros:
- works well for dense Graphs
- faster for dense Graphs
- simple for implemented weighted edges
Cons:
- use more space: takes up to space. For example, 1,000 nodes will take up to 100,000 bytes. Typically, must be below 1000.
For one of of the Kaggle competitions: “Quora, Quora Question Pairs”, I was looking for a datastructure that will enable to speedup the code to generate data augmentatiton.