Graph representation - Adjancy list versus Adjancy Matrix

A Graph consists of nodes connected or not to other nodes by edges.

There are several ways of representing a graph. In this post, I will discuss 2 of those representations: adjancy list, adjancy matrix. Depending on the nature of the graph (dense or sparse), one or the other representation would be favored.

  • Dense graph:
  • Sparse graph:

Adjancy list

The key are the nodes, and the value is a list of connected nodes.

{
A: [B, C, E]
B: [A, C]
}

Pros:

  • faster and uses less space for sparse Graphs

Cons:

  • slower for dense Graphs

Adjancy Matrix

For each connected nodes, the value of the matrix at the corresponding indexes is 1.

If the edge is weighted, we can replace 1 by the weight. For example:

Pros:

  • works well for dense Graphs
  • faster for dense Graphs
  • simple for implemented weighted edges

Cons:

  • use more space: takes up to space. For example, 1,000 nodes will take up to 100,000 bytes. Typically, must be below 1000.

For one of of the Kaggle competitions: “Quora, Quora Question Pairs”, I was looking for a datastructure that will enable to speedup the code to generate data augmentatiton.