Aaron's Blog

https://aaronchenwei.github.io/

View on GitHub
28 November 2022

Graph Algorithms Overivew

by aaronchenwei

1. Algorithm Types

2. Algorithms

2.1. Centrality

Name Mechanism Use Case
Eigenvector A vertex's influence is based on its indegree and the influence of the vertices which refer to it.
  • Finding influencer from a social network
  • Ranking web pages for searching engine
  • Identify the marketing targets within a doctor referral network
  • Personalized version can be used for user user behavior prediction and recommender systems
PageRank Variant of Eigenvector. Adding additional jump probability
Article Rank Variant of PageRank. Lowers the influence of low-degree nodes
Closeness Find vertexes that have shorter distance to all other nodes
  • Recommend locations for a grocery store
Harmonic Variant of Closeness Centrality. Reverses the sum and reciprocal operations in the definition of closeness centrality
  • Same as Closeness, but works better when run on disconnected graph
Degree Variant of Closeness Centrality. Reverses the sum and reciprocal operations in the definition of closeness centrality
  • Same as Closeness, but works better when run on disconnected graph
Betweenness Variant of Closeness Centrality. Reverses the sum and reciprocal operations in the definition of closeness centrality
  • Same as Closeness, but works better when run on disconnected graph
Influence Maximization Variant of Closeness Centrality. Reverses the sum and reciprocal operations in the definition of closeness centrality
  • Same as Closeness, but works better when run on disconnected graph

2.1.1. PageRank

2.1.2. Article Rank

2.1.3. Betweenness

2.1.4. Closeness

2.1.5. Degree

2.1.6. Eigenvector

2.1.7. Harmonic

2.1.8. Influence Maximization

2.2. Community

Name Mechanism Use Case
Weakly Connected Components (WCC) Every vertex is reachable from every other vertex through an undirected path
  • Entity Resolution, obtain a holistic picture of an given user entity
  • Fraud ring detection
Strongly Connected Components Every vertex is reachable from every other vertex through an directed path
  • AML, money laundering ring detection
K Core Maximal connected subgraph in which every vertex is connected to at least k vertices in the subgraph
  • Network analytical characterization of hierarchical layers (Hierarchical result based on different k values)
K-means Partition vertexes into k clusters in which each vertex belongs to the cluster with the nearest mean
  • Grouping transportation orders into clusters based on longitude and latitude
  • Grouping product into categories based on their embeddings
Label Propagation If the plurality of your neighbors all bear the label X, then you should label yourself as also a member of X
  • Anti-fraud, detect fraud rings that is a group of applications/accounts that share common PII info
  • Detect user communities in a social network, at later stage, one could extract features of each user communities for marketing purposes
Speaker-Listener Label Propagation Variant of the Label Propagation algorithm that is able to detect overlapping communities
Louvain Partitions the vertices in a graph by approximately maximizing the graph's modularity score
Local Cluster Coefficient (LCC) local clustering coefficient = Number_trangles/((n-1)n/2)
  • Anti phone call fraud, how many pairs of your callees called each other
Triangle Counting Count the number of triangles in the graph
  • Evaluate the cohesiveness of communities

2.2.1. Connected Components

2.2.2. K Core

2.2.3. K Means

2.2.4. Label Propagation

2.2.5. Local Cluster Coefficient

2.2.6. Louvain

2.2.7. Speaker-Listener Label

2.2.8. Propagation

2.2.9. Triangle Counting

2.3. GraphML/Embeddings

Name Mechanism Use Case
FastRP It generates node embeddings of low dimensionality through random projections from the graph’s adjacency matrix to a low-dimensional matrix
  • Use the embeddings as machine Learning features
  • Use the embeddings for similarity algorithms
Node2Vec Uses random walks in the graph to create a vector representation of a node

2.3.1. FastRP

2.3.2. Node2Vec

2.4. Path

Name Mechanism Use Case
Astar Shortest Path A shortest path algorithm guided by a heuristic function that estimates the cost of the cheapest path
  • Route planning. Planning the best route for delivering an order based on train schedules
  • Flight search. Planning a series of flights that take least amount of travelling time
  • Anti-fraud. Analyse relationship between a new credit card application and a blacklisted application
Shortest Path Return either the shortest or top k shortest unweighted/weighted paths connecting a set of source vertices to a set of target vertices
Cycle Detection Find all the cycles (loops) in a graph
  • AML, find the transaction loops
Estimated Diameter The worst-case length of a shortest path between any pair of vertices in a graph
  • Understand the interconnectedness of the graph to support following analytics, e.g. to choose small world version of WCC or standard version
Maxflow Finding a feasible flow through a flow network that obtains the maximum possible flow rate
  • AML, find the maximum money flow between two accounts
  • Supply chain, find the maximum supply flow from one site to another
Minimum Spanning Tree A minimum spanning tree is a set of edges that can connect all the vertices in the graph with the minimal sum of edge weights
  • Travelling salesman problem
Minimum Spanning Forest A set of minimum spanning trees, one for each WCC

2.4.1. Astar_shortest_path

2.4.2. BFS

2.4.3. Cycle_detection

2.4.4. Estimated_diameter

2.4.5. Maxflow

2.4.6. Minimum_spanning_forest

2.4.7. Minimum_spanning_tree

2.4.8. Shortest Path

2.5. Classification

Name Mechanism Use Case
Greedy Graph Coloring Assigns a unique integer value known as its color to the vertices of a graph such that no neighboring vertices share the same color
  • Find the maximum tasks that can be performed together without conflicting with each other
  • Find the maximum of vertexes that can execute some logic in parallel without conflicting with each other
Maximal Independent Set An independent set of vertices does not contain any pair of vertices that are neighbors

2.5.1. Greedy Graph Coloring

2.5.2. Maximal Independent Set

2.6. Similarity

Name Mechanism Use Case
Cosine Similarity Calculates cosine similarity based on topological common attributes
  • Find similar users, patients
  • User purchased item A also purchased product B
Jaccard Similarity Calculates jaccard similarity based on topological common attributes
Approximate Nearest Neighbors Calculate the approximate nearest neighbors with similarity score based on non-topological attribute values
K Nearest Neighbors Based on similarity scores makes a prediction for every vertex whose label is not known
  • Anti-fraud, predict fraud labels

2.6.1. Cosine

2.6.2. Jaccard

2.6.3. K Nearest Neighbors

2.6.4. Approximate Nearest Neighbors

Name Mechanism Use Case
Adamic Adar Number of shared links between two nodes
  • Collaborative filtering for recommender systems
Common Neighbors Number of common neighbors between two vertices
Preferential Attachment Product of the number of neighbors of the two vertices
Resource Allocation Reciprocal of number of neighbors of common neighbors
Same Community Returns 1 if they two vertices are in the same community, and returns 0 if they are not in the same community
Total Neighbors Total number of neighbors of two vertices

2.7.1. Adamic Adar

2.7.2. Common Neighbors

2.7.3. Preferential Attachment

2.7.4. Resource Allocation

2.7.5. Same Community

2.7.6. Total Neighbors

2.8. Frequent Pattern Mining

Name Mechanism Use Case
Apriori Returns top k most frequent sequential patterns
  • Application UX improvements
  • Customer churn prediction
  • Bad rating cause analysis

2.8.1. Aprior