Graph Algorithms Overivew
by aaronchenwei
1. Algorithm Types
- Centrality
- Indicators of centrality assign numbers or rankings to nodes within a graph corresponding to their network position
- Classification
- Classify the graph into sets
- Community
- A network is said to have community structure if the nodes of the network can be easily grouped into sets of nodes such that each set of nodes is densely connected internally
- GraphML/Embeddings
- A graph embedding is an approach converting neighborhood topology into a fixed size vector of decimal values
- Path
- Identify path between vertexes and path analytics
- Similarity
- Computes similarity between pairs of items
- Topological Link Prediction
- Link prediction is the problem of predicting the existence of a link between two entities in a network
- Frequent Pattern Mining
- An analytical process that finds frequent patterns
2. Algorithms
2.1. Centrality
Name | Mechanism | Use Case |
---|---|---|
Eigenvector | A vertex's influence is based on its indegree and the influence of the vertices which refer to it. |
|
PageRank | Variant of Eigenvector. Adding additional jump probability | |
Article Rank | Variant of PageRank. Lowers the influence of low-degree nodes | |
Closeness | Find vertexes that have shorter distance to all other nodes |
|
Harmonic | Variant of Closeness Centrality. Reverses the sum and reciprocal operations in the definition of closeness centrality |
|
Degree | Variant of Closeness Centrality. Reverses the sum and reciprocal operations in the definition of closeness centrality |
|
Betweenness | Variant of Closeness Centrality. Reverses the sum and reciprocal operations in the definition of closeness centrality |
|
Influence Maximization | Variant of Closeness Centrality. Reverses the sum and reciprocal operations in the definition of closeness centrality |
|
2.1.1. PageRank
2.1.2. Article Rank
2.1.3. Betweenness
2.1.4. Closeness
2.1.5. Degree
2.1.6. Eigenvector
2.1.7. Harmonic
2.1.8. Influence Maximization
2.2. Community
Name | Mechanism | Use Case |
---|---|---|
Weakly Connected Components (WCC) | Every vertex is reachable from every other vertex through an undirected path |
|
Strongly Connected Components | Every vertex is reachable from every other vertex through an directed path |
|
K Core | Maximal connected subgraph in which every vertex is connected to at least k vertices in the subgraph |
|
K-means | Partition vertexes into k clusters in which each vertex belongs to the cluster with the nearest mean |
|
Label Propagation | If the plurality of your neighbors all bear the label X, then you should label yourself as also a member of X |
|
Speaker-Listener Label Propagation | Variant of the Label Propagation algorithm that is able to detect overlapping communities | |
Louvain | Partitions the vertices in a graph by approximately maximizing the graph's modularity score | |
Local Cluster Coefficient (LCC) | local clustering coefficient = Number_trangles/((n-1)n/2) |
|
Triangle Counting | Count the number of triangles in the graph |
|
2.2.1. Connected Components
2.2.2. K Core
2.2.3. K Means
2.2.4. Label Propagation
2.2.5. Local Cluster Coefficient
2.2.6. Louvain
2.2.7. Speaker-Listener Label
2.2.8. Propagation
2.2.9. Triangle Counting
2.3. GraphML/Embeddings
Name | Mechanism | Use Case |
---|---|---|
FastRP | It generates node embeddings of low dimensionality through random projections from the graph’s adjacency matrix to a low-dimensional matrix |
|
Node2Vec | Uses random walks in the graph to create a vector representation of a node |
2.3.1. FastRP
2.3.2. Node2Vec
2.4. Path
Name | Mechanism | Use Case |
---|---|---|
Astar Shortest Path | A shortest path algorithm guided by a heuristic function that estimates the cost of the cheapest path |
|
Shortest Path | Return either the shortest or top k shortest unweighted/weighted paths connecting a set of source vertices to a set of target vertices | |
Cycle Detection | Find all the cycles (loops) in a graph |
|
Estimated Diameter | The worst-case length of a shortest path between any pair of vertices in a graph |
|
Maxflow | Finding a feasible flow through a flow network that obtains the maximum possible flow rate |
|
Minimum Spanning Tree | A minimum spanning tree is a set of edges that can connect all the vertices in the graph with the minimal sum of edge weights |
|
Minimum Spanning Forest | A set of minimum spanning trees, one for each WCC |
2.4.1. Astar_shortest_path
2.4.2. BFS
2.4.3. Cycle_detection
2.4.4. Estimated_diameter
2.4.5. Maxflow
2.4.6. Minimum_spanning_forest
2.4.7. Minimum_spanning_tree
2.4.8. Shortest Path
2.5. Classification
Name | Mechanism | Use Case |
---|---|---|
Greedy Graph Coloring | Assigns a unique integer value known as its color to the vertices of a graph such that no neighboring vertices share the same color |
|
Maximal Independent Set | An independent set of vertices does not contain any pair of vertices that are neighbors |
2.5.1. Greedy Graph Coloring
2.5.2. Maximal Independent Set
2.6. Similarity
Name | Mechanism | Use Case |
---|---|---|
Cosine Similarity | Calculates cosine similarity based on topological common attributes |
|
Jaccard Similarity | Calculates jaccard similarity based on topological common attributes | |
Approximate Nearest Neighbors | Calculate the approximate nearest neighbors with similarity score based on non-topological attribute values | |
K Nearest Neighbors | Based on similarity scores makes a prediction for every vertex whose label is not known |
|
2.6.1. Cosine
2.6.2. Jaccard
2.6.3. K Nearest Neighbors
2.6.4. Approximate Nearest Neighbors
2.7. Topological Link Prediction
Name | Mechanism | Use Case |
---|---|---|
Adamic Adar | Number of shared links between two nodes |
|
Common Neighbors | Number of common neighbors between two vertices | |
Preferential Attachment | Product of the number of neighbors of the two vertices | |
Resource Allocation | Reciprocal of number of neighbors of common neighbors | |
Same Community | Returns 1 if they two vertices are in the same community, and returns 0 if they are not in the same community | |
Total Neighbors | Total number of neighbors of two vertices |
2.7.1. Adamic Adar
2.7.2. Common Neighbors
2.7.3. Preferential Attachment
2.7.4. Resource Allocation
2.7.5. Same Community
2.7.6. Total Neighbors
2.8. Frequent Pattern Mining
Name | Mechanism | Use Case |
---|---|---|
Apriori | Returns top k most frequent sequential patterns |
|