Skip to content

🌌 Method to partition large networks into communities

Notifications You must be signed in to change notification settings

adriacabeza/GraphClustering

Repository files navigation

🌌 Graph Clustering into communities

HitCount contributions welcome made-with-python GitHub stars

We will be using the following graphs from the Stanford Network Analysis Project (SNAP): ca-GrQc, Oregon-1, roadNet-CA, soc-Epinions1, and web-NotreDame (http://snap.stanford.edu/data/index.html). Project description in project.pdf and final report in report.pdf.

Initial example visualization and clustering of the graph ca-GrQc

Kamada-Kawai graph visualization of the ca-GrQc graph and Clustering using the Spectral Embedding.

Statistics of graph datasets

Graph #vertices #edges #clusters
ca-GrQc 4158 13428 2
Oregon-1 10670 22002 5
soc-Epinions1 75877 405739 10
web-NotreDame 325729 1117563 20
roadNet-CA 1957027 2760388 50

Run it

Requirements

Python 3 and install dependencies:

pip install -r requirements.txt

Recommendations

Usage of virtualenv is recommended for package library / runtime isolation.

Usage

Run the clustering algorithm from the main Python file graph_clustering.py. You can read arguments help and find command examples in EXPERIMENTS.sh. List of arguments:

  • seed: Random seed.
  • iterations: Number of iterations with different seed.
  • file: Path of the input graph file.
  • outputs_path: Path to save the outputs.
  • clustering: Use "kmeans", "custom_kmeans", "kmeans_sklearn", "xmeans" or "agglomerative".
  • random_centroids: Random centroids initialization for "custom_kmeans".
  • distance_metric: Distance metric for "custom_kmeans": "MINKOWSKI", "CHEBYSHEV", "EUCLIDEAN".
  • compute_eig: Compute eigenvectors or load them.
  • k: Number of desired clusters.
  • networkx: Use networkx library for Laplacian.
  • eig_kept: Number of eigen vectors kept.
  • normalize_laplacian: Normalize Laplacian.
  • invert_laplacian: Invert Laplacian.
  • second: Using only second smallest eigenvector.
  • eig_normalization: Normalization of eigen vectors by "vertex", "eig" or "None".

Authors

👤 Álvaro Orgaz Expósito (alvarorgaz)

👤 Adrià Cabeza (adriacabeza)

About

🌌 Method to partition large networks into communities

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published