Centrality Measures ( Node Importance)

Centrality Measures ( Node Importance)#

Not all nodes are equal. Some nodes connect many others; some nodes act as bridges and some nodes are closer to everyone.

Centrality measures quantify how important or influential a node is within a graph (or network).

Key insight: Different centrality metrics answer different questions about importance. A node can rank #1 on one metric and #5 on another.

We will cover three foundational measures:

Measure	Core question	Analogy
Degree centrality	How many direct connections?	Popularity
Closeness centrality	How quickly can you reach everyone?	Efficiency
Betweenness centrality	How often do you sit on the shortest path between others?	Broker / gatekeeper

Illustration of different centrality roles in a network. Source: towardsdatascience.com

1. Degree Centrality#

The simplest measure. Measures importance based on number of direct connections.

A node’s degree centrality is its number of edges divided by the maximum possible edges.

Example: who has the most connections? Alice and Bob tie in our example. In a social network, a person with many friends has high degree centrality

\[C_D(v) = \frac{\deg(v)}{N - 1}\]

where N is the total number of nodes.

High degree → directly connected to many nodes → “popular” or “well-networked”
Fast to compute: O(V + E)
Limitation: ignores the quality/importance of connections (a node connected to 10 isolated nodes vs. 10 hubs)

2. Closeness Centrality#

Measures how close a node is to all other nodes. It focus on how quickly a node can reach all others (based on shortest path). A node is close-central if its average shortest path to every other node is short.

Example: who can reach everyone else most quickly on average? A node that can quickly reach everyone (few hops) has high closeness.

\[C_C(v) = \frac{N - 1}{\sum_{u \neq v} d(v, u)}\]

where \(d(v, u)\) is the shortest path length between nodes v and u.

Nodes with high closeness are closer (shorter paths) to all others.Source: www.ebi.ac.uk

High closeness → information/influence spreads fast from this node
In a social network: the person who hears gossip first, or spreads news fastest
In a supply chain: the warehouse that can ship to any destination quickest
Complexity: O(V × (V + E)) using BFS from each node

3. Betweenness Centrality#

Measures how often a node acts as a bridge along the shortest path between two other nodes.

How often a node lies on shortest paths between other nodes. Identifies bridge nodes

Example: who is connected to other well-connected nodes? This is the core idea behind Google’s PageRank. A person connecting two groups has high betweenness

\[C_B(v) = \sum_{s \neq v \neq t} \frac{\sigma_{st}(v)}{\sigma_{st}}\]

where:

\(\sigma_{st}\) = total number of shortest paths from s to t
\(\sigma_{st}(v)\) = number of those paths that pass through v

Notes: The normalized version divides by \((N-1)(N-2)/2\) so scores fall in [0, 1]. But in this course, we are not doing it.

High betweenness → gatekeeper / broker — controls information flow
Removing a high-betweenness node disrupts the network most
Used in: fraud detection (unusual intermediaries), infrastructure resilience, influence campaigns
Complexity: O(VE) (Brandes algorithm)

When to Use Which?#

Degree → quick measure of connectivity
Closeness → efficiency of communication
Betweenness → control over information flow
Eigenvector → influence in the network

Connection to Featurization#

Centrality measures are node-level features used in:

Recommendation systems
Social network analysis
Fraud detection
Graph-based machine learning

These features help models understand node importance and role

import networkx as nx

G = nx.Graph()
G.add_edges_from([
    ("You","Alice"),
    ("You","Charlie"),
    ("Alice","Bob"),
    ("Bob","David"),
    ("Charlie","Eve")
])

print("Degree:", nx.degree_centrality(G))
print("Closeness:", nx.closeness_centrality(G))
print("Betweenness:", nx.betweenness_centrality(G))
print("Eigenvector:", nx.eigenvector_centrality(G))

Degree: {'You': 0.4, 'Alice': 0.4, 'Charlie': 0.4, 'Bob': 0.4, 'David': 0.2, 'Eve': 0.2}
Closeness: {'You': 0.5555555555555556, 'Alice': 0.5555555555555556, 'Charlie': 0.45454545454545453, 'Bob': 0.45454545454545453, 'David': 0.3333333333333333, 'Eve': 0.3333333333333333}
Betweenness: {'You': 0.6000000000000001, 'Alice': 0.6000000000000001, 'Charlie': 0.4, 'Bob': 0.4, 'David': 0.0, 'Eve': 0.0}
Eigenvector: {'You': 0.5211200923556694, 'Alice': 0.5211200923556694, 'Charlie': 0.41790694813836154, 'Bob': 0.41790694813836154, 'David': 0.23192160753344954, 'Eve': 0.23192160753344954}