Barabasi-Albert Algorithm for Generating Scale-Free Graphs

In summary, the conversation discusses the implementation of the Barabasi-Albert model in C++ and the expected scale-free degree distribution. However, the resulting degree distribution was exponential instead of power-law, leading to doubts about whether the algorithm was misunderstood. The conversation also delves into the calculation of k_tot and the possibility of errors in the algorithm.
  • #1
dholbach
7
0
I've been working on implementing the Barabasi-Albert model in C++. Barabasi-Albert networks are supposed to be scale-free -- that is, their degree distribution is supposed to be power-law distributed. In order to test whether my program was working correctly, I plotted the degree distribution from a network with a total of N=30,000 nodes. Unfortunately, I seem to have done something very wrong.

The resulting degree distribution from my program had a very nice, very clean exponential distribution and not a power-law distribution. On a linear-log plot, the degree distribution was a straight line for several orders of magnitude. While I'd be happy to share the relevant snippet of code at some point, I wanted to check to see if I've misunderstand BA algorithm at some important point.

1. Create an initial completely connected graph with m0 nodes. I used m0=10 nodes because my understanding is that m0 should be small compared to N. By "completely connected", I mean a graph where every node is connected exactly once to every other node. So, for example, node 0 is connected to 9 other nodes: 1, 2, ..., 9, node 1 is also connected to 9 other nodes: 0, 2, 3, ..., 9, and so on.
2. Create a new node not yet connected to any other node -- call this new node n.
3. Select a node i in the original connected graph.
4. Draw a random number r from a uniform distribution on [0,1].
5. Set [itex] p = k_i/k_{tot} [/itex], where [itex] p = k_i [/itex] is the number of edges connected to node i and [itex] k_{tot} [/itex] is the sum of [itex] k_j [/itex] for all edges in the original connected graph. In other words, [itex] k_{tot} [/itex] is twice the number of edges in the entire graph.
6. If r < p, connect n to i. That is, add i to n's adjacency list and add n to i's adjacency list.
7. Repeat steps 2-6 until the graph has N = 30,000 nodes.

After completing steps 1-7, I have my program output the number of edges connected to every node. Then I use an awk script to construct a histogram. The resulting distribution is exponential and not power-law; what gives? Have I misunderstood the BA algorithm at some point?
 
Physics news on Phys.org
  • #2
Maybe I'm tired and not thinking straight, but is k_tot really twice the number of edges in the entire graph? If m_0 = 10, then at the beginning any node n_i is connected to 9 edges. Since you have 10 nodes and each at the beginning is connected to 9 edges, k_tot is 9 * 10 = 90. Then k_tot is 10 times the number of edges in the whole graph in the beginning scenario, not twice the number. So your base case assumption is flawed, unless I'm messing up on my definitions. Did you implement the incorrect k_tot in an algorithm in this way (counting total no. of edges and doubling it) instead of manually?

Apologies if this is me misinterpreting something, my graph theory is really rusty.
 
  • #3
ModestyKing -- Thanks for your reply!

In my code, k_tot is calculated by summing the degrees of every node. So, even if you were right, it wouldn't matter.

At any rate, the reason that k_tot is 2x the number of edges is because the edges are counted twice -- once at each node. Suppose we have a completely connected graph with five nodes. Each node is connected to four other nodes. So if we sum the edges over all of the nodes, we begin with node 0 and count its four connections -- to nodes 1, 2, and 3. Now we move to node 1 and count it's three connections -- to nodes 0, 2, and 3. But we've already counted the edge that connects node 0 to node 1, so the edge from node 0 to node 1 is counted twice. When we sum the number of connections over all of the nodes in the entire graph, we end up counting every edge twice, since we count any given edge at two nodes. That's why k_tot is twice the number of edges.
 

FAQ: Barabasi-Albert Algorithm for Generating Scale-Free Graphs

What is the Barabasi-Albert algorithm?

The Barabasi-Albert algorithm is a mathematical model used to generate scale-free networks, which are networks where the distribution of connections follows a power law. This algorithm was developed by physicist Albert-László Barabási and his colleagues in the late 1990s.

How does the Barabasi-Albert algorithm work?

The algorithm starts with a small number of nodes and then adds new nodes one at a time. Each new node is connected to a certain number of existing nodes, with the probability of being connected to a particular node being proportional to that node's current degree (number of connections). This leads to the formation of hubs, or nodes with a high number of connections, which is a defining characteristic of scale-free networks.

What is a scale-free network?

A scale-free network is a type of network where the distribution of connections follows a power law. This means that a few nodes have a large number of connections, while most nodes only have a few connections. This type of network is also known as a "rich get richer" network, as the highly connected nodes tend to attract more connections.

What are some applications of the Barabasi-Albert algorithm?

The Barabasi-Albert algorithm has been used in various fields such as social networks, biology, and computer science. It has been used to study the spread of diseases, the structure of the internet, and the formation of friendships in social networks. It has also been used to design efficient routing algorithms for computer networks.

What are the limitations of the Barabasi-Albert algorithm?

One limitation of the algorithm is that it assumes that new nodes can only be connected to existing nodes, which may not always be the case in real-world networks. Additionally, the algorithm does not take into account factors such as geographic constraints or node attributes, which may be important in certain types of networks. It also cannot generate networks with a specific degree distribution, as the resulting networks will always have a power law distribution.

Similar threads

Replies
8
Views
2K
Replies
3
Views
3K
Replies
1
Views
2K
Replies
1
Views
1K
Back
Top