power-law: Erdös, Bacon, and a whole lot of Ego

On The Man

A legendary personality known for his eccentricities, Paul Erdös was a Hungarian mathematician who published more papers than any other mathematician in history, working with hundreds of collaborators over the years. Perhaps the only other mathematician comparable in terms of sheer prolificacy was Leonhard Euler; Erdös published more papers, but Euler published more pages. A large part of his later life was spent living out of a suitcase, and writing papers with those of his colleagues willing to give him room and board.

Indeed, Erdös worked on problems in graph theory, number theory, probability theory, and many other theories, which I'd rather not enumerate. Suffice it to say that so enormous was his output, his friends created the idea of the Erdös Number, as a humorous tribute to the man. It is no surprise that this number has become well known in scientific circles as a tongue-in-cheek measurement of mathematical prominence.

On The Erdös Number

To be assigned an Erdös number, an author must co-write a research paper with an author with a finite Erdös number. Paul Erdös has an Erdös number of zero. Anybody else's Erdös number is k + 1 where k is the lowest Erdös number of any coauthor.

Erdös wrote around 1,500 mathematical articles in his lifetime, mostly co-written. He had 511 direct collaborators; these are the people with Erdös number 1. The people who have collaborated with them (but not with Erdös himself) have an Erdös number of 2 (9267 people as of 2010), those who have collaborated with people who have an Erdös number of 2 (but not with Erdös or anyone with an Erdös number of 1) have an Erdös number of 3, and so forth. A person with no such coauthorship chain connecting to Erdös has an Erdös number of infinity (or an undefined one).

On Other Numbers & Collaboration Distances

If one were therefore to construct a graph using the procedure described above (based on coauthorship), then the result would be what can be called a collaboration graph. The distance between 2 people (or nodes) in the graph is then the collaboration distance.

Indeed, why just stop at Paul Erdös? We can now consider the collaboration graph of movie actors, also known as the Hollywood graph, where 2 actors share an edge if they've starred in a movie together. For this graph, an analog of the Erdös number would be the Bacon number, which measures the collaboration distance to Kevin Bacon (image).

A nifty gadget for the calculation of one's collaboration distance (and Erdös number) can be found here. Incidentally, the Erdös number of the author of this post is 5 (courtesy: the gadget previously indicated).

[Paul Erdös -- Douglas Stinson -- Palash Sarkar -- Sanjay Burman -- Prof. DM -- Yours Truly]

More interestingly, his collaboration distance from Albert Einstein is 7! Not too shabby, I'd say :)

On Egocentric Networks

This brings us to the idea behind "egocentric networks". Social scientists often talk about people's egocentric networks - a fancy name for the cloud of friends and acquaintances that you have, which if diagrammed would have you at the center with edges connecting you to other people in your life. So, the Erdös collaboration graph can be construed and analyzed as the egocentric network of Paul Erdös. Such analyses, however, are most commonly used in the fields of psychology or social pyschology, ethnographic kinship analysis or other genealogical studies of relationships between individuals.

To better understand the idea, we need some definitions.

The "Ego" is an individual "focal" node. A network has as many egos as it has nodes. Egos can be persons, groups, organizations, or whole societies.
The nodes to whom the ego is connected are called the "Alters".
The "Neighborhood" is the collection of the ego and all nodes to whom the ego has a connection at some path length. In social network analysis, the "neighborhood" is almost always one-step; that is, it includes only the ego and its alters. The neighborhood also includes all of the ties among all of the actors to whom ego has a direct connection. The boundaries of ego networks are defined in terms of neighborhoods.
The "N-step neighborhood" expands the definition of the size of the ego's neighborhood by including all nodes to whom ego has a connection at a path length of N, and all the connections among all of these actors. Neighborhoods of greater path length than 1 (i.e. egos adjacent nodes) are rarely used in social network analysis. When we use the term neighborhood here, we mean the one-step neighborhood.

That's all folks ~ !

power-law

Tuesday, February 7, 2012

Erdös, Bacon, and a whole lot of Ego

No comments:

Post a Comment