Saturday, March 31, 2012

Study of Two species Interactions using Lotka Volterra Model

An ecological network is a presentation of the biotic interactions in an ecosystem, in which species(nodes) are connected by pairwise interactions(links). Species compete, evolve and disperse simply for the purpose of seeking resources to sustain their struggle for their very existence. Depending on their specific settings of applications, they can take the forms of resource-consumer, plant-herbivore, parasite-host interactions etc. When seemingly competitive interactions are carefully examined, they are often in fact some forms of predator-prey interaction in disguise. Thus these predator-prey interaction models give the building blocks of large and complex ecological networks.

Generic Predator-Prey Model:
Consider two populations whose sizes at a reference time t are denoted by x(t), y(t) respectively. The functions x and y might denote population numbers or concentrations (number per area) or some other scaled measure of the population sizes, but are taken to be continuous functions. Changes in population size with time are described by the time derivatives dx/dt and dy/dt respectively, and a general model of interacting populations is written in terms of two autonomous differential equations:

dx/dt = xf(x, y)
dy/dt = yg(x,y )

The functions f and g denote the respective per capita growth rates of the two species. It is assumed that df/dy < 0 and dg/dx > 0. This general model is often called Kolmogorov's predator-prey model.

Fig: An Example of Periodic activity generated by Predator-Prey Model.

Lotka Volterra Model:
In 1926, the famous Italian mathematician Vito Volterra proposed a differential equation model to explain the observed increase in predator fist (and corresponding decrease in prey fish) in the Adriatic Sea during World War I. At the same time in the United States, the equations studied by Volterra were derived independently by Alfred Lotka (1925) to describe a hypothetical chemical reaction in which the chemical concentrations oscillate. The Lotka-Volterra model is the simplest model of predator-prey interactions It is based on linear per capita growth rates, which are written as

f = b - py
g = rx - d

  • The parameter b is the growth rate of x (the prey) in the absence of interaction with species y (the predators). Prey numbers are diminished by these interactions: The per capita growth rate decreases (here linearly) with increasingly y, possibly becoming negative.
  • The parameter p measures the impact of predation on (dx/dt)/x.
  • The parameter d is the death (or emigration) rate of species y in the absence of interaction with species x.
  • The term rx denotes the net rate of growth (or immigration) of the predator population in response to the size of the prey population.
The Prey-Predator model with linear per capita growth rates is

dx/dt = (b - py)x
dy/dt = (rx - d)y

This system is referred to as Lotka Volterra Model and represents one of the earliest models in mathematical ecology.
Solving these equations analytically require complicated mathematical skills. On the top of that, some of the differential equations have no analytical solution at all. A better approach would be to use numerical methods like Euler's method.

Fig: Example showing cycles formed by population dynamics of two species generated from the model

This model is not very realistic. It does not consider any competition among prey or predators. As a result, prey population may grow infinitely without any resource limits. Predators have no saturation: their consumption rate is unlimited. The rate of prey consumption is proportional to prey density. Thus, it is not surprising that model behavior is unnatural showing no asymptotic stability. However numerous modifications of this model exist which make it more realistic.

Wednesday, March 28, 2012

Trawling the Web

The World Wide Web, along with its pages and hyperlinks can be viewed as a directed graph with nodes and edges. It is a fascinating object, and demands careful study. The graph contains over a million nodes, and more than a billion interconnecting links, and appears to grow exponentially with time. In order to study this graph, we need to first develop algorithms that tackle problems such as Web Search, Automatic Community Identification on a graph of such dimensions. It is important to note that these algorithms must be driven by the presence of certain structures in the Web graph. It should also be noted that simple Random graph models do not satisfactorily explain many properties of the web graph because they are not structure-driven. This article explains modified models that better explain such properties.

Trawling the Web for Cyber-Communities

Algorithms such as the HITS algorithm, essentially are search algorithms designed to find high-quality pages (authorities) about a fixed topic. How should one extend such algorithms if the need is to enumerate all topics (under a certain definition)?

We first need to carefully define what is meant by a topic in the web graph. One can easily see that each topic present in the web graph structure can be detected by looking for complete bipartite graphs Ki,j in the graph, as shown in the figure given below.

We define a bipartite core Ci,j to be a graph on i+j nodes, such that it contains Ki,j as a subgraph. This notion is motivated by the fact that for any well represented topic on the Web, for appropriate values of i and j, there will exist a bipartite core for it on the Web graph. The given figure shows C4,3 in which the nodes on the left connect to the home pages of major aircraft manufacturers. It is subgraphs like these which represent cyber-communities, like in this case, the nodes on the left represent aficionados of commercial aircraft manufacturers. These nodes create hub-like pages, which co-cite the authority-like pages on the right.

It is important to note that the hub pages may not co-cite all the authority pages in such cyber-communities, but, every such community will contain a bipartite core Ci,j for non-trivial value of I and j. This process of enumerating all possible bipartite cores in the web for some fixed value of i and j is called trawling.

A natural question that arises is that for small values of i and j, the cores that are found, do they actually represent cyber communities or are they present to co-incidental citations? Experiments on the web graph suggest that only a miniscule number of such bipartite cores are coincidental. But how does one actually run such experiments on a web sized graph? The naïve search algorithms does not scale up to this size.

An algorithmic paradigm called the elimination-generation paradigm is used. The algorithm makes a number of passes over the Web graph. The paradigm is explained below:

Elimination: There are often conditions that are necessary that must be satisfied for a node to participate in a subgraph which can form a bipartite core. For example, consider the case of C4,4, any node with in-degree less than or equal to 3 cannot participate on the right side of such a core. Thus, edges that are directed into such nodes can be pruned from the graph. Similarly, nodes with out-degree less than or equal to 3 cannot participate on the left side. Such necessary conditions form elimination filters.

Generation Here we consider nodes that barely qualify for potential membership in an interesting subgraph, as it can easily be verified if they belong to the subgraph or not. Consider the example of C3,3. Let u be any node with degree exactly 3. Then, such a node can belong to the subgraph if and only if the 3 nodes it points to have a neighborhood intersection of size atleast 3. This property forms what we call a generation filter. After applying such a filer, regardless of whether we find a core or not, we can prune the node, since all potential interesting cores containing it have already been enumerated.

The above mentioned algorithm makes several passes through the web graph, eliminating nodes in each pass. In most of the experiments, the benefits of elimination/generation tail off as fewer and fewer nodes are eliminated at each phase, and the algorithm stops after some time.

It must be noted that bipartite cores are not the only interesting subgraphs which constitute enumeration problems, we can also look at bidirectional stars (webrings), cliques, directed trees etc. Experiments run on the web graph help us study some of the local properties of the web.

Degree Distribution:

It was observed that both the in-degree and out-degree of the nodes follow a Zipfian distribution, with the probability that a node has degree i being proportional to 1/ia. A question that arises is that, with the random graph model that is used to model the web graph, can such properties be effectively introduced in the model?

Number of Bipartite Cores

Trawling experiments run on the a crawl of the web which contains about 100 million webpages brings up the following relation between the number of bipartite cores found with the size of the left side of the core (i), and with the size of the right side of the core (j). Again, a question arises, can such properties be effectively modeled by a random graph model for the web?

The above properties lead us to question the simple random graph model to model the web graph. A modified model is required to accurately incorporate such properties.


1. The Web as a Graph: Measurements, Models and Methods, Kleinberg et. al

Saturday, March 17, 2012

Fortune sometime favors the Poor

In complex Network, 'Preferential attachment', 'Assortativity' - these frequent use of terms make the readers convinced that there is no place for "Poor" in this world. But if the thing goes in this way, no one becomes rich from the scratch. Fortunately, God gives this opportunity by creating a concept "Randomness" so that Poor eventually gets little bit of luck, turns back and survives in this competing world.
      In his paper,  Pennock et al.(2002) brought out an important observation that though the world (WWW) shows the characteristics of Power Law with a relatively small number of sites receiving a disproportionately large share of hyperlink references and traffic, the microscopic observations state that the components of WWW behave link Log-normal degree distribution considering both "Preferential" and "Uniform" attachments.
        Several studies (Albert et al.,1999;  Broder et al., 2000) find that the probability that a randomly selected web page has k links is proportional to some power of negative constant (alpha) of k for large k , empirically determined the value of constant as roughly 2.1 for inbound links and 2.72  for outbound links. When displayed on a log-log plot, this so-called power law distribution appears linear with slope of (- alpha). A power law distribution has a heavy tail, which drops off much more slowly than the tail of a Gaussian distribution. As a result, although the vast majority of web pages have relatively small numbers of links, a few pages have enormous numbers of links—enough to skew the mean well above the median. But if we consider the small fragments of the Internet, little different think has been observed. If we look at the results of Pennock et al.(2002), it is seen that the graphs of individual component (company network, university network, power-grid network etc.) obey  some kind of mixture distribution instead if power low degree distribution. 

                                                                         Figure 1

    It is an open question exactly how peaked distributions for subsets of the web like those in Figure 1 sum together to produce the nearly pure power law for the web as a whole. There should be complementary things going on across the fragments of the internet which is responsible for  building ultimate power low structure. Pennock at al. (2002) also suggested that the vast majority of subsets (or subsets containing the vast majority of pages) exhibit a nearly zero mode and dominate this sum.   this portion needs to be investigated more.

Through a broad survey of observations given by Pennock et al., I would like to share my opinion:
1. In real world, it may not the case that any node can get arbitrary link from other node. I mean, the randomness in real world graph is not likely to be possible theoretically. If we think of a real example like new website building, someone wants to link his newly launched web page to the most popular site like Twitter or Facebook (Preferential attachment) and some other sites which are, I think, somehow related to him by any means (which is called here as random attachment) i.e. their profiles are strongly semantically related. The relation may be one of the followings: the field of interest (research areas) is similar to the linking page, or the profile is somehow matching (colleagues or partners), or he would be attached with that site previously (the site is the institute where he studied earlier) etc. So the profile study or background information of both the linking and linked nodes may be very useful analysis to answer the actual cause of randomness.
2. Another important observation is slightly beyond this study. If we take a gentle look at the power low curve, it is seen that it has a heavy tail which drops of much more slower. If we identify those nodes of high degrees, we can see that some of them are spammers. Now-a-days, spammers are getting populated by making several web pages and recursively linking them together and getting high rank. It is also observed that some of the highly popular nodes in social network like famous politicians, actors are pointing to the spammers only for getting more or more popularity. Now if we identify the spammers, delete them from the graph, penalized the high degree node for pointing to the spammers and redistribute that penalized rank to the authentic web pages are pointing to that high degree nodes, the low degree nodes may become richer.

        So thinks are not so easy as it looks like. Over the time, I will try to edit this blogs if any ideas related to describing the different curve like structures of the components aggregated together to make the power low curve come up. I am requesting all the readers to post their suggesting which they think about to describe this peculiarity and any other observation if they have noticed.

 "winner's don't take it all: characterizing the competitions for links in web", Pennock et al. (2002).


Sunday, March 11, 2012

Small Worlds and Large Diameters

A small world network is characterized by a low average path length between any two nodes, and high clustering. Most real networks, including biological networks like protein interaction networks, are  known to have this property. Since random networks are also known to have the property of low diameter, one might think that real world networks have even shorter diameters than their random counterparts. The reason for this is that short diameters can potentially increase the network efficiency of exchanging mass and/or information, not only in biological networks (e.g. the metabolic network) but also in transportation, communication and computer networks. However, studies at the  Department of Ecology and Evolutionary Biology, University of Michigan have shown that the observed diameter in real world networks is greater than the random expectation. The following are some of the salient observations of this study.
Real networks have greater-than-expected diameters
The diameter of a real network was compared to that of its randomly rewired network in which the connections between nodes were randomized while the degree of every node remained unchanged. A total of 13 real networks examined, including two linguistic, three technological, four social, and four biological networks, were found to have diameters greater than their random expectations (Table 1). An analysis of the frequency distribution of diameters showed that the greater than expected diameters are not caused by presence of some extremely long paths, but due to the presence of many elongated minimal paths. Overall, the diameters of real networks were found to be  2.3–128.3% greater than their random expectations, with a median difference of 17.4%. Albert and Barabási had also noted earlier that many real networks have longer diameters than those computed under the power-law degree distribution.

Table 1
Real networks have greater-than-expected clustering coefficients
Small world networks have high clustering coefficients. We all have seen this in the case of protein interaction networks. In 12 of the 13 real networks examined (except the dolphin association network), the clustering coefficient C was found to be greater than the expected value determined from its randomly rewired networks, and this difference is statistically significant in 7 cases (Table 2). However, there is no consistent relationship between clustering coefficient and diameter among the randomly rewired networks of each real network (Table 2), suggesting that the greater-than-expected diameters of real networks cannot be explained by the greater-than-expected clustering coefficients. Furthermore, across the 13 real networks, the correlation between the expected C and expected D from their randomly rewired networks appears negatively (Spearman's R = −0.57, P = 0.047), while the correlation between the Z-score for C and Z-score for D is not significant (Spearman's R = −0.13, P = 0.68).

Table 2
Network modularization enlarges the diameter
If shorter diameters are beneficial to at least some networks, why do all networks have longer diameters than expected by chance? It was hypothesized that this phenomenon might relate to modularity (Newman and Girvan) in networks. Simulations were conducted to find the relation between diameter and modularity for a network with a fixed amount of nodes and edges. It was observed  that the network diameter increases as the modularity increases in these simulated networks (Fig 1a). But the relationship between diameter and modularity is not linear; when the diameter is short, a small percentage increase in diameter allows a substantial percentage increase in modularity (Fig 1a). A similar concave curve is observed when the increases in diameter and modularity are measured by Z-scores, rather than the absolute values (Fig 1b). Using a similar simulation, the relationship between modularity and diameter using networks with a fixed number of modules but different mean degrees was also confirmed.

If modularization is truly the cause of the higher-than-expected diameters of real networks, all real networks should have modularities greater than expected from their randomly rewired networks. This is indeed the case (Table 1). The percentage excess in modularity (compared to the random expectation) ranges from 4.8 to 206.9% for the 13 networks, with a median of 40.8%. This percentage excess exceeds that for diameter in 10 of the 13 networks, a nonrandom pattern that is consistent with the simulation result in Fig 1 (P = 0.046, one-tail binomial test).

 Fig 1
All networks studied, including biological networks, have diameters greater than their random expectations. This suggests that modularization may well be a universal characteristic of real networks, due to the advantages it brings to network multi-functionality, robustness and evolvability. As a consequence, the network diameter has to be sacrificed to accommodate modular structures. Because shorter diameters could provide higher functional efficiency, the result suggests a tradeoff between network efficiency and multi-functionality, robustness, and/or evolvability. Although there are many networks unstudied in this work, this analysis covers major types of networks and the results are likely to reflect a general pattern of real networks. It would be an interesting study to look for those rare real world networks for whom the diameter is actually less than their random expections and examine in what way are they different from general real world networks.

[1] Zhang Z, Zhang J (2009) A Big World Inside Small-World Networks. PLoS ONE 4(5): e5686. doi:10.1371/journal.pone.0005686

Wednesday, March 7, 2012

What is Random Graph?

Random Graph, denoted by Gn,p, consists of n vertices and each possible edge between two vertices is present with independent probability p. Often one wishes to express properties of Gn,p not in terms of p but in terms of the average degree z of a vertex. The average number of edges on the graph as a whole is 1/2*n(n-1)p, and the average number of ends of edges is twice this, since each edge has two ends. So the average degree of a vertex is
where the last approximate equality is good for large n.  

Random Graph Vs Real-world Network

Real-world networks show strong clustering or network transitivity, whereas Random graph model does not. A network is said to show clustering if the probability of two vertices being connected by an edge is higher when the vertices in question have a common neighbor. Watts and Strogatz measured this clustering by defining a clustering coefficient C, which is the average probability that two neighbors of a given vertex are also neighbors of one another. In many real-world networks the clustering coefficient is found to have a high value, anywhere from a few percent to 50 percent or even more. In the random graph on the other hand, the probabilities of vertex pairs being connected by edges are by definition independent, so that there is no greater probability of two vertices being connected if they have a mutual neighbor than if they do not. This means that the clustering coefficient for a random graph is simply C = p, or equivalently C ~ z/n.

Degree Distribution
The probability pk that a vertex in a random graph has degree exactly k is given by the binomial distribution
In the limit where n >> kz, this becomes
which is the well-known Poisson distribution. Both binomial and Poisson distributions are strongly peaked about the mean z, and have a large-k tail that decays rapidly as 1/k!. We can compare these predictions to the degree distributions of real networks by constructing histograms of the degrees of vertices in the real networks.

As the figure shows, in most cases the degree distribution of the real network is power-law degree distributions which is very different from the Poisson distribution.

Small World Effect

Random graphs having non-Poisson degree distributions can be generated by specifying a degree sequence, i.e., a specified set {ki} of the degrees of the vertices i = 1...n. Typically this set will be chosen in such a way that the fraction of vertices having degree k will tend to the desired degree distribution pk as n becomes large. Once one has one's degree sequence, the method for generating the graph is as follows: one gives each vertex i a number {ki} of “stubs",ends of edges emerging from the vertex, and then one chooses pairs of these stubs uniformly at random and joins them together to make complete edges. When all stubs have been used up, the resulting graph is a random graph with the desired degree sequence.
The mean number of neighbors of a randomly chosen vertex A in a graph with degree distribution pk is  
The probability distribution of the degree of the vertex to which an edge leads is proportional to kpk
and the number of edges emerging from the vertex other than the one we arrived along is one less than the total degree of the vertex and its correctly normalized distribution is therefore 
or, equivalently,

The average degree of such a vertex is then
......... (1)

This is the average number of vertices two steps away from our vertex A via a particular one of its neighbors. Multiplying this by the mean degree of A, which is just z = <k>, we thus find that the mean number of second neighbors of a vertex is

We can extend this calculation to further neighbours also. The average number of edges leading from each second neighbour, other than the one we arrived along, is also given by (1), and indeed this is true at any distance m away from vertex A. Thus the average number of neighbours at distance m is
where z1 = z = <k> and z2 is given by Eq. (2). Iterating this equation we then determine that

Depending on whether z2 is greater than z1 or not, this expression will either diverge or converge exponentially as m becomes large, so that the average total number of neighbors of vertex A at all distances is finite if z2 < z1 or infinite if z2 > z1 (in the limit of infinite n).
Thus the graph shows a phase transition similar to that of the random graph precisely at the point where 
z2 = z1.

Making use of Eq. (2) and rearranging, we find that this condition is also equivalent to
or, as it is more commonly written,
The quantity of interest in many networks is the typical distance through the network between pairs of vertices. If we are “below" the phase transition of Eq. (4), in the regime where there is no giant component, then most pairs of vertices will not be connected to one another at all, so vertex-vertex distance has little meaning. On the other hand, "above" the transition, where there is a giant component, all vertices in this giant component are connected by some path to all others. Eq. (3) tells us the average number of vertices a distance m away from a given vertex A in the giant component. When the total number of vertices within distance m is equal to the size n of the whole graph, m is equal to the so-called “radius" r of the network around vertex A. Indeed, since z2/z1>>1 well above the transition, the number of vertices at distance m grows quickly with m in this regime, which means that most of the vertices in the network will be far from A, around distance r, and r is thus also approximately equal to the average vertex-vertex distance l. Well above the transition therefore, l is given approximately by, or,

For the special case of the random graph, for which z1 = z and z2 = z^2
as noted above, this expression reduces to the well-known standard formula for this

The important point to notice about Eq. (5) is that the vertex-vertex distance increases logarithmically with the graph size n, i.e., it grows rather slowly. Even for very large networks we expect the typical distance through the network from one vertex to another to be quite small. In social networks this effect is known as the small-world effect, and was famously observed by the experimental psychologist Stanley Milgram in the letter-passing experiments he conducted in the 1960s (Milgram, 1967; Travers and Milgram, 1969; Kleinfeld, 2000).

[1] M.E.J Newman, Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, U.S.A.
      "Random graphs as models of networks"

Monday, March 5, 2012

Network Analysis of Natural Signals

As we know, a sound source generates waves of different frequencies at a time and quality of sound depends on the combination of these frequencies together. If we look at following spectrogram of a male voice from 19th century, we find that different frequencies occur only together.

It means there is certain correlation between different frequencies produced from any source. In "Collective behavior of stock price movements in an emerging market" by Pan and Sinha (PRE 76, 046116, 2007) they have analyzed the time series data of different stock prices to find out correlation between different stocks. We can find the the exact correlation between these frequencies by doing similar network analysis of .wav file in following steps -
  1. Read wave file in matlab using wavread and generate spectrogram using Short Term Fourier Transform.
  2. We will get a matrix P containing Power Spectral Density(PSD) of each segment. We can find correlation matrix for different frequencies usinf corr() function in matlab on P.
  3. We can find eigen values and eigen vectors for this correlation matrix using eig() function.
  4. Let's us say N is the no. of frequencies and T is length of time series. If the N return time series of length T are mutually uncorrelated, then the resulting random correlation matrix is called a Wishmart matrix, whose statistical properties are well known [1]. In the limit , such that Q=T/N is greater than or equal to 1, the eigenvalue distribution of this random correlation matrix is given by  for  and 0 other wise. The bounds of this equation are given by 
  5. In absence of any correlation among stocks, the distribution should be bounded between lambda_min and lambda_max. Since, we are expecting correlation here, we will ignore all eigenvalues between lambda_min and lambda_max and principal eigenvalue as well.
  6. From the filtered eigenvalues, we construct the correlation matrix again, C_new.
  7. The adjacency A matrix of C_new is generated by choosing a threshold correlation c_th such that, A_ij =1 if C_new_ij > c_th otherwise 0.
  8. Using A, we can construct network of frequencies. This network can be used to determine which frequencies occur together by community discovery methods.
From this kind of network analysis, we can identify the source of sound by looking at community structure of frequencies.

References -
1. A. M. Sengupta and P. P. Mitra, Phys. Rev. E 60, 3389 (1999)

Sunday, March 4, 2012

Diversity-Stability Theory

Theoretical models suggest that there could be multiple relationships between diversity and stability, depending on how we define stability . Stability can be defined at the ecosystem level — for example, a rancher might be interested in the ability of a grassland ecosystem to maintain primary production for cattle forage across several years that may vary in their average temperature and precipitation.

Figure 1 shows how having multiple species present in a plant community can stabilize ecosystem processes if species vary in their responses to environmental fluctuations such that an increased abundance of one species can compensate for the decreased abundance of another. Biologically diverse communities are also more likely to contain species that confer resilience to that ecosystem because as a community accumulates species, there is a higher chance of any one of them having traits that enable them to adapt to a changing environment. Such species could buffer the system against the loss of other species.

Figure 1
Each rectangle represents a plant community containing individuals of either blue or green species and the total number of individuals corresponds to the productivity of the ecosystem. Green species increase in abundance in warm years, whereas blue species increase in abundance in cold years such that a community containing only blue or green species will fluctuate in biomass when there is interannual climate variability. In contrast, in the community containing both green and blue individuals, the decrease in one species is compensated for by an increase in the other species, thus creating stability in ecosystem productivity between years. 

In contrast, if stability is defined at the species level, then more diverse assemblages can actually have lower species-level stability. This is because there is a limit to the number of individuals that can be packed into a particular community, such that as the number of species in the community goes up, the average population sizes of the species in the community goes down. For example, in Figure 2, each of the simple communities can only contain three individuals, so as the number of species in the community goes up, the probability of having a large number of individuals of any given species goes down. The smaller the population size of a particular species, the more likely it is to go extinct locally, due to random — stochastic — fluctuations, so at higher species richness levels there should be a greater risk of local extinctions. Thus, if stability is defined in terms of maintaining specific populations or species in a community, then increasing diversity in randomly assembled communities should confer a greater chance of destabilizing the system.

Figure 2
In this case, these communities are so small that they can only contain 3 individuals. For example, this could be the case for a small pocket of soil on a rocky hillslope. There are 3 potential species that can colonize these communities — blue, dark green, and light green — and for the sake of this example let’s assume that the blue species has traits that allow it to survive prolonged drought. Looking at all possible combinations of communities containing 1, 2 or 3 species, we see that, as the number of species goes up, the probability of containing the blue species also goes up. Thus, if hillslopes in this region were to experience a prolonged drought, the more diverse communities would be more likely to maintain primary productivity, because of the increased probability of having the blue species present.

Cedar Creek Biodiversity Experiment

This experiment determines effects of plant species numbers and functional traits on community and ecosystem dynamics and functioning. It manipulates the number of plant species in 168 plots, each 9 m x 9 m, by imposing plant species numbers of 1, 2, 4, 8, or 16 perennial grassland species. The species planted in a plot were randomly chosen from a pool of 18 species (4 species, each, of C4 grasses, C3 grasses, legumes, non-legume forbs; 2 species of woody plants). Its high replication (about 35 plots at each level of diversity) and large plots allow observation of responses of herbivorous, parasitoid and predator insects and allow additional treatments to be nested within plots. Planted in 1994, it has been annually sampled since 1996 for plant aboveground biomass and plant species abundances and for insect diversity and species abundances. Root mass, soil nitrate, light interception, biomass of invading plant species, and C and N levels in soils, roots, and aboveground biomass have been determined periodically. In addition, soil microbial processes and abundances of mycorrhizal fungi, soil bacteria and other fungi, N mineralization rates, patterns of N uptake by various species, and invading plant species, have been periodically measured in subprojects in the Biodiversity Experiment.

Key Results

Plant biomass production increased with diversity because of complementary interactions among species and not because of selection (sampling) effects .

Foliar fungal disease incidence decreased at higher diversity because of greater distance between individuals of a species, and resultant lower rates of disease spread.

Greater plant diversity led to greater diversity of herbivorous insects, and this effect continued up the food web to predator and parasitoid insects.

Fewer novel plant species invaded higher diversity treatments because of their lower soil NO3 levels, greater neighborhood crowding and competition, and greater chance that functionally similar species would occur in a given neighborhood .

Greater plant species numbers led to greater ecosystem stability (lower year-to-year variation in total plant biomass) but to lower species stability (greater year-to-year variation in abundances of individual species), with the stabilizing effect of diversity mainly attributable to statistical averaging effects and overyielding effects .

Using Egocentric Networks for Content Distribution in Wireless DTNs

When we come across the term "Egocentric Network", we often start thinking of theoretical concepts like collaboration graphs and Erdos Numbers. I recently came across this very interesting practical application of Egocentric Networks in the area of wireless content distribution, which I would like to share with all of you.

The growing popularity of handheld devices, equipped with wireless interfaces like cellular network, Wifi and Bluetooth, along with the augment in their storage capacity, has resulted in a new communication paradigm - Delay Tolerant Content Distribution, where each individual carrying the device (node), can receive some content at a certain time, store it and then forward it to others at a later time.This store and forward mechanism can be made more efficient by using the concept of Egocentric Networks.

The nodes are ranked mainly based on their  Egocentric Betweenness. Other parameters like pattern of meeting between the nodes, connection quality, content accuracy (identifying the node which provides interesting data) are also taken into consideration. The Egocentric Betweenness Bi of a node i is defined as the number of pairs of neighbors of i that are not directly connected to each other.  Due to the dynamic nature of wireless Delay Tolerant Networks, Bi is averaged along time to give the Average Egocentric Betweenness. Nodes with higher value of the average egocentric betweenness are given higher ranks, as they have larger influence in the network because many nodes depend on them to make connections with other nodes.

The content exchange process works as follows - each node sends a content request to the highest rank node in its locality having content which is of interest to it. Locality here can be defined as the N-step neighborhood in the egocentric network of the node, for some empirical value of N. Let, for a particular case, the node requesting the content be A and the node receiving the request be B. B may receive multiple requests. If that happens, it accepts the request of the highest rank requesting node and rejects the others. Thus, the request from A can be accepted or rejected by B. If B accepts the request, A starts the download from B. The content is thus made available to A, which stores it and can forward it to other nodes on being requested. However, if B rejects the request, A sends the request to the next highest rank node and so on. Thus, the content is efficiently distributed among all nodes that are interested in it.

This type of distribution mechanism can be particularly useful for advertising, where the final objective is to achieve maximum dissemination, while taking the interest of the users into consideration.

[1] R. Cueves, E. Jaho, C. Guerrero, I. Stavrakakis: OnMove: A protocol for Content Distribution in Wireless Delay Tolerant Networks based on Social Information

Friday, March 2, 2012

The bernoulli distribution is a discrete probability distribution, named after Swiss scientist Jacob Bernoulli, takes value '1' with probability 'p' and takes '0' with probability 'q'(1-p). The bernoulli trial has random outcome, can be success or can be failure.And for binomial distribution, if there are more then one event. And every event is independent. If we take success probability 'p' then failure probability will be '1-p'. And if we do 'n' trials with success probability 'p' then it is called binomial experiment, and it is denoted by B(n,p). Bernoulli distribution is a special form of binomial distribution & a member of exponential family.
If there are 'n' random indipendent variables with same success probability 'p', then binomial distribution will be:

and binomial(1,p) will be bernoulli distribution.

Probability mass function(PMF) is just a function which gives probability that a discrete random variable is equal to some value. This function gives probability of exactly k successes in the experiment B(n,p). Bernoulli experiment can lead us to find negative binomial geometric, many more other distribution.

Bernoulli probability mass function:

can also be like:

Binomial probability mass function:

Poisson probability mass function:

Poisson distribution is actually binomial distribution under the limiting condition that , & np = and is a constant. Under these restrictions, the limiting form of the Binomial distribution will be poisson distribution.
Some properties of network can be analyse through these distribution. poisson distribution shows it's use in fields of counting, for example vehicle movement on roads, accedents resord of city and where should we open new medicle centre for faster result.