This is very cool. Cute, interesting, of general interest, and it even mentions food webs! *(Thanks to Steve W. for the link)*

# The Network Structure of Baseball Blogs: Part 1

**03**
*Monday*
May 2010

Posted Network theory

in
**03**
*Monday*
May 2010

Posted Network theory

in*(Thanks to Steve W. for the link)*

**30**
*Friday*
Apr 2010

Posted Network theory

in
**28**
*Wednesday*
Apr 2010

Posted Network theory

inGen. Stanley A. McChrystal, the leader of American and NATO forces in Afghanistan, was shown a PowerPoint slide in Kabul…. It certainly would be interesting to apply some network analyses to this! Which components are central? Is the “strategy” highly modular and clustered, or diffuse? What does the degree distribution look like?

p.s. I always knew that PowerPoint was the Enemy.

**21**
*Wednesday*
Apr 2010

Posted Coral reefs, Network theory

inThe last post on this topic reported that alpha vertebrate diversity differs among reef communities in the Cayman Islands, Cuba and Jamaica, with the Caymans having the greatest species richness. I also showed that if we consider the reef communities to be random draws from the gamma-level (regional) species pool, we cannot reconstruct food webs with the observed Jamaican connectance. What’s causing this? At least a partial answer is the bias in the degrees of trophic specialization in the communities. Jamaica has greater than expected connectance because it has a relatively greater proportion of generalist species, i.e. those with high in-degrees (incoming links). Interestingly, if we compare the in-degree distributions of the communities, we find no significant differences (Kolmogorov-Smirnov tests). This first figure illustrates the cumulative frequency functions of each community. While the K-S test says no difference, however, we note that there are asymmetries in the distributions, and it could well be worth decomposing these distributions into pre- and post-modal portions. They are right-skewed and relatively long-tailed.

Another way to compare the interaction or link distributions is to look at the properties of those species, present in the regional pool, that are missing from each community. The box plots at left plot the in-link distributions of “missing” vertebrates from each community. There is a clear trend, suggesting again that Jamaica is relatively poor in trophic specialists, while the Caymans are relatively rich, with Cuba in between. K-S tests fails to confirm any significance here, but sample sizes are pretty small, and there is likely a problem with test power.

A final interesting observation. Given that there are species common to two or more of the communities, we can compare the in-degree distributions of those species only. A series of paired t-tests confirm that species in Jamaica have significantly more incoming interactions than conspecifics in the Caymans and Cuba (Pr(T>t)=0.0004 and 0.0001 respectively). Can this be reconciled with the above observations? This result is telling us that if a species exists in Jamaica and elsewhere, it will have more prey resources in Jamaica! Given that we are recording only vertebrate differences among the communities, then it means that they have **more vertebrate prey resources**. I find this to be very odd, and I’m going to have to wrap my brain around it a bit to explain it. Might be time to decompose those distributions.

**26**
*Friday*
Feb 2010

Posted Coral reefs, Network theory

inThe other measure commonly taken of food web networks (in addition to connectance) is the link or degree distribution. Real-world networks, unlike random graphs, rarely have Poisson or normal link distributions, having instead scale-free or power law distributions. The terms scale-free and power law refer to the fact that these distributions lack characteristic scales (see below), and take the general form , where the probability of a value is a power function of the value itself. Scale-free distributions have been found in networks as diverse as the Internet, transportation networks, anatomical circulatory networks, social networks and food webs. There are two features of these distributions that are of importance to food web theory. First, being scale-free means that the distribution has no characteristic scale. Many distributions have a characteristic scale, often captured by a peak (or high density region) and measured as a mean or mode, for example Poisson or normal distributions. The form of a sample drawn from one of those distributions depends on the range from which it is drawn, whereas the shape of a power law distribution is invariant throughout its range. One part of the distribution may be used to predict another with a simple rescaling of the density (see figure). Therefore, a partial sampling of the range yields an overall view.

Second, power law distributions are long-tailed decay distributions. The decay of the distribution’s density with increasing X, dictated by the negative exponent , means that the distribution’s density is concentrated at low values of X. Nevertheless, the long tail means that there is measurable density high in the X range. Contrast this with the exponential distribution in the first figure, which is also a decay distribution, but with a rapidly decaying short tail. A long-tailed link distribution has nodes that are of considerably greater degree than others. These highly linked or hub nodes confer significant resistance to failure of network connectivity. The canonical example is the Internet. Random failure of any single server is unlikely to affect the network broadly because most servers are of low degree (drawn from the high density region of P(X), and hence of low degree), but there is a high probability that they are linked to high-degree hubs. The network is susceptible, however, to targeted attacks on hubs. This is the now classic work by Barabasi and others. It is not clear how long-tailed distributions arise in networks, but models of preferential growth, where new links have a greater probability of being added to already highly linked nodes (the “rich get richer” model) are reasonable hypotheses when applied to flow networks (for example, information, energy) or social networks (the blogosphere, personal relationships).

Food webs have been characterized most frequently by their in-link distributions, which are the frequency distributions of the number of prey per consumer species (species in-degree). In-link distributions therefore describe patterns of energy flow in the system, as well as the trophic habits or dietary breadths of the species. Most documented food webs have decay in-link distributions and those are either scale-free, power law distributions, or they have properties of exponential decay, or seem to be a mixture of the two types of distributions. Exponentially decaying distributions have greater concentrations of density at low degree. This latter group of distributions are best described as mixed exponential-power law distributions of the form

where

,

r is species in-degree, and M is the maximum number of prey species available.

Dunne et al. (2002) examined the link distributions of 16 published food webs, though the survey included both links to (in-links) and from (out-links) predators. They found significant variation among the networks, but distributions belonged mostly either to power law, exponential or uniform distributions. Camacho et al. (2002), in an analysis of six of those same food webs concluded that trophic in-link distributions in fact follow a universal functional form,

where z is the coordination number of the network and is the exponential integral function. The above is also a decay function, significantly related to a scaled number of prey, r/2z. The authors derived a value of z=7.5 from the pooled data of the six networks. The distribution itself was derived analytically from an interpretation of the niche model of Williams and Martinez, which has demonstrated some success in describing empirical trophic link distributions. The data are generally aggregated averages of species population distributions, however, and it is not clear to what extent, if any, the niche model actually predicts any underlying community mechanisms, rather than describing those specific parameterized and averaged representations.

The in-link distribution of the Greater Antillean coral reef raises again the issue of species aggregation. The distribution of the guild-level network, where 750 species are aggregated into 265 guilds on the basis of very precise trophic data, is a distinct power law distribution of the form

(The exponent is of particular importance in community resistance to secondary extinction [Roopnarine et al., 2007]; see below). The high resolution of this dataset allow us, however, to also examine the species-level network, for which the distribution is not a decay distribution, but instead has a distinct mode at 36 links (secondary consumers and higher; second figure). The most precise trophic data are available for the vertebrate species in the network and the vertebrate-only distribution is similar to the overall species-level distribution, though with a mode at 76 links. Clearly the discrepancy between the guild- and species-level distributions is caused by the omission of species richnesses from the aggregated guilds. Thus it remains to be resolved if communities in fact always have greater proportions of trophic specialists, or if this pattern is restricted to aggregated data, and if the pattern occurs naturally at all or is an artifact.

**24**
*Wednesday*
Feb 2010

Posted Coral reefs, Network theory, Visualization

in**Tags**

connectance, coral reef, food webs, Network theory, networks, paleo-food web, real world networks

Another installation in the series (see previous posts on this page).

**System complexity**.– The complexity of a food web depends upon the taxon richness of the system, as well as the topology and dynamics of interspecific interactions. Although richness and topology are captured by graphic depictions, the utility of the depictions is often limited to impressing upon the viewer the overwhelming structural complexity of the systems. For example, here is a Greater Antillean coral reef food web comprising 265 trophic guilds and 4,656 interactions, currently one of the most detailed food web networks available. The system is definitely complicated, as expected of a coral reef community, but not much else can be concluded from the graph. In fact, it is more complicated than illustrated, being based on a dataset comprising 750 species and 34,465 interspecific interactions. Many of the species have been aggregated into sets termed trophic guilds, where members of a guild share prey drawn from the same guild(s), and likewise for predators. Species aggregation is a common way in which to reduce food web network complexity, but there are few formulaic methods for aggregation. The most common method is based on the concept of trophic species (trophospecies), where aggregated species are assumed to have exactly the same prey and predators. The trophic guild concept on the other hand was formulated specifically for fossil taxa and assumes uncertainty in species interactions. It is very important to understand the impacts of aggregation on network structure and dynamics, and the implications for species’ roles in the system. Whether different aggregation schemes yield similar insights into complex systems is currently poorly understood. I will return to this topic in a later post.

**Connectance**.– A number of measures and summary statistics are used to describe and compare food webs, perhaps the most common one being connectance. Food web connectance differs from the graph connectance defined earlier, because the networks are now directional. Each node may link to every other node including itself, but a directional link from species A to B is no longer equivalent to a link from B to A. The maximum number of links possible is therefore the square of the number of nodes. Using symbols common in the food web literature,

where L is the number of directional links in the network, and S is the number of nodes or species. Connectance values are generally well below one, reflecting the relative sparsity of links in food webs, but it is difficult to compare connectances among food webs that use different aggregation schemes. Perhaps given this difficulty, it is quite surprising that there is a regular relationship between L and S spanning a large number of food webs, compiled from a variety of sources, and using different aggregation methods (see also Ings et al.). The exponential nature of the relationship shows that link density, or connectance, increases with increasing node richness. It is possible that increasing taxon richness in a community demands greater connectivity in order to maintain efficient energy transfer and hence stability, or the relationship is simply spurious and any true relationship is obscured by the heterogeneity of food web metadata. This remains, in my opinion, an open problem in food web theory.

**23**
*Tuesday*
Feb 2010

Posted Graph theory, Network theory

in**Tags**

connectance, food webs, graph, interaction strength, Network theory, networks, real world networks, Robustness

Perhaps the most obvious structural elements of real food webs that distinguishes them from the graphs presented earlier is directionality of the links. Links are trophic interactions, that is, predator-prey relationships, and describe the passage of energy from prey species to predators. They can also be used to describe the impact of predation on a prey species, recognizing that the relationship is an asymmetrical one between nodes. The “traditional” manner in which to depict this graphically is with arrows between nodes (Fig. A). Whereas the graphs illustrated so far have been undirected graphs, a food web is defined properly as a directed graph, or digraph. The asymmetry is also reflected by the adjacency matrix, which is no longer symmetric about the diagonal.

The most straightforward applications of Graph Theory to food web biology are analyses of the structure or topology of digraphs. Digraphs are often referred to as networks in modern usage, and the study of digraphs, especially those describing real-world networks such as the Internet or social networks, is described as Network Theory. The reader should be aware, however, that networks are technically graphs that are digraphs having weighted or parameterized links. A network therefore depicts a food web when it contains species interactions, the direction of those interactions, and some measure of the interactions, such as interaction strength. A digraph without measures or weights on the links is in reality a special case of a food web digraph, one in which all links are considered equivalent.

A very simple three species food web is illustrated in Fig. A. Species 1 (S1) is prey only (perhaps a primary producer), S2 is both a predator or consumer of S1 while being prey to S3, and S3 is the top consumer in the network. Alternative arrangements for three species are illustrated in Fig. B-D, including a simple food chain (Fig. B), a web where the top consumer is also cannibalistic (Fig. C), and a cycle among the three species (Fig. D). These networks bear only information about the existence and direction of interactions among species, but this information is important because structure always affects function (Strogatz, 2001). The basic network approach has proven useful as a means of capturing the complexity of food webs, deriving basic comparative properties such as connectance and link distributions, and assessing one type of robustness against perturbation.

**11**
*Thursday*
Feb 2010

Posted CEG theory, Graph theory, Network theory

inI’m currently working on another review/instructional paper, this one examining the relationship of paleo-food webs to graph and network theory (along with excursions into combinatorics and counting). Here is a draft of one of the sections. This is a draft! No references, and sketches of figures will be added to the post as they are completed.

**GRAPHS**

A food web is a summary of interspecific trophic interactions. A mathematical graph is the combination of two sets, commonly written G(V, E), where the elements of E are relationships among the elements of V. Both concepts may be expressed graphically as diagrams of relationships among species or elements, an exercise that makes clear the relationship between the real-world biological system and the abstract mathematical one. The area of mathematics dealing with graphs is known as Graph Theory, familiarity with which proves very useful in the exploration and analysis of not only food webs, but of any real-world system (biological and otherwise) that can be expressed as relationships or interactions among discrete entities. Examples of other systems include networks of genomic interactions, metabolic networks, and phylogenetic trees.

Examine the food webs illustrated in Figure 1. The circles represent species, and the links between them are interspecific interactions. Describing these systems mathematically as G(V,E), the elements of E are relationships among the elements of V. The elements of V are typically referred to as vertices or nodes, and their relationships, or the elements of E, are referred to as edges. Edges are written as pairs of vertices, for example , where and are vertices in G (that is, and ). Species in the food web are therefore nodes, and trophic interactions or links are edges. The first web (Fig. 1A) is a system of non-interacting species 1, 2 and 3. It could function only if embedded within a larger system of species with which these species interacted, or if all three species were autotrophic. Such a graph where no vertices or nodes are connected (E is an empty set) is an *unconnected* graph, one with some edges is *connected* (Fig. 1B), while a graph with all vertices connected (Fig. 1C) is a *complete* graph. Any node to which another is linked is termed its neighbor. Note the alternative representation of a graph as an n by n binary adjacency matrix, where element equals one if an edge connects the two vertices, and zero otherwise.

The three graphs would obviously depict food webs with very different implications for the species involved. For example, the density of interactions increases as the number of edges, or |E|, increases. This density is often described simply by the connectance (C) of the graph, standardized as the ratio of the number of edges to the maximum number of edges possible. Each node or species could hypothetically interact with every other species, therefore the number of possible interactions is n(n-1). Note, however, that links would be counted twice, for example {1,2} and {2,1}, so we halve this number. Then

The connectances of Figures 1A-C are therefore zero, 0.333 and 1. We extend this by noting that species may sometimes interact with themselves trophically if individuals are true cannibals. This situation is illustrated in figures as loops, or unit diagonal entries in the adjacency matrix (Fig. 1D). We can now generalize by stating that the connectance of food webs expressed as graphs is measured as

and the connectance of the complete food web in Fig. 1D is therefore 1.

In addition to the overall link density of the graph, or the number of interactions in the food web, we are also interested in the number of interactions per species. This number indicates how trophically specialized or generalized a species is, and is interesting from both evolutionary and ecological perspectives. For example, very specialized species may have stronger coevolutionary interactions with the species to which they are linked, and specialization itself may require temporally extended intervals of stability or high productivity to evolve. Generalized species, on the other hand, could be less susceptible to major perturbations if the intensities of their interactions are distributed broadly among their neighbors. The number of edges or links attached to a node is termed the degree of the node. The simplest cases are those where all nodes are of the same degree, for example Figs. 1A, 1C and 1D. The distribution of links within the graph, or the link distribution, is then single-valued, and may be described as a Dirac delta function or Kronecker’s delta. A slight generalization, where nodes have the same number of links on average, instead of precisely the same degree, leads to the significant development of random graphs and the eventual study of real-world networks.

**Next up:***Random graphs*

**28**
*Thursday*
Jan 2010

Posted Network theory

in