, ,


The following is the second part of the excerpt (see previous post) on model-building. Apologies for the paleontological and evolutionary biology jargon to any casual readers who might stumble across this. I’ll try to embellish and explain another time, or just leave me a question. Same for the citations.

“…the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience.” (Einstein, 1934). There are numerous views and opinions on the purpose for constructing a scientific model, but they all fall roughly between the end-points on a scale of general scientific goals: problem-solving (Grimm and Railsback, 2005) and constructing an underlying theory based on observations (Randall, 2005). For example, Darwin developed the qualitative model of natural selection to explain a theory of evolution that he deduced underlay many of his observations of the natural world. In the process he assumed the existence of a mechanism for the inheritance of traits. Evolutionary biologists of the early 20th century were subsequently faced with reconciling the problem between Mendelian genetics and concepts of natural selection at that time. Models of quantitative and population genetics were developed that went a long way toward reconciliation (Fisher, Wright, Haldane). It is not difficult to appreciate differences between the models and presentations of Darwin, and for example, Fisher, and that brings us to the first guideline.

A model should be appropriate to the problem or question being addressed. Darwin set out to explain patterns which he had observed, and his work is remarkable in its weaving together of so many major themes into the theory of evolution, including inheritance of traits, population growth and the struggle for existence, competition within and among species, and the history of life as recorded in the fossil record. Formulating his theory did not require a detailed understanding of any of these themes comparable to what we have available to us today. Moving beyond Darwin, however, in order to explain observations, test the theory of natural selection, and search for underlying mechanisms, requires the construction of far more specific models, and that is precisely what Fisher did. Darwin presented a convincing case of modification with descent, but it was up to Fisher and others to provide adequate models of natural selection capable of explaining the data. Fisher provided a specific and quantitative model explaining how natural selection can be based on the Mendelian “particulate character of the hereditary elements” (Fisher, 1958). This and related models, plus the later revolution in molecular biology, were crucial to our current understanding of natural selection as an evolutionary mechanism. Nevertheless, the details of these models are not at all essential when incorporating evolutionary theory into many biological models. For example, Sepkoski (1984) did not need them for his kinetic models of Phanerozoic diversity, even though all arguments of taxon origination are based on evolution. How then does one decide on the proper ingredients for a model?

Define the domain of the model. Presumably, all things in nature are connected, but most of these connections are never relevant to understanding the problem immediately at hand. One should define, at the outset, those connections or relationships which are causal, and those that are contingent (Bohm, 1957). To paraphrase an example from Bohm, one can quite accurately deduce a law of gravitation by dropping balls of various sizes and composition. Now drop a sheet of paper. Does the difference in motion suggest a new law, or inadequacy of the old one? The paper does eventually meet the ground, and precisely because of the same law of gravitation to which the balls are subject. Its different motion is a result, of course, of air resistance. If your goal is to deduce a law of gravity, then one realizes that gravity here is causal in the net journey of the paper, but the specific steps taken along the journey are contingent upon the sheet’s air resistance. Air resistance is irrelevant to a law of gravity. If your goal, on the other hand, is to model the motion of sheets of paper, then accounting for gravity alone is not sufficient and air resistance is indeed relevant.

Constrain model specificity. Getting back to the balls for a moment, is air resistance relevant to them? If one’s measurements were precise enough, then you would likely discover differences based on different surface textures of the balls, perhaps variation over time corresponding to air temperature, and so on. But those details should be unimportant when describing the motion of a dropped ball, because mass and sphericity dominate minor differences in ball composition. Therefore, the addition of air temperature as a causal parameter is unnecessary. There is often a temptation and tendency to strive for realism in biological models to the point at which the model becomes too complex. This point can be recognized either when the addition or variation of parameters fail to alter model output in a predictable manner, or variation of model output cannot be explained directly as a result of parameter variation. An accurate simulation of reality is not a model unless one can point to exactly why the simulation reproduces all the aspects that make it realistic. Simulations are not necessarily models.

Having stated that, a major caveat must be given immediately. Ecological systems are mostly complex, that is, they are the result of the interactions of numerous independently acting or semi-independent entities. Simple models of those entities can sometimes produce complicated results, particularly when causal relationships within the model are nonlinear (as we will see later on). The behavior of a system of entities can be modeled as a collection of simple models, or ignore that lower level altogether and treat the entire system as an entity. These approaches are not likely to yield the same results. The former can quickly become a hopelessly realistic simulation, while the latter simply cannot be interpreted at a level relevant to individual entities. A proper model will search for a middle path, and though the search can be as much art as it is science, that’s where the fun is.