PEST - Model-Independent Parameter Estimation and Uncertainty Analysis

Decision Support

A DocumentReflections

A Document

Download a document which provides a comprehensive discussion of the role of modelling in environmental decision-making. This is actually an extract from a much larger document entitled "Methodologies and Software for PEST-Based Model Predictive Uncertainty Analysis". This, together with a suite of files, constitutes a comprehensive tutorial on the use of PEST and its utility support software in model predictive uncertainty analysis and model-based decision support. It can be obtained from the downloads section of these pages.

Some Reflections

The following paper was presented by me, John Doherty (author of PEST), at the recent PEST conference in Potomac Maryland. I've placed this on the PEST web pages because, in many ways, it encapsulates the philisophical basis and direction of recent PEST developments. A copy of this text, as well as the text of other papers presented at the PEST conference, can be obtained in a soon-to-be-published book. Contact us if you would like a copy.


My intention in writing this paper is to present a few thoughts on where we have gone wrong in using models as a basis for environmental decision-making in the past, and on where we should go from here if we are to do better. There is nothing wrong with making mistakes; there is a lot wrong with making mistakes repeatedly. The art of environmental modelling is a still young and we, as an industry, are still learning. Meanwhile the technology is improving. We are thus well placed to avoid making the mistakes in the future that we have made in the past.

Avoidance of mistakes requires that mistakes be recognized for what they are. This presents us with a problem, for environmental simulation is very different from many other scientific disciplines in that there is no independent measure of correctness and incorrectness.One could argue that models face the ultimate credibility test because they are used to predict the future, and the future will one day come to pass. This idea, however, is flawed, for there is no mathematical reason why a model should be able to predict the future. The mistake here lies with those who think that it can.

This leads us to what, in my opinion, has been the most widespread mistake in model usage to date. This is to expect from models more than can be justified on the basis of mathematics or logic. But how can this happen, given that models are based on mathematics, and that we as modellers are (or should be) skilled in logic? Maybe it is because we are not, as an industry, skilled enough in the mathematics of data processing, in contrast to other forms of mathematics. Whatever the reason, numerical simulators of environmental behaviour - simulators whose roots are based in science and whose algorithms are rooted in mathematics - are regularly used in unscientific and unmathematical ways to make decisions that can have huge environmental, financial and societal repercussions.

This short paper seeks to address the issue of what can be expected from environmental models and, in the light of those expectations, how models should be employed in the future as a basis for environmental management.

Making a Decision

A fundamental aspect of all decision-making is the accommodation of risk. Risk of what? Risk of bad things happening.  Here “bad things” can include such phenomena as pollution of water supplies, depletion of river flow, failure to deliver allocated water to users who expect delivery of that water, etc. Associated with each unwanted occurrence is a cost. Sometimes the cost cannot be counted in dollars, but must be assessed in terms of other values held by society. An unwanted occurrence must be avoided, but often not at all costs, for other unwanted occurrences of other management strategies compete, and they too have a cost. Hence environmental decision-making is often a complex process – a process which must include at least the following two sub-processes.

  1. The costs and benefits associated with alternative management decisions must be tabulated.
  2. The risk associated with the occurrence of bad things that may follow from different management decisions must be quantified.

It is the model's responsibility to provide the latter of the above two inputs to the decision-making process.

If possible, use of a model should provide another, less quantifiable but no less important, contribution to the environmental decision-making process. Despite the fact that the costs and benefits associated with different management strategies may be given different values by different stakeholders, public ownership of important decisions must nevertheless be sought. This is more likely to be achieved if a model is employed as a means of co-operatively testing different hypotheses, and/or for qualitatively assessing the comparative likelihood associated with different decision-dependent events, rather than as a device through which one group imposes its will on another.

Two things are required of a model if it is to be used to assess risk. These are:

  1. That it be capable of computing the uncertainty associated with its predictions of future environmental behaviour; and
  2. That the uncertainty associated with these predictions has been minimized through extraction of all available information from existing datasets.

We can add to these that the model should also be capable of suggesting data acquisition strategies that can most effectively reduce the uncertainty associated with predictions of importance.

Model Calibration

An environmental model purports to simulate environmental processes. Many simulators are readily available “off the shelf”. Let us suppose that we have taken a model from the shelf, and that we intend to use it as a basis for environmental management at a particular site. Let us further suppose that the equations which underpin its algorithmic design constitute correct representations of all environmental processes prevailing at that site (a generous supposition). Hence to use the model as a basis for site-specific environmental management, all we have to do is provide the model with correct boundary conditions, and with correct parameters; as a perfect simulator of site-scale processes, it then provides a perfect support for decisions required to manage that site.

This, of course, is fantasy. A model requires thousands, maybe millions, of parameters (maybe a handful of parameters for each of the tens of thousands of cells or elements that it employs). The earth is heterogeneous. It is simply not possible to supply the correct values for the thousands of parameters that a perfect simulator of environmental processes necessarily requires.

It is at this stage that the another mistake is often introduced to the chain of “logic” that has as its outcome model usage that is seriously flawed. The logic goes something like this. “Once we calibrate the model, it will be fine. By calibrating it we can provide it with parameters that allow it to replicate past system behaviour (at a handful of observation points). It follows therefore that it will be a good predictor of future system behaviour (at these and other observation points).”

What does “calibrate” mean, anyway? This word was never intended for use in the environmental modelling context. This word was stolen from another setting altogether. The word “calibrate” normally describes fine-tuning of a laboratory instrument to ensure measurement accuracy. During this process one or two dials or knobs are gently turned so that the instrument’s output is correct when provided with a known input. Meticulous and ingenious design ensures that only a very limited number of knobs need to be adjusted during both the factory and laboratory calibration processes, and that for each instrument a unique setting for these knobs guarantees correctness of its outputs.

Meticulous and ingenious design of an environmental model ensures just the opposite of this. A complex model cannot be a perfect simulator of complex environmental reality unless it has thousands of knobs (i.e. “parameters”) that tune its outputs for correctness in the face of known inputs. Furthermore, historical system behaviour, normally measured at only a few points in the model domain, can be replicated by the model using an infinite number of settings of these knobs. The families of settings that allow such replication to take place have some things in common – not just any settings will do. But they have a lot of differences too. These differences can lead to big differences in predictions made by a “calibrated” model – especially predictions that are made at places other than those at which environmental behaviour was replicated during the “calibration” process, and under stresses that are different from those which prevailed at “calibration”.

So why is “calibration” such an established part of modelling culture? Part of the answer lies in wishful thinking – the same thinking that has underpinned all attempts by all societies over all ages to prophesize the future. Unfortunately, in the field of environmental modelling, the notion of “calibration” has received some mathematical support through the suggested use of parsimonious parameterization schemes, these being justified on the basis that model-based data-processing requires formulation of a well-posed inverse problem. Indeed, early versions of PEST supported only these methods. I too was one of the voices that insisted on parameter parsimony – not because the world was simple when PEST was young, or even because models were simple back in those days. It was because I knew no better.

Seldom, if ever, does reality conform to the dictates of wishful thinking. An inability to parameterize complexity uniquely does not make complexity go away. It does, however, hamper a model’s ability to act as a crystal ball, for inestimable complexity is the source of most model predictive uncertainty. Hence a modeller should dispense with parameterization complexity only if he/she wishes to emasculate a model’s ability to quantify predictive uncertainty. But given the fundamental role of risk assessment in model-based decision-making, why would he/she want to do that?

Uncertainty Analysis

So if we dispense with the notion that “calibration” bestows on a model something akin to spiritual powers that endow it with prophetic abilities, where does this leave us? It forces us to abandon parameter parsimony as a parameterization philosophy, and the use of over-determined methods as a platform for model-based data processing. Highly-parameterized methods and ill-posed inverse problems then become our working environment. The comforting notion of parameter uniqueness is abandoned, as we must learn to live in a more complex world of tangled parameter relationships that include (often discontinuous) highly-dimensioned null spaces wherein nonuniqueness holds sway. But amidst the chaos we can begin to answer some important questions, and gain some important insights into what a model can and cannot achieve as we strive to understand and manage the environment.

Armed with these new insights, the first question that we can address is “how much parameter complexity should be included in a model”? As always, there will be compromises. But, in principle, there can be only one answer to this question. It is this. “If a prediction of interest is sensitive to a parameter, then that parameter must be adjustable in model-based data-processing and in ensuing model-based uncertainty analysis.” Here the word “parameter” can refer to representations of system properties right down to the cell or element level. Computing resources will place limits on this of course. However the important point is this – it is decision-informative predictions that must set the level of parameter complexity employed by a model, and not the (often very limited) dataset available for calibration of the model. If this is not the case, the model’s ability to contribute to the decision-making process which it was built to support will be compromised.

It follows from the above that, to the extent that a prediction of interest is sensitive to parameterization detail, that detail must be represented in a model. Because of the low likelihood that all of the parameters that are required to represent such detail can be estimated uniquely they must be represented on a stochastic basis. The importance of uncertainty analysis is once again obvious. Also obvious are the properties which an environmental model must possess if it is to form a suitable basis for environmental decision-making. These are as follows.

  1. The model must possess sufficient complexity to ensure that, for any decision-relevant prediction, the (unknown) correct prediction will lie within uncertainty limits established using the model.
  2. Predictive uncertainty margins calculated by the model must not be artificially reduced through failure to represent sufficient prediction-sensitive parameterization complexity in the model, nor artificially expanded through failure to extract maximum parameter-pertinent information from existing environmental data.
  3. Model defects must not introduce bias to model-calculated predictive confidence intervals.

As quantification of predictive uncertainty lies at the heart of modelling requirements, it is to Bayes theorem that attention must now be turned. Conceptually, once we have ensured that a model employs enough parameters for model-based predictive uncertainty analysis to have integrity, quantification of predictive uncertainty becomes a relatively simple matter. All we need is the following:

  1. A prior probability distribution that encompasses all parameters employed by the model.
  2. The statistical properties of measurement noise.

Armed with both of these, and with the dependencies of model outputs on system properties as supplied by the model, predictive uncertainty can be quantified. Conceptually, we therefore have what we need for environmental decision support. Unfortunately, however, the second of the above two enumerated needs constitutes the rock on which our expectations founder.

Structural Noise

Models have defects, for they are imperfect simulators of environmental processes. This has a number of consequences, including the following (stated without proof).

  1. When matching model outputs to historical measurements of system state, we cannot expect a level of fit that is commensurate with measurement noise. In most practical cases model-to-measurement fit is dominated by so-called “structural noise”.
  2. In any particular modelling context, structural noise has unknown statistical properties. However in nearly all cases, to the extent that this noise can be assigned a covariance matrix, this matrix will be singular. What does this mean? It means that irrespective of how many observations comprise a calibration dataset, there is an upper limit to the uncertainty reduction that history-matching can accrue. (This is in addition to the limits on parameter uncertainty reduction incurred by the presence of the null space.)
  3. Because of its unknown statistical properties, the likelihood term in Bayes equation can be calculated only approximately at best. Also, because structural noise can effectively shield information residing within an historical environmental dataset from parameters used by the model, its presence places a lower limit on the extent to which the dimensionality of the null space can be reduced through the history-matching process.
  4. Estimates of parameters made through the history-matching process are compromised, as at least some parameters adopt surrogate roles to compensate for model defects. However, the extent to which they do this is unknown as the amount and statistical properties of structural noise are unknown.

Special measures will almost certainly be required for optimal extraction of information from field observations through the history-matching process. In particular, historical measurements of system state and their model-generated counterparts will probably need to be processed in special ways before being matched to each other in order to filter out structural noise, thereby granting at least some model parameters greater access to the information contained within these measurements. Such processing may include digital filtering of time series, temporal, spatial and vertical differencing of head measurements, computation of periodic or event-based flow volumes, and calculation of exceedence and/or duration statistics. The relative weighting applied to different objective function components, each of which is formulated through processing the measurement dataset and its model-generated counterpart in a different way, will be somewhat arbitrary. However a fundamental requirement of any weighting scheme is that each such objective function component should neither dominate the overall objective function, nor be dominated by other objective function components.

Seen in this context, perhaps the term “model calibration” may actually regain some lost meaning. Prior to analysing the uncertainty of key model predictions as a basis for environmental decision support, we should undertake highly parameterized history-matching. In doing so we achieve the following outcomes.

  1. We assess the level of structural noise (i.e. model-defect-induced noise) associated with different components of the measurement dataset. This can often be loosely equated to the level of model-to-measurement fit at which unrealistic values, or unrealistic spatial/temporal trends, begin to appear in the estimated parameter field.
  2. We can draw some inferences on the level of spatial and temporal parameter variability that prevails within the study area – thus obtaining insights into prior parameter probability distributions necessary for predictive uncertainty analysis.
  3. We obtain a minimum error variance parameter field which, together with the above insights, provides a basis for subsequent predictive uncertainty analysis.

Notwithstanding the benefits gained from model calibration undertaken in this manner, calibration-constrained predictive uncertainty analysis is seriously compromised by the existence of model structural defects, and the effects that these defects have on model outputs under both calibration and predictive conditions. Bayes equation can only then serve as a guide to model-based predictive uncertainty analysis; the latter must inevitably entail a large degree of subjectivity. The situation is further complicated by the following aspects of data and uncertainty analysis undertaken using a defect-laden model. (These are presented without proof).

  1. The fact that some parameters adopt surrogate roles to compensate for model structural defects may actually reduce rather than increase the potential for error in certain model predictions. In general, to the extent that a prediction “resembles” measurements used in history-matching-based parameter inference, the more likely is this to occur. A subjective decision must often be made by a modeller on whether to allow or disallow parameter surrogacy, to the extent that this can be recognised in calibrated parameter fields.
  2. Parameters of minimum error variance as inferred through the model calibration process may not therefore lead to predictions of minimum error variance. This occurs because minimization of the error variance of different predictions may require different parameters to adopt different surrogate roles.
  3. Model calibration may need to be prediction-specific. When calibrating a model in order to optimize its ability to make a certain type of prediction, greater weights may need to be given to historical observations which most resemble that type of prediction.

All in all, this is a most unsatisfying situation – unsatisfying because optimal model-based data-processing must necessarily be approximate and subjective as it is undertaken using models that have defects. Of course the problem could be eradicated by designing models which have no defects. However such models are unlikely to appear in the foreseeable future (or even in the unforeseeable future for that matter). And even if they do appear, it is hard to imagine that their cpu requirements would be met by computers that will appear in the foreseeable future.

Making a Decision

From the above considerations it is apparent that while highly-parameterized, model-based uncertainty analysis can form a powerful basis for open and informed environmental decision-making, that analysis must necessarily involve a large subjective component, this requiring the exercise of expert judgement at every turn. Perhaps this is not such a bad thing, for it forces the model to become an instrument for making inquiries and testing hypotheses, rather than a laboratory instrument "wannabe" that can be calibrated to give the “right answer”. If model-based data-analysis and scientific inquiry is done in an open manner, with different stakeholder groups involved in the inquiry process, it may even lead to less expensive and less damaging resolution of disputes than that which often accompanies model-based decision-making at the present time. While it would be foolhardy to expect unanimity in such a necessarily subjective decision-making context, it would not be too much to expect the following.

  1. A clearer definition of points of disagreement between different stakeholder groups;
  2. Identification of data elements that form the pivots on which different predictions of future system behaviour depend;
  3. Identification of yet-to-be-collected data elements which would serve to reduce the uncertainty associated with key model predictions;
  4. Optimal design of a monitoring protocol for early warning of unwanted future environmental occurrences.

It is therefore obvious that models cannot be used in isolation when forming the basis for environmental data-processing and decision-making. Use of a complex model in a decision-making context without concomitant use of a tool such as PEST is like using a bolt without a nut; one is simply of no use without the other.

It is also apparent that methodologies for model-based environmental data-processing and uncertainty analysis must include a high degree of user-involvement. Visual inspection of calibration outcomes (including all aspects of model-to-measurement fit, and parameter fields attained through the calibration process) is thus essential. Ideally these should be complemented by numerical analysis methodologies (such as those provided by PEST’s new “Pareto mode” functionality) that present a user with a range of processing outcomes (whether in regularised inversion or uncertainty analysis), thereby allowing him/her to select that which he/she considers to be optimal according to his/her current understanding of a site, and of the decisions that must be made regarding management of that site.


The use of models to analyse environmental data, and to assist in the making of environmental decisions, is not an exact science. Even if a model were a perfect simulator of environmental behaviour, its parameterization would be necessarily incomplete and broad scale because of the large null space to which the model calibration process is denied mathematical entry. Where a model is an imperfect simulator of environmental behaviour (as all models are), the orthogonal complement to the null space (i.e. the solution space) becomes populated by parameters whose estimated values compensate for model structural defects to an unknown degree. The greater the preponderance of these defects, the greater is the need for compensation.

Model defects cannot be completely removed; many of them will not even be recognized. Furthermore, the fact that some parameters will compensate for these defects does not necessarily impair a model’s ability to make predictions; in many cases it will actually improve its ability to do so – something which should be encouraged through exercise of informed subjective judgement in designing a calibration process that extracts all information that is pertinent to a particular prediction from a given historical environmental dataset. Subjectivity, compromises and pragmatism must underpin all aspects of model usage, as attempts to make a model perfect will inevitably lead to models whose run-times and penchant for numerical instability inhibit their use with software such as PEST, thus rendering them useless for environmental data-processing and decision support.

It would thus appear that environmental modelling is a murky business – a business that will always be subjective and hence a hotbed for arguments. But if models can serve to focus those arguments on issues that matter – issues that science rather than rhetoric may ultimately resolve - then these arguments will have served the decision-making process well. This is a far more productive outcome of model usage than that which seeks to award one duelling model victory over another, when both in fact make false and undeserving claims to prophetic visions of the environmental future.

C4SF Login | Web Site Managed via C4Site Factory,™ a trademark of Echo Valley Graphics, Inc.
©2014 Copyright Echo Valley Graphics, Inc. All Rights Reserved.