Previous Chapter: 3 - Introduction to the Mechanism

Chapter 4 - Introduction to Model Testing

Chapter 4 introduces the essential scientific process that determines whether a model is reliable: validation. This chapter explains what validation is, how it differs from calibration, and why it is the only meaningful criteria for deciding whether a model should be trusted. While climate discussions often focus on projections or consensus, scientific credibility depends on whether a model can accurately reproduce known observational history. If it cannot, its predictions cannot be relied upon.

The chapter begins with a simple but crucial distinction. A model may be carefully constructed, mathematically elegant, or widely accepted, but it is not scientifically valid unless its results match observations. Many climate-related claims appear authoritative because they are presented through official bodies or repeated widely in media, but authority is not evidence. Validation is evidence. Without it, a model is nothing more than an assumption illustrated with mathematics. Chapter 4 emphasises that this is not a criticism unique to climate science-every scientific model, regardless of discipline, must meet this standard.

To highlight the difference between appearance and reliability, the chapter discusses calibration, the process by which model parameters are adjusted to achieve a desirable match to some portion of the data. Calibration is acceptable, even necessary, but it does not prove anything. A calibrated model can be made to fit the data it was tuned to; the true test is whether it can match other data, outside the calibration window. Valid models pass this test. Weak models fail it. Many climate models appear convincing only because they are calibrated to fit a narrow range of conditions. When tested against the greenhouse-effect record from 1985–2022, they perform poorly.

The chapter then introduces the scientific attitude required for honest model testing: beginning with the null hypothesis. In climate terms, the null hypothesis is that natural climate variability remains the dominant factor unless the data clearly show otherwise. This is consistent with hundreds of millions of years of climate history, during which natural cycles produced dramatic temperature swings long before industrial activity existed. A model that assumes from the outset that human emissions must be the dominant cause of warming effectively rejects the null hypothesis without evidence. When this assumption is embedded directly into the model structure-as is the case with the IPCC model-it becomes logically impossible for the model to ever disprove the assumption. This is the opposite of scientific testing.

Next, Chapter 4 turns to the data itself. The availability of 459 monthly observations of the total greenhouse effect, spanning nearly four decades, provides an unprecedented opportunity to assess model performance. These measurements capture changing surface temperatures, atmospheric composition, and greenhouse-gas abundances. They form a robust dataset against which any model can be tested. If a model’s predictions diverge sharply from this real-world history, the model cannot be considered valid-regardless of how often it is cited or how widely it is believed.

The chapter outlines the key criteria used to judge model performance:

  1. Shape matching - Does the model reproduce the pattern of changes seen in observations?
  2. Magnitude - Does it match the size of the variations in the greenhouse-effect record?
  3. Parameter plausibility - Are the model’s input assumptions physically reasonable?
  4. Predictive power - When applied to periods it was not tuned to, does the model still match the real data?
  5. Consistency with spectral physics - Does the model respect the known radiative behaviour of greenhouse gases?

A model must satisfy all of these criteria to be considered reliable. Passing one or two is not enough. The chapter notes that many modern models fail on multiple criteria, especially when tested over the full 1985–2022 period.

One of the most important insights introduced here is that good scientific models are falsifiable. They can be proven wrong. This allows science to advance by eliminating incorrect ideas. But models that embed their conclusions as assumptions cannot be falsified-they “prove” their own premises and therefore cannot be tested honestly. Chapter 4 warns that when such models dominate public policy, science risks becoming circular: predictions are treated as evidence, and evidence is ignored when it contradicts predictions.

The chapter concludes by preparing readers for the model comparison that follows. The three models introduced in Chapter 1-the IPCC Model, the TRANS Model, and the Cardinal Model-will each be tested using the criteria described here. The greenhouse-effect dataset provides the independent, observational benchmark needed to judge them. Only one model will be shown to match the real-world record, and the chapter sets the stage for that outcome by giving readers the methodological tools to interpret the results objectively.

By the end of Chapter 4, readers understand both the logic and the importance of model validation. They are ready to explore how the greenhouse effect functions in detail and to assess the credibility of competing models based on evidence rather than rhetoric or consensus. This marks the transition from conceptual groundwork to empirical analysis, which forms the core of the chapters ahead.


Next Chapter: 5 - The CARDINAL Model – Summarised Construction and Outcomes