Upper Air Temperature Validation

MSU & AMSU Data Comparison with In Situ Observations

In order to validate the MSU- and AMSU-derived lower tropospheric temperature data, we would like to compare the long-term changes with those measured directly. The most widespread instruments for directly measuring the temperature of the upper air are radiosondes (commonly called weather balloons). These balloons ascend through the atmosphere, measuring various meteorological variables (including temperature) and radioing the results back to the surface. The radiosonde data, while having the advantage of being a direct measurement of temperature, have two major disadvantages. First, most of the radiosonde stations are located in northern hemisphere land areas, leaving large regions of the world’s oceans and the southern hemisphere essentially unmonitored. Second, there are calibration errors or inhomogeneities in the radiosonde dataset that occur over time as instrumentation is upgraded, observing practices are changed, or processing code is improved. The effects of these inhomogeneities need to be removed before long-term changes in temperature can be analyzed. This process has been performed by a number of groups over the past decade or so, resulting in a number of homogenized datasets. We use the most recent versions of these datasets for comparison with the satellite data. The datasets we use are listed below.

Name of Dataset

Originating Institution

HadAT (Thorne et al., 2005)

Hadley Centre, UK Met Office 

RAOBCORE (Haimberger, 2007)

University of Vienna

RICH (Haimberger et al., 2008)

University of Vienna

IUK (Sherwood, 2007; Sherwood et al., 2008)  

University of New South Wales

MSU/AMSU Equivalent Temperature

Radiosondes measure temperatures at specific levels, while the satellite data are temperatures averaged over thick atmospheric layers, weighted by the temperature weighting function, shown in Fig. 1.  In order to convert the radiosonde measurements to a MSU/AMSU-equivalent temperature for each channel, we construct a weighted average of the temperatures measured at the specific radiosonde levels and at the Earth’s surface.

The exact values of these weights are calculated using a radiative transfer model, and depend on which levels are available in the radiosonde measurements, the surface atmospheric pressure at the radiosonde location, and whether the surface is water or land.

MSU/AMSU weighting function
Fig. 1.  Temperature weighting functions for the 4 MSU/AMSU products.

Gridded Data

In order to compare with satellite data, we require TLT-equivalent temperatures in gridded form. HadAT, RAOBCORE and RICH are available in gridded form. HadAT are available as gridded TLT temperatures, while RAOBCORE and RICH are in the form of gridded levels. For RAOBCORE and RICH, we used discrete weights to convert the RAOBCORE and RICH data to equivalent TLT. Since HadAT is available on a 36 x 36 lat lon grid, we regridded it to a 144 x 72 grid for ease of comparison. IUK is available as adjusted monthly average profiles at each radiosonde location. These we gridded into a 10 x 10 degree grid using a “bucket” method, and then converted to TLT using discrete weights and regridded it to a 144 x 72 grid.

The spatial sampling for all adjusted radiosonde datasets is relatively sparse, with much of the weight centered over northern hemisphere land areas. In the Fig. 2 regions sampled by RAOBCORE 1.4 in January 1987 show a typical radiosonde sampling pattern.

Typical RAOBCORE sampling
Fig. 2.  Regions sampled by the RAOBCORE 1.4 product in January 1987.

Sub-Sampling Satellite Data to Match Radiosonde Locations

Radiosonde sampling must be taken into account when comparing radiosonde datasets to MSU/AMSU TLT, since the temperatures are likely to behave differently in the sampled and non-sampled areas. A typical HadAT-subsampled TLT map is shown here:

RSS TLT subsampled for HadAT
Fig. 3 RSS TLT temperature for the month of January 1987 subsampled to match the gridded version of the HadAT dataset for that month.

Raw Global v. Sub-Sampled Satellite Data Compared to Radiosonde Trends

To illustrate the importance of sampling, we performed two comparisons using TLT data. In the first, we calculated area weighted global averages for each dataset, ignoring whether a given pixel in the satellite data was sampled by the radiosonde dataset - we will call these raw global averages. In the second, we calculated area-weighted dataset only using those pixels that were sampled by both the satellite and the radiosonde dataset. Note that this results in multiple versions of the subsampled satellite data, one for each radiosonde dataset.

Here we plot 5 time series for globally averaged HadAT data, the "raw" RSS and UAH time series, and the HadAT-subsampled RSS and UAH time series. In this case, differences between HadAT and both satellite datasets are reduced significantly by subsampling. This occurs for both the long-term trends and the short time scale differences.

Fig. 4  Globally (75S to 75N) area-weighted averaged TLT anomaly time series for the 1979-2006 period. Each time series is low-pass filtered using a filter that strongly attenuates variability that occurs on less than a six-month time scale.

Effects of Sub-Sampling

We plotted similar plots for each radiosonde dataset, for global, tropical, northern extratropical and southern extratropical averages. The short time scale differences were nearly always improved by sub sampling, while the differences in long-term trends were sometimes increased by sub sampling, particularly in the southern extratropics where the number of radiosonde stations is small thus increasing the importance of errors in individual stations. The trend results are summarized in Fig. 5 below. For the global and tropical latitude bands, sub sampling improves the agreement between satellite and radiosondes. In the northern extratropics, sub sampling has little effect (a small increase) on the satellite trends, since this region is well sampled in all datasets. In the southern extratropics, sub sampling increases the difference in trends between the satellite datasets and 3 of the adjusted radiosonde datasets.

Summary Plots

Fig. 5.  Summary of homogenized radiosonde and MSU/ASMU TLT trends for 1979-2006, except for IUK, which is for the period 1979-2005. The four groups along the bottom are for averages made over different zonal bands, Global (75S-75N), Southern Extratropics (75S to 30S), Tropics (30S-30N), and the Northern Extra Tropics (30N-75N). Within each group, there are 3 subgroups. The leftmost subgroup is the area-weighted trend for each radiosonde dataset. The next two subgroups are the trends for RSS and UAH MSU/AMSU data sub-sampled at locations where each radiosonde dataset has valid data. The color coding of the symbols matches the color coding of the radiosonde trend symbols.

The error bars in the summary figures were found using a Monte-Carlo approach to model the internal error in the RSS datasets. The outer error bars are the 2-sigma error with uncertainty in the diurnal adjustment included. The inner error bars are the 2-sigma errors when we ignore the uncertainty in the diurnal adjustment. For more details see Mears et al., 2011.

Other Channels, Other Radiosonde Datasets

To see plots similar to Fig. 4 and 5, but for other channels and radiosonde datasets, please use our validation browse tool.


The excellent agreement between all TLT datasets in the northern extratropics should put to rest any doubt that there is significant warming of the troposphere in this region. This level of agreement is likely to be due to the high quality of most of the radiosonde stations in this region, the dense sampling of radisondes (which increases the likelihood that neighbor-based homogenizations techniques are successful), along with the relatively small diurnal adjustments needed for the satellite datasets.

In the tropics, the agreement between TLT datasets is not as good, with RSS typically showing more warming than either the UAH or radiosonde datasets. Both the radiosonde and satellite datasets may contain errors in this region. Many tropical radiosonde stations show substantial inhomogenities, which may be under corrected, leading to a cooling bias (Thorne et al 2005, Sherwood et al 2008) though the amount of this bias in unknown. The differences between the satellite datasets in the tropics are likely to be due to differences between the corrections both groups make for diurnal drift. 

In the southern extratropics, the two satellite datasets are again in good agreement with each other for TLT. This, combined with the very small diurnal adjustments needed due to the small amounts of land present, gives us confidence in the satellite data in this region (at least more than a few degrees from the edge of the Antarctic Continent – see discussion of problems at the Antarctic edge in Mears et al, 2009). Radiosonde data in this region is sparse, making the southern extratropical radiosonde averages prone to error. Also note that the different radiosonde datasets have very different sampling in the southern extratropics – for example, HADAT does not include the radiosonde stations along the Antarctic coast, which the other datasets do include.

As we move higher, the range of trends found by the radiosonde datasets tends to increase. This increase in spread is much too large to be explained by differences in sampling, and thus is likely to be caused by differences in the homogenization procedures used to generate the datasets. The inhomogeneities in the radiosonde data tend to be larger at lower pressure, where the temperature sensing elements in the radiosonde packages become less firmly coupled to the air temperature and thus are more susceptible to the effects of solar heating. These larger inhomogenities are harder to correct, perhaps leading to the larger spread in results. In the tropics and southern extratropics, most of the radiosonde datasets (except for RAOBCORE) show more cooling that either satellite dataset. This suggests that the radiosonde datasets are undercorrected, particularly in regions with sparse spatial sampling, where the neighbor analysis methods used by the homogenization efforts are most likely to have difficulties.

This leads us to suspect that the radiosonde results are less reliable higher in the atmosphere. This is likely to be at least part of the cause for the decreasing agreement between radiosonde results and satellite results as we move to higher altitudes.

The two satellite datasets also become more different from each other at higher altitude. There is a large difference between RSS and UAH TMT in the tropics, with RSS being in good agreement with the RAOBCORE measurement, and UAH showing agreement with the RUK and HadAT results. In the lower stratosphere, RSS shows substantially less cooling than UAH at all latitudes. Unfortunately, due to the suspect reliability of the radiosonde datasets in the stratosphere, they cannot be used to unambiguously determine which satellite dataset is closer to being correct.


Mears, C. A., F. J. Wentz, P. Thorne, and D. Bernie (2011), Assessing uncertainty in estimates of atmospheric temperature changes from MSU and AMSU using a Monte-Carlo estimation technique, J. Geophys. Res., doi:10.1029/2010JD014954, in press.

Haimberger, Leopold, 2007: Homogenization of Radiosonde Temperature Time Series Using Innovation Statistics. Journal of Climate, 20, 1377-1403.

Haimberger, Leopold, Tavolato, C., and Sperka, S, 2008: Towards the Elimination of Warm Bias in Historic Radiosonde Records -- Some New Results From a Comprehensive Intercomparioson of Upper Air Data. Journal of Climate, 21, 4587-4606.

Sherwood, Steven C., 2007: Simultaneous Detection of Climate Change and Observing Biases in a Network With Incomplete Sampling. Journal of Climate, 20, 4047-4062, 10.1175/JCLI4215.1.

Sherwood, Steven C., Meyer, Cathryn L., and Allen, Robert J.,2008: Robust tropospheric warming revealed by iteratively homogenized radiosonde data.

Thorne, Peter W., Parker, David E., Tett, Simon F. B., Jones, Phillip D., McCarthy, Mark, Coleman, Holly, and Brohan, Philip, 2005: Revisiting Radiosonde Upper-Air Temperatures From 1958 to 2002. Journal of Geophysical Research, 110, D18105, doi:10.1029/2004JD005753