RADIOSONDE DATA SPECIFICATION DOCUMENT
--------------------------------------
Henrik Vedel, DMI
The Danish Meteorological Institute, DMI, is providing radiosonde data for the
TOUGH project. This file gives a description of the format of the data files,
the way in which the specific humidity is deduced from the radiosonde data,
and of how to obtain the data.
Region covered:
---------------
The region from which the radiosonde data are extracted is defined by,
-30.0 <= longitude <= 40.0 degrees,
25.0 <= latitude <= 89.9 degrees.
"Meteorological" Radiosonde data
------------------------------
The radiosonde data which we obtain at the DMI from the GTS network, via which
meteorological data are distributed, contain geopotential height, temperature,
dewpoint temperature, wind speed and wind direction at selected pressure
levels. The levels are a combination of standard pressure levels and extra
levels at which the atmospheric profile was found to change behavior during
each particular radiosonde ascent.
Of these data we extract pressure, geopotential height, temperature, and
dewpoint temperature for the TOUGH project. Further the specific humidity is
calculated and included in the extracted data files, as it is the specific
humidity which enters the calculation of ztd.
About heights
-------------
For the purpose of comparing radiosonde data with GPS zenith tropospheric
delay, it is necessary to corretct for the offset in height between the GPS
site and the radiosonde site and to carry out a vertical integration. If doing
so, note that the heights provided in the radiosonde records are geopotential
heights, which differs from geometry heights. The reference surface of
reference for the heights is the geoid. The document
http://www.dmi.dk/f+u/publikation/vidrap/Sr00-04.pdf provides
further details on the height issue.
About calculation of relative humidity and specific humidity
based on radiosonde report data.
--------------------------------
For the purpose of calculating a value of the zenith tropospheric delay (ZTD)
from the radiosonde data, the specific humidity (q) is required. However, in
the WMO GTS Network radiosonde data, the only information provided for
deriving this is the dewpoint temperature. The radiosondes DO NOT measure
dewpoint temperature (T_dew), but rather relative humidity (RH). The ground
station equipment converts the measured relative humidity to dewpoint
temperatures before transmitting the radiosonde report via the GTS network.
There can be a systematic bias associated with this conversion if converting
back to RH using another formula. The bias between using a physically
'precise' formula and the formula actually used by the typical radiosonde ground
equibment we have found to be of the order 2.3 mm of ZTD. To circumvent these
problems we carry out the following steps:
1) Convert back from T_dew to relative humidity RH by using the inverse of the
formula used by the most common radiosonde ground equipment (we do not know
what is used at each site).
2) Convert from RH to specific humidity q, using also pressure and temperature
from the radiosonde report, and a precise formula for the water vapor
saturation pressure.
Below is a more detailed account of the equations used for the TOUGH radiosonde
dataset and some background information on the problem.
Some formulas and definitions
--------------------------------
The definition of relative humidity is
H = e/esat(T), (1)
where e is the actual water vapor pressure and esat(T) is the saturation water
vapor pressure at temperature T. The radiosonde measures H and T. The calculation
of esat given the temperature is described below.
The dewpoint temperature is defined as the temperature at which the water vapor
saturation pressure equals the actual water vapor pressure, in other words
esat(T_dew) = e = H*esat(T). (2)
The specific humidity, which is the quantity entering the nwp models as
well as the ztd calculation is defined as,
q = density_water_vapor/total_density_air = eps*e/(p-e*(1-eps)), (3)
where eps is the ratio of molecular weight of water to that of dry air, and
p is the total pressure (measured by the radiosonde is this case).
Deducing q it is tempting to take e = esat(T_dew) using the most precise formulae
for the function esat(T) available, but unfortunately we don't know which expression for
esat(T) was used in the original conversion from the measured (H,T) -> T_dew to get
the reported dewpoint temperature, so that approach might create a bias.
The Esat function
-----------------
Esat itself can be found by solving the so-called Classius-Clapeyron equation,
which is complicated and involves various assumptions. Various analytical
approximations to esat exist, of different complexity and precision.
The one used internally in the Hirlam model is rather precise (compared against
lab. data and more elaborate analytical expressions), having the form,
esat_hirlam = R2*exp(R3*(T-T0)/(T-R4)), (4)
R2= 610.78 Pa, T0=273.16 K,
R3_ice=21.875 K, R3_liquid=17.269 K,
R4_ice=7.66 K, R4_liquid=35.86 K,
with the 'ice' constants being used below -15 Celsius and the 'liquid' constants
above freezing, and a gradual change, linear in T, from the ice constants to the
liquid constants over the -15 to 0 Celsius interval. This being an ad hoc way to
mimic the existence of a three phase medium in that temperature interval.
Conversion from relative humidity H to dewpoint temperature T_dew
--------------------------------
For carrying out the conversion from H to T_dew, an approximate function can
be derived for inverting the esat(T) relation.
It is unclear for what historical reason the measured quantity, the relative
humidity, is not broadcast along with the the dewpoint temperature.
There is unfortunately no meteorological standard for the formula to use for
the conversion of H to T_dew (Johm Elms, UK Met Office, personal com. via the
WMO). Different radiosonde ground stations may use different formulas! During
meteorological data assimilation the dewpoint temperatures are converted back
to relative humidity or to specific humidity using each meteorological
office's standard formula. This mismatch is not a big concern for weather
models, as the errors introduced are small compared to other uncertainties,
both it the models and the radiosonde measurement of relative humidity (which
is good to about 1-2%). But it may create a systematic bias in comparisons
with independent datasets where the measurement accuracy may be more precise,
for example in comparisons with GPS ztd's, when data for longer periods are
treated statistically.
The Digicora approximate H => T_dew relation
--------------------------------
The Digicora radiosonde ground equipment uses the following formula for the
conversion between H and T_dew:
T_dew= T*2*K/ {T*ln(100/H%)+2*K}, with (5)
K= 15*ln(100/H%) -2.0*(T-273.15)+ 2711.5
Temperatures are in K, while H% underlines that in the above version of the
Digicora formula H should be specified in %.
This is NOT the most correct formula to use. Their point of view is that
because of the error on the H measurement is of the order of 1-2%, (referring
probably to absolute error) a high precision is not necessary for the
conversion. This argument fails when it comes to statistical consideration of
large portions of data. About half the stations are estimated to use the
Digicora formula (John Elms, UK Met Office, personal com. via the WMO), simply
because Digicora and similar equipment is the most widespread. Probably none
uses a version involving both the liquid and solid phase of water as the
internal hirlam routine.
Bias introduced through the approximate Esat function
--------------------------------
We can calculate the effect of using the more approximate Digicora relation
compared to using the more precise HIRLAM esat relation in the following manner.
- convert the T_dew back to H by inverting the Digicora relation, (5)
- calculate e = H*esat(T) using the precise HIRLAM esat relation, (4)
- calculate q(p,e) and integrate to get ZTD
This ztd_digicora estimate has essentially removed the effect of the
imprecise H=> T_dew conversion.
For comparison, we can carry out
- calculate e = esat(T_dew) using the precise HIRLAM esat relation
- calculate q(p,e) and integrate to get ZTD
This ztd_HIRLAM estimate has incorrectly assumed the digicora relation is equivalent
to the more precise HIRLAM relation.
Using the radiosonde profiles we have been extracting for the MAGIC
project (period 199811-200005) I have determined the offset in
ztd from using one rather than the other of the Digicora and Hirlam
conversion relations. I find,
= 2.28+/-1.02 mm N=107520
<_station> = 2.31+/-0.38 mm N=135
A small, but clearly significant bias.
(To be clear: This just compares radiosonde profiles against
themselves using ztd as a measure and using different formulas for the
Td->H conversion. There are no hirlam model runs and no gps data
involved at all.)
Based on this comparison, we have decided that it is necessary to remove
the effect of the imprecise initial H => T_dew conversion when providing
specific humidities for the MAGIC radiosonde dataset.
Undoing the initial imprecise H=> T_dew conversion
--------------------------------
As stated above, each meteorological office has a method for converting
the radiosonde dewpoint temperatures back to relative humidities.
A third relation for esat, provided below, is the one used currently
in the Hirvda assimilation package being developed for Hirlam
(the conversion routine comes from Ecmwf I believe),
esat_hirvda = R2*exp(R3*(T-T0)/(T-R4)), (6)
R2=611.21 Pa, R3=17.502 K, R4=32.19 K, T0=273.16 K
This equation provides a means for converting back to relative humidity
that is equivalent to inverting the imprecise Digicora formula.
The difference in ztd's based on the Digicora and Hirvda routines is
very small, of the order 0.01mm, with a standard deviation of the order 0.1mm,
demonstrating that the Hirvda routine successfully removes the bias introduced
by the imprecise Digicora formula.
Offsets in relative humidity between the Digicora and the Hirvda routine
do exist, but they are much smaller than between Hirlam and Digicora and
appear at temperatures at which the humidity is very low. The use of esat_hirvda
over the Digicora relation is preferred however, as not all radiosonde stations
use Digicora equipment, and as a functional form for esat(T) is sometimes
useful.
Implementation for the TOUGH dataset
--------------------------------
It has been decided, therefore, to use the following approach when deriving
specific humidities for TOUGH from radiosonde reports.
We first find the best approximation to the original relative humidity
measured by the radiosonde using the Hirvda formula, (6).
H = esat_hirvda(T_dew)/esat_hirvda(T), (7)
with T_dew & T from the report.
Then we use the most precise formula for saturation pressure from the
HIRLAM internal system to transform from relative humidity to specific
humidity.
q = eps*H*esat_hirlam(T)/(p-H*esat_hirlam(T)*(1-eps)), (8)
with p & T from the report, and H from the line above.
This approach is supposed to give reasonable specific humidities in an average
sense. If the conversion formula used at a particular radiosonde station is
known, or if the measured relative humidities are available directly,
then then this results may be improved a bit.
Important: the original T_dew dewpoint temperatures from the GTS network
are included in the TOUGH datafiles.
Warning against bad data.
-------------------------
The data we extract have
passed a low level quality control only. Some will still contain
erroneous measurements. Whenever the number "-9999.90" appears in
the data files it means the observation in question is missing or
noted to be wrong. Other data can be errorneous as well, but harder
to identify.
Format of the radiosonde data files containing daily data:
-----------------------------------------------------------
On a daily basis the DMI extracts radiosonde data.
Each station report is written as a separate file,
named identifier.time, where identifier is the international 5 digit
identifier of the station, and time is the time of the
radiosonde launch in the format yyyymmhh. The files will be
in ascii format, the content structured as:
Line 1: #, ccode, name
Line 2: ident, lat, lon, alt, nlev, yyyy, mm, dd, hh, mm
Line 3: p(1),phi(1),T(1),T_dew(1),q(1)
Line 4: p(2),phi(2),T(2),T_dew(2),q(2)
Line 5: .
Line 6: .
Line 7: .
Where
# = start of a new record
ccode = standard letter code for countries
name = name of the station (e.g. name or nearby town or locality)
ident = The international identifier used for identification
of meteorological stations, a five digit integer number,
may include leading zeroes.
lat = latitude in degrees
lon = longitude in degrees
alt = altitude in meters
nlev = number of levels at which there are reports.
yyyy = year (4 digits)
mm = month (2 digits)
dd = day (2 digits)
hh = hour (2 digits)
mm = minutes (2 digits)
p = pressure [Pascal]
phi = geopotential [m**2/s**2]
T = temperature [K]
T_dew = dewpoint temperature [K]
q = specific humidity
A fortran routine writing the information listed above in the format
we are using is provided as a separate document.
Notice:
Normally the pressure levels appear in order of decreasing pressure in
the reports, but sometimes this is not the case throughout the full
sequence. In some cases the whole or part of the pressure sequence
might be repeated, often with one type of data reported missing in the
first set and not in the second. We have not corrected for such things.
The number nlev simply counts the number of pressure levels for which
there were reports in the original radiosonde data file = the number
for which we have extracted data.
Secondly, one should know that the equipment in radiosondes sometimes
gives wrong results, though not obviously so. Liquid water on the
instruments has a degrading effect on the measurements. Something of
particular importance to the estimates of the dewpoint temperature (and
thereby also the specific humidity).
Having said this, radiosondes in general provide high quality
measurements, and they are independent of meteorological models
and their shortcomings.
Archiving:
----------
The data are stored time wise as well as station wise.
The time wise stored data can be found as
./met/rs/recent/yyyymm/rsyyyymmdd.tar.gz at the ftp-server ftp.dmi.dk,
with yyyymmdd=year+month+day_of_month in a 4+2+2 digit format.
Data stored station wise can be found as
./met/rs/past/yyyymm/stationid.yyyymm.gz at the ftp-server,
with the station-id being the 5 digit international identifier for each
station (evt. with leading 0's). See, the ./met/rs/rs.list file
for a conversion of this number to names and coordinates.
The station archives are made using the Unix cat command with the
input file having the format specified above.
Updates of the 'recent' archives are made daily with a delay of a two
days relative to current time. The purpose of the delay is to ensure that
almost all radiosonde data for that day has arrived at DMI. In near
real time some stations would be sometimes missing.
Updates of the 'past' archives are made once a month
Old data may get removed from the DMI ftp server if disc space
becomes a problem.
Access to the DMIs ftp server, placement of radiosonde data
-----------------------------------------------------------
Connect by ftp to ftp.dmi.dk Give "tough" as your username
and give the usual TOUGH password as your password. Upon being
accepted you'll be located in the top tough directory on the
ftp-server. All material related to the TOUGH project will be
in this directory or its sub-directories.
In case of problems getting on to dmi ftp server try the ACRI site,
following the description further below.
The radiosonde data will be located in the subdirectories
./met/rs/recent/yyyymm and ./met/rs/past/yyyymm, with yyyymm
corresponding to the year and the month in a 4+2 digit format.
The main magic directory will contain files describing the
structure of the site and eventual updates in the procedures. The
directory ./met/rs will contain a file called rs.list,
which will list all radiosonde stations for which data have been
extracted and placed on the server during the MAGIC period of operation.
In rs.list there will be one line for each radiosonde specifying:
ident, ccode, name, lat, lon, alt
where,
ident = 5 digit meteorological identifier
ccode = country code
name = name of radiosonde station
lat = latitude [degrees]
lon = longitude [degrees]
alt = altitude [m]
Note, that as radiosondes are sometimes malfunctioning or stations taken
out of service, there might well be radiosonde stations listed in the
file rs.list for which there are no data available for a period.
Access to radiosonde data at ACRI-ST
------------------------------------
The radiosonde data extracted by DMI are to be made available also from
a ftp-server at ACRI-ST. Currently this service is not operating.
Time period.
------------
The extraction is done starting at 2003 04 01 and shall continue during
the duration of the TOUGH project.
Acknowledgements.
-----------------
Jennifer Haase and Digicora are thanked for help leading to the current
treatment of humidity in the radiosonde reports extracted by DMI for TOUGH.