Meteorological data from ground stations

From Agri4castWiki
Revision as of 13:59, 19 February 2014 by Meteoconsult (talk | contribs) (Conversion to daily values)
Jump to: navigation, search

General description

The processing of observed station weather into the MCYFS involves four steps:

preprocessing of station weather data

Data acquisition from weather stations

Weather stations (black dots) for which data are available for (part of) the period from 1975 until the current day

The stations are limited to those that regularly collect data and can supply the data in near real time (Burrill and Vossen, 1992). Relevant information of stations includes station number, station name, latitude, longitude and altitude. This data is available in the table STATION.

Some of the historic meteorological data are purchased directly from National Meteorological Services. Others are acquired via the GTS. As data are obtained from a variety of different sources, considerable preprocessing is necessary to convert them to a standard format. Two different procedures are applied for distinct subsets of the data. Around 1992 the historic meteorological data represented approximately 380 stations in the EU, Switzerland, Poland and Slovenia with data from 1949 to 1991 (Burrill and Vossen, 1992). Later the historic sets have been extended with stations in eastern Europe, western Russia, Maghreb and Turkey. The historic data are converted into consistent units and checked on realistic values. The database was also scanned for inconsistencies, such as successive days with the same value for a variable, or minimum temperatures higher than maximum temperatures (Burrill and Vossen, 1992).

From 1991 to present, meteorological data are received in near real time from sources like the GTS network for different hours within one day. The data are pre-processed and quality checked using the AMDAC software package (MeteoConsult, 1991) which extracts, decodes and processes the GTS data.

Available stations

Available temperature stations 1975
Available temperature stations 2009

The station database stored in table STATION holds over 7100 stations distributed over 74 countries. Over 3000 of these stations provide weather data in near real time. All weather data is stored in the stations weather database (table WEATHER_OBS_STATION) that currently counts over 33.8 million records. The figures on the right illustrate the increase of available number of stations for the temperature indicator between 1975 and 2009. In general the stations density in the monitored areas is considered sufficiently high for the purpose of the project.

Raw station data are collected from various sources:

Data from outside the ECOMET area are transmitted from the KNMI as if WMO essential (GTS). A number of countries in Europe, especially in the east, are aiming to become a member of ECOMET. This might lead to a reduction in the amount of data freely available.

Meteorological stations selected in priority are those located in the agricultural zones and equally distributed over the mainland (instead of over islands - for Portugal, Spain or Greece in particular). In particular, for western Russia (western of Urals) the main areas covered are the agricultural districts.

Basic indicators

The basic indicators that are received from weather stations include:

  • Precipitation
  • Temperature
  • Measured radiation
  • Sunshine
  • Cloud cover
  • Vapour pressure
  • Wind speed
  • Snow depth
  • Humidity

Observations of maximum and minimum temperatures, precipitation amounts and sunshine duration (when available) are contained in the main hours synoptic. METAR data provide temperature, dew point, visibility and cloud amount. As far as available, they can be used for intermediate or even non-standard (i.e. all but main and intermediate) hours. From most countries outside Europe, 3-hourly synoptic data are exchanged worldwide.

Data quality check

The software package Actual Meteorological Database Construction (AMDAC) is used to perform decoding, filling and quality evaluation of actual meteorological data which are used as input for agro-meteorological models. The chain of data processing and quality control can be described as follows:

Near real-time pre-processing (3-hourly data)

  • Decode intermediate-hour and main-hour SYNOP reports and METAR reports from weather stations circulating on the GTS in the geographical zone of interest;
  • Extract or calculate and store the meteorological parameters in a separate database;
  • Check the quality of the observed elements in the received weather reports by performing boundary and time consistency checks; the latter is done by comparing the values of reported parameters with those previously or subsequently reported from the same station;
  • Correct automatically obvious errors detected while performing these checks;
  • Automatically fill gaps in the database through interpolation based on time consistency criteria;
  • Flag dubious observations which cannot be corrected automatically;
  • Write all automatic corrections and flagged dubious observations to a log file;
  • Have the flagged observations checked and, if necessary, corrected by a trained meteorologist; when a correction is done, the derived parameters are recalculated and the data are written back to the database.

Dedicated trained and qualified meteorologists go through the dubious observation values that are flagged as such by the AMDAC automatic pre-checking program. The MIDAS software application is used to graphically visualize and analyze additional information such as:

  • Station observation data
  • Satellite images
  • Radar data

which may serve as the basis of either confirmation or rejection of the observed values.

Conversion to daily values

Once the database has been filled following the method described above, the 3-hourly data are aggregated to daily values. This includes the following indicators:

  • Precipitation (daily and 6-hourly)
  • Temperature (daily maximum, daily minimum and 3-hourly)
  • Measured radiation
  • Sunshine
  • Cloud cover
  • Vapour pressure
  • Wind speed
  • Snow depth
  • Humidity (3-hourly)

A final check is then performed on these daily values before an output file is created for further processing. This automated quality check consists in verifying the data according to the table below. If errors are found, the meteorologist will check the data again and make modifications if relevant.

Parameter constraint
Daily mean of total cloud cover : N 0 to 8 octas
Measured sunshine duration: MeaSun 0 to 24 hours
Measured radiation: RadMea 0 - 36 MJ/m2
Minimum temperature: Tn -35 to 35°C depending on region
Maximum temperature: Tx -20 to 50°C depending on region
Maximum temperature - Minimum temperature 0< Tx-Tn <30°C
Daily mean vapour pressure: e 0 to 35 hPa depending on region
Daily mean wind speed at 10 metres: ff10 0 to 15 m/s
Amount of precipitation from 6 UTC-6 UTC: RRR 0 to 140 mm depending on region
Air temperature: TT -35 to 50°C depending on region
Relative humidity: RH 5 to 100% depending on region
Daily mean vapour pressure deficit: vpd 0 to 60 hPa depending on region
Daily mean slope of saturation vapour pressure vs. temperature curve: slope 0 to 3 hPa/°C
Daytime mean of total cloud cover: N 0 to 8 octas
Penman evaporation: ETP 0 to 25 mm/day depending on region
Snow cover: SNOW (Tn+Tx)/2 < 10°C

The variables in the above table have wide ranges but can be exceeded in some countries. For instance, in the summer time maximum temperatures above 40°C are quite common in the southern countries. However, in most situations it is known before that the potential range for a variable is much smaller. To detect small potential errors, the MOS (Model Output Statistics) forecasting system of Meteo Consult is used. The MOS forecast-errors for the first 24 hours are usually very small. Therefore the observation thresholds are defined as a small range around the MOS forecast. If the observation is outside the range, it is flagged (already during the pre-processing checks of the 3-hourly values take place) for manual control by a meteorologist. MOS reckons with seasons and applicable climatology. Excessive persistence is very unlikely and spatial consistency is large. This is different for rainfall (see Consecutive zero values for rainfall).

Currently a system is being implemented that will store the various flags at station level. It still has te be decided what flags need to be included. At least the period definition of daily rainfall sum will be included.

Finally the meta data of all stations in the MCYFS database is checked once a year.

Consecutive zero values for rainfall

Consecutive time series of zero rainfall are difficult to detect. Such errors can only be identified by inspecting longer time series going back several weeks to several months.

The following procedure is running:

  • Each month the rainfall sum over the last 30, 90 and 180 days is checked. Flagged data that have a suspiciously low rainfall sum over the analysed periods are compared with surrounding stations with the MIDAS work station and the MARS viewers.
  • In case of suspicious data the historic time series of this specific station and surrounding stations are retrieved.
  • If the time series of the station are found to be wrong (thus wrongly zero for a long period) the following actions are executed:
    • The station is added to a black list: the station is immediately excluded from the operational station list.
    • The erroneous time series are deleted from RAIN_OBS_STATION and the RAIN value in WEATHER_OBS_STATION is set to ‘Null’.
    • All affected grid cells (WEATHER_OBS_GRID) and regions (WEATHER_OBS_REGION and WEATHER_OBS_AEZ) are reprocessed. In case these erroneous data were also used in the crop simulation and yield forecast these data sets are also reprocessed.
    • Before mirroring the data to the analysts, they are informed to secure an optimal analysis environment.

Once a year each station on the blacklist is verified and it is decided if it can be restored to the operational work flow. Falsely blocked data is backordered and reprocessed.

Sufficient observations per country

Each Month an overview is created showing the delivered number of stations per country. Information is also added on sudden changes and follow-up actions. Similar listings are made on a daily basis for internal use.

Example monthly overview (pdf)

Example daily overview (pdf)

Ingestion into the database

After the station weather data passed all checks data is exported to a fixed formatted ASCII file (S-file) containing the data of a single day that can be imported in the table WEATHER_OBS_STATION.

Calculation of advanced parameters

Global radiation

Global radiation is the daily sum of incoming solar radiation that reaches the earth surface. It is mainly composed of wavelengths between 0.3 μm and 3 μm. Approximately half of the incoming radiation with wavelengths between 0.4 and 0.7 μm is Photosynthetically Active Radiation (PAR). Global radiation is the driving variable in the growth-determining CO2 assimilation process and thus crop growth models are sensitive to radiation data (van Diepen, 1992). A major problem is the scarcity of measured global radiation. In cases where no direct observations are available it must be derived from sunshine duration, cloud cover and/or temperature, on the basis of relatively weak relationships.

The global radiation calculation uses one of three formulae (Ångström-Prescott, Supit-Van Kappel, and Hargreaves), depending on the availability of meteorological parameters. An important component in these formulae is the amount of Angot radiation which is the extraterrestrial radiation integrated over the day at certain latitude on a certain day. In fact, all of the three formulae estimate the fraction of Angot radiation actually received at the earth surface. The calculation of the Angot radiation and the three different formulae are described by Supit et al. (1994) and van der Goot (1998a).

Ångström-Prescott, Supit-Van Kappel, and Hargreaves regression constants

The main problem with the application of the Ångström-Prescott, Supit-Van Kappel, and Hargreaves formulae is the quality of the regression constants. Studies by Supit (1994), Supit and van Kappel (1998) and van Kappel and Supit (1998) showed no relationship between latitude and the coefficients, although such a relation is frequently used to estimate these regression constants. Initially Supit and van Kappel (1998) and van Kappel and Supit (1998) obtained sets of regression constants for the formulae for as many weather stations as possible, with a geographic distribution that corresponds to the area of interest for the MCYFS. As a result, a set of 256 reference stations was identified for which a relevant set of measured radiation data and other parameters in the formulae existed. For these stations regression constants were calculated based on measured radiation data for the three formulae mentioned above.

In 2012 the regression coefficients of these solar radiation models were updated using a new set of weather station data and an alternative source of radiation data: 6 years (2005-2010) of the down-welling surface shortwave radiation flux (DSSF) 30-minutes product derived from Meteosat Second Generation satellite data by the Land Surface Analysis Satellite Applications Facility (LSA SAF) (Bojanowski et al.,2013). For each solar radiation model a set of weather stations was selected having suffient observations of either sunshine duration, or cloud cover/temperature or only temperature to perform a regression analysis. Results are stored in table STATION_REFERENCE_COEFFICIENTS.

The program SupitConstants uses this set of data (via the view SUPIT_REFERENCE_STATIONS), consisting of latitude, longitude, altitude and calculated regression constants, to derive the regression constants for all stations in the MCYFS. Interpolation of the regression constants of the reference stations to other stations is based on a distance weighted average of the three nearest stations. This process is carried out once, unless the set of reference stations changes or when new stations are added.

Interpolated regression constants are written in the table SUPIT_CONSTANTS and copied to table STATION. After the regression constants have been established for all stations, global radiation can be calculated by CGMS using any one of the above formulae. Finally, the CGMS writes the derived daily global radiation of every station in the table WEATHER_OBS_STATION_CALCULATED (see flowchart).

The following hierarchical method is used to calculate global radiation in CGMS (Supit and van Kappel, 1998). If observed/measured global radiation is available it will be used.

Angot radiation

The principle component of all calculations is the extraterrestrial radiation, or Angot radiation. In fact, all of the global radiation calculations estimate the fraction of Angot radiation actually received. The extraterrestrial radiation is calculated as:

Ångström-Prescott formula

In case sunshine duration is available, global radiation is calculated using the equation postulated by Ångström (1924) and modified by Prescott (1940). The two constants in this equation depend on the geographic location.

Supit-Van Kappel formula

When sunshine duration is not available but minimum and maximum temperature and cloud cover are known, the Supit-Van Kappel formula is used, which is an extension of the Hargreaves formula (Supit, 1994). Again, the regression coefficients depend on the geographic location.

Hargreaves formula

Finally, when only the minimum and maximum temperatures are known the equation of Hargreaves et al. (1985) is used. Again, the regression coefficients depend on the geographic location.


Daily meteorological station data collected from stations does not contain potential evapotranspiration. This parameter is calculated by the CGMS with the well-known Penman-Monteith equation (Allen et all., 1998). In general, the evapotranspiration from a reference surface, the so-called reference crop evapotranspiration or reference evapotranspiration can be described by the FAO‑Penman-Monteith:

Evapotranspiration from a wet bare soil surface (ES0) and from a water surface (E0) are calculated with the well-known Penman formula (Penman, 1948). Only the albedo and surface roughness differs for these two types of evapotranspiration as explained below.

The net absorbed radiation depends on incoming global radiation, net outgoing long-wave radiation, the latent heat and the reflection coefficient of the considered surface (albedo). For E0 and ES0 albedo values of 0.05 and 0.15 are used respectively. The evaporative demand is determined by humidity, wind speed and surface roughness. For a free water surface and for the wet bare soil (E0, ES0) a surface roughness value of 0.5 is used. For a more detailed description of the underlying formulae we refer to Supit et al. (1994) and van der Goot (1997).

The calculated E0, ES0, and ET0 are stored in table WEATHER_OBS_STATION_CALCULATED.