CMETEO

From Agri4castWiki
Revision as of 15:53, 21 February 2014 by Hendrik2 (talk | contribs) (The process)
Jump to: navigation, search


Introduction

CMETEO stands for CORINE-meteo. The softwaretool CMETEO can be seen as an application suite to aggregate data of meteo indicators as found in e.g. the CGMS12EU.WEATHER_OBS_GRID data set or similar data sets at grid resolutions (meteo grid). Think of meteo indicators on temperature (maximum, minimum, mean or daily), windspeed, precipitation, radiation, potential transpiration, evaporation etc. In the context of MARS, the weather indicators are given at a regular grid. These indicators will be aggregated to spatial resolutions like administrative regions (NUTS level 3-0) or agri-environmental regions according to:

  • a general area weighted algorithm
  • a landcover (or even crop) specific area weighted algorithm.

The process

The process can be described in a short way as follows.

When activated CMETEO first uses the supplied interface parameters to initialise other parameters to steer the process.

  • parameters according the user name (ROI), theme and spatial type (regions, agri-environmental zones):
    • type of aggregation {relevant in case of weather forecast members: plain (in case of one or no members; median (in case of 50/51 members)}
    • type of interval {day, dekad}
    • indication to include a crop specific aggregation {yes,no}
    • thresholds whether to include a grid cell based on its landcover or crop specific area {a comma seperated list of integer for each threshold needed, integers are in the range of: 0 up to 99}
    • get a list of all landcovers that are relevant for the ROI
    • get a list of all cmeteo-members, being a combination of day-offsets (days in forecast depth) and members of meteo-ensembles. In those cases where no weather forecast data apply, the cmeteo-members refers to only one member and no day-offset e.g. data sets like CMETEO_GRID_WEATHER (see for details the description of the interface).
  • get a list of all days to process

Continue to process regions of level 0 (=countries) belonging to the ROI:

  • get a region (country)
  • get all subregions of all levels linked to that region.
  • gather input for each subregion:
    • initialise an array of weather indicators for all days, all landcovers and all members of cmeteo (day-offsets and member_no's).
    • for each day:
      • collect available weather indicators of all cmeteo-members for the grid cells within the subregion.
      • for each threshold:
        • if the size of area is above the threshold, summarise the values per cmeteo-member and landcover. The values are weighted for the landcover specific area of the grid cell.
        • optionally, add a crop specific aggregation, weighted for the area of the crop in the region (note that crop area statistics are only available at administrative/region level and thus can only be used in aggregations from NUTS level 3 to higher levels)
        • add the results to the parent region (that contains this subregion) to facilitate aggregation of the regions at higher levels.

As a result an array is filled with aggregated weather indicators per (sub)region, landcover (or even crop), day and cmeteo-member. The records of the array will be merged into the database according the rules:

  • if the record does not contain any result (no aggregation), then delete it's former representation from the database if one exists
  • if the record contains a result (an aggregation), then update it's former representation in the database if one exists
  • if the record contains a result (an aggregation), then insert it into the database if no former representation exists

The procedure stops after processing all indicated regions of level 0 belonging to the ROI.

Environment

The CMETEO package is designed to operate in several environments (i.e. database schemas) which may differ in both input (different tables names, definitions, structure, interval of time, etc.) and output datasets (different type of spatial resolutions, agri-environmental zones etc). All these environments need to have the basic data (gridded weather, list of regions, land cover or even crop specific areas per grid cell) and a number of CMETEO database objects available for the CMETEO process.

Software tools needed by CMETEO

The essential functionality of the process is implemented in the packages: CMETEO_admin and CMETEO_process. Besides CMETEO uses functionality which is implemented in several other software tools. These ones are listed in the following table.

tool remarks
REGLISTS supplies specific lists of regions directed by input parameters.
DATE_GENERATOR generic tool to generate lists of dates and some additional attributes based on the input parameters.
PROCMAN package serving configurations for complex programs like CMETEO.
PROCESS LOGGING set of procedures to send info to the user interface (you may extend these to send to log tables).
UPDATE_EVENTS_ARCHIVE procedure to signal the PMB (Project Management Board) that the processing ended successfully.

Installation of CMETEO

To install CMETEO you need to do the following steps:

  • check the environment, if the objects needed by the CMETEO package are available (see Software tools needed by CMETEO)
  • install the objects of the CMETEO package
  • add configurations to PROCMAN for any schema you wish to process CMETEO on

In the next paragraphs these steps will be detailed.

CMETEO will be installed in the database. So you need to decide in which schema you wish to install. Currently, CMETEO is installed in the MRSMAN schema, which is intended to be the base of applications to process datasets that are in other schemas. As CMETEO is such an application (serving different environments), it is a good choice to install it in the MRSMAN schema. However, it should not be installed in a schema (storing the gridded weather) that is to be processed by CMETEO.

Tools or objects needed by CMETEO don't have to be installed necessarily in the same schema as CMETEO. It is sufficient to grant access to these objects to the owner/schema in which CMETEO will be installed.

Installation of CMETEO, check the environment

Check that the following tools or objects are valid and available:

Tools, currently installed in the FORALL schema:

  • REGLISTS (requires the tables REGION_MAPPINGS and REGION_MAPPINGS_PER_ROI)
  • DATE_GENERATOR

Tools, currently installed in the MRSMAN schema:

  • PROCMAN
  • PROCESS LOGGING

Plus the objects TD_NUMBER and TD_ERRORS.

Tool, currently installed in the PMB schema:

  • UPDATE_EVENTS_ARCHIVE


Installation of CMETEO, install the package

CMETEO is delivered as a package containing scripts for the application software and accompanying objects. They must be installed in this order:

  • threshold_value_tp.sql
  • all: cmeteo_...._tbl.sql
  • cmeteo_region_mappings_vw.sql
  • cmeteo_process_pks.sql
  • cmeteo_process_pkb.sql
  • cmeteo_admin_pks.sql
  • cmeteo_admin_pkb.sql

Installation of CMETEO, configure for schema

CMETEO uses configurations in the PROCMAN tool to do the processing of weather indicators for each database schema in a proper way. Configuration in PROCMAN guides CMETEO to:

  • check the availability of datasets (tables, views) needed for the processing
  • create standardized entries for these datasets (tables, views)
  • get settings for some details while processing

Of course, the datasets and the settings must be in accordance with each other. Thus, for each set to be processed you need to define a 'process' which declares the datasets involved and the specification of the details. You may declare several processes to be executed per schema. E.g. in schema CGMS12EU two processes are declared, named cmeteo-cgms12eu-regions and cmeteo-cgms12eu-zones.

For the details you need to specify values for:

  • aggregation method
  • etc.

To find the availability of datasets (e.g. for the process of cmeteo-cgms12eu-regions) you need to supply PROCMAN, table PRCMN_PROCESS_OBJECTS, with specifications (name, type, owner etc.) of:

  • CMETEO_AGGREGATION_AREAS
  • CMETEO_REGION_GRID_LANDCOVERS
  • CMETEO_WEATHER_OBS_GRID
  • CMETEO_WEATHER_OBS_REGION
  • CMETEO_WEATHER_OBS_REGION_IO

To make CMETEO run the process (e.g. cmeteo-cgms12eu-regions), you need to supply PROCMAN, table PRCMN_PROCESS_OBJECTS with info to create (dynamic) synonyms:

  • CMETEO_CROPS
  • CMETEO_GRID_LANDCOVER_AREAS
  • CMETEO_GRID_WEATHER
  • CMETEO_LANDCOVERS
  • CMETEO_PROCESS
  • CMETEO_REGIONS
  • CMETEO_REGION_MAPPINGS
  • CMETEO_REGION_WEATHER
  • CMETEO_REJECTED_WEATHER

Also you need to set some parameters (table PRCMN_CONFIGURATIONS). The PROCMAN tool is able to use default settings for all parameters needed. So you only should specify values for some parameters that are specific for the process.

TODO: explain parameters (their use and predefined lists)

The list below contains parameters for the CMETEO procedure, their default value and values for processing weather indicators for Regions in CGMS12EU (as an example).

PARAMETER_GROUP PARAMETER ARGUMENT_TYPE DEFAULT VALUE FORMAT CGMS12EU (cmeteo-cgms12eu-regions)
aggregation method text plain
day in interval text day1
destination text name of (view/table) for weather weather_obs_region
event id number -1 9999 19
group indicator text std
include crops text no
interval dates text day
numeric fractions set of strings NIL FRAC_AREA_TEMP_MAX,FRAC_AREA_TEMP_MIN,FRAC_AREA_PRECIPITATION,AVG_TEMP
roi text replace with a non-virtual roi EUR
thresholds grid set of integer 0,5 0
NIL format fraction number text 9.99
NIL operator text ne
NIL source text zero
NIL thresholds set of number 0
NIL unit type text unit
FRAC_AREA_TEMP_MIN format fraction number text 9.99
FRAC_AREA_TEMP_MIN operator text le
FRAC_AREA_TEMP_MIN source text temp_min
FRAC_AREA_TEMP_MIN thresholds set of number 0,-8,-10,-18,-20
FRAC_AREA_TEMP_MIN unit type text unit
FRAC_AREA_TEMP_MAX format fraction number text 9.99
FRAC_AREA_TEMP_MAX operator text ge
FRAC_AREA_TEMP_MAX source text temp_max
FRAC_AREA_TEMP_MAX thresholds set of number 25,30,35,40
FRAC_AREA_TEMP_MAX unit type text unit
FRAC_AREA_TEMP_PRECIPITATION format fraction number text 9.99
FRAC_AREA_TEMP_PRECIPITATION operator text ge
FRAC_AREA_TEMP_PRECIPITATION source text rainfall
FRAC_AREA_TEMP_PRECIPITATION thresholds set of number 1,3,5,10,15,30
FRAC_AREA_TEMP_PRECIPITATION unit type text unit
AVG_TEMP format fraction number text 99.9
AVG_TEMP operator text ge
AVG_TEMP source text temp_avg
AVG_TEMP thresholds set of number 0,2,4,6,8,10
AVG_TEMP unit type text remains

Administration

Before the process will be executed, some checks and tasks, mainly administrative, are done.

  • The module checks if no other instances of CMETEO are active in it's schema and registers itself as active. Only one process per schema can be active at once.
  • After registering, it checks if all objects are available that are needed for the selected theme (including the ROI).
  • It also redefines some synonyms (dynamic synonyms like CMETEO_REGION_WEATHER and CMETEO_GRID_LANDCOVER_AREAS) according to the demands of the process.
  • Before processing, it checks the availability of weather indicators for the selected interval.
  • After processing, it collects counts of processed data and some other items and informs the user via the tool PROCESS LOGGING.
  • It signals the Project Management Board that the process has successfully ended.
  • Finally it releases the lock on active processes.

Interface

The CMETEO package is designed to act in a batch file, or to start from a commandline. Both methods use the same procedure, CMETEO_ADMIN.DO_AGGREGATION to start with. This procedure defines the interface consisting of parameters, both obligatory or optional.

To get CMETEO working properly, you supply:

  • the name of the database schema that holds the data to be processed (although the schema name is not defined as a parameter, this name is part of the interface). You supply this name as the user name when you login to the database to start CMETEO.
  • p_theme: theme, an indication of the group of meteo indicators you wish to process. See the list below for possible themes.
  • p_spatial: spatial indicator, e.g. 'Regions', 'agri-environmental zones'. See the list below for possible alternatives.

N.B. you must supply the parameters for theme and spatial indicators exactly as they are defined.

Optionally, you can supply:

  • qry_region: a set identifiers of regions to be processed.
  • p_yr2start,p_yr2end,p_dy2start,p_dy2end: a day or an interval of days to be processed.
  • p_refresh: an indicator whether or not to refresh the previously aggregated weather indicators.

Details of the interface

Themes and spatial indicators as used by CMETEO to perform the process in schema:
schema: process: theme: spatial indicator:
cgfs_eur_09 cmeteo-cgfs-eps-nuts weather(simulated) eps NUTS
cgfs_eur_09 cmeteo-cgfs-eps-zones weather(simulated) eps agri-environmental zones
cgfs14eu cmeteo-cgfs14eu-his-regions weather(simulated) his Regions
cgfs14eu cmeteo-cgfs14eu-his-zones weather(simulated) his agri-environmental zones
cgfs_eur_09 cmeteo-cgfs-ope-nuts weather(simulated) ope NUTS
cgfs_eur_09 cmeteo-cgfs-ope-zones weather(simulated) ope agri-environmental zones
cgms12eu cmeteo-cgms12eu-regions weather(observed) Regions
cgms12eu cmeteo-cgms12eu-zones weather(observed) agri-environmental zones
cgms_asia_08 cmeteo-cgms-asia Weather - CGMS - Analysis Regions
gwsi_e_12 cmeteo-gwsi_e Weather - GWSI - Analysis Regions
gwsi_t_12 cmeteo-gwsi_t Weather - GWSI - Analysis Regions


  • The theme must be supplied in p_theme. See the table above for possible alternatives. It is an extraction of the process table in MRSMAN.
  • The spatial indicator must be supplied in p_spc_res. See the table above for possible alternatives. It is an extraction of the process table in MRSMAN.
  • (Optional) An interval or day to process in the set of: p_yr2start,p_dy2start,p_yr2end,p_dy2end. CMETEO uses several possibilities to define an interval. A coarse way is to mention a year to start and a year to end. In this case all days from the first of January of the starting year to the last day of December of the ending year will be processed. A more fine grained way is to provide a date to start and a date to end in which case only the days in between will be processed, or even one day if both entered dates are the same. You may supply any combination of years and dates. If you don't supply any, the default interval will be chosen, which starts at the first of January of the current year and runs up to the day before today. In fact, no day in the future will be processed.
  • (Optional) an indicator to enable replacement of previously aggregated weather indicators in p_refresh, being one of 'yes', 'no'. The default is 'yes'. This one will be used when no value for this parameter has been supplied.
  • (Optional) A set identifiers of regions to be processed in the parameter qry_region. You may add a query or sql statement that delivers the preferred subset of regions for which you need cmeteo. The regions must be identified by their reg_map_id as found in table REGION_MAPPINGS.

If supplied, the selected set must contain some or all reg_map_ids of level 0 (!) for the relevant ROI, e.g.: select rm.reg_map_id from region_mappings rm where rm.reg_level = 0 and rm.reg_code in ('LU', 'BE', 'NL'); You also may supply a limited list of ids of region mappings which must be comma separated. If you leave this parameter empty, all regions of the ROI will be processed.

Several datasources work with the generic interface of CMETEO. The most important ones are listed in the table below.

Datasets (in CGMS12EU) used by cmeteo:
dataset containing: base set adapted set for cmeteo name in cmeteo remarks
weather indicators per gridcell for meteodata WEATHER_OBS_GRID CMETEO_WEATHER_OBS_GRID (view) CMETEO_GRID_WEATHER input for cmeteo; intermediate set contains standardized attribute names
weather indicators per administrative (NUTS) region WEATHER_OBS_REGION CMETEO_WEATHER_OBS_REGION (view) CMETEO_REGION_WEATHER output of cmeteo; intermediate set contains standardized attribute names
weather indicators per agri-environmental region WEATHER_OBS_AEZ CMETEO_WEATHER_OBS_AEZ (view) CMETEO_REGION_WEATHER output of cmeteo; intermediate set contains standardized attribute names
landcovers or crops to be evaluated CMETEO_LANDCOVER_EU (in schema FORALL) CMETEO_LANDCOVERS (synonym) CMETEO_LANDCOVERS input for cmeteo; intermediate set contains standardized attribute names
areas/surfaces for landcovers per grid and per admin. region LINK_REGION_GRID_LANDCOVER CMETEO_REGION_GRID_LANDCOVERS (view) CMETEO_GRID_LANDCOVER_AREAS input for cmeteo; intermediate set contains standardized attribute names
areas/surfaces for landcovers per grid and per agri-env. region LINK_AEZ_GRID_LANDCOVER CMETEO_AEZ_GRID_LANDCOVERS (view) CMETEO_GRID_LANDCOVER_AREAS input for cmeteo; intermediate set contains standardized attribute names
areas/surfaces for crops per administrative region AGGREGATION_AREAS CMETEO_AGGREGATION_AREAS (view) CMETEO_AGGREGATION_AREAS input for cmeteo
weather indicators per region which are rejected CMETEO_REJECTED_WEATHER output of cmeteo, containing weather indicators per region and date that could not be stored in regular output dataset via cmeteo_region_weather
regions and their mappings CMETEO_REGION_MAPPINGS (in schema MRSMAN) CMETEO_REGION_MAPPINGS (synonym) CMETEO_REGION_MAPPINGS input for cmeteo
regions to process CMETEO_REGIONS (in schema MRSMAN) CMETEO_REGIONS (synonym) CMETEO_REGIONS input/output for cmeteo


In various schemas datasets are available that contain weather indicators, e.g. for meteo gridcells or regions of some type. However, they do vary both in name of the set and in names of the attributes. Also the collection of the attributes may vary (with or without day_offsets, with or without members of the meteo ensemble). In fact, all the sets contain the indicators that are essential for the processing of CMETEO. To prevent adaption of the CMETEO application every time another dataset in a schema must be processed or a new schema will be included into the processing scheme of CMETEO, some attention must be paid to the standardisation of the structure of datasets both as input and as output. The standardized dataset for input will be called CMETEO_GRID_WEATHER. For each variant of a dataset containing grid weather a view will be created that presents all attributes needed for the CMETEO process, either as data selected from the dataset or a dummy value. CMETEO just reads this data; there is no need to update this data.

For the output of CMETEO also a view will be created for each variant of output. In some schemas several of such views can exist. The name of a view hints, in most cases, to the name of the dataset on which it is based. CMETEO uses a synonym CMETEO_REGION_WEATHER, denoting the proper view, to access the data and to write the results. At the start of the processing, CMETEO is able to create the synonym to point to the right view.