CMETEO

From Agri4castWiki
Revision as of 18:25, 24 February 2014 by Henk (talk | contribs) (Installation of CMETEO, install the package)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Introduction

CMETEO stands for CORINE-meteo. The softwaretool CMETEO can be seen as an application suite to aggregate data of meteo indicators as found in e.g. the CGMS12EU.WEATHER_OBS_GRID data set or similar data sets at grid resolutions (meteo grid). Think of meteo indicators on temperature (maximum, minimum, mean or daily), windspeed, precipitation, radiation, potential transpiration, evaporation etc. In the context of MARS, the weather indicators are given at a regular grid. These indicators will be aggregated to spatial resolutions like administrative regions (NUTS level 3-0) or agri-environmental regions according to:

  • a general area weighted algorithm
  • a landcover (or even crop) specific area weighted algorithm.

The process

The process can be described in a short way as follows.

When activated CMETEO first uses the supplied interface parameters to initialise other parameters to steer the process.

  • parameters according the user name (ROI), theme and spatial type (regions, agri-environmental zones):
    • type of aggregation {relevant in case of weather forecast members: plain (in case of one or no members; median (in case of 50/51 members)}
    • type of interval {day, dekad}
    • indication to include a crop specific aggregation {yes,no}
    • thresholds whether to include a grid cell based on its landcover or crop specific area {a comma seperated list of integer for each threshold needed, integers are in the range of: 0 up to 99}
    • get a list of all landcovers that are relevant for the ROI
    • get a list of all cmeteo-members, being a combination of day-offsets (days in forecast depth) and members of meteo-ensembles. In those cases where no weather forecast data apply, the cmeteo-members refers to only one member and no day-offset e.g. data sets like CMETEO_GRID_WEATHER (see for details the description of the interface).
  • get a list of all days to process

Continue to process regions of level 0 (=countries) belonging to the ROI:

  • get a region (country)
  • get all subregions of all levels linked to that region.
  • gather input for each subregion:
    • initialise an array of weather indicators for all days, all landcovers and all members of cmeteo (day-offsets and member_no's).
    • for each day:
      • collect available weather indicators of all cmeteo-members for the grid cells within the subregion.
      • for each threshold:
        • if the size of area is above the threshold, summarise the values per cmeteo-member and landcover. The values are weighted for the landcover specific area of the grid cell.
        • optionally, add a crop specific aggregation, weighted for the area of the crop in the region (note that crop area statistics are only available at administrative/region level and thus can only be used in aggregations from NUTS level 3 to higher levels)
        • add the results to the parent region (that contains this subregion) to facilitate aggregation of the regions at higher levels.

As a result an array is filled with aggregated weather indicators per (sub)region, landcover (or even crop), day and cmeteo-member. The records of the array will be merged into the database according the rules:

  • if the record does not contain any result (no aggregation), then delete it's former representation from the database if one exists
  • if the record contains a result (an aggregation), then update it's former representation in the database if one exists
  • if the record contains a result (an aggregation), then insert it into the database if no former representation exists

The procedure stops after processing all indicated regions of level 0 belonging to the ROI.

Environment

The CMETEO package is designed to operate in several environments (i.e. database schemas) which may differ in both input (different tables names, definitions, structure, interval of time, etc.) and output datasets (different type of spatial resolutions, agri-environmental zones etc). All these environments need to have the basic data (gridded weather, list of regions, land cover or even crop specific areas per grid cell) and a number of CMETEO database objects available for the CMETEO process.

Software tools needed by CMETEO

The essential functionality of the process is implemented in the packages: CMETEO_admin and CMETEO_process. Besides CMETEO uses functionality which is implemented in several other software tools. These ones are listed in the following table.

tool remarks
REGLISTS supplies specific lists of regions directed by input parameters.
DATE_GENERATOR generic tool to generate lists of dates and some additional attributes based on the input parameters.
PROCMAN package serving configurations for complex programs like CMETEO.
PROCESS LOGGING set of procedures to send info to the user interface (you may extend these to send to log tables).
UPDATE_EVENTS_ARCHIVE procedure to signal the PMB (Project Management Board) that the processing ended successfully.

Installation of CMETEO

To install CMETEO you need to do the following steps:

  • check the environment, if the objects needed by the CMETEO package are available (see Software tools needed by CMETEO)
  • install the objects of the CMETEO package
  • add configurations to PROCMAN for any schema you wish to process CMETEO on

In the next paragraphs these steps will be detailed.

CMETEO will be installed in the database. So you need to decide in which schema you wish to install. Currently, CMETEO is installed in the MRSMAN schema, which is intended to be the base of applications to process datasets that are in other schemas. As CMETEO is such an application (serving different environments), it is a good choice to install it in the MRSMAN schema. However, it should not be installed in a schema (storing the gridded weather) that is to be processed by CMETEO.

Tools or objects needed by CMETEO don't have to be installed necessarily in the same schema as CMETEO. It is sufficient to grant access to these objects to the owner/schema in which CMETEO will be installed.

Installation of CMETEO, check the environment

Check that the following tools or objects are valid and available:

Tools, currently installed in the FORALL schema:

  • REGLISTS (requires the tables REGION_MAPPINGS and REGION_MAPPINGS_PER_ROI)
  • DATE_GENERATOR

Tools, currently installed in the MRSMAN schema:

  • PROCMAN
  • PROCESS LOGGING

Plus the objects TD_NUMBER and TD_ERRORS.

Tool, currently installed in the PMB schema:

  • UPDATE_EVENTS_ARCHIVE

Installation of CMETEO, install the package

CMETEO is delivered as a package containing scripts for the application software and accompanying objects. They must be installed in this order:

  • threshold_value_tp.sql
  • all: cmeteo_...._tbl.sql
  • cmeteo_region_mappings_vw.sql
  • cmeteo_process_pks.sql
  • cmeteo_process_pkb.sql
  • cmeteo_admin_pks.sql
  • cmeteo_admin_pkb.sql

In each target db schema (with gridded weather) the user has to install a synonym CMETEO_ADMIN and some synonyms for installed types running the following scripts:

  • cmeteo_admin_syn.sql
  • td_threshold_value_syn.sql
  • td_threshold_values_syn.sql

Also, in each target schema some views have to be installed. It's not to say beforehand how this should be created. It depends on the availability of tables in the schema. For schema CGMS12EU scripts are supplied as a part of the installation to be installed.

Installation of CMETEO, configure for schema

CMETEO uses configurations in the PROCMAN tool to do the processing of weather indicators for each database schema in a proper way. Configuration in PROCMAN guides CMETEO to:

  • check the availability of datasets (tables, views) needed for the processing
  • create standardized entries for these datasets (tables, views)
  • get settings for some details while processing

Of course, the datasets and the settings must be in accordance with each other. Thus, for each set to be processed a 'process' must be defined which declares the datasets involved and the specification of the details. Several processes can be declared to be executed per schema. E.g. in schema CGMS12EU two processes are declared, named cmeteo-cgms12eu-regions and cmeteo-cgms12eu-zones.

For the details some values need to be defined e.g.:

  • aggregation method
  • event ID for Project Management Board
  • list of threshold-indicators to be produced in the spatial aggregation
  • etc.

In order to check availability of datasets (e.g. for the process of cmeteo-cgms12eu-regions) required datasets need to be defined in PROCMAN, table PRCMN_PROCESS_OBJECTS, with specifications (name, type, owner etc.) of (e.g. CGMS12EU for aggregation to administrative regions):

  • CMETEO_AGGREGATION_AREAS
  • CMETEO_REGION_GRID_LANDCOVERS
  • CMETEO_WEATHER_OBS_GRID
  • CMETEO_WEATHER_OBS_REGION (also check for trigger CMETEO_WEATHER_OBS_REGION_IO)

To make CMETEO run the process (e.g. cmeteo-cgms12eu-regions), additional object names need to be defined in PROCMAN, table PRCMN_PROCESS_OBJECTS to create (dynamic) synonyms:

  • CMETEO_CROPS
  • CMETEO_GRID_LANDCOVER_AREAS
  • CMETEO_GRID_WEATHER
  • CMETEO_LANDCOVERS
  • CMETEO_PROCESS
  • CMETEO_REGIONS
  • CMETEO_REGION_MAPPINGS
  • CMETEO_REGION_WEATHER
  • CMETEO_REJECTED_WEATHER

Finally some parameters needs to be set (table PRCMN_CONFIGURATIONS). The PROCMAN tool is able to use default settings for all parameters needed. So you only should specify values for some parameters that are specific for the process.

For CMETEO a limited set of parameters is used. In the table below these paramaeters are listed with their default value and a description with some hints to use.

PARAMETER_GROUP PARAMETER ARGUMENT_TYPE DEFAULT VALUE DESCRIPTION
roi text replace with a non-virtual roi The code of a Region of Interest (ROI) for which the aggregation has to take place. Choose one of the ROIs in the table Region_mappings_per_roi in the FORALL schema.
interval dates text day An indicator for the interval between successive dates of gridded weather. Meaningful values are {day, dekade}.
day in interval text day1 Indicates the position of the date in the interval, as found in the input (gridded weather), in the interval (see: interval dates). Meaningful values are {day1, last day}. If the interval is day, then only day1 is to be used.
aggregation method text plain Indicates the method to derive the aggregation. Useful values are {plain, median}. Plain just gathers the values for each grid and day and aggregates them. Median first takes the median values of a set of members, then aggregates them.
include crops text no An indicator to add aggregations weighted on areas of crops (rom aggregation areas), not being land covers. You can choose: {yes, no}
thresholds grid set of integer 0,5 An integer, or a set of integers separated by commas, to define a threshold to select the grids. A grid is taken into account if it's fraction occupied by the land cover is above the threshold. The threshold is expressed as a percentage (100 times the fraction). You can use any value(s) between 0 and 99. If you don't wish to apply a threshold you just set 0.
destination text name of table for weather the name of dataset (table) in which the results finally will be stored. You may prefix if with the schema.
event id number -1 the identifier (number) of the entry in the Process Management Board (PMB), table EVENTS, that identifies a type of action. CMETEO uses this id to signal the PMB that the process has successfully completed. see the PMB manual.
group indicator text std (Depricated) A code to guide the selection of regions to process. It's not used any more.
numeric fractions set of strings NIL A name, or a set of names, of indicators that needs to be derived during spatial aggregation (events like rainy days, or temperature above a threshold (by comparing primary values with a threshold). You can supply any set of names. However, you need to supply details for each name in the set. See the examples of NIL in the parameter_group.
NIL source text zero Indication of the source of a (scalar) value to compare with the threshold(s). In case of NIL (not relevant) it should be the string zero. In other cases (it is thematically relevant) an attribute of a table is used. Currently, you can use {rainfall, temp_max, temp_min, temp_avg}
NIL operator text ne Code for the comparison operator. In case of NIL (not relevant) the string ne is used. In other cases (it is thematically relevant) you can use one of {gt, ge, eq, ne, le, lt}.
NIL thresholds set of number 0 A number, or a set of numbers, used as thresholds to compare to the value as found in the source. In case of NIL (not relevant) the value 0 is used. In other cases (it is thematically relevant) it may be number(s) with a fractional part, e.g. 3.25. Numbers must be separated by commas.
NIL unit type text unit Code to define the threshold-indicators (how to process the result of the comparison). In case of NIL (not relevant) the string unit is used. In other cases (it is thematically relevant) you may just count them, so adding 1 to the aggregation (e.g. rainy days), or use the remains above (or below) the threshold to summarise (e.g. temperature above a threshold). Meaningful codes are {unit, remains}. Currently, unit results in a value of 0 or 1; remains delivers the remaining part of the source below or above the threshold.
NIL format fraction number text 9.99 The format mask for the value of the fraction to use in the resulting dataset. In case of NIL (not relevant) the string 9.99 is used. In other cases (it is thematically relevant) this mask should be equal to the comparative column in the table where the results will be written.


The table below also lists the parameters and supplies values for processing weather indicators for Regions in CGMS12EU (as an example). As you see the default values remain in place if applicable.

PARAMETER_GROUP PARAMETER ARGUMENT_TYPE DEFAULT VALUE CGMS12EU (for process: cmeteo-cgms12eu-regions)
aggregation method text plain
day in interval text day1
destination text name of table for weather weather_obs_region
event id number -1 19
group indicator text std
include crops text no
interval dates text day
numeric fractions set of strings NIL FRAC_AREA_TEMP_MAX, FRAC_AREA_TEMP_MIN, FRAC_AREA_PRECIPITATION, AVG_TEMP
roi text replace with a non-virtual roi EUR
thresholds grid set of integer 0,5 0
NIL format fraction number text 9.99
NIL operator text ne
NIL source text zero
NIL thresholds set of number 0
NIL unit type text unit
FRAC_AREA_TEMP_MIN format fraction number text 9.99
FRAC_AREA_TEMP_MIN operator text le
FRAC_AREA_TEMP_MIN source text temp_min
FRAC_AREA_TEMP_MIN thresholds set of number 0,-8,-10,-18,-20
FRAC_AREA_TEMP_MIN unit type text unit
FRAC_AREA_TEMP_MAX format fraction number text 9.99
FRAC_AREA_TEMP_MAX operator text ge
FRAC_AREA_TEMP_MAX source text temp_max
FRAC_AREA_TEMP_MAX thresholds set of number 25,30,35,40
FRAC_AREA_TEMP_MAX unit type text unit
FRAC_AREA_TEMP_PRECIPITATION format fraction number text 9.99
FRAC_AREA_TEMP_PRECIPITATION operator text ge
FRAC_AREA_TEMP_PRECIPITATION source text rainfall
FRAC_AREA_TEMP_PRECIPITATION thresholds set of number 1,3,5,10,15,30
FRAC_AREA_TEMP_PRECIPITATION unit type text unit
AVG_TEMP format fraction number text 99.9
AVG_TEMP operator text ge
AVG_TEMP source text temp_avg
AVG_TEMP thresholds set of number 0,2,4,6,8,10
AVG_TEMP unit type text remains

Administration

Before the process will be executed, some checks and tasks, mainly administrative, are done.

  • The module checks if no other instances of CMETEO are active in it's schema and registers itself as active. Only one process per schema can be active at once.
  • After registering, it checks if all objects are available that are needed for the selected theme (including the ROI).
  • It also redefines some synonyms (dynamic synonyms like CMETEO_REGION_WEATHER and CMETEO_GRID_LANDCOVER_AREAS) according to the demands of the process.
  • Before processing, it checks the availability of weather indicators for the selected interval.
  • After processing, it collects counts of processed data and some other items and informs the user via the tool PROCESS LOGGING.
  • It signals the Project Management Board that the process has successfully ended.
  • Finally it releases the lock on active processes.

Interface

The CMETEO package is designed to act in a batch file, or to start from a commandline. Both methods use the same procedure, CMETEO_ADMIN.DO_AGGREGATION to start with. This procedure defines the interface consisting of parameters, both obligatory or optional.

To get CMETEO working properly, you supply:

  • the name of the database schema that holds the data to be processed (although the schema name is not defined as a parameter, this name is part of the interface). You supply this name as the user name when you login to the database to start CMETEO.
  • p_theme: theme, an indication of the group of meteo indicators you wish to process. See the list below for possible themes.
  • p_spatial: spatial indicator, e.g. 'Regions', 'agri-environmental zones'. See the list below for possible alternatives.

N.B. you must supply the parameters for theme and spatial indicators exactly as they are defined.

Optionally, you can supply:

  • qry_region: a set identifiers of regions to be processed.
  • p_yr2start,p_yr2end,p_dy2start,p_dy2end: a day or an interval of days to be processed.
  • p_refresh: an indicator whether or not to refresh the previously aggregated weather indicators.

Details of the interface

Themes and spatial indicators as used by CMETEO to perform the process in schema:
schema: process: theme: spatial level:
cgfs14eu cmeteo-cgfs14eu-eps-regions weather(simulated) eps Regions
cgfs14eu cmeteo-cgfs14eu-eps-zones weather(simulated) eps agri-environmental zones
cgfs14eu cmeteo-cgfs14eu-his-regions weather(simulated) his Regions
cgfs14eu cmeteo-cgfs14eu-his-zones weather(simulated) his agri-environmental zones
cgfs14eu cmeteo-cgfs14eu-ope-regions weather(simulated) ope Regions
cgfs14eu cmeteo-cgfs14eu-ope-zones weather(simulated) ope agri-environmental zones
cgms12eu cmeteo-cgms12eu-regions weather(observed) Regions
cgms12eu cmeteo-cgms12eu-zones weather(observed) agri-environmental zones
cgms_asia_08 cmeteo-cgms-asia Weather - CGMS - Analysis Regions
gwsi_e_12 cmeteo-gwsi_e Weather - GWSI - Analysis Regions
gwsi_t_12 cmeteo-gwsi_t Weather - GWSI - Analysis Regions


  • The theme must be supplied in p_theme. See the table above for possible alternatives. It is an extraction of the process table (PRCMN_PROCESSES) in MRSMAN.
  • The spatial level must be supplied in p_spatial. See the table above for possible alternatives. It is an extraction of the process table (PRCMN_PROCESSES) in MRSMAN.
  • (Optional) An interval or day to process in the set of: p_yr2start (yyyy),p_dy2start (ddmmyyyy),p_yr2end (yyyy),p_dy2end (ddmmyyyy). CMETEO uses several possibilities to define an interval. A coarse way is to mention a year to start (p_yr2start) and a year to end (p_yr2end). In this case all days from the first of January of the starting year to the last day of December of the ending year will be processed. A more fine grained way is to provide a date to start (p_dy2start) and a date to end (p_dy2end) in which case only the days in between will be processed, or even one day if both entered dates are the same. You may supply any combination of years and dates. Note that in case you enter both years and dates the entered dates will overrule entered years. If you don't supply any, the default interval will be chosen, which starts at the first of January of the current year and runs up to the day before today. In fact, no day in the future will be processed.
  • (Optional) an indicator (p_refresh) to enable update of previously aggregated weather indicators, being one of 'yes', 'no'. The default is 'yes'. This one will be used when no value for this parameter has been supplied.
  • (Optional) A set identifiers of regions (q_regmap) to be processed. You may add a query or sql statement that delivers the preferred subset of regions for which you need cmeteo. The regions must be identified by their reg_map_id as found in table REGION_MAPPINGS in db schema FORALL.

If supplied, the selected set must contain some or all reg_map_ids of level 0 (!) for the relevant ROI, e.g.: select rm.reg_map_id from region_mappings rm where rm.reg_level = 0 and rm.reg_code in ('LU', 'BE', 'NL'); You also may supply a limited list of ids of region mappings which must be comma separated. If you leave this parameter empty, all regions of the ROI will be processed.

Several datasources work with the generic interface of CMETEO. The most important ones are listed in the table below.

Datasets (in CGMS12EU) used by CMETEO:
dataset containing: base set adapted for CMETEO name in CMETEO remarks
weather indicators per gridcell for meteodata WEATHER_OBS_GRID CMETEO_WEATHER_OBS_GRID (view) CMETEO_GRID_WEATHER input for cmeteo; intermediate set contains standardized attribute names
weather indicators per administrative region WEATHER_OBS_REGION CMETEO_WEATHER_OBS_REGION (view) CMETEO_REGION_WEATHER output of cmeteo; intermediate set contains standardized attribute names
weather indicators per agri-environmental region WEATHER_OBS_AEZ CMETEO_WEATHER_OBS_AEZ (view) CMETEO_REGION_WEATHER output of cmeteo; intermediate set contains standardized attribute names
landcovers or crops to be evaluated CMETEO_LANDCOVER_EU (in schema FORALL) CMETEO_LANDCOVERS (synonym) CMETEO_LANDCOVERS input for cmeteo; intermediate set contains standardized attribute names
areas/surfaces for landcovers per grid and per admin. region LINK_REGION_GRID_LANDCOVER CMETEO_REGION_GRID_LANDCOVERS (view) CMETEO_GRID_LANDCOVER_AREAS input for cmeteo; intermediate set contains standardized attribute names
areas/surfaces for landcovers per grid and per agri-env. region LINK_AEZ_GRID_LANDCOVER CMETEO_AEZ_GRID_LANDCOVERS (view) CMETEO_GRID_LANDCOVER_AREAS input for cmeteo; intermediate set contains standardized attribute names
areas/surfaces for crops per administrative region AGGREGATION_AREAS CMETEO_AGGREGATION_AREAS (view) CMETEO_AGGREGATION_AREAS input for cmeteo
weather indicators per region which are rejected CMETEO_REJECTED_WEATHER output of cmeteo, containing weather indicators per region and date that could not be stored in regular output dataset via cmeteo_region_weather
regions and their mappings CMETEO_REGION_MAPPINGS (in schema MRSMAN) CMETEO_REGION_MAPPINGS (synonym) CMETEO_REGION_MAPPINGS input for cmeteo
regions to process CMETEO_REGIONS (in schema MRSMAN) CMETEO_REGIONS (synonym) CMETEO_REGIONS input/output for cmeteo


In various schemas datasets are available that contain weather indicators, e.g. for meteo gridcells or regions of some type. However, they do vary both in name of the set and in names of the attributes. Also the collection of the attributes may vary (with or without day_offsets, with or without members of the meteo ensemble). In fact, all the sets contain the indicators that are essential for the processing of CMETEO. To prevent adaption of the CMETEO application every time another dataset in a schema must be processed or a new schema will be included into the processing scheme of CMETEO, attention was paid to the standardisation of the structure of datasets both as input and as output. The standardized dataset for input is called CMETEO_GRID_WEATHER. For each variant of a dataset containing grid weather a view is created that presents all attributes needed for the CMETEO process, either as data selected from the dataset or a dummy value. CMETEO just reads this data.

For the output of CMETEO also a view is created for each variant of output. In some schemas several of such views can exist. The name of a view hints, in most cases, to the name of the dataset on which it is based. CMETEO uses a synonym CMETEO_REGION_WEATHER, denoting the proper view, to access the data and to write the results. At the start of the processing, CMETEO is able to create the synonym to point to the right view (for either regions or AEZ).