In order to optimize the added-value given by analysts with further data processing, the map and graphic display analysis, the MARS analysts needs a tool in a common web framework able at the same time to
- visualize the results of MCYFS according to user definition of pages/content (portal concept)
- visualize specific pre-defined process output (warehouse concept)
- enable interactive re-parameterisation and re-launch of processes
- enable interactive geographical data visualisation.
In 2003, such a tool was developed to provide MARS statisticians an integrated interface to reduce the time access to the information and support the analysis in more automatically processing of the data; this tool is here referred as.
The COBO links all the heterogeneous environment used in the forecast process by the analysts, consisting in a mix of development and automation of procedures in a web-based environment that provide an integrated guide for the user through the different steps of the estimation production.
Each user of the COBO has an account, with a password and a rule, that permit to limit access to the data. In particular, two types of users are foreseen: the analyst, who performs the analysis and proposes forecasts for selected crops and countries, and the administrator, who is responsible of the “opening” of the analysis (starting the whole procedure with the loading of updated data) and the decision on the final forecasts to be published.
The statistical part of COBO
The statistical forecasting process can be subdivided in three main phases, all integrated in the COBO tool; part of the first two phases are run automatically in CGMS Level 3 (Yield Forecasting), developed in S+ environment, while other analyses are carried out in COBO through SPSS.
First phase: data import and inspection
The first phase of the statistical forecasting process consists in updating the reference official time series (Eurostat data).
Eurostat sends every 30 days via email a file extracted from their CRONOS DB, i.e. time series at national and regional level of crop area, yield and production; these data are in the following steps used to study and evaluate trends.
The COBO page for data import:
COBO allows the administrator to inspect the imported data, both graphically and in table format, and eventually to change manually imported values:
Second phase: production and proposal of forecasts by the analysts
Once the data loading process has been completed and confirmed by the administrator, each analyst is allowed to analyze and proposed forecasts separately for each crop and country, following the pre-defined scheme described above.
The forecast production process allows the user to make his own analysis and at the end to propose to the administrator one or more plausible forecasts per country and crop; COBO registers all values proposed and displays them in a summary page for the administrator to leave him the possibility to make the final choice.
In the following, the steps performed by the analysts will be described in the same order proposed by the COBO. This order reflects the prioritization.
The CGMS trend analysis and prediction model
Official statistics of regional mean yields are predicted automatically by the CGMS using one of its simulated predictors:
- Potential dry weight of the simulated biomass (ton.ha-1).
- Water limited dry weight of the simulated biomass (ton.ha-1).
- Potential dry weight of the simulated storage organs (ton.ha-1).
- Water limited dry weight of the simulated storage organs (ton.ha-1).
The statistical sub system of the CGMS uses a combination of a linear time trend and crop growth simulation results.
As a first attempt, the analysts considers the CGMS output results coming from the default launch of CGMS Level 3, and displayed automatically in COBO:
If the analyst is not satisfied by the automatic results given by CGMS Level 3 system output, he is given the possibility to re-parameterize part of the time series analysis (for instance, changing the time series length), launching a new CGMS simulation:
Trend analysis in COBO
When for a certain combination of country and crop the accuracy of the CGMS prediction is deemed not to be sufficient, the MARS analyst starts to redefine trend periods and functions. First, trends for a longer period (1975 until current year) are determined if yield statistics for such a period are available. Next, trends for more recent periods are studied. Besides the analysis of the trend period, different trend functions (linear, quadratic, logarithm and exponential) are studied:
Scenario-analysis in SPSS
To deal with the residual uncertainty given by the unknown evolution of the season from the moment the forecast is issued to the moment the crop is harvested, agro-meteorological scenario's can be produced and analysed. The scenario analyses consist in finding the most similar agro-meteorological years basing on the time series of parameters simulated by the CGMS. The analysis is based on Principal Component Analysis (PCA), Factor Analysis and Cluster Analysis; as input crop indicators of the CGMS of all available years are used. Further elements are given in section 3.6. Once agro-climatic years are detected which are similar to the current year, the resulting range of final yield performances of these years can be attributed to the current season. If a trend exists the range of final yields will be corrected for this trend before relating the yields to the current year. Scenario analyses and subsequent forecast computations are carried out in COBO calling the SPSS routines; the MARS analyst is given the possibility to decide which indicators can be used for each country and crop. The scenario analysis are always executed both to produce and update constantly pessimistic and optimistic crop yield scenario at EU level at any time of the year and in order to use the output as principal forecasts in case the other models would not show good statistical performances. Like in the CGMS section, the analyst is displayed with some “default” scenario analyses carried out automatically each 10 days, or he is allowed to create his own new scenario fitting to the specific situation analyzed:
Summary results of the current analysis are displayed in the COBO page, with a link to the detailed SPSS output:
Trimmed average or custom values
The last options available concerning the forecast are the last five years average, or a trimmed average in COBO:
Third phase: the final choice of the forecast
The different elements such as prediction models, trend analysis, scenario analysis, trimmed average and judgement of the analyst on the result lead to a yield forecast. Based on their experience and expert knowledge, the MARS analysts propose the yield forecast during the MARS bulletins meetings. The forecasts proposed are checked by the statistical administrator and discussed collegially trying to optimise the pattern of reliability given by all sources i.e. give priority and “prefer predictors” according to the different moment of the year. Decisions, final choices and all intermediate steps through the COBO tool are registered and traceable, and the whole forecast process is therefore transparent and repeatable.The following chart resumes the approach: