Various authors have proposed to subdivide crop yield in three components: mean yield, multi-annual trend and residual variation (e.g. Vossen, 1989; Dagnelie et al. , 1983; Dennet et al. , 1980; Odumodu and Griffits, 1980). It is assumed that the interacting effects of climate, soil, management, technology, etc. determine the mean yield. Observed national, regional and sub-regional yields show a trend in time. The trend is mainly due to long-term economic and technological dynamics such as increased fertiliser application, improved crop management methods, new high yielding varieties, etc. The third component, the residual variation, is considered to be the variation among years (Dennet et al. , 1980). It is exactly this part which should be explained by weather, crop and remote sensing indicators.
According to Dennet et al. (1980) and Odumodu and Griffits (1980), the technological time trend should be removed from the crop yield time series, assuming that the residual variation is independent of that trend. This approach can be summarised as (Vossen, 1989):
YT,obs equals Yavg + f(T) + e
Palm and Dagnelie (1993) fitted various time trend functions to national yield series (ton.ha -1 ) of several crops for 9 EU member states. Regressions were executed for the period prior to 1983 and a forecast for 1983 was made. This procedure was repeated for successive years up till 1988. The prediction results were compared with national yield values. Of the tested functions a quadratic function of time performed best. However, differences with a simple linear trend function were small. In a next step, these authors removed the trend from the yield series using the quadratic function. The residuals for the period prior to 1983 were regressed against various meteorological parameters and a prediction for 1983 was made. Again, this procedure was repeated for successive years up till 1988. This was done for 19 Departments in France . Comparing the predicted and official yield series demonstrated that the applied meteorological variables did not improve the prediction accuracy.
Swanson and Nyankori (1979) for corn and soybean production in the USA , Sakamoto (1978) for wheat production in South Australia , Agrawal and Jain (1982) for rice yields in the Raipur District in India , considered the technological time-trend dependent on the residual variation. According to Winter and Musick (1993), Hough (1990) and Smith (1975), weather affects farm management practices such as planted area, timing of field operations, application of inputs, etc. Hence, the time trend should be analysed simultaneously with the explaining variables. This approach can be summarised as (Vossen, 1989):
YT,obs equals bo + f(T) + f(weather) + e
Swanson and Nyankori (1979) showed that the time trend was underestimated when weather data were not analysed simultaneously with the time trend. Similar results were found for millet in Botswana (Vossen, 1989).
The previous equation does not account for the interaction between crop growth and weather variability. Also root characteristics and soil physical properties are not accounted for. Therefore Vossen (1990b, 1992) proposed to use crop growth simulation results to describe year-to-year yield variation. In a crop growth simulation model weather and soil characteristics are summarised and crop characteristics, including yield form the output, i.e. simulation results quantitatively represent the influence of weather variables on crop growth. The yield can be written as:
YT,obs equals bo + f(T) + f(simulation) + e
Official statistics of regional mean yields are predicted by the CGMS using one of the following simulated predictors (see Crop Simulation):
- Potential above ground biomass (ton.ha-1 dry weight)
- Water limited above ground biomass (ton.ha-1 dry weight)
- Potential storage organs biomass (ton.ha-1 dry weight)
- Water limited storage organs biomass (ton.ha-1 dry weight)
Originally, it was intended to predict yields by solely using the water limited weight of storage organs in the prediction model. Later on, the other three were added. Water limited yield, for instance, is inappropriate for a region with a lot of irrigation. Furthermore drought stress can be strongly reduced in case of groundwater influence. This factor is not included in the CGMS. The simulated biomass indicators were added because these are more robust, and less sensitive to modelling errors in the distribution of assimilates. Moreover they also allow yield prediction during the growing season, when grain filling has not yet started or grains are still very small (de Koning et al. , 1993). In the current system all of the crop parameters can be used as predictors including Leaf Area Index, Soil Moisture and Development Stage. With little effort any indicator can be added, such as the ratio between the estimated actual crop transpiration and potential crop transpiration (Ta/Tp, see Crop Simulation) or remote sensing indicators (see Remote Sensing).
The statistical sub system of the CGMS uses a combination of a linear time trend and crop growth simulation results as proposed by Vossen (1990b, 1992). This prediction model can be described as:
YT equals b0 + b1T + b2ST
Sub-optimal production circumstances such as drought, low temperatures etc. are allowed for by the constant b2, which should lie between 0 and 1.
Per region, for a moving window of at least 9 years, the regression coefficients are established and subsequently used for yield prediction of the 10th year (‘one-year-ahead'). The selection of the predictor to forecast the final yield is as follows:
- Each candidate predictor is fitted to the data currently available for this region.
- Candidates with a negative estimate of b2 are rejected because of the nature of the process.
- From the remaining ones, that with the lowest jackknife mean square error is selected.
|Jackknife errors are calculated by simulating that an observation is absent and that the predictor is used to assess its value. It reveals the error in predicting the observation which had been kept out of sight. Obviously, jackknife errors are not entirely relevant in the present situation where we want to predict the future rather than to reconstruct the past. For direct application it is more relevant to investigate the prediction of the one-year-ahead. Still the jackknife method is used because the jackknife error-size estimates are less variable, being based on a larger number of predictions. With the same number of observations ‘n' the jackknife method has ‘n' error estimates while the ‘one year ahead' prediction, has only ‘n-y' error estimates where ‘y' is the number of years on which the prediction is based. More detailed descriptions are given by de Koning et al. (1993) and Jansen (1995).|
A quadratic trend function is also considered in the CGMS. However, based on results of Palm and Dagnelie (1993) and de Koning et al. (1993), it was concluded that a linear trend sufficiently describes the increasing official yields. A smooth trend of any type over a large number of years assumes a continuity which might be unrealistic (de Koning et al. , 1993; [[References|Vossen, 1992[[References|; Vossen, 1990a). According to Vossen and Rijks (1995) the predictor should only be based on data from the recent past. The length of the series should nevertheless be long enough to give a sufficient number of degrees of freedom in the regression analysis. Gradual shift in the time trend is allowed for by the shortness of the time series, used to derive the predictor.
Required input data are stored in the tables
The statistics have a wider range of crops than the ones considered by the Crop Simulation. Therefore yields of some of the 'statistical crops' are forecasted using the same 'CGMS crop'. This relation is stored in table STAT_CROP.
To be able to run the forecast in batch mode, all model parameters are stored in advance in tables:
Each ten days the all stored models are run an results are written to the tables:
Before the start of each growing season, yield forecast are produced based on the long term average and corrected for a technological trend. The MARS analyst can change the length of the time series. This re-defines the trend function and results in different CGMS level 3 forecasts.
When for a certain combination of country and crop the accuracy is deemed not to be sufficient, the MARS analyst start to redefine trend periods and functions using Excel, SPSS or the user interface of the CGMS statistical tool.
First, trends for a longer period (1975 until current year) are determined if yield statistics for such a period are available. Next, trends for more recent periods are studied. For Eastern Europe the period after 1990 is used (to exclude strong changes caused by political changes around 1990). For countries within the European Union the period after 1992 is important because in 1992 the Common Agricultural Policy went through important changes that affected yield and planted areas.
Besides changing the trend period, different trend functions are studied. Yield statistics of each country are directly taken from the CRONOS database which is updated each month. Linear, quadratic and other type of trends are studied. MARS analysts also study the minimum and maximum trend evolution by separating the data set in two groups representing the 50% highest and 50% lowest values.
Other prediction models
In some cases the MARS analyst builds his/her own prediction models for certain combinations of crops and countries. These models use can include any number of indicators from the Crop Simulation module, Weather Monitoring module or Remote Sensing module. They can be build in SPSS or the CGMS statistical tool.