Calibration Manager

From Agri4castWiki
Revision as of 14:22, 9 April 2018 by Raymond (talk | contribs)
Jump to: navigation, search


Introduction

The calibration Manager is a Python packages that links Python wofost (PCSE) to an open optimization tool NLopt and various databases required to run and optimize the model. This makes it possible to calibrate (a combination of) crop parameters (e.g. TSUM1 and TSUM2, SPAN and SLATB and TDWI,...) using one or more target variables (e.g. day of anthesis, day of maturity, lai-max, harvest index,...). The selected target variables are combined in a single objective function that is optimized. Additional functionalities are added to the calibration manager, such as normalizing target variables, assigning weights to experiments and calendars, criteria when expert knowledge is allowed to enter the optimization, applying additional crop masks to exclude non agricultural areas from the calibration.

Architecture

Calibration manager architecture.jpg

Components:

  • Calibration manager (Python package)
  • PCSE (Python Wofost crop model)
  • NLopt (nonlinear optimization module)
  • AgroPheno database (SQLite) defining observed:
    • Start of season (DOP or DOE)
    • Flowering (DOA)
    • End of season (DOM or DOH)
    • Target LAI (LAI-max and LAI-end)
    • Target Harvest index (HI)
  • CGMS12 or CGMS14 database (Oracle) defining

Calibrating of TSUMS

TSUM1 and TSUM2 (and TSUMEM) can be calibrated in a combined objective function that is optimized. Instead of using a separate simple model to simulate phenology, this calibration uses the full python Wofost implementation, including vernalization and photoperiodicity and is thus also suitable for calibrating winter crops with more complex temperature dependencies.

Using experimental observations

Calibrating tsum1 tsum2.jpg

Preferably, experiments are used to calibrate TSUMS. As a first requirement, each experiment should have an observed start-of-season: to start a simulation and simulate flowering and end-of-season. In the calibration, PCSE starts either on DOP or DOE, depending on what is available. It is not needed to derive one from the other bases on a rule of thumb or calculated local average between observed duration between DOP and DOE. As second requirement the experiment should have observed flowering and/or observed end-of-season; to compare with the simulated flowering and end-of-season. Within each zone, there should be at least one experiment with observed flowering and one with observed end-of-season to properly distribute TSUM1 and TSUM2 over the season. In summary: per calibration zone we need at least one DOP-DOA or DOE-DOA and at least one DOP-DOM or DOE-DOM or DOP-DOH or DOE-DOH. Consider the example of a calibration zone with 8 experiments (see graph). For experiments 1-7, the simulation can start. For these experiments, simulated flowering and end-of-season are available. For experiments 1, 2, 3, 5, 6 and 7 observed flowering and/or end-of-season are available (green blocks). These can be compared to the simulated values: 7 pairs for which the difference can be calculated.

All differences are combined in a single objective function f(x):

f(x) = √(f(a)+f(m)) f(a) = mean squared difference of anthesis f(m) = mean squared difference of maturity

The calibration manager searches for TSUM1 and TSUM2 where f(x) is minimized. In the above example, the observations from experiment 8 are not used because there is no observed start-of-season. Thus, flowering and maturity cannot be simulated and differences with observed flowering and maturity cannot be calculated.

Weight for experiments

Calibrating tsum1 tsum2 weighted.jpg

To avoid that grids with many experiment in the same grid have a heavy impact on the calibration for the entire zone, the experiments are weighted in such a way that each grid has the same weight. Consider the example of a calibration zone with 8 experiments and 2 regional calendars (see graph). Three experiments (1, 2 and 3) fall inside grid number 1. Each of these experiments will be given weight 1/3. There is only a single experiment (5) in grid number 2 which is therefore given weight 1. Etc.

Including regional calendars

In case not enough experimental data are available, regional calendars are added. Regional calendars are included if within the zone:

  • There is not a single experiment defining the vegetative phase (DOP-DOA or DOE-DOA)
  • Or there is not a single experiment defining the full season (DOP-DOM or DOE-DOM or DOP-DOH or DOE-DOH)
  • Or the combined number of above phases is less than 15 (configurable).

In the example, there are 2 experiments defining the vegetative phase and there are 5 experiments defining the full season. Theoretically this is enough to calibrate TSUM1 and TSUM2 for this zone. Because the combined number of above observations is lower than 15, also the two regional calendars will be included. Regional calendars are valid for many grids and many years. Therefore a weight is assigned to each used calendar, in order not to outnumber available experimental observations that are only valid for a single grid and a single year. The weight is calculated in such a way that each calendar as a whole has a weight 1. In the example, two regional calendars are found: regional calendar 9 and 10. Calendar 9 is assigned to 5 grids and valid for years 2001-2010: 50 combinations. Calendar 10 is assigned to 4 grids and is valid for years 1991-2010: 80 combinations. Each combination of calendar 9 gets a weight of 1/50. Each combination of calendar 10 gets a weight of 1/80.

To limit the number of included AgroPheno data from regional calendars, only grids are included that fall within the Mapspam crop mask. For experiments this limitation is not applied! By excluding grids with regional calendars outside the Mapspam crop mask, in some zones still no vegetative and full season are defined. In that case also grids outside the Mapspam crop mask are allowed.

Processing sequence

The calibration manager works in the following way:

1 Loop over zones 1.1 Get all grid_no’s of zone 1.2 Loop over grids 1.2.1 Get all experiments of grid. 1.2.2 Get regional calendars of grid including a flag if the grid has a MAPSPAM pixel > 50 ha of physical area for rainfed wheat. 1.2.3 In case experiment or regional calendar has DOH and not DOM, convert DOH to DOM by subtracting a (configurable) number of days 1.2.4 In case experiment or regional calendar has DOH and DOM, remove DOH 1.3 End loop over grids 1.4 Count experiment combinations with DOP and DOE per grid 1.5 Count experiment combinations with DOP-DOA and/or DOE-DOA 1.6 Count experiment combinations with DOP-DOM, DOE-DOM 1.7 Calculate weights: 1.7.1 Weight experiments = 1/d 1.7.2 Weight calendars: 1.7.2.1 if d=0 or e=0 or (d+e)<15 and grid-year combinations under mapspam >0 1.7.2.2 then weight = 1/(count of grid-year combinations) 1.7.2.3 else weight = 1/(count of grid-year combinations including grids outside mapspam mask) 1.8 Loop over grid-year combinations for experiments and regional calendars 1.8.1 run simulation with PCSE 1.8.2 Calculate differences between simulated and observed flowering and end-of-season. 1.9 End loop over grid-year combinations 1.10 Calculate combined RMSD of all differences, using weights for regional calendars 1.11 Calibration: Repeat h, i, j with different TSUM1, TSUM2 until combined RMSD is minimized (with NLopt). 2 End loop over zones

3 Export optimal TSUM1, TSUM2 and other calibration stats for each zone