A hitchhiker guide to modelling

by  Aalt-Jan van Dijk

Wageningen University, The Netherlands

In my last few posts (‘Tell me who your sister is and I’ll tell you who you are‘ and ‘Cold kick-start for flowering‘), I discussed a couple of modelling papers. Hence, I thought it would be good to give a more general ‘travel guide’ for this type of papers. To do so, I will mostly focus on a recent paper by Seaton et al. (2015) which deals with the link between the circadian clock and flowering in response to photoperiod and temperature. The components of their model include CYCLING DOF FACTOR 1 (CDF1), FLAVIN-BINDING, KELCH REPEAT, F-BOX 1 (FKF1), GIGANTEA (GI), CONSTANS (CO) and FLOWERING LOCUS T (FT). For the purpose of this guide, I will group the results using three different stages that we can distinguish in a typical modelling approach (Fig. 1).


Fig.1.Three different modelling stages and typical examples of results obtained in these stages

(1) The first stage consists of constructing the model. One example from the Seaton paper is that the effect of GI on CDF1 protein stability is ‘sufficient to explain the lower CO transcript levels observed in the gi mutant’. This means that a model variant, in which the effect of GI on CDF1 stability is incorporated, is able to reproduce the known fact that the gi mutant has lower CO levels. The construction phase is, in most cases, when parameter values have to be obtained. Often, the strength of the interactions in quantitative models is estimated based on e.g. time course data. One then adjusts the parameter values to get as good fit as possible between predictions and data. One example of such parameter fitting in the Seaton paper is the influence of temperature on the regulation of FT by SHORT VEGETATIVE PHASE (SVP) and FLOWERING LOCUS M (FLM). According to the paper, ‘the action of these regulators can be modelled by introducing a uniform activation of FT expression at 270C’. What is done here is that when the temperature is 270C instead of 220C, three parameters are changed in the model. Importantly, given the high number of parameters, relatively noisy and/or scarce data, a good fit is not yet proof for the validity of a model or any of the claims made based on results obtained during this construction phase.

(2) The second stage of modelling deals with model predictions and their validation. Typical phrases to look for if you want to travel in that direction are ‘previously unseen data’ or ‘test data’. An example from the Seaton et al. paper is that model simulations for the response of CO and FT were compared with data ‘not used for parameter optimisation’. The way in which model predictions and data are compared can be either quantitative or qualitative. In this paper, comparison is qualitative. This makes it difficult to decide how much evidence there really lies in the fact that model and data agree to a certain extent. A qualitative match can sometimes be trivial, if in the construction stage choices were made which inevitably lead to certain model behaviour. Note that discrepancies between model predictions and experiments can be very informative. According to Seaton et al. (2015) ‘the model is unable to fully describe the dynamics of CO and FT mRNA in the cca1;lhy double mutant’. This could indicate that CIRCADIAN CLOCK ASSOCIATED 1 (CCA1) and LATE ELONGATED HYPOCOTYL (LHY) may also regulate CO and FT transcription by additional mechanisms than those included in the model.

Various other examples of validation could be given. In my recent flowering time model (Leal Valentim et al., 2015), validation was performed by predicting changes in expression level in mutant backgrounds and comparing these predictions with independent expression data, and by comparison with predicted and experimental flowering times for several double mutants. This involved a quantitative comparison between experiments and predictions. In the model by Satake et al., (2013), parameters were fitted to data obtained in controlled experiments in the lab. The model subsequently predicted the response to temperature conditions measured in the field. Comparison of these predictions with measurements indicated a convincing ability to predict flowering.

(3) Finally, a third stage in modelling is to interpret model features or results emerging from the modelling to obtain biological insights. One more example from the Seaton paper is that coherent feed-forward networks are identified. In such network structures, a single component plays multiple reinforcing roles in the system. The concept of feed-forward or feed-back loops and related network motifs are intuitively appealing and lead to new insights into how biology works. Discussion of such types of network motifs can be found in a study by Jaeger et al. (2013). The interlocked feedback loops were claimed to be relevant in a developmental context because they enable the acquisition of a fate outcome that is stable over many cell divisions.

One general remark that I still want to make is that for modelling as well as for any other scientific experiment, it all starts with good questions. To illustrate this, consider the answer given to ‘the great Question of Life, the Universe and Everything’ by the computer in ‘The Hitchhiker’s Guide to the Galaxy’. ‘Forty-two’, said Deep Thought, with infinite majesty and calm. ‘I checked it very thoroughly, and that quite definitely is the answer. I think the problem, to be quite honest with you, is that you’ve never actually known what the question is’.

This small ‘travel guide’ is of course far from complete. Some aspects that I have not discussed include technical issues like what type of model is used – e.g. continuous vs. discrete or stochastic vs. deterministic. In addition to the molecular level discussed here, models could include various other levels such as e.g. cells or tissues. Finally, it is important to realise that results in the validation stage or the interpretation stage can sometimes be quite sensitive to choices and parameter values defined during the construction stage. Nevertheless, hopefully this ‘travel guide’ will help you to find your way when reading your next modelling paper!


Seaton DD, Smith RW, Song YH, MacGregor DR, Stewart K, Steel G, Foreman J, Penfield S, Imaizumi T, Millar AJ, Halliday KJ. 2015. Linked circadian outputs control elongation growth and flowering in response to photoperiod and temperature, Molecular Systems Biology11(1): 776. doi: 10.15252/msb.20145766.

Leal Valentim F, van Mourik S, Posé D, Kim MC, Schmid M, van Ham RCHJ, Busscher M, Sanchez-Perez GF, Molenaar J, Angenent GC, Immink RGH, van Dijk ADJ. 2015. A quantitative and dynamic model of the Arabidopsis flowering time gene regulatory network. PLoS ONE 2015 doi: 10.1371/journal.pone.0116973

Satake A, Kawagoe T, Saburi Y, Chiba Y, Sakurai G, Kudoh H. 2013. Forecasting flowering phenology under climate warming by modelling the regulatory dynamics of flowering-time genes. Nature Communications. Aug 14;4:2303.

Jaeger KE, Pullen N, Lamzin S, Morris RJ and Wigge PA. 2013 Interlocking feedback loops govern the dynamic behaviour of the floral transition in Arabidopsis. The Plant Cell. 25(3), 820-833.


About Flowering Highlights

Flowering Newsletter published by the Journal of Experimental Botany
This entry was posted in flowering, Uncategorized and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s