Show Summary Details
More options …

# Journal of Integrative Bioinformatics

Editor-in-Chief: Schreiber, Falk / Hofestädt, Ralf

Managing Editor: Sommer, Björn

Ed. by Baumbach, Jan / Chen, Ming / Orlov, Yuriy / Allmer, Jens

Editorial Board: Giorgetti, Alejandro / Harrison, Andrew / Kochetov, Aleksey / Krüger, Jens / Ma, Qi / Matsuno, Hiroshi / Mitra, Chanchal K. / Pauling, Josch K. / Rawlings, Chris / Fdez-Riverola, Florentino / Romano, Paolo / Röttger, Richard / Shoshi, Alban / Soares, Siomar de Castro / Taubert, Jan / Tauch, Andreas / Yousef, Malik / Weise, Stephan / Hassani-Pak, Keywan

CiteScore 2018: 0.90

SCImago Journal Rank (SJR) 2018: 0.315

Open Access
Online
ISSN
1613-4516
See all formats and pricing
More options …
Volume 15, Issue 3

# BioModelKit – An Integrative Framework for Multi-Scale Biomodel-Engineering

Mary-Ann Blätke
Published Online: 2018-09-06 | DOI: https://doi.org/10.1515/jib-2018-0021

## Abstract

While high-throughput technology, advanced techniques in biochemistry and molecular biology have become increasingly powerful, the coherent interpretation of experimental results in an integrative context is still a challenge. BioModelKit (BMK) approaches this challenge by offering an integrative and versatile framework for biomodel-engineering based on a modular modelling concept with the purpose: (i) to represent knowledge about molecular mechanisms by consistent executable sub-models (modules) given as Petri nets equipped with defined interfaces facilitating their reuse and recombination; (ii) to compose complex and integrative models from an ad hoc chosen set of modules including different omic and abstraction levels with the option to integrate spatial aspects; (iii) to promote the construction of alternative models by either the exchange of competing module versions or the algorithmic mutation of the composed model; and (iv) to offer concepts for (omic) data integration and integration of existing resources, and thus facilitate their reuse. BMK is accessible through a public web interface (www.biomodelkit.org), where users can interact with the modules stored in a database, and make use of the model composition features. BMK facilitates and encourages multi-scale model-driven predictions and hypotheses supporting experimental research in a multilateral exchange.

This article offers supplementary material which is provided at the end of the article.

## 1 Introduction

In the mid-1990s, systems biology emerged as a new discipline in biosciences aiming to achieve a systems-level understanding by encouraging the development of advanced high-throughput technologies to investigate molecular mechanisms on various omic levels. The consistent interpretation and integration of the resulting big data, which is not only numerous but also complex and diverse, is still a challenging field, which demands integrative modelling and analysis frameworks.

The well-established formalism of Petri nets (PNs) has been shown to ideally support multidisciplinary research [1], [2], [3], [4]. By their definition, PNs allow for an intuitive representation of reaction networks including concurrency and synchronisation [5], [6].

Figure 1:

Overview of the PN framework.

A PN is defined by two sets of nodes, places representing conditions (e.g. molecules) and transitions describing events (e.g. reactions), that are connected via directed arcs carrying arc-weights to reflect the stoichiometry. Tokens, which reside only in places, indicate the value of a condition. A transition is a called enabled and may fire if the number of tokens (marking) on the pre-places is equal to or greater than the corresponding arc-weights. Only one transition can fire at a time. During the firing, the chosen enabled transition removes tokens from its pre-places and adds tokens to its post-places, both according to the arc-weight. After firing a new state is reached. Playing the so-called token game allows exploring the state space.

The standard, i.e. discrete and time-free PNs have been initially used for structural analysis, see, e.g. [7], [8], [9], [10], [11]. Moreover, PNs offer a unifying framework, which empowers the quantitative (i.e. stochastic, continuous, hybrid) interpretations of a qualitative model [12], [13]. Coloured PNs, combine PNs with a programming language to obtain a scalable modelling language for concurrent systems. The combination of PNs with a programming language provides the primitives for encoding data manipulation and creating compact and parameterizable models [14]. Therefore, coloured PNs offer a convenient framework to model complex multi-scale [15], multi-level and multi-dimensional aspects of a biomolecular system, see also [16] for a recent review.

For our purposes, we decided to use the compatible tools:

• Snoopy – a tool for modelling and animation/simulation of hierarchical uncoloured/coloured qualitative, stochastic, continuous and hybrid PNs [17], [18], [19];

• Charlie – an analysis tool of standard PN properties [20];

• Marcie – a symbolic reachability analysis tool, including CTL and CSL and model checking [21].

which support all aspects of PNs as described above and more.

Modelling concepts need to be designed in a way to provide a clear and precise outline of the modelled molecular mechanisms and to adapt the model content according to the analysis focus flexibly. These two claims involve the reuse and recombination of models or parts thereof. Here, the fundamental concept of modularisation that has been widely used in engineering to reduce the complexity of problems by decomposing a system into functional parts with defined interfaces comes into focus. BioModelKit (short, BMK) seizes on the idea of modularisation in the context of modelling; we summarize related work in Section 2.

BMK is a web-based platform (www.biomodelkit.org) that relies on an a priori molecule-centred modularisation concept, where sub-models (modules) are designed consistently with matching interface networks to allow for the automatic composition of complex and coherent models, see Section 3. Molecule-centred means that a module describes the function and interaction of one particular component. Section 3.1 provides all necessary terminology of the used modularisation concept. BMK makes use of different module types to integrate molecular mechanisms and correlations on the main omic level (metabolome, proteome, transcriptome, genome), i.e. we distinguish mechanistic modules (protein, protein degradation, mRNA, and gene modules) and causal modules (causal influence and allelic influence modules), see Figure 2 – Box 1 and Section 3.2 for details. A module is given as a PN, which relates the molecular structure of the component and different conformational states thereof (places) to molecular events (transitions) interfering with the molecular function, see Section 3.3.

The module coupling is not restricted to a particular omic or abstraction level allowing to aggregate molecular mechanisms and correlations on various levels in a composed model consistently. Modules can be recombined arbitrarily to compose alternative models with competing module versions or reused in the context of a different biological system. The arbitrary recombination of modules enables the user to decide also about which omic levels to include in a composed model. Section 4 describes the details of the model composition based on an ad hoc chosen set of modules.

Figure 2:

Overview of the BioModelKit framework.

BMK offers several additional features to enhance the versatility of the proposed framework. First, an annotation file in the BMK markup language (BMKml) complements the PN graph of each module, holding details about the authorship and modelling process, documentation of all model elements, and cross-references to resources. So far, we exploit three ways to obtain modules, see Figure 2 – Box 2: (i) direct engineering of modules based on published knowledge; (ii) reverse engineering of modules from complex (omic) data sets; and (iii) modularisation of gene regulatory networks [22] and SBML encoded models [23]. The three approaches are discussed in Section 5.2. The algorithmic mutation is another feature of BMK to generate alternative models automatically by mimicking the effect of gene knockouts and structure-function mutations. This approach allows for in-depth in silico mutation studies to determine variations in a molecular mechanism reproducing the desired function, see Section 5.3. Another recent addition to BMK is the spatiotemporal transformation that adds a position to all components in the composed model, which allows tracking the movement of the components and their interaction in a defined space (1-, 2-, or 3-dimensional) with restriction to the assumed cellular arrangement [24], details are explained in Section 5.4.

The accessibility of modules and the model composition, as well as the features mentioned above, are crucial components of BMK, which is realised through the web interface and the database, see Figure 2 – Box 3, Section 6 discusses aspects of their implementation.

A brief discussion and outlook on BMK is given in Section 7.

## 2 Related Work

In system biology, modularisation approaches have mostly been applied to decompose complex models for analysis purposes, see [25]. Only a few approaches like the one from Cooling et al. [26] suggest composing models of ab inito built standard virtual parts (SVPs), which are accessible via CellML [27]. In all cases, manual adjustments are required to couple the obtained modules.

The public dissemination of models is another critical issue in life sciences to not only reproduce results but also to accelerate their reuse. Next to CellML, which is a repository for a wide range of biological models, the Biomodels database [28], BIGG Models [29] and the SEEK platform [30] offer an infrastructure to collect and exchange published models.

While some of the model repositories mentioned above support the simulation of models, none of them supports the automatic composition of models from a set of existing models. The major reasons inferring the automated coupling of models are the use of diverse modelling formalisms, the incompatibility of granularities, the lack of naming convention standards of model elements (reactions, species, parameters etc.) and the inconsistent formulation of assumptions and constraints used in a model concomitant with insufficient model documentation [31].

As already introduced, BMK applies the concept of modularisation to compose models from a set of modules, that are given as reuseable Petri net submodels with interfaces, which are specifically and consistently designed for model composition in an automated fashion. To the best of our knowledge, the framework established with BMK is unique. There exist no comparable frameworks or tools.

## 3 Modularisation

BMK employs an a priori molecule-centred modularisation concept, where sub-models (modules) are designed consistently with matching interface networks to allow for the automatic model composition. Molecule-centred means that a module describes the function of one particular component by a PN. The PN relates the molecular structure of the component and its conformational states (places) to molecular events (transitions) interfering with the molecular function. Below, we introduce the formalisation used within BMK, compare Figure 3, which exemplifies the modularisation and model composition concept on a simple example consisting of a kinase and its transcription factor as a feedback activator.

## 3.1 Module Terminology

In this section, we introduce the module terminology of the molecule-centred modularisation approach as in [32]. This formalisation is essential for the realisation of BMK including all of its features and functionalities. A brief example follows every term about the running example depicted in Figure 3. Figure 3(A) provides a short description of the molecular mechanisms, while Figure 3(B) depicts the corresponding modules, respectively the composed model.

A component is any chemical compound that contributes to a biological process at the molecular level.

#### Running Example

Components in the running example are kinase protein pK, kinase mRNA mK, kinase gene gK, transcription factor protein pTF.

A genetic component is a component with genetic information like genes, mRNAs, and proteins.

#### Running Example

All of the components above are genetic components.

A non-genetic component is a component without genetic information like metabolites, second messengers, energy equivalents, ions, etc.

#### Running Example

None of components above is a non-genetic component.

Figure 3:

(A) Molecular mechanism: Active kinase (K) phosphorylates its transcription factor (TF), the phosphorylated TF binds to the K gene, which starts the transcription, followed by protein translation of K. (B) Modular representation by PNs: Two protein modules for the TF and K, one gene, mRNA and protein degradation module for K according to the molecular mechanism in (A). Nodes highlighted in the same colour define a shared interface network. Modules share their tokens after model composition and thus, the dotted tokens appear on shared places enabling the execution of interactions.

A module Mc0 describes the functionality of one particular genetic component c0, called main component, including its interactions with nIC ≥ 0 components. Here, $C\left({M}_{{c}_{0}}\right)=\left\{{c}_{0},\dots ,{c}_{{n}_{IC}}\right\}$ defines the total set of components of module Mc0. We distinguish mechanistic module types (protein, protein degradation, mRNA and gene modules) and causal module types (allelic influence and causal influence modules), see Section 3.2 for details.

#### Running Example

Modules related to

• kinase K: protein module MpK, protein degradation module MdK, gene module MgK, and mRNA module MmK;

• transcription factor TF: protein module MpTF

A functional unit u is a defined part of a component $c\in C\left({M}_{{c}_{0}}\right)$ with an assigned function that is of importance for the functionality of the component. A genetic component $c\in C\left({M}_{{c}_{0}}\right)$ consists of nFU ≥ 1 functional units. We assume that non-genetic components cannot be decomposed into more than one functional unit. Furthermore, $U\left(c\right)=\left\{{u}_{1},\dots ,{u}_{{n}_{FU}}\right\}$ defines the total set of functional units of component $c\in C\left({M}_{{c}_{0}}\right)$, $\mathcal{U}\left({M}_{{c}_{0}}\right)=\bigcup _{c\in C\left({M}_{{c}_{0}}\right)}U\left(c\right)$ defines the total set of functional units in module Mc0.

#### Running Example

Functional units of the involved components are

• kinase protein pK: catalytic domain (CD) part of MpK, MdK and MmK, as well as MpTF due to its interaction;

• kinase gene gK: (i) enhancer (E) part of MgK, as well as MpTF due to its interaction, (ii) artificial functional unit for transcriptional activity (A) part of MgK, as well as MmK because A triggers the mRNA synthesis;

• kinase mRNA mK: RNA sequence (mRNA_K) part of MmK; and

• transcription factor protein pTF: (i) tyrosine residue (Y) part of MpTF, as well as MpK due to its interaction, (ii) DNA binding site (BS) part of MpTF, as well as MgK due to its interaction.

A molecular state s describes a specific physical constitution of a functional unit $u\in \mathcal{U}\left({M}_{{c}_{0}}\right)$. A functional unit uU(c) can adopt nMS ≥ 1 different molecular states. Furthermore, $S\left(u\right)=\left\{{s}_{1},\dots ,{s}_{{n}_{MS}}\right\}$ defines the total set of molecular states of a functional unit uU(c) and $\mathcal{S}\left({M}_{{c}_{0}}\right)=\bigcup _{u\in \mathcal{U}\left({M}_{{c}_{0}}\right)}S\left(u\right)$ defines the summation of all sets S(u) over $u\in \mathcal{U}\left({M}_{{c}_{0}}\right)$. Each molecular state $s\in \mathcal{S}\left({M}_{{c}_{0}}\right)$ can be mapped to at least one functional unit according to the function ${\lambda }^{u}\left(s\right)=\left\{u\in \mathcal{U}\left({M}_{{c}_{0}}\right)\mid s\in S\left(u\right)\right\}$ and to at least one component according to the function ${\lambda }^{c}\left(s\right)=\left\{c\in C\left({M}_{{c}_{0}}\right)\mid s\in \bigcup _{u\in U\left(c\right)}S\left(u\right)\right\}$.

#### Running Example

Molecular states of functional units

• catalytic domain CD of kinase protein pK: active (pKCDact) or inactive (pKCDinact, initial/active molecular state in MpK);

• enhancer E of kinase gene gK: free (gKE, initial/active molecular state in MgK) or bound to DNA binding site BS of transcription factor protein pTF (gKEpTFBS);

• transcriptional activity of the kinase gene A: active (gKAact) or inactive (gKAinact, initial/active molecular state in MgK);

• RNA sequence mRNAK of kinase mRNA mK: mature (mKmRNAKmature);

• tyrosine residue Y of transcription factor protein pTF: phosphorylated (pTFY, initial/active molecular state in MpTF) or unphosphorylated (pTFYp); and

• DNA binding site BS of transcription factor protein pTF: free (pTFBS, initial/active molecular state in MpTF) or bound to enhancer ES of kinase gene gK (gKEpTFBS).

An interaction state sIS is a molecular state ${s}_{IS}\in \mathcal{S}\left({M}_{{c}_{0}}\right)$ if it can be mapped to more than one component $|{\lambda }^{c}\left(s\right)|>1$ defining a complex $k={\lambda }^{c}\left(s\right)$. Accordingly, ${\mathcal{S}}_{IS}\left({M}_{{c}_{0}}\right)=\left\{s\in \mathcal{S}\left({M}_{{c}_{0}}\right)\mid |{\lambda }^{c}\left(s\right)|>1\right\}$, ${\mathcal{S}}_{IS}\left({M}_{{c}_{0}}\right)\subseteq \mathcal{S}\left({M}_{{c}_{0}}\right)$ defines the total set of interaction states.

#### Running Example

The complex of the transcription factor protein pTF and the kinase gene gK defines an interaction state gKEpTFBS in MpTF and MgK.

The term active molecular state defines the current state of a functional unit $u\in \mathcal{U}\left({M}_{{c}_{0}}\right)$, only one molecular state can be active at each time point. The function $\alpha :S\left(u\right)\to \left\{0,1\right\}$ determines the active molecular state with the constraint $\sum _{s\in S\left(u\right)}\alpha \left(s\right)=1$. If α(s) = 1 the molecular state s is active, otherwise if α(s) = 0 the molecular state s is inactive.

#### Running Example

See above, examples for molecular states.

The initial state s0,uS(u) with α(s0,u) = 1 and s0,u${\mathcal{S}}_{IS}\left({M}_{{c}_{0}}\right)$ is assumed to be the ground state of a functional unit, which can not be an interaction state. This assumption is necessary to exclude inconsistencies during the composition of models. The union of all initial states is given by ${S}_{0}\left({M}_{{c}_{0}}\right)=\bigcup _{u\in U\left({c}_{0}\right)}{s}_{0,u}$.

#### Running Example

See above, examples for molecular states.

A molecular event e represents an action changing the molecular states of the functional units. The relations between molecular states in $\mathcal{S}\left({M}_{{c}_{0}}\right)$ are described by nME ≥ 1 molecular events. Here, $\mathcal{E}\left({M}_{{c}_{0}}\right)=\left\{{e}_{1},\dots ,{e}_{{n}_{ME}}\right\}$ defines the total set of molecular events.

#### Running Example

Molecular events are

• pKt1 in MpK: activation of catalytic domain pKCDinact to pKCDact;

• pKt2 in MpK: inactivation of catalytic domain pKCDact to pKCDinact;

• pKpTFt1 in MpK and MpTF: phosphorylation of tyrosine residue pTFY to pTFYp triggered by the active catalytic domain of the kinase pKCDact;

• pTFt1 in MpK and MpTF: dephosphorylation of tyrosine residue pTFYp to pTFY;

• gKpTFt1 in MgK and MpTF: binding of pTFBS and gKE forming the complex gKEpTFBS;

• gKpTFt2 in MgK and MpTF: dissociation of complex gKEpTFBS to pTFBS and gKE;

• gKpTFt3 in MgK: activation of gKAinact to gKAact triggered by the complex gKEpTFBS;

• gKt1 in MgK: inactivation of gKAact to gKAinact;

• gKmKt1 in MmK: synthesis of mature kinase mRNA mKmRNAKmature;

• mKt1 in MmK: degradation of mature kinase mRNA ewgdsfgg mKmRNAKmature;

• pKmKt1 in MmK: synthesis of kinase protein defined by pKCDinact; and

• dKt1 in MdK: degradation of kinase protein defined by pKCDinact and pKCDact.

The stoichiometric coefficient $\nu \left(e,s\right)\in {\mathbb{N}}_{0}$ (products) and $\nu \left(s,e\right)\in {\mathbb{N}}_{0}$ (educts) defines the stoichiometry with which a molecular state $s\in \mathcal{S}\left({M}_{{c}_{0}}\right)$ mediates a molecular event $e\in \mathcal{E}\left({M}_{{c}_{0}}\right)$. The set of molecular states participating in a molecular event e can be distinguished into a set of educts $\bullet e=\left\{s\in \mathcal{S}\left({M}_{{c}_{0}}\right)\mid \nu \left(s,e\right)\ne 0\right\}$, and a set products $e\bullet =\left\{s\in \mathcal{S}\left({M}_{{c}_{0}}\right)\mid \nu \left(e,s\right)\ne 0\right\}$. Vice versa, $\bullet s=\left\{e\in \mathcal{E}\left({M}_{{c}_{0}}\right)\mid \nu \left(e,s\right)\ne 0\right\}$ defines the set of molecular events, where the molecular state s is a product, and $s\bullet =\left\{e\in \mathcal{E}\left({M}_{{c}_{0}}\right)\mid \nu \left(s,e\right)\ne 0\right\}$ defines the set of molecular events, where molecular state s is an educt. A molecular event $e\in \mathcal{E}\left({M}_{{c}_{0}}\right)$ can be defined as $e=\left\{\left(\epsilon \left(\bullet e\right),\epsilon \left(e\bullet \right)\right),\epsilon \left(\bullet e\right)\to \epsilon \left(e\bullet \right)\right\}$, where $\epsilon \left(\bullet e\right)=\sum _{s\in \bullet e}\nu \left(s,e\right)\cdot s$ and $\epsilon \left(e\bullet \right)=\sum _{s\in e\bullet }\nu \left(e,s\right)\cdot s$. Each molecular event $e\in \mathcal{E}\left({M}_{{c}_{0}}\right)$ occurs with a rate r(e).

#### Running Example

All stoichiometric coefficients are equal to one.

An interaction event eIS is a molecular event ${e}_{IS}\in \mathcal{E}\left({M}_{{c}_{0}}\right)$ that involves more than one component, $|{\lambda }^{c}\left(\bullet e\cup e\bullet \right)|>1$. The total set of interaction events is given by ${\mathcal{E}}_{IA}\left({M}_{{c}_{0}}\right)=\left\{e:|{\lambda }^{c}\left(\bullet e\cup e\bullet \right)|>1\right\}$.

#### Running Example

Interaction events given by interaction of (i) kinase and transcription factor protein (pKpTFt1), (ii) kinase gene and transcription factor protein (gKpTFt1, gKpTFt2, gKpTFt3), (iii) kinase gene and mRNA (gKmKt1), and (iv) kinase mRNA and protein (pKmKt1).

A molecular process E(u) defines the set of molecular events changing the molecular state of a functional unit uU(c), such that $E\left(u\right)=\bigcup _{u\in S\left(u\right)}\left(\left(\bullet s\cup s\bullet \right)\setminus \left(\bullet s\cap s\bullet \right)\right)$.

#### Running Example

Molecular processes in module

• MpK: (i) pKt1 and pKt2 for pkCD, and (ii) pKpTFt1 for pTFY;

• MgK: (i) gKpTFt1 and gKpTFt2 for pTFCD and gKE, and (ii) gKpTFt3 and gKt1 for gKA;

• MmK: (i) gKmKt1 and mKt2 for mKmRNAKmature, and (ii) pKmKt1 for pKCD;

• MpTF: (i) pKpTFt1 and pTFt1 for pTFY, and (ii) gKpTFt1 and gKpTFt2 for pTFCD and gKE; and

• MdK: (i) dKt1 for pKCD.

## 3.2 Module Types

The modularisation concept makes use of different module types to integrate molecular mechanisms and correlations on all of the main omic levels (metabolome, proteome, transcriptome, genome). Accordingly, we distinguish mechanistic module types [33], compare Figure 2 – Box 2:

A gene module ${M}_{g,{c}_{0}}$ represents the transcriptional activity of a gene controlled by the formation of the regulatory landscape and the pre-initiation complex. These processes include complex non-covalent interactions with transcription factors or other regulatory proteins interacting with a silencer or enhancer sequences of the gene.

A mRNA module ${M}_{m,{c}_{0}}$ represents the biosynthesis of a particular mRNA by transcription of the respective gene; the post-transcriptional modification of the mRNA including capping, (alternative) splicing and polyadenylation; the translation into the proteins encoded by the processed mRNA; and the degradation of the mRNA including its potential control through proteins or small interfering anti-sense RNA molecules. These processes include covalent modification and non-covalent interactions of the mRNA.

A protein module ${M}_{p,{c}_{0}}$ describes the functionality of a particular protein (single polypeptide chain), including changes in the protein conformation, non-covalent interactions with other components and covalent modifications that regulate the functionality. Thus, a protein module represents the formation and cleavage of covalent and non-covalent bonds, as well as conformational changes of the protein structure.

A protein degradation module ${M}_{d,{c}_{0}}$ represents the degradation of a protein by proteolysis in lysosomes, ubiquitin-dependent degradation by the proteasome, or degradation by digestive enzymes or by any other possible mechanism that may lead to the degradation or inactivation of the protein. The post-translational proteolytic processing is described in the corresponding protein module.

The module types introduced above exclusively rely on known molecular mechanisms, even though some mechanistic details may not necessarily be considered. To integrate causal relationships if molecular mechanisms unknown, we introduce two more module types [33]:

An allelic influence module ${M}_{ai,{c}_{0}}$ represents the effects of alleles (mutated versions of a gene) on molecular processes. In contrast to gene modules, the described effects are causal influences, which might be directly or indirectly mediated by unknown processes.

A causal influence module ${M}_{ci,{c}_{0}}$ describes the influence of arbitrary entities, others than alleles, on molecular processes.

## 3.3 Module Petri Net

The formal description of a module M, see Section 3.1 has to be translated into a PN $\mathcal{N}=\left\{P,T,F,f,v,{m}_{0}\right\}$. Places represent molecular states and transitions represent molecular events. The function ${\varrho }_{p\to s}:P\to \mathcal{S}\left(M\right)$ (${\varrho }_{t\to e}:T\to \mathcal{E}\left(M\right)$) maps a place pP (transition tT) to a molecular state $s\in \mathcal{S}\left(M\right)$ (molecular event $e\in \mathcal{E}\left(M\right)$), both mappings are bijective.

The PN $\mathcal{N}\left({M}_{{c}_{0}}\right)=\left\{P,T,F,f,v,{m}_{0}\right\}$ of a module Mc0 is defined as:

• Set of places $P=\bigcup _{c\in C\left({M}_{{c}_{0}}\right)}{P}^{c}$, where

• ${P}^{c}=\bigcup _{u\in U\left(c\right)}{P}^{u}$ is the set of places representing a component $c\in C\left({M}_{{c}_{0}}\right)$ and ${P}^{u}=\left\{p:{\varrho }_{p\to s}\left(p\right)\in S\left(u\right)\right\}$ is the set of places representing a functional unit $u\in \mathcal{U}\left({M}_{{c}_{0}}\right)$;

• Set of transitions $T=\bigcup _{c\in C\left({M}_{{c}_{0}}\right)}{T}^{c}$, where

• ${T}^{c}=\bigcup _{u\in U\left(c\right)}{T}^{u}$ is the set of transitions related to a component $c\in C\left({M}_{{c}_{0}}\right)$ and ${T}^{u}=\left\{t:{\varrho }_{t\to e}\left(t\right)\in E\left(u\right)\right\}$ is the set of transitions related to a functional unit $u\in \mathcal{U}\left({M}_{{c}_{0}}\right)$;

• Set of arcs $F:={F}_{SA}\subseteq \left(P×T\right)\cup \left(T×P\right)$;

• Arc-weights f: $F\to {\mathbb{N}}_{0}$, where

• $\mathrm{\forall }s\in \mathcal{S}\left({M}_{{c}_{0}}\right),e\in \mathcal{E}\left({M}_{{c}_{0}}\right)$ with $s\in \bullet e$: ${f}_{SA}\left({\varrho }_{p\to s}^{-1}\left(s\right),{\varrho }_{t\to e}^{-1}\left(e\right)\right)=\nu \left(s,e\right)$ (input arc), and

• $\mathrm{\forall }s\in \mathcal{S}\left({M}_{{c}_{0}}\right),e\in \mathcal{E}\left({M}_{{c}_{0}}\right)$ with $s\in e\bullet$: ${f}_{SA}\left({\varrho }_{t\to e}^{-1}\left(e\right),{\varrho }_{p\to s}^{-1}\left(s\right)\right)=\nu \left(e,s\right)$ (output arc);

• Set of firing rates $v:T\to H$ with $H=\bigcup _{t\in T}h\left(t\right)$, where:

• $\mathrm{\forall }e\in \mathcal{E}\left({M}_{{c}_{0}}\right):h\left({\varrho }_{t\to e}^{-1}\left(e\right)\right)=r\left(e\right)$;

• Initial marking m0: $P\to {\mathbb{N}}_{0}$:

• Mc0 is a gene, protein, causal or allelic influence module:

• $\mathrm{\forall }s\in {S}_{0}\left({M}_{{c}_{0}}\right):{m}_{0}\left({\varrho }_{p\to s}^{-1}\left(s\right)\right)={n}_{{c}_{0}}$ (nc0 is the assumed initial number of copies for component c0, here we assume ${n}_{{c}_{0}}=1$), and

• $\mathrm{\forall }s\in \mathcal{S}\left({M}_{{c}_{0}}\right)\setminus {S}_{0}\left({M}_{{c}_{0}}\right):{m}_{0}\left({\varrho }_{p\to s}^{-1}\left(s\right)\right)=0$,

• Mc0 is a protein degradation or mRNA module:

• $\mathrm{\forall }s\in \mathcal{S}\left({M}_{{c}_{0}}\right):{m}_{0}\left({\varrho }_{p\to s}^{-1}\left(s\right)\right)=0$.

#### Running Example

See Figure 3(B) for the Petri net representation of the modules used in the running example in Figure 3(A).The modules are also provide as Snoopy-files in Supplementary Material 1.

The protein degradation module ${M}_{d,{c}_{0}}$ is a special case with respect to the use of arcs. Assuming a transition tT represents the molecular event of protein degradation $e={\varrho }_{t\to e}\left(t\right)$ of the protein defined by module ${M}_{p,{c}_{0}^{\mathrm{\prime }}}$, then the set of places representing:

• none interaction states: ${P}_{{M}_{p,{c}_{0}^{\mathrm{\prime }}}}^{nonIS}={\varrho }_{p\to s}^{-1}\left(\bigcup _{u\in U\left({c}_{0}^{\mathrm{\prime }}\right)}S\left(u\right)\setminus {\mathcal{S}}_{IS}\left({M}_{p,{c}_{0}^{\mathrm{\prime }}}\right)\right)$ are connected with transition t using:

• reset arcs ${f}_{XA}\left({P}_{{M}_{p,{c}_{0}^{\mathrm{\prime }}}}^{nonIS},t\right)=1$, and

• marking-dependent standard arcs ${f}_{SA}\left(t,{P}_{{M}_{p,{c}_{0}^{\mathrm{\prime }}}}^{nonIS}\right)={m}_{0}\left({P}_{{M}_{p,{c}_{0}^{\mathrm{\prime }}}}^{nonIS}\right)$;

• interaction states: ${P}_{{M}_{p,{c}_{0}^{\mathrm{\prime }}}}^{IS}={\varrho }_{p\to s}^{-1}\left(\bigcup _{u\in U\left({c}_{0}^{\mathrm{\prime }}\right)}S\left(u\right)\cap {\mathcal{S}}_{IS}\left({M}_{p,{c}_{0}^{\mathrm{\prime }}}\right)\right)$ are connected with transition t using:

• inhibitory arcs ${f}_{IA}\left({P}_{{M}_{p,{c}_{0}^{\mathrm{\prime }}}}^{IS},t\right)=1$.

This transformation ensures that proteins are only degraded if they are not interacting with other components and that only one copy is degraded.

#### Running Example

In the kinase protein degradation module MdK, transitions dKt1 is representing the degradation of the kinase protein pK. The none-interaction states of the kinase protein pK are pKCDact and pKCDinact. Each of the two corresponding places with the initial markings m0(pKCDact), respectively m0(pKCDinact), is connected with transitions dKt1 via a reset arc (arc with two arrowheads), which deletes all tokens in case of firing, and a marking-dependent standard arc, which adds m0(pKCDact) − 1 tokens to place pKCDact and m0(pKCDinact) − 1 tokens to place pKCDinact. This subnet allows deleting exactly one copy the kinase protein pK, regardless of the active state of the functional unit CD.

The kinase protein module MdK includes no interaction state. If there would exist an interaction state represented by a place p, it must have been connected with transition dKt1 via an inhibitory arc (arc with an empty circle as the arrowhead). Thus, the degradation of the protein kinase pK could not occur as long as place p is marked.

## 3.4 Interface Networks

Interface networks are shared subgraphs of modules describing an interaction between the represented components or a functional relationship (e.g. links between translation, transcription or protein degradation), in the most trivial case interface networks consist only of a single place. Identical interface networks are used to couple modules and are therefore crucial for the model composition.

A set of modules $I=\left\{{M}_{1},\dots ,{M}_{n}\right\}$ of components involved in one particular interaction share the interface network ${\mathcal{N}}_{I}\left(M\right)=\left\{{P}^{I},{T}^{I},{F}^{I},{f}^{I},{v}^{I},{m}_{0}^{I}\left(M\right)\right\}$, where ${\mathcal{N}}_{I}\left(M\right)\subseteq \mathcal{N}\left(M\right)$, MI. All nodes in an interface network ${\mathcal{N}}_{I}\left(M\right)$ are declared as logical (fusion) nodes. The set of places PI and transitions TI in an interface network 𝒩I with e.g. $I=\left\{{M}_{{c}_{0,1}},{M}_{{c}_{0,2}}\right\}$ depends on the type of interaction among the two modules ${M}_{{c}_{0,1}}$ and ${M}_{{c}_{0,2}}$, which can be categorised into:

• regulation – assuming ${M}_{{c}_{0,1}}$ and ${M}_{{c}_{0,2}}$ can be any module type:

• ${T}^{I}=\left\{t:{\varrho }_{t\to e}\left(t\right)\in {\mathcal{E}}_{IA}\left({M}_{{c}_{0,1}}\right)\cap {\mathcal{E}}_{IA}\left({M}_{{c}_{0,2}}\right)\right\}$, and

• ${P}^{I}=\left\{p:{\varrho }_{p\to s}\left(p\right)\in \bullet \left({\mathcal{E}}_{IA}\left({M}_{{c}_{0,1}}\right)\cap {\mathcal{E}}_{IA}\left({M}_{{c}_{0,2}}\right)\right)\cup \left({\mathcal{E}}_{IA}\left({M}_{{c}_{0,1}}\right)\cap {\mathcal{E}}_{IA}\left({M}_{{c}_{0,2}}\right)\right)\bullet \right\}$;

#### Running Example

Transitions and places describing the molecular mechanism in the interface network, shared by the modules

• kinase protein MpK and transcription factor protein MpTF are TI = {pKpTFt1} and PI = {pTFY, pTFYp, pKCDact}, see Figure 3(B) nodes highlighted in blue; and

• kinase gene MgK and transcription factor protein MpTF are TI = {gKpTFt1, gKpTFt2} and PI = {pTFBS, gKE, gKEpTFBS}, see Figure 3(B) nodes highlighted in red. In general, interface networks of protein modules do not include events affecting the transcriptional activity of a gene. Thus, transition gKpTFt3 is not part of the interface network of the kinase gene MgK and transcription factor protein MpTF.

• transcription – assuming ${M}_{{c}_{0,1}}$ is a gene module and ${M}_{{c}_{0,2}}$ is an mRNA module:

• ${T}^{I}=\mathrm{\varnothing }$, and

• ${P}^{I}=\left\{p:{\varrho }_{p\to s}\left(p\right)\in \mathcal{S}\left({M}_{{c}_{0,1}}\right)\wedge {\varrho }_{p\to s}\left(p\right)\text{ is a transcriptionally active state of }{c}_{0,2}\right\}$;

#### Running Example

Places in the interface network descring part of the transcription shared by the modules kinase gene MgK and kinase mRNA MmK are PI = {gKAact}, see Figure 3(B) nodes highlighted in orange.

• translation – assuming ${M}_{{c}_{0,1}}$ is an mRNA module and ${M}_{{c}_{0,2}}$ is a protein module:

• TI = ∅, and

• ${P}^{I}=\left\{p:{\varrho }_{p\to s}\left(p\right)\in {\mathcal{S}}_{0}\left({M}_{{c}_{0,2}}\right)\right\}$;

#### Running Example

Places in the interface network describing part of the translation shared by the modules of kinase mRNA MmK and kinase protein MpK are PI = {pKCDinact}, see Figure 3(B) nodes highlighted in green.

• degradation – assuming ${M}_{{c}_{0,1}}$ is a protein module and ${M}_{{c}_{0,2}}$ is a protein degradation module:

• TI = ∅, and

• ${P}^{I}\subseteq \left\{p:{\varrho }_{p\to s}\left(p\right)\in \mathcal{S}\left({M}_{{c}_{0,1}}\right)\right\}$.

#### Running Example

Places in the interface network describing part of the degradation shared by the modules of kinase protein MpK and kinase protein degradation MdK are PI = {pKCDinact, pKCDact}, see Figure 3(B) nodes highlighted in yellow.

The redundant interface networks shared among modules might appear unnecessarily complicated, but they are of tremendous benefits for modules with complex interaction sites by securing the correct functioning of modules in a composed model. Even more, interface networks ensure that interactions can only be executed if all modules of components involved in the interaction are part of the composed model according to the real-world scenario, see next section.

## 4 Model Composition

The model composition relies on the interface networks introduced in Section 3.4. A composed model is defined by the set of modules $G=\left\{{M}_{1},\dots ,{M}_{{n}_{M}}\right\}$, nM ≥ 1, which can also be written as $G={G}_{g}\cup {G}_{m}\cup {G}_{p}\cup {G}_{d}\cup {G}_{ai}\cup {G}_{ci}$, where:

• ${G}_{g}=\left\{{M}_{g,{c}_{0,1}},\dots ,{M}_{g,{c}_{0,{n}_{g}}}\right\}$ – subset of gene modules,

• ${G}_{m}=\left\{{M}_{m,{c}_{0,1}},\dots ,{M}_{m,{c}_{0,{n}_{m}}}\right\}$ – subset of mRNA modules,

• ${G}_{p}=\left\{{M}_{p,{c}_{0,1}},\dots ,{M}_{p,{c}_{0,{n}_{p}}}\right\}$ – subset of protein modules,

• ${G}_{d}=\left\{{M}_{d,{c}_{0,1}},\dots ,{M}_{d,{c}_{0,{n}_{d}}}\right\}$ – subset of protein degradation modules,

• ${G}_{ai}=\left\{{M}_{ai,{c}_{0,1}},\dots ,{M}_{ai,{c}_{0,{n}_{ai}}}\right\}$ – subset of allelic influence modules, and

• ${G}_{ci}=\left\{{M}_{ci,{c}_{0,1}},\dots ,{M}_{ci,{c}_{0,{n}_{ci}}}\right\}$ – subset of causal influence modules.

All module subsets Gg, Gm, Gp, Gd, Gai, and Gci are pairwise disjunctive.

The PN of the composed model $\mathcal{N}\left(G\right)=\left({P}^{G},{T}^{G},{F}^{G},{f}^{G},{v}^{G},{m}_{0}^{G}\right)$ is given by:

• Set of places ${P}^{G}=\bigcup _{{M}_{{c}_{0,i}}\in G}{P}_{{M}_{{c}_{0,i}}}$;

• Set of transitions ${T}^{G}=\bigcup _{{M}_{{c}_{0,i}}\in G}{T}_{{M}_{{c}_{0,i}}}$;

• Set of arcs ${F}^{G}=\bigcup _{{M}_{{c}_{0,i}}\in G}{F}_{{M}_{{c}_{0,i}}}$;

• Arc weights fG: ${F}^{G}\to {\mathbb{N}}_{0}$

• ${f}^{G}\in {F}^{G}$, where ${f}^{G}\in {F}^{{M}_{1}},\dots ,{F}^{{M}_{m}}$ with $\left\{{M}_{1},\dots ,{M}_{m}\right\}\subseteq G$: ${f}^{G}={f}^{{M}_{1}}=\dots ={f}^{{M}_{m}}$;

• Firing rates vG: ${T}^{G}\to {H}^{G}$, ${H}^{G}=\bigcup _{{M}_{{c}_{0,i}}\in G}{H}_{{M}_{{c}_{0,i}}}$;

• Initial marking: ${m}_{0}^{G}:{P}^{G}\to {\mathbb{N}}_{0}$, where

• $\mathrm{\forall }p\in {P}^{G}$ with ${\varrho }_{p\to s}\left(p\right)\in \bigcup _{M\in G}{\mathcal{S}}_{0}\left(M\right)$: ${m}_{0}^{G}\left(p\right)={n}_{{c}_{0}}$ with ${c}_{0}={\lambda }^{c}\left({\varrho }_{p\to s}\left(p\right)\right)$, and

• $\mathrm{\forall }p\in {P}^{G}$ with ${\varrho }_{p\to s}\left(p\right)\notin \bigcup _{M\in G}{\mathcal{S}}_{0}\left(M\right)$: ${m}_{0}^{G}\left(p\right)=0$.

#### Running Example

See Figure 3(B) and the caption for the Petri net representation of the composed model of the running example in Figure 3(A). The composed model is also provide as Snoopy-file in Supplementary Material 2.

The definition of interface networks within modules also allows integrating various level of granularity into a composed model. Modules can be recombined arbitrarily to compose alternative models or reused in the context of a different biological system. The arbitrary recombination of modules enables the user to decide about which omic levels to include in a composed model.

Composed models can be submitted to an analysis framework using the PN tools Snoopy [18], Charlie [20] or Marcie [21]. Snoopy [18] allows converting the composed model into an SBML file to employ other analysis tools of choice as well.

## 5 Features

Based on the molecule-centred modularisation and model composition concept, we developed several extensions to enhance the versatility and usefulness of BMK, and thus to accelerate its usage and potential fields of application. Below, we briefly summarise important BMK features and provide references for further details.

## 5.1 Module Annotation

As discussed by Le Novère et al. [31] most published biological models are lost due missing access or insufficient characterisation. Based on these observations Le Novère et al. proposed a guideline for curating and encoding models called MIRIAM (Minimum Information Requested In the Annotation of Models). To adhere to the proposed MIRIAM guideline, a module consists not only of its underlying PN but also of an annotation file. We, therefore, defined the BMK markup language (BMKml), the XML Schema Definition (XSD) is documented in [32]. In summary, the annotation file specifies:

• module name, type and a cross-reference to the Ensembl database [34] of the corresponding component;

• information to clarify the authorship including contact details, creation and modification dates of the module, as well as a short description of the considered molecular mechanisms; and

• for each place, transition and parameter a description, literature references and cross-references to other databases giving more detail about the modelled molecular mechanisms or linking a GO annotation [35].

The module annotation provides a complete characterisation and documentation of the module to ease the understanding of the modelled molecular mechanisms and the reuse in composed models.

## 5.2 Module Construction

So far, we exploit three ways to obtain modules, compare Figure 2 – Box 1:

1. The direct engineering of modules in Snoopy [18] based on knowledge provided in textbooks, publications, and databases is the default approach to construct modules. We used this approach to compose models for the following case studies:

• Pain signalling – the composed model comprises 38 modules of key players involved in molecular mechanisms of nociception, which occurs in the peripheral terminals of dorsal root ganglion neurones and is responsible for the detection of noxious and painful stimuli [36], [37], [38], [39], [40].

• IL6-induced JAK-STAT signalling – the composed model comprises seven protein modules (JAK, STAT, IL6, IL6R, gp130, SHP2, SOCS3), as well as a protein degradation, mRNA and gene module for the feedback inhibitor SOCS3 [25], [41].

• Phosphate regulatory network in enterobacteria – the composed model comprises nine protein modules (PhoA, PstS, PstA, PstB, PStC, PhoU, PhoR, PhoA), as well as a protein degradation, mRNA and gene module for PhoA, which sense and regulate the external Pi [33].

2. The reverse engineering of modules from complex (omic) datasets to obtain causal modules or map data to module prototypes [32], [33], e.g. as in case of conceptualised gene or mRNA modules, are straightforward approaches to integrate experimental results directly into composed models and thus, to test for their effect on the model’s behaviour.

3. The modularisation and thus, the integration of existing models will be another valuable resource to obtain modules. In particular, we suggest a routine to decompose mechanistic SBML encoded models [23] into a set of related modules. Modules can also be obtained from gene regulatory models defined by Boolean networks as described in [22].

## 5.3 Algorithmic Mutation

Life science showed in order to understand a biological system it is necessary to study not only the native system state but also its genetic variations. Genetic variations are a consequence of mutations in the genome, where some of the mutations are indifferent, and others lead to a gain or even a loss of function. Mutations result either naturally or are experimentally enforced. Modelling approaches need to introduce equivalent mutation workflows. BMK suggests algorithmic mutation based on the modularisation concept to conveniently mimic gene deletions and structure-function mutations, which are described in detail in [32].

Removing modules from a composed model mimics gene deletions. This mutation can be performed as single, double or even multiple gene knockouts to sufficiently block the effect of redundant gene functions.

Since functional units in a module have a relation to the structure of the modelled component, their alterations can mimic structure-function mutation. By deleting the transitions in the subgraph of a functional unit, the functional unit can only adopt a subset of the previously reachable molecular states and can therefore not execute its complete functionality. As stated above, such alteration could result in an indifferent effect, but it could also possibly increase or decrease a specific function encoded by the molecular mechanism.

Both mutation algorithms can be performed either on the entire set of modules or on a subset of modules/functional units that can be filtered through the provided references in the annotation file, see Section 5.1. The systematic mutation of a composed model yields a large set of alternative models and demands analysis workflows based on machine learning to reveal fundamental insights and identify structures with a significant effect on the system’s behaviour.

## 5.4 Spatiotemporal Transformation

Molecular mechanisms are affected by the spatial distribution and movement of the involved components. Therefore, the cellular arrangement concerning compartments, membranes and pools, as well as the cell geometry and size, have to be considered to understand the spatiotemporal behaviour of a system. Representing biomolecular mechanisms and movement in one coherent model is a challenging task.

Based on the modularisation concept, we propose a spatiotemporal transformation algorithm that integrates those spatial aspects into the composed model, see [24], [42] for details. The spatiotemporal transformation algorithm extends the composed model with a spatial model controlling the movement and interaction of the involved components without interfering the modelled molecular mechanisms. In the first step of the spatiotemporal transformation, the cellular space needs to be defined by a grid, as well as cellular substructures on that particular grid. This step includes the assignment of the components in the composed model to a set of substructures. The position of each component on the grid is encoded by a set of coordinate places (one, two or three places depending on the dimension of the grid) and the corresponding marking. The movement of the component is realised by increasing or decreasing the marking on the coordinate places by transitions with respect to the grid and its substructures. The introduced subnets responsible for the movement must further distinguish, whether the component is in an interactive state forming a complex or not, to guarantee a synchronous movement of all components forming an interactive complex. Additional rules have to be introduced to ensure that interaction between components encoded in the interface networks can only occur if they reside close to each other. The spatiotemporal transformation employs coloured PNs to allow for a scalable representation of the spatial model concerning the copy number of the involved components. The copy number of components has to be realised via colour and not via the marking, to distinguish the individual copies in respect to their position, movement and interaction.

Applying the spatiotemporal transformation to a composed model allows studying how space might affect the dynamic behaviour of a system without constituting a new additional model or framework only for this specific purpose.

Figure 4:

Basics of the spatial transformation concept.

#### Running Example

In the example given in Figure 4, the mechanistic model describes the binding of A and B to form complex AB, and its dissociation. The spatial model provides places encoding the (x, y)-coordinates of A and B. The marking of those places reflects the position of A and B on a 2-dimensional grid, which is defined by the parameters (xDimL, xDimU) and (yDimL, yDimU) (lower/upper bound at the x/y-axis). Also, the spatial model encodes spatial constraints to restrict interactions among A and B locally. Therefore, the transitions ABt1 and ABt2 are connected with the coordinates places by read arcs. The firing rates are multiplied by a Boolean expression defining a neighbourhood condition, which evaluates whether A and B are at the same position. Only if A and B stay at the same position, the interaction can be executed in the model. In the spatial model, A and B can either move as single entities or as complex AB. Thus, they can either move towards the lower bound of an axis (t{x, y} L{A, B, AB}) or the upper bound of an axis (t{x, y} U{A, B, AB}). The marking of the coordinate place must always be greater than xDimL/yDimL, to move towards the lower bound of an axis. This is checked by the read arcs with the arc-weight xDimL + 1/yDimL + 1. Also, the marking of the coordinate place must always be less or equal than xDimU/yDimU, to move towards the upper bound of an axis. This is checked by the inhibitory arcs with the arc-weights xDimU/yDimU. In addition, A and B are only allowed to move as a single entity if the place AB is empty. This is checked by the inhibitory arcs connecting the place AB with transitions t{x, y} L{A, B} and t{x, y} U{A, B}. In contrast, to move A and B as a complex, place AB must not be empty. This is checked by the read arcs connecting place AB with transitions t{x, y} LAB and t{x, y}UAB.

## 6 BioModelKit Web Interface and Database

The BMK web interface (BMKwi, www.biomodelkit.org) gives public access to the modules by storing and linking all elements of the underlying PN graph and all annotations in the BMK database (BMKdb), see Figure 2 – Box 3. Thus, BMK is more than just a repository of hard-coded models.

BMKdb uses MySQL as relational database management system. The database scheme has been derived from the XSD scheme of the Snoopy file format [18] used to save the PN graphs of a module and the BMKml format to store the module annotation, see Section 5.1. The content of the PN graph and the annotation file of a module are not only stored as hard copies on the BMK file server, but are also transferred as records to the BMKdb, where all PN graph elements and information in the annotation file are explicitly stored and linked to each other, see Figure 5. The database scheme and its relation to the introduced module terminology and annotations are described in [32]. Storing modules in BMKdb allows not only organising submitted modules but also keeping track of module versioning. Therefore, the BMKdb can provide various module versions for one component that might display different levels of granularity, hypotheses on the modelled molecular mechanisms, or refinements and updates due to new insights.

BMKwi is running on a Linux/Apache 2.2.14 web server with MySQL client version 5.1.67, PHP 5.2.0, and Javascript 1.8.5. Its main functionality is to publish modules. Therefore, users can browse, search and inspect modules, the corresponding content is automatically generated by querying the BMKdb. Users registered at BMK can also manage their private profiles, which includes creating module collections. Module collections can be filled with ad hoc chosen modules published at BMK. Later on, modules stored in a private collection can be used for model composition, algorithmic model mutation or spatiotemporal transformation which are available as automated features. Registered users can also curate and submit modules to BMK, which will be available after successful approval. Figure 5 summaries the structure of the web-interface and possible user interactions.

The simple click-and-run principle makes it easy for every user to compose complex models and apply the implemented features, see the movie in Supplementary Material 3 for a demonstration.

Figure 5:

BioModelKit database and navigation scheme of the web-interface with features available for user interaction.

## 7 Discussion and Outlook

With BMK we propose a molecule-centred modularisation concept which is geared towards the ab inito design of modules, describing the functionality and interactions of one particular molecule by a PN that includes interface networks to automate the model composition. We showed that modules can be constructed from various data resources using direct and reverse engineering approaches as well as by decomposing existing models, which allows integrating molecular mechanisms and causal correlations on the metabolic, proteomic, transcriptomic and genomic level. In the scope of BMK, we set-up a MySQL database, that stores and organises the PN graph and all information provided in the MIRIAM-compliant annotation file of each module, which allows keeping track of the module versioning. We further designed a web interface to access modules stored in BMKdb. The BMKwi enables the user to perform the composition of models based on an ad hoc chosen set of modules. The model composition can be combined with mutation algorithms to generate alternative models accordingly, which allows mimicking naturally occurring genetic variations and their effect in silico. BMK also allows transforming a composed model into a spatiotemporal model integrating information about the cell geometry and arrangement concerning compartments, membranes and pools.

We demonstrated the practicability and versatility of the concept within BMK on several case studies, like JAK-STAT signalling [25], [41], molecular mechanisms of nociception [36], [37], [38], [39], [40] and phosphate regulation in enterobacteria [33]; additional case studies are currently in progress. So far, the BMKdb holds only the modules of the JAK-STAT signalling case studies, the modules of the other two case studies will be made available with the next release and update of the BMKwi and BMKdb.

BMK is still under continuous development; we aim to enhance the functionality of the web interface according to user priorities. Soon, we will release modules constructed in so far unpublished case studies, as well as modules obtained by the modularisation of existing models, to increase the number of available models in BMK. We intend to develop a BMK ontology, which addresses the naming convention of model elements (reactions, species, parameters etc.) to allow for a more consistent naming among modules stored in the BMK. Based on the BMK ontology, we plan to develop a module questionnaire that allows for a straightforward translation of molecular mechanisms into modules compliant to the BMK standards. Future developments of the BMKwi will also focus on the implementation of modularisation workflows for automatic omic data integration, which facilitates the user to directly test for the effect of the experimental results on the model’s behaviour.

We see great potential for BMK for in-depth in silico mutation studies to identify models reproducing a desired behaviour. This approach can considerably supplement experimental work. Another exciting field for BMK is in silico synthetic biology; module interface networks can easily be altered to add or remove functionality, which allows rewiring the network structure of the composed model synthetically and thus, to analyse the effect of desired properties. Last but not least, individualising modules or conceptualising composed models based on omic data might help to obtain new insights about the underlying molecular circumstances and to predict strategies to improve the molecular mechanisms. Such workflows are of particular interest in precision medicine to integrate patient-specific data and to suggest potential therapeutic strategies to treat clinical disorders or in precision crop plant breeding to recommend suitable genotypes, genetic variations or breeding strategies to increase the yield and yield stability, as well as the nutrition value.

BMK is well suited to study complex and diverse biological systems, to obtain new insights about the involved molecular mechanisms and based on that to make predictions and rise hypothesis supporting and benefiting experimental work in multilateral collaborations.

## Acknowledgement

We would like to thank Monika Heiner, Mostafa Herajy, Fei Liu, Christian Rohr, and Martin Schwarick for their contributions in developing and supporting the use of Snoopy, Marcie, and Charlie; and Wolfgang Marwan for supporting the development of BMK by his supervision. We appreciate countless productive discussions with all of them.

## References

• [1]

Koch I, Reisig W, Schreiber F. Modeling in systems biology. The Petri Net Approach. Springer Science & Business Media, 2010. Google Scholar

• [2]

Somekh J, Peleg M, Eran A, Koren I, Feiglin A, Demishtein A, et al. A model-driven methodology for exploring complex disease comorbidities applied to autism spectrum disorder and inflammatory bowel disease. J Biomed Inform. 2016;63:366–78.

• [3]

Xu H, Curtis TY, Powers SJ, Raffan S, Gao R, Huang J, et al. Genomic, biochemical, and modeling analyses of asparagine synthetases from wheat. Front Plant Sci. 2018;8:2013.

• [4]

Zechendorf E, Vaßen P, Zhang J, Hallawa A, Martincuks A, Krenkel O, et al. Heparan sulfate induces necroptosis in murine cardiomyocytes: A medical-in silico approach combining in vitro experiments and machine learning. Front Immunol. 2018;9:885.

• [5]

Baldan P, Cocco N, Marin A, Simeoni M. Petri nets for modelling metabolic pathways: a survey. Nat Comput. 2010;9:955–89.

• [6]

Chaouiya C. Petri net modelling of biological networks. Brief Bioinform. 2007;8:210–9.

• [7]

Heiner M, Koch I. Petri net based model validation in systems biology. In: Proc. ICATPN 2004. vol. 3099 of LNCS. Springer, 2004:216–37. Google Scholar

• [8]

Sackmann A, Heiner M, Koch I. Application of petri net based analysis techniques to signal transduction pathways. BMC Bioinformatics. 2006;7:482.

• [9]

Heiner M. Understanding network behaviour by structured representations of transition invariants – a petri net perspective on systems and synthetic biology. In: Condon A, Harel D, Kok J, Salomaa A, Winfree E, editors. Algorithmic Bioprocesses, Natural Computing Series. Berlin, Heidelberg: Springer, 2009. p. 367–89. Available from: http://www.springerlink.com/content/m8t30720r141442m

• [10]

Zevedei-Oancea I, Schuster S. Topological analysis of metabolic networks based on petri net theory. In Silico Biology. 2003;3:323–45.

• [11]

Koch I. Petri nets in systems biology. SoSyM. 2014;14:703–10. Google Scholar

• [12]

Blätke MA, Heiner M, Marwan W. Chapter 7 – BioModel Engineering with Petri Nets. In: Algebraic and Discrete Mathematical Methods for Modern Biology. Boston: Elsevier Inc., 2015:141–93. Google Scholar

• [13]

Gilbert D, Heiner M, Lehrack S. A unifying framework for modelling and analysing biochemical pathways using petri nets. In: Computational Methods in Systems Biology. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007:200–16. Google Scholar

• [14]

Jensen K, Kirstensen LM. Coloured petri nets: modelling and validation of concurrent systems. Berlin Heidelberg: Springer-Verlag; 2009.

• [15]

Gao Q, Gilbert D, Heiner M, Liu F, Maccagnola D, Tree D. Multiscale modelling and analysis of planar cell polarity in the drosophila wing. IEEE/ACM Trans Comput Biol Bioinform. 2013;10:337–51.

• [16]

Liu F, Heiner M, Gilbert D. Coloured petri nets for multi-level, multiscale, and multi-dimensional modelling of biological systems. Brief Bioinform. 2017;bbx150. Google Scholar

• [17]

Rohr C, Marwan W, Heiner M. Snoopy – a unifying petri net framework to investigate biomolecular networks. Bioinformatics (Oxford, England). 2010;26:974–5.

• [18]

Heiner M, Herajy M, Liu F, Rohr C, Schwarick M. Snoopy – A unifying Petri net tool. In: Proc. PETRI NETS 2012. vol. 7347 of LNCS. Springer, 2012:398–407. Google Scholar

• [19]

Marwan W, Rohr C, Heiner M. 2012. Petri Nets in Snoopy: A Unifying Framework for the Graphical Display, Computational Modelling, and Simulation of Bacterial Regulatory Networks. In: van Helden J, Toussaint A, Thieffry D, editors. Bacterial Molecular Networks. Methods in Molecular Biology (Methods and Protocols), vol 804. Springer, New York, NY. p. 409–37. Google Scholar

• [20]

Heiner M, Schwarick M, Wegener J. Charlie – an extensible Petri net analysis tool. In: Devillers R, Valmari A, editors. Proc. PETRI NETS 2015. vol. 9115 of LNCS. Springer, 2015:200–11. Google Scholar

• [21]

Heiner M, Rohr C, Schwarick M. MARCIE – model Checking and reachability analysis done effiCIEntly. In: Colom J, Desel J, editor(s). Proc. PETRI NETS 2013. Vol. 7927 of LNCS. Berlin, Heidelberg: Springer, 2013:389–99. Google Scholar

• [22]

Jehrke L. Modulare modellierung und graphische darstellung boolescher netzwerke mit hilfe automatisch erzeugter Petri-netze und ihre simulation am beispiel eines genregulatorischen netzwerkes [Masterthesis]; 2014. Google Scholar

• [23]

Soldmann M. Transformation monolithischer SBML-modelle biomolekularer netzwerke in Petri netz module [Masterthesis]; 2014. Google Scholar

• [24]

Blätke MA, Rohr C. BioModelKit: spatial modelling of complex multiscale molecular biosystems based on modular models. In: Advances in Biological processes and Petri nets (BioPPN). vol. 160, 1-2 of Fundamenta Informaticae. IOS Press, 2018:221–54. Google Scholar

• [25]

Blätke MA, Dittrich A, Rohr C, Heiner M, Schaper F, Marwan W. JAK/STAT signalling – an executable model assembled from molecule-centred modules demonstrating a module-oriented database concept for systems and synthetic biology. Mol Biosyst. 2013;9:1290–307.

• [26]

Cooling MT, Rouilly V, Misirli G, Lawson JR, Yu T, Hallinan J, et al. Standard virtual biological parts: a repository of modular modeling components for synthetic biology. Bioinformatics (Oxford, England). 2010;26:925–31.

• [27]

Lloyd CM, Lawson JR, Hunter PJ, Nielsen PMF. The cellML model repository. Bioinformatics (Oxford, England). 2008;24:2122–3.

• [28]

Li C, Donizelli M, Rodriguez N, Dharuri H, Endler L, Chelliah V, et al. BioModels database: an enhanced, curated and annotated resource for published quantitative kinetic models. BMC Syst Biol. 2010;4:1.

• [29]

King ZA, Lu J, Dräger A, Miller P, Federowicz S, Lerman JA, et al. BiGG models: a platform for integrating, standardizing and sharing genome-scale models. Nucleic Acids Res. 2016;44(D1):D515–22.

• [30]

Zhu Q, Wong AK, Krishnan A, Aure MR, Tadych A, Zhang R, et al. Targeted exploration and analysis of large cross-platform human transcriptomic compendia. Nat Methods. 2015;12:211–4.

• [31]

Le Novère N, Finney A, Hucka M, Bhalla DUS, Campagne F, Collado-Vides J, et al. Minimum information requested in the annotation of biochemical models (MIRIAM). Nat Biotechnol. 2005;23:1509–15.

• [32]

Blätke MA. BioModelKit a framework for modular biomodel-engineering. [Phd Thesis]; 2017. Google Scholar

• [33]

Blätke MA, Heiner M, Marwan W. Predicting phenotype from genotype through automatically composed petri nets. In: Proc. 10th International Conference on Computational Methods in Systems Biology (CMSB 2012), London. vol. 7605 of LNCS/LNBI. Springer, 2012:87–106. Google Scholar

• [34]

Flicek P, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, et al. Ensembl 2008. Nucleic Acids Res. 2008;36(Database issue):D707–14.

• [35]

Gene Ontology Consortium. The Gene Ontology Project in 2008. Nucleic Acids Res. 2008;36(Database issue):D440–4.

• [36]

Blätke MA. Petri-netz modellierung mittels eines modularen und hierarchischen ansatzes mit anwendung auf nozizeptive signalkomponenten. [Diploma Thesis]; 2010. Google Scholar

• [37]

Blätke MA, Meyer S, Stein C, Marwan W. Petri net modeling via a modular and hierarchical approach applied to nociception. In: Proc. 1st Int. Workshop on Biological Processes & Petri Nets (BioPPN), satellite event of Petri Nets 2010;2010:135–46. Google Scholar

• [38]

Blätke MA, Marwan W. Modular and hierarchical modelling concept for large biological Petri nets applied to nociception. In: Proc. 17th German Workshop on Algorithms and Tools for Petri Nets (AWPN 2010). vol. 643 of CEUR Workshop Proceedings. CEUR-WS.org, 2010:42–50.

• [39]

Blätke MA, Meyer S, Marwan W. Pain signaling – a case study of the modular Petri net modeling concept with prospect to a protein-oriented modeling platform. In: Proc. 2nd International Workshop on Biological Processes & Petri Nets (BioPPN), satellite event of Petri NETS 2011. vol. 724 of CEUR Workshop Proceedings. CEUR-WS.org, 2011:117–34.

• [40]

Blätke MA, Marwan W. A database-supported modular modelling platform for systems and synthetic biology. In: Proc. 3rd International Workshop on Biological Processes & Petri Nets (BioPPN), satellite event of Petri NETS 2012. vol. 852 of CEUR Workshop Proceedings. CEUR-WS.org, 2012:18–19.

• [41]

Blätke MA, Dittrich A, Rohr C, Heiner M, Schaper F, Marwan W. JAK–STAT signalling as example for a database-supported modular modelling concept. In: Proc. 10th International Conference on Computational Methods in Systems Biology (CMSB 2012), London. vol. 7605 of LNCS/LNBI. Springer, 2012:362–5. Google Scholar

• [42]

Blätke MA, Rohr C. A Coloured Petri net approach for spatial biomodel engineering based on the modular model composition framework Biomodelkit. In: Proc. 6th Int. Workshop on Biological Processes & Petri Nets (BioPPN 2015), satellite event of Petri Nets 2015. vol. 1373 of CEUR Workshop Proceedings. CEUR-WS.org, 2015:37–54.

• [43]

Liu F, Blätke M, Heiner M, Yang M. Modelling and simulating reaction–diffusion systems using coloured Petri nets. Comput Biol Med. 2014;53:297–308.

• [44]

Pârvu O, Gilbert D, Heiner M, Liu F, Saunders N, Shaw S. Spatial-temporal modelling and analysis of bacterial colonies with phase variable genes. ACM Trans Model Comput Simul. 2015;25:13–25.

• [45]

Liu F, Heiner M. Multiscale modelling of coupled Ca2+ channels using coloured stochastic Petri nets. IET Syst Biol. 2013;7:106–13.

• [46]

Blätke MA, Rohr C, Heiner M, Marwan W. A Petri-net-based framework for biomodel engineering. In: Large-Scale Networks in Engineering and Life Sciences. Modeling and Simulation in Science, Engineering and Technology. Cham: Springer International Publishing, 2014:317–66.Google Scholar

## Supplementary Material

Revised: 2018-05-21

Accepted: 2018-06-07

Published Online: 2018-09-06

Funding Source: Federal Ministry of Education and Research

Award identifier / Grant number: FKZ0315449F, FKZ0316177D

This work has been partly supported by the Germany Federal Ministry of Education and Research, Funder Id: 10.13039/501100002347 (FKZ0315449F, FKZ0316177D).

Conflict of Interest Statement: All authors have read the journal’s Publication ethics and publication malpractice statement available at the journal’s website and hereby confirm that they comply with all its parts applicable to the present scientific work.

Citation Information: Journal of Integrative Bioinformatics, Volume 15, Issue 3, 20180021, ISSN (Online) 1613-4516,

Export Citation