As stated by World Health Organization (WHO) report, 246 million individuals have suffered with diabetes disease over worldwide and it is anticipated that by 2025 this estimation can cross 380 million. So, the proper and quick diagnosis of this disease is turned into a significant challenge for the machine learning researchers. This paper aims to design a robust model for diagnosis of diabetes using a hybrid approach of Chaotic-Jaya (CJaya) algorithm with Extreme Learning Machine (ELM), which is named as CJaya-ELM. In this paper, Jaya algorithm with Chaotic learning approach is used to optimize the random parameters of ELM classifier. Here, to assess the efficacy of the designed model, Pima Indian diabetes dataset is considered. Here, the designed model CJaya-ELM, has been compared with basic ELM, Teaching Learning Based Optimization algorithm (TLBO) optimized ELM (TLBO-ELM), Multi-Layer Perceptron (MLP), Jaya algorithm optimized MLP (Jaya-MLP), TLBO algorithm optimized MLP (TLBO-MLP) and CJaya algorithm optimized MLP models. CJaya-ELM model resulted in the highest testing accuracy of 0.9687, sensitivity of 1, specificity of 0.9688 with 0.9782 area under curve (AUC) value. Results reveal that CJaya-ELM model effectively classifies both the positive and negative samples of Pima and outperforms the competitors.
We present here CellML 2.0, an XML-based language for describing and exchanging mathematical models of physiological systems. MathML embedded in CellML documents is used to define the underlying mathematics of models. Models consist of a network of reusable components, each with variables and equations giving relationships between those variables. Models may import other models to create systems of increasing complexity. CellML 2.0 is defined by the normative specification presented here, prescribing the CellML syntax and the rules by which it should be used. The normative specification is intended primarily for the developers of software tools which directly consume CellML syntax. Users of CellML models may prefer to browse the informative rendering of the specification (https://cellml.org/specifications/cellml_2.0/) which extends the normative specification with explanations of the rules combined with examples of their usage.
Biological models often contain elements that have inexact numerical values, since they are based on values that are stochastic in nature or data that contains uncertainty. The Systems Biology Markup Language (SBML) Level 3 Core specification does not include an explicit mechanism to include inexact or stochastic values in a model, but it does provide a mechanism for SBML packages to extend the Core specification and add additional syntactic constructs. The SBML Distributions package for SBML Level 3 adds the necessary features to allow models to encode information about the distribution and uncertainty of values underlying a quantity.
Rule-based modeling is an approach that permits constructing reaction networks based on the specification of rules for molecular interactions and transformations. These rules can encompass details such as the interacting sub-molecular domains and the states and binding status of the involved components. Conceptually, fine-grained spatial information such as locations can also be provided. Through “wildcards” representing component states, entire families of molecule complexes sharing certain properties can be specified as patterns. This can significantly simplify the definition of models involving species with multiple components, multiple states, and multiple compartments. The systems biology markup language (SBML) Level 3 Multi Package Version 1 extends the SBML Level 3 Version 1 core with the “type” concept in the Species and Compartment classes. Therefore, reaction rules may contain species that can be patterns and exist in multiple locations. Multiple software tools such as Simmune and BioNetGen support this standard that thus also becomes a medium for exchanging rule-based models. This document provides the specification for Release 2 of Version 1 of the SBML Level 3 Multi package. No design changes have been made to the description of models between Release 1 and Release 2; changes are restricted to the correction of errata and the addition of clarifications.
This paper presents a report on outcomes of the 10th Computational Modeling in Biology Network (COMBINE) meeting that was held in Heidelberg, Germany, in July of 2019. The annual event brings together researchers, biocurators and software engineers to present recent results and discuss future work in the area of standards for systems and synthetic biology. The COMBINE initiative coordinates the development of various community standards and formats for computational models in the life sciences. Over the past 10 years, COMBINE has brought together standard communities that have further developed and harmonized their standards for better interoperability of models and data. COMBINE 2019 was co-located with a stakeholder workshop of the European EU-STANDS4PM initiative that aims at harmonized data and model standardization for in silico models in the field of personalized medicine, as well as with the FAIRDOM PALs meeting to discuss findable, accessible, interoperable and reusable (FAIR) data sharing. This report briefly describes the work discussed in invited and contributed talks as well as during breakout sessions. It also highlights recent advancements in data, model, and annotation standardization efforts. Finally, this report concludes with some challenges and opportunities that this community will face during the next 10 years.
This special issue of the Journal of Integrative Bioinformatics presents papers related to the 10th COMBINE meeting together with the annual update of COMBINE standards in systems and synthetic biology.
To date, many proteins generated by large-scale genome sequencing projects are still uncharacterized and subject to intensive investigations by both experimental and computational means. Knowledge of protein subcellular localization (SCL) is of key importance for protein function elucidation. However, it remains a challenging task, especially for multiple sites proteins known to shuttle between cell compartments to perform their proper biological functions and proteins which do not have significant homology to proteins of known subcellular locations. Due to their low-cost and reasonable accuracy, machine learning-based methods have gained much attention in this context with the availability of a plethora of biological databases and annotated proteins for analysis and benchmarking. Various predictive models have been proposed to tackle the SCL problem, using different protein sequence features pertaining to the subcellular localization, however, the overwhelming majority of them focuses on single localization and cover very limited cellular locations. The prediction was basically established on sorting signals, amino acids compositions, and homology. To improve the prediction quality, focus is actually on knowledge information extracted from annotation databases, such as protein–protein interactions and Gene Ontology (GO) functional domains annotation which has been recently a widely adopted and essential information for learning systems. To deal with such problem, in the present study, we considered SCL prediction task as a multi-label learning problem and tried to label both single site and multiple sites unannotated bacterial protein sequences by mining proteins homology relationships using both GO terms of protein homologs and PSI-BLAST profiles. The experiments using 5-fold cross-validation tests on the benchmark datasets showed a significant improvement on the results obtained by the proposed consensus multi-label prediction model which discriminates six compartments for Gram-negative and five compartments for Gram-positive bacterial proteins.
A standardized approach to annotating computational biomedical models and their associated files can facilitate model reuse and reproducibility among research groups, enhance search and retrieval of models and data, and enable semantic comparisons between models. Motivated by these potential benefits and guided by consensus across the COmputational Modeling in BIology NEtwork (COMBINE) community, we have developed a specification for encoding annotations in Open Modeling and EXchange (OMEX)-formatted archives. Distributing modeling projects within these archives is a best practice established by COMBINE, and the OMEX metadata specification presented here provides a harmonized, community-driven approach for annotating a variety of standardized model and data representation formats within an archive. The specification primarily includes technical guidelines for encoding archive metadata, so that software tools can more easily utilize and exchange it, thereby spurring broad advancements in model reuse, discovery, and semantic analyses.
Synthetic biology builds upon genetics, molecular biology, and metabolic engineering by applying engineering principles to the design of biological systems. When designing a synthetic system, synthetic biologists need to exchange information about multiple types of molecules, the intended behavior of the system, and actual experimental measurements. The Synthetic Biology Open Language (SBOL) has been developed as a standard to support the specification and exchange of biological design information in synthetic biology, following an open community process involving both wet bench scientists and dry scientific modelers and software developers, across academia, industry, and other institutions. This document describes SBOL 3.0.0, which condenses and simplifies previous versions of SBOL based on experiences in deployment across a variety of scientific and industrial settings. In particular, SBOL 3.0.0, (1) separates sequence features from part/sub-part relationships, (2) renames Component Definition/Component to Component/Sub-Component, (3) merges Component and Module classes, (4) ensures consistency between data model and ontology terms, (5) extends the means to define and reference Sub-Components, (6) refines requirements on object URIs, (7) enables graph-based serialization, (8) moves Systems Biology Ontology (SBO) for Component types, (9) makes all sequence associations explicit, (10) makes interfaces explicit, (11) generalizes Sequence Constraints into a general structural Constraint class, and (12) expands the set of allowed constraints.
This document defines Version 0.3 Markup Language (ML) support for the Systems Biology Graphical Notation (SBGN), a set of three complementary visual languages developed for biochemists, modelers, and computer scientists. SBGN aims at representing networks of biochemical interactions in a standard, unambiguous way to foster efficient and accurate representation, visualization, storage, exchange, and reuse of information on all kinds of biological knowledge, from gene regulation, to metabolism, to cellular signaling. SBGN is defined neutrally to programming languages and software encoding; however, it is oriented primarily towards allowing models to be encoded using XML, the eXtensible Markup Language. The notable changes from the previous version include the addition of attributes for better specify metadata about maps, as well as support for multiple maps, sub-maps, colors, and annotations. These changes enable a more efficient exchange of data to other commonly used systems biology formats (e. g., BioPAX and SBML) and between tools supporting SBGN (e. g., CellDesigner, Newt, Krayon, SBGN-ED, STON, cd2sbgnml, and MINERVA). More details on SBGN and related software are available at http://sbgn.org. With this effort, we hope to increase the adoption of SBGN in bioinformatics tools, ultimately enabling more researchers to visualize biological knowledge in a precise and unambiguous manner.