Systems biology plays a central role for biological network analysis in the post-genomic era. Cytoscape is the standard bioinformatics tool offering the community an extensible platform for computational analysis of the emerging cellular network together with experimental omics data sets. However, only few apps/plugins/tools are available for simulating network dynamics in Cytoscape 3. Many approaches of varying complexity exist but none of them have been integrated into Cytoscape as app/plugin yet. Here, we introduce PetriScape, the first Petri net simulator for Cytoscape. Although discrete Petri nets are quite simplistic models, they are capable of modeling global network properties and simulating their behaviour. In addition, they are easily understood and well visualizable. PetriScape comes with the following main functionalities: (1) import of biological networks in SBML format, (2) conversion into a Petri net, (3) visualization as Petri net, and (4) simulation and visualization of the token flow in Cytoscape. PetriScape is the first Cytoscape plugin for Petri nets. It allows a straightforward Petri net model creation, simulation and visualization with Cytoscape, providing clues about the activity of key components in biological networks.
Background: A precise experimental identification of transcription factor binding motifs (TFBMs), accurate to a single base pair, is time-consuming and difficult. For several databases, TFBM annotations are extracted from the literature and stored 5ʹ → 3ʹ relative to the target gene. Mixing the two possible orientations of a motif results in poor information content of subsequently computed position frequency matrices (PFMs) and sequence logos. Since these PFMs are used to predict further TFBMs, we address the question if the TFBMs underlying a PFM can be re-annotated automatically to improve both the information content of the PFM and subsequent classification performance.
Results: We present MoRAine, an algorithm that re-annotates transcription factor binding motifs. Each motif with experimental evidence underlying a PFM is compared against each other such motif. The goal is to re-annotate TFBMs by possibly switching their strands and shifting them a few positions in order to maximize the information content of the resulting adjusted PFM. We present two heuristic strategies to perform this optimization and subsequently show that MoRAine significantly improves the corresponding sequence logos. Furthermore, we justify the method by evaluating specificity, sensitivity, true positive, and false positive rates of PFM-based TFBM predictions for E. coli using the original database motifs and the MoRAine-adjusted motifs. The classification performance is considerably increased if MoRAine is used as a preprocessing step.
Conclusions: MoRAine is integrated into a publicly available web server and can be used online or downloaded as a stand-alone version from http://moraine.cebitec.uni-bielefeld.de.
Over the last decade the evaluation of odors and vapors in human breath has gained more and more attention, particularly in the diagnostics of pulmonary diseases. Ion mobility spectrometry coupled with multi-capillary columns (MCC/IMS), is a well known technology for detecting volatile organic compounds (VOCs) in air. It is a comparatively inexpensive, non-invasive, high-throughput method, which is able to handle the moisture that comes with human exhaled air, and allows for characterizing of VOCs in very low concentrations. To identify discriminating compounds as biomarkers, it is necessary to have a clear understanding of the detailed composition of human breath. Therefore, in addition to the clinical studies, there is a need for a flexible and comprehensive centralized data repository, which is capable of gathering all kinds of related information. Moreover, there is a demand for automated data integration and semi-automated data analysis, in particular with regard to the rapid data accumulation, emerging from the high-throughput nature of the MCC/IMS technology. Here, we present a comprehensive database application and analysis platform, which combines metabolic maps with heterogeneous biomedical data in a well-structured manner. The design of the database is based on a hybrid of the entity-attribute- value (EAV) model and the EAV-CR, which incorporates the concepts of classes and relationships. Additionally it offers an intuitive user interface that provides easy and quick access to the platform’s functionality: automated data integration and integrity validation, versioning and roll-back strategy, data retrieval as well as semi-automatic data mining and machine learning capabilities. The platform will support MCC/IMS-based biomarker identification and validation. The software, schemata, data sets and further information is publicly available at http://imsdb.mpi-inf.mpg.de.
CoryneRegNet is an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks. Initially, it was designed to provide methods for the analysis and visualization of the gene regulatory network of Corynebacterium glutamicum. Now we integrated the genomes and transcriptional interactions of three other corynebacteria, C. diphtheriae, C. efficiens, and C. jeikeium into CoryneRegNet; providing comparative analysis and visualization with GraphVis. We also integrated the high-performance PSSM search tool PoSSuM search to detect potential transcription factor binding sites within and across species. As an application, we reconstruct in silico the regulatory network of the iron metabolism regulator DtxR in the four corynebacteria.
CoryneRegNet is freely accessible at https://www.cebitec.uni-bielefeld.de/groups/gi/software/coryneregnet/. The final slash (/) is mandatory. In order to use the GraphVis feature, Java (at least version 1.4.2) is required.
Comparative analysis of biological networks is a major problem in computational integrative systems biology. By computing the maximum common edge subgraph between a set of networks, one is able to detect conserved substructures between them and quantify their topological similarity. To aid such analyses we have developed CytoMCS, a Cytoscape app for computing inexact solutions to the maximum common edge subgraph problem for two or more graphs. Our algorithm uses an iterative local search heuristic for computing conserved subgraphs, optimizing a squared edge conservation score that is able to detect not only fully conserved edges but also partially conserved edges. It can be applied to any set of directed or undirected, simple graphs loaded as networks into Cytoscape, e.g. protein-protein interaction networks or gene regulatory networks. CytoMCS is available as a Cytoscape app at http://apps.cytoscape.org/apps/cytomcs.
IMS2 is an Integrated Medical Software system for the analysis of Ion Mobility Spectrometry (IMS) data. It assists medical staff with the following IMS data processing steps: acquisition, visualization, classification, and annotation. IMS2 provides data analysis and interpretation features on the one hand, and also helps to improve the classification by increasing the number of the pre-classified datasets on the other hand. It is designed to facilitate early detection of lung cancer, one of the most common cancer types with one million deaths each year around the world.
After reviewing the IMS technology, we first describe the software architecture of IMS2 and then the integrated classification module, including necessary pre-processing steps and different classification methods. The Lung Hospital Hemer (Germany) provided IMS data of 35 patients suffering from lung cancer and 72 samples of healthy persons. IMS2 correctly classifies 99% of the samples, evaluated using 10-fold cross-validation.
Electronic laboratory notebooks (ELNs) are more accessible and reliable than their paper based alternatives and thus find widespread adoption. While a large number of commercial products is available, small- to mid-sized laboratories can often not afford the costs or are concerned about the longevity of the providers. Turning towards free alternatives, however, raises questions about data protection, which are not sufficiently addressed by available solutions. To serve as legal documents, ELNs must prevent scientific fraud through technical means such as digital signatures. It would also be advantageous if an ELN was integrated with a laboratory information management system to allow for a comprehensive documentation of experimental work including the location of samples that were used in a particular experiment. Here, we present OpenLabNotes, which adds state-of-the-art ELN capabilities to OpenLabFramework, a powerful and flexible laboratory information management system. In contrast to comparable solutions, it allows to protect the intellectual property of its users by offering data protection with digital signatures. OpenLabNotes effectively closes the gap between research documentation and sample management, thus making Open- LabFramework more attractive for laboratories that seek to increase productivity through electronic data management.
Distinct bacteria are able to cope with highly diverse lifestyles; for instance, they can be free living or host-associated. Thus, these organisms must possess a large and varied genomic arsenal to withstand different environmental conditions. To facilitate the identification of genomic features that might influence bacterial adaptation to a specific niche, we introduce LifeStyle-Specific-Islands (LiSSI). LiSSI combines evolutionary sequence analysis with statistical learning (Random Forest with feature selection, model tuning and robustness analysis). In summary, our strategy aims to identify conserved consecutive homology sequences (islands) in genomes and to identify the most discriminant islands for each lifestyle.
The explosion of biological data has largely influenced the focus of today’s biology research. Integrating and analysing large quantity of data to provide meaningful insights has become the main challenge to biologists and bioinformaticians. One major problem is the combined data analysis of data from different types, such as phenotypes and genotypes. This data is modelled as bi-partite graphs where nodes correspond to the different data points, mutations and diseases for instance, and weighted edges relate to associations between them. Bi-clustering is a special case of clustering designed for partitioning two different types of data simultaneously. We present a bi-clustering approach that solves the NP-hard weighted bi-cluster editing problem by transforming a given bi-partite graph into a disjoint union of bi-cliques. Here we contribute with an exact algorithm that is based on fixed-parameter tractability. We evaluated its performance on artificial graphs first. Afterwards we exemplarily applied our Java implementation to data of genome-wide association studies (GWAS) data aiming for discovering new, previously unobserved geno-to-pheno associations. We believe that our results will serve as guidelines for further wet lab investigations. Generally our software can be applied to any kind of data that can be modelled as bi-partite graphs. To our knowledge it is the fastest exact method for weighted bi-cluster editing problem.