Die stetig wachsende Bedeutung von Kulturpflanzen zur Ernährung und zur industriellen Verwendung erfordert es, deren Eigenschaften kontinuierlich zu verbessern. Zur Unterstützung der lebenswissenschaftlichen Forschung in diesem Bereich wurde das Informationssystem MetaCrop entwickelt, welches Daten über metabolische Netzwerke (Pathways) agronomisch bedeutsamer Organismen verwaltet. Der Fokus des Systems liegt neben der ausschließlichen Verwendung manuell kurierter, und damit qualitativ hochwertiger, Informationen auf der hohen Granularität der gespeicherten Daten, mit der versucht wird, die hohe biologische Komplexität der Pflanzen abzubilden. MetaCrop ist unter der URL http://metacrop.ipk-gatersleben.de verfügbar.
Endothelin-1 (ET-1), with its vasoconstrictive and proliferation-stimulating effects, could play a role in the pathogenesis of primary pulmonary hypertension. We investigated the relationship between the ET-1 like immunoreactivity and the ET-receptor density, the grade of the pulmonary vasculopathy, and properties of the pulmonary circulation in patients with pulmonary hypertension due to congenital heart disease.
Twenty-six patients with a median age of 1 year and 1 month (6 weeks–17 years–9 months) were assigned to group I (n = 15) with a pulmonary to systemic flow ratio (Qp/Qs) ≥ 1.5 and a pulmonary to systemic resistance ratio (Rp/Rs) ≤ 0.3 (“high flow–low resistance group”) and to group II (n = 11) with a Qp/Qs < 1.5 and an Rp/Rs > 0.3 (“low flow–high resistance group”).
Patients belonging to group II showed a higher ETA-receptor density in lung arteries (p < 0.05) and parenchyma (p < 0.01) than patients in group I. Patients with the highest ET-1 like immunoreactivity in lung artery walls also showed a trend towards a higher ETA-receptor density. The ETB-receptor expression was low and not related to any of the above factors.
Our results suggest that the paracrine lung ET-1 system is up-regulated in pediatric patients with secondary pulmonary hypertension associated with congenital heart disease.
Search engines and retrieval systems are popular tools at a life science desktop. The manual inspection of hundreds of database entries, that reflect a life science concept or fact, is a time intensive daily work. Hereby, not the number of query results matters, but the relevance does. In this paper, we present the LAILAPS search engine for life science databases. The concept is to combine a novel feature model for relevance ranking, a machine learning approach to model user relevance profiles, ranking improvement by user feedback tracking and an intuitive and slim web user interface, that estimates relevance rank by tracking user interactions. Queries are formulated as simple keyword lists and will be expanded by synonyms. Supporting a flexible text index and a simple data import format, LAILAPS can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases.
With a set of features, extracted from each database hit in combination with user relevance preferences, a neural network predicts user specific relevance scores. Using expert knowledge as training data for a predefined neural network or using users own relevance training sets, a reliable relevance ranking of database hits has been implemented.
In this paper, we present the LAILAPS system, the concepts, benchmarks and use cases. LAILAPS is public available for SWISSPROT data at http://lailaps.ipk-gatersleben.de
Newborn screening (NBS) is an established screening procedure in many countries worldwide, aiming at the early detection of inborn errors of metabolism. For decades, dried blood spots have been the standard specimen for NBS. The procedure of blood collection is well described and standardized and includes many critical pre-analytical steps. We examined the impact of contamination of some anticipated common substances on NBS results obtained from dry spot samples. This possible pre-analytical source of uncertainty has been poorly examined in the past.
Capillary blood was obtained from 15 adult volunteers and applied to 10 screening filter papers per volunteer. Nine filter papers were contaminated without visible trace. The contaminants were baby diaper rash cream, baby wet wipes, disinfectant, liquid infant formula, liquid infant formula hypoallergenic (HA), ultrasonic gel, breast milk, feces, and urine. The differences between control and contaminated samples were evaluated for 45 NBS quantities. We estimated if the contaminations might lead to false-positive NBS results.
Eight of nine investigated contaminants significantly altered NBS analyte concentrations and potentially caused false-positive screening outcomes. A contamination with feces was most influential, affecting 24 of 45 tested analytes followed by liquid infant formula (HA) and urine, affecting 19 and 13 of 45 analytes, respectively.
A contamination of filter paper samples can have a substantial effect on the NBS results. Our results underline the importance of good pre-analytical training to make the staff aware of the threat and ensure reliable screening results.
Knowledge found in biomedical databases, in particular in Web information systems, is a major bioinformatics resource. In general, this biological knowledge is worldwide represented in a network of databases. These data is spread among thousands of databases, which overlap in content, but differ substantially with respect to content detail, interface, formats and data structure.
To support a functional annotation of lab data, such as protein sequences, metabolites or DNA sequences as well as a semi-automated data exploration in information retrieval environments, an integrated view to databases is essential. Search engines have the potential of assisting in data retrieval from these structured sources, but fall short of providing a comprehensive knowledge excerpt out of the interlinked databases. A prerequisit of supporting the concept of an integrated data view is to acquire insights into cross-references among database entities. This issue is being hampered by the fact, that only a fraction of all possible cross-references are explicitely tagged in the particular biomedical informations systems.
In this work, we investigate to what extend an automated construction of an integrated data network is possible. We propose a method that predicts and extracts cross-references from multiple life science databases and possible referenced data targets. We study the retrieval quality of our method and report on first, promising results. The method is implemented as the tool IDPredictor, which is published under the DOI 10.5447/IPK/2012/4 and is freely available using the URL: http://dx.doi.org/10.5447/IPK/2012/4.
To support the interpretation of measured molecular facts, like gene expression experiments or EST sequencing, the functional or the system biological context has to be considered. Doing so, the relationship to existing biological knowledge has to be discovered. In general, biological knowledge is worldwide represented in a network of databases. In this paper we present a method for knowledge extraction in life science databases, which prevents the scientists from screen scraping and web clicking approaches.
We developed a method for extraction of knowledge networks from distributed, heterogeneous life science databases. To meet the requirement of the very large data volume, the method used is based on the concept of data linkage graphs (DLG).We present an efficient software which enables the joining of millions of data points over hundreds of databases. In order to motivate possible applications, we computed networks of protein knowledge, which interconnect metabolic, disease, enzyme and gene function data.
The computed networks enabled a holistic relationship among measured experimental facts and the combined biological knowledge. This was successfully applied for a high throughput functional classification of barley EST and gene expression experiments with the perspective of an automated pipeline for the provisioning of controlled annotation of plant gene arrays and chips.
Availability: The data linkage graphs (XML or TGF format), the schema integrated database schema (GML or GRAPH-ML) and the graph computation software may be downloaded from the following URL: http://pgrc.ipk-gatersleben.de/dlg/
Genetic variance within the genotype of population and its mapping to phenotype variance in a systematic and high throughput manner is of interest for biodiversity and breeding research. Beside the established and efficient high throughput genotype technologies, phenotype capabilities got increased focus in the last decade. This results in an increasing amount of phenotype data from well scaling, automated sensor platform. Thus, data stewardship is a central component to make experimental data from multiple domains interoperable and re-usable. To ensure a standard and comprehensive sharing of scientific and experimental data among domain experts, FAIR data principles are utilized for machine read-ability and scale-ability. In this context, BrAPI consortium, provides a comprehensive and commonly agreed FAIRed guidelines to offer a BrAPI layered scientific data in a RESTful manner. This paper presents the concepts, best practices and implementations to meet these challenges. As one of the worlds leading plant research institutes it is of vital interest for the IPK-Gatersleben to transform legacy data infrastructures into a bio-digital resource center for plant genetics resources (PGR). This paper also demonstrates the benefits of integrated database back-ends, established data stewardship processes, and FAIR data exposition in a machine-readable, highly scalable programmatic interfaces.
Efficient and effective information retrieval in life sciences is one of the most pressing challenge in bioinformatics. The incredible growth of life science databases to a vast network of interconnected information systems is to the same extent a big challenge and a great chance for life science research. The knowledge found in the Web, in particular in life-science databases, are a valuable major resource. In order to bring it to the scientist desktop, it is essential to have well performing search engines. Thereby, not the response time nor the number of results is important. The most crucial factor for millions of query results is the relevance ranking.
In this paper, we present a feature model for relevance ranking in life science databases and its implementation in the LAILAPS search engine. Motivated by the observation of user behavior during their inspection of search engine result, we condensed a set of 9 relevance discriminating features. These features are intuitively used by scientists, who briefly screen database entries for potential relevance. The features are both sufficient to estimate the potential relevance, and efficiently quantifiable.
The derivation of a relevance prediction function that computes the relevance from this features constitutes a regression problem. To solve this problem, we used artificial neural networks that have been trained with a reference set of relevant database entries for 19 protein queries.
Supporting a flexible text index and a simple data import format, this concepts are implemented in the LAILAPS search engine. It can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases. LAILAPS is publicly available for SWISSPROT data at http://lailaps.ipk-gatersleben.de