Documentation of research data plays a key role in biomedical engineering innovation processes. It provides necessary evidence for the proof of an invention , contributes to fulfilling regulatory requirements and assists meeting the high standards of funding authorities, industrial partners and society. Additionally, documentation affects directly quality issues as traceability and repeatability. Furthermore, a sophisticated documentation contributes to fulfilling the provisions and aims of good scientific practice and good laboratory practice.
For decades, the paper-bound laboratory notebook (LN) was used successfully for producing quality documentation . Most scientists record measurements, interim results and ideas in their individual notebooks (see Figure 1).
Simultaneously, working methods and processes in biomedical engineering have changed massively over the last decade. The increased use of computer and electronic measuring instruments forced scientists in academic and other environments to manage increasingly diverse digital datasets from various sources [4, 15], resulting in a mix of digital and analogue data . This development is accelerated by the increasing application of data analysis software as well as of high-throughput systems (the number of publications listed in PubMed with search term “biomedical engineering AND high-throughput” increased from 123 (2003–2007) to 414 (2008–2012) and “biomedical engineering AND automation” from 136 to 350). Thus, it can be hypnotized that an electrical alternative to the common paper-bound notebooks could improve the quality of documentation.
Electronic laboratory notebooks (ELNs) are well established as part of the quality management system in chemical and pharmaceutical industrial laboratories . Applications of electronic variants of LN in a commercial environment were seen as responsible for a 20% growth in efficiency . In recent years, the range of ELN software has grown quickly . This ranges from open source to commercial subject-specific products. Latest developments can be installed on tablet PCs or multi-touch tables .
Currently, an interface that connects ELNs to an external authentication point is in development by a German Research Association (DFG) funded project . Thereby, the electronic stored data can automatically be equipped with a “qualified electronic signature”, which saves the data with high authenticity.
Despite the success of ELNs in a commercial environment and an application rate of 50%  in the pharmaceutical industry, only 4% of LNs in academia are electronic . The exact reason for the imbalance is unknown. There is a lack of information and comparative studies in the field of ELNs in academic environments. Therefore, a survey among documentation in laboratories was conducted.
This paper is aiming to establish a forecast model for a paper-bound or electronic documentation preference and acceptance of scientists using binary logistic regression.
An online survey has been conducted in order to determine the experiences of scientists using LNs and their attitude towards documentation. In a multistage process, questions that emerged from a literature research were categorized using brainstorming and relevance tests. This resulted in a questionnaire focusing on the use of LN, the collaboration within a laboratory, the documentation with LN, ELN and demographic background. A method validation was realized using a pretest with five scientists examining redundancy, user motivation, scalability, intelligibility and uniqueness of the questions. The final questionnaire consisting of 36 items, was implemented with the open source software LimeSurvey . The survey language was German. For more details see Supplementary Information. More than 200 researchers in academic as well as industrial laboratories in Germany were invited to participate on the anonymous online survey. They were contacted at a fair (MEDICA 2012 in Düsseldorf), a congress (BMT 2012 – Biomedical Technology Congress in Jena) or by using e-mail distribution lists (project “innovating medical technology in.nrw” , Institute of Applied Medical Engineering RWTH Aachen University and regional networks). In total, 105 contributions (101 were completed) were received between March and November 2012. The participation rate of researchers who received a personal invitation by e-mail was 68.4% (39/57). The other researchers participated via a registration link.
The prediction model was developed using binary logistic regression, which is used to identify the relationship between one or more predictor variables and a categorical outcome variable  in a univariate or multivariate approach. In this paper the binary logistic regression is used to analyze the influence of 18 factors (predictor variables, see Table 1) over the dichotomous outcome variable system preference (LN or ELN). Unlike other statistical methods the “(…) logistic regression analysis does not require that the data are drawn from a multivariate normal distribution with equal variances and covariances for the explanatory variables.” [18, page 1390].
In literature a minimum sample size between 50 and 100 is recommended [5, 23]. The required size increases with an increasing number of predictor variables. Considering this, a pre-selection of relevant variables was performed. In this first step each predictor will be individually tested for statistical significance in a univariate logistic regression. In order to include all influencing factors the predictors with a significance level of p≤20% in the unvariate model are tested for significance in a multivariate model. In the second step, insignificant predictors with p>5% are excluded iteratively from the prediction model (see Figure 2).
Model fit in a logistic regression is sensitive to collinerarity among the predictors . Therefore, the dependence of the predictor variables is examined and considered in the model. Finally, validation tests for the overall model and goodness-of-fit are executed (overall statistics, Hosmer and Lemeshow test, Nagelwerke R2, Omnibustest). All statistical calculations were carried out using SPSS (v.19).
The survey participants were asked about their professional backgrounds (multiple replies possible). The free text answers were categorized into the fields of engineering, natural sciences and human medicine based on the personnel at universities subject classification .
The variable affinity for technology is based on five questions about the frequency of using laptops, tablet PCs, smartphones, social networks and Internet forums. In order to identify more and less technically able researchers, the answers were weighted with never=0, less often than weekly=1, weekly=2 and daily=3. The cumulative results of these five questions reflect the level of affinity for technology.
Another issue, the anticipated strengths of the systems LN and ELN, was analyzed in ten different categories (see Table 1). For each category the question included three answer categories:
LN has an advantage over ELN
Neither LN nor ELN has an advantage
ELN has an advantage over LN
It is to be expected that factors determining the researcher’s system choice (LN/ELN) are also seen as strength of the system. In order to interpret the factors with regard to the low rate of ELN in the academic environment, the answers (2) and (3) are combined to “LN has no advantage over ELN”. In the logistic regression analysis, missing answers are excluded from analysis.
Characteristics of the survey sample
The survey sample is dominated by young academic scientists working predominantly with paper-based LN: 69% of the respondents are aged between 24 and 35 years and just as many have used a LN within the past month. One hundred and nine of 134 professional backgrounds could be assigned into the fields of engineering, natural sciences and human medicine. Most respondents have an educational background in natural science (68), followed by engineering (31) and medicine (10). The majority (73%) are counted among a university environment, followed by research facilities as Frauenhofer- or Max-Plank-Gesellschaft (12%). Additional characteristics of the survey sample are given in Table 2. For participants in an academic environment with experience in using ELNs, a response rate of 5% (4/74) was achieved, which confirms the rate reported in literature .
The results of the pre-selection of relevant variables are presented in Table 1. Gender, individual affinity, amount of work, mobility, data backup, affinity for technology and protection of intellectual property are significant at the 20% level. The frequency between the response groups of data backup, quality assurance in the laboratory, team work, readability of the data, information exchange with colleagues and information exchange with cooperation partners are characterized by a strong imbalance. The variation of the total number of respondents between the individual logistic regressions is explained by the varying number of missing answers in each question.
Multivariable logistic regression
Table 3 presents the results of final logistic regression including Wald statistics testing the significance of individual regression coefficients . The significant predictors (p≤5%) amount of work, mobility and personal affinity are integrated in the logistic regression model. In order to evaluate the prediction model, tests about the overall model and goodness-of-fit were carried out. The overall statistics (score=46.567; df=4; p<0.000) implies the contribution of at least one of the independent variables to the prediction of the outcome. The prediction model with the three predictors was more effective than the model without predictor variables (null model). In case of the goodness-of-fit statistic, the Hosmer and Lemeshow test was carried out (χ2=1.773; df=4; p=0.777). The result with the null hypothesis of a good model fit suggests that the model describes the data adequately. Further tests confirming results are Nagelwerke R2=0.728 and Omnibustest (χ2=55.606; df=4; p<0.000). In a cross classification of observed values for the outcome variables and the predicted values (cut-off value=0.5) the model predicts 89.0% of the cases correctly (null model: 63%). This is a further indication for a good model fit.
The classification of the advantages of each system by the researcher is based mainly on presumptions. Merely 10% of respondents have experience in using ELNs.
Anticipated strengths of an electronic documentation
The pre-selection analysis displays no significant contribution of quality assurance in the laboratory, team work, readability of the data and information exchange with colleagues or information exchange with cooperation partners to system preference (see Table 1). Most of the 27 researchers preferring paper-bound documentation stated that the electronic variant is at least equal in these categories. In consequence, these variables are not decisive for the scientist’s system preference.
A more detailed analysis of the responses revealed that 63% of the LN-preferring researchers agreed that ELNs have an advantage over LNs in the field of information exchange with cooperation partners. For team work and readability of the data the percentage rises to 72%. Due to their independency of the individual system preference, and in consideration of the low level of ELN operational experience, these categories can be referred to as anticipated strengths of ELNs. This illustrates potentials of electronic documentation. It can be hypothesized that, particularly in applied sciences characterized by high level of cooperation, the transfer between partners in academic and industrial environment would benefit from an electronic documentation. The documentation could speed up the innovation process in this way.
Influence of the predictors on the system preference
In the logistic regression analysis three variables with a significant influence on the scientist’s documentation preference are identified with amount of work being at the most significant level. In comparison to researchers seeing an advantage of paper-bound notebooks over electronic documentation systems, the ratio “preferring ELN/preferring LN” increases by a factor 58.543 in relation to the group “LN has no advantage”. To increase the acceptance of ELNs the researchers must be convinced that workload would not increase by using ELN. Thus, an adequate documentation system has to facilitate the respective workflow and should be adapted to the specific processes in an academic environment.
The predictor variable mobility is not significant at a 5% level but correlates with the predictor amount of work [p(amount of work by mobility)=0.018]. A lack of mobility is seen as contributing to the amount of work. Paper-bound notebooks are characterized by an unrestricted mobility. An electronic documentation system with stationary personal computers such as multi-touch tables  would have negative influence on the system preference.
Another predictor with a highly significant impact on the system preference is personal affinity with an odds ratio of 37.537. The wording personal affinity is vague and closely linked to the system preference question. So the high significance level (p=0.000) is not surprising.
However, it implies that the preference also depends on a personal aspect. This aspect could depend on additional influence factors varying for each individual person but also on the research nature. Previous studies report a mistrust of researchers in sharing data which is explained by the highly competitive nature of biomedical research [1, 21]. The benefit of an ELN in such an environment is doubtful. This also could explain the low influence of anticipated strengths of ELNs as information exchange on the system preference.
The affinity for technology, based on the frequency of using laptops, tablet PCs, smartphones, social networks and internet forums, contributes not significantly to the system preference. Also noticeable is the fact that the system choice is independent of the research type, the professional background of the scientists and the interdisciplinary composition of the laboratory. Although a shift in science from individual working to teamwork is described , and researchers anticipate a better support by ELNs in this field, the influence on the system choice is small.
Impact of the prediction model
The prediction model gives advice to leading employees searching for the most suitable documentation system. During the selection process two choices have to be made: first, deciding between paper-bound and electronic laboratory notebooks; second, selecting between the various ELN software solutions. The latter choice often depends on several aspects such as quality assurance, improvement of information transfer or costs and many others. In this context, scientists have stated that, “Insufficient user acceptance has long been an obstacle to the successful adoption of IT/IS” [25, page 67]. This is what the prediction model specifically addresses.
The model enables the identification of predictors, which are crucial determinants for the users acceptance of ELN software. The identified predictors finally indicate performance requirements that have to be met by the selected ELN. The predictor amount of work, for example, suggests the need for supporting the scientists’ workflow. The increasing workflow assistance fosters an increasing user acceptance. Thus, based on the prediction model different ELNs can be ranked or rated. A follow-up study analyzing the influence of individual ELN functions on the user perception regarding the crucial predictors would complement the prediction model.
This study allows for the assessment and comparison of the probable acceptance level of ELNs by comparing its functions before purchase or installation. The prediction model enables a more knowledge-based decision making whether a system change is wise or not.
We have shown that the acceptance of an ELN in a specific environment can be predicted. The documentation system preference in an academic environment can be described via a mathematical model. Because of the dependence of system choice on specific parameters, the acceptance of an ELN is predictable before implementation.
The significant factors are independent of the professional background, research type or age and enable the development of one common prediction model.
The possibility to identify crucial factors for the scientist’s choice allows academic demands to be defined. ELNs have the opportunity to achieve high acceptance proving lower workload. Because of the low level of ELN operational experience in academic environment, the evidence is an important part for the acceptance.
The documentation system preference of the researcher is independent of possible improvements of an electronic documentation in the field of quality management, transfer and data backup. Conversely, these items are perceived as strengths of ELNs. Whether an electronic alternative to the common paper-bound notebooks could foster the quality of documentation, remain undetermined.
Placement of the thesis within the research context
Several studies are intended to support scientists in a digital world, by developing tools and technical infrastructure. Projects such as multi-touch tables for social scientists , new ELN concepts  or the effort to connect ELNs to an external authentication point  display the trend towards electronic applications supporting research processes. Our study complements these technical approaches including the user acceptance and motivation.
Constraints of the study
The survey sample consists of 101 participants. A limitation of parameters in the prediction model resulted. In particular, in the items information exchange, team work, readability and quality assurance, significance is not expected [8, 16].
The item affinity for technology is based on the frequency of using laptops, tablet PCs, smartphones, social networks and Internet forums. The number of scientists with experience in using tablet PCs regularly is quite small: 71% of the respondents have never used a tablet PC. Considering the rapid spreading of tablet PCs and their similarity to LNs in look and feel an influence on the system preference cannot be excluded.
Unanswered questions and future research
Additional parameters could influence the system choice of the researcher. To improve the prediction of the model including other influence factors would be reasonable.
If the prediction of the model corresponds with the practice is unsolved. An experimental investigation with varying ELNs is essential.
The project is co-funded by the European Union (ERDF – European Regional Development Fund – Investing in your future) and the German federal state North Rhine-Westphalia (NRW), under the operational program “Regional Competitiveness and Employment” 2007–2013 (EFRE).
Campbell EG, Clarridge BR, Gokhale M, et al. Data withholding in academic medicine: characteristics of faculty denied accessto research results and biomaterials. Res Policy 2002; 29: 303–12.Google Scholar
Ebel HF, Bliefert C, Greulich W. Schreiben und Publizieren in den Naturwissenschaften, 5th edition. Weinheim: Wiley-VCH 2006.Google Scholar
Elliott M. It′s Not About the Paper. Available from: http://www.scientificcomputing.com/its-not-about-the-paper.aspx?terms=chemistry. (Last accessed March 2013.)
Feagan L, Rohrer J, Garrett A, et al. Bioinformatics process management: information flow via a computational journal. Source Code Biol Med Feagan 2007; 2: 9.Google Scholar
Fromm S. Binäre logistische Regressionsanalyse: eine Einführung für Sozialwissenschaftler mit SPSS für Windows. In: Otto-Friedrich-Universität Bamberg, editor. Bamberger Beiträge zur empirischen Sozialforschung. Bamberg 2005: 5–35.Google Scholar
Goddard NH, Macneil R, Ritchie J. eCAT: Online electronic lab notebook for scientific research. Autom Exp 2009; 1: 4.Google Scholar
Herrmann K, Möllers M, Wittenhagen M, et al. Ein interaktiver Multitouch-Tisch als multimediale Arbeitsumgebung. In: Schomburg S, Leggewie C, Lobin H, Puschmann C, editors. Digitale Wissenschaft: Stand und Entwicklung digital vernetzter Forschung in Deutschland. Köln 2011: 45–50.Google Scholar
Hosmer DW, Lemeshow S. Applied logistic regression. 2nd edition. New York: Wiley 2000.Google Scholar
Innovating medical technology IN.NRW Available from: http://medtec-innrw.de. (Last accessed March 2013.)
LimeSurvey. Available from: http://www.limesurvey.org. (Last accessed March 2013.)
Lu DL, Kowalski T, Uthaman S. Are laboratory notebooks necessary in a first inventor to file world? J Commerc Biotechnol 2012; 18: 67–68.Google Scholar
Macneil R. The benefits of integrated systems for managing both samples and experimental data: an opportunity for labs in universities and government research institutions to lead the way. Autom Exp 2011; 3: 2.PubMedCrossrefGoogle Scholar
Multitouch Lab Journal. Available from: http://www.info-design.net/laborbuch. (Last accessed March 2013.)
Nelson MR, Reisinger SJ, Henry SG. Designing databases to store biological information. Biosilico 2003; 1: 134–142.Google Scholar
Noruis MJ. SPSS 14.0 advanced statistical procedures companion. Upper Saddle River, NJ: Prentice Hall 2005.Google Scholar
Peters B, Dewil R, Smets IY. Improved process control of an industrial sludge centrifuge-dryer installation through binary logistic regression modeling of the fouling issues. J Process Contr 2012; 22:1387–1396.CrossrefWeb of ScienceGoogle Scholar
Potthoff J, Rieger S. Elektronisches Laborbuch: Beweiswerterhaltung und Langzeitarchivierung in der Forschung. In: Schomburg S, Leggewie C, Lobin H, Puschmann C, editors. Digitale Wissenschaft: Stand und Entwicklung digital vernetzter Forschung in Deutschland. Köln 2011: 149–156.Google Scholar
Sally Lee E, McDonald DW, Anderson N, Tarczy-Hornoch P. Incorporating collaborator concepts into informatics in support of translational interdisciplinary biomedical research. Int J Med Inform 2009; 78: 10–21.CrossrefWeb of ScienceGoogle Scholar
Statistisches Bundesamt. Bildung und Kultur, Personal an Hochschulen – Fächersystematik –, Available from: https://www.destatis.de/DE/Methoden/Klassifikationen/BildungKultur/PersonalStellenstatistik.pdf?__blob=publicationFile. (Last accessed March 2013.)
Urban D. Logit-Analyse: Statistische Verfahren zur Analyse von Modellen mit qualitativen Response-Variablen. Stuttgart: G. Fischer 1993.Google Scholar
Wright JM. Make it better but don′t change anything. Autom Exp 2009; 1: 5.Google Scholar
Wuchty S, Jones BF, Uzzi B. The increasing dominance of teams in production of knowledge. Science 2007; 316: 1036–1039.Google Scholar
About the article
Published Online: 2013-11-13
Published in Print: 2014-04-01