Accessible Requires Authentication Published by De Gruyter Oldenbourg November 8, 2018

ScaDS Dresden/Leipzig – A competence center for collaborative big data research

René Jäkel ORCID logo, Eric Peukert, Wolfgang E. Nagel and Erhard Rahm


The efficient and intelligent handling of large, often distributed and heterogeneous data sets increasingly determines the scientific and economic competitiveness in most application areas. Mobile applications, social networks, multimedia collections, sensor networks, data intense scientific experiments, and complex simulations nowadays generate a huge data deluge. Nonetheless, processing and analyzing these data sets with innovative methods open up new opportunities for its exploitation and new insights. Nevertheless, the resulting resource requirements exceed usually the possibilities of state-of-the-art methods for the acquisition, integration, analysis and visualization of data and are summarized under the term big data. ScaDS Dresden/Leipzig, as one Germany-wide competence center for collaborative big data research, bundles efforts to realize data-intensive applications for a wide range of applications in science and industry. In this article, we present the basic concept of the competence center and give insights in some of its research topics.


Funding source: Bundesministerium für Bildung und Forschung

Award Identifier / Grant number: 01IS14014A-D

Funding statement: This work was supported by the German Federal Ministry of Education and Research (BMBF, 01IS14014A-D) by funding the competence center for Big Data “ScaDS Dresden/Leipzig”.


We thank the Center for Information Services and High Performance Computing (ZIH) at TU Dresden for generous allocations of computer time. Furthermore the authors would like to express their gratitude to Hendrik Herold, Florian Jug, Jochen Tiepmar, Joachim Staib, Peter Winkler, Richard Grunzke and Bernd Schuller for providing insights into their research fields.


1. D. Gershon, Dealing with the data deluge. Nature 416 (2002), no. 6883, p. 889–891. Search in Google Scholar

2. G. Bell, T. Hey, and A. Szalay, Beyond the data deluge, Science 323 (2009). no. 5919, p. 1297–1298. Search in Google Scholar

3. M. Asch et al.Big data and extreme-scale computing: Pathways to Convergence-Toward a shaping strategy for a future software and data ecosystem for scientific inquiry. The International Journal of High Performance Computing Applications, vol. 32, (2018), no. 4, p. 435–479. Search in Google Scholar

4. G. Fox, J. Qiu, S. Jha, S. Ekanayake, and S. Kamburugamuve, Big Data, Simulations and HPC Convergence. In: 7th Workshop on Big Data Benchmarking, 2015. Search in Google Scholar

5. R. Jäkel, R. Müller-Pfefferkorn, M. Kluge, R. Grunzke, and W. E. Nagel, Architectural implications for exascale based on big data workflow requirements. In: Big Data and High Performance Computing, vol. 26, Advances in Parallel Computing, IOS Press, 2015, p. 101–113. Search in Google Scholar

6. W. E. Nagel, R. Jäkel, and R. Müller-Pfefferkorn. Execution Environments for Big Data: Challenges for User Centric Scenarios, BDEC white paper, Barcelona 2015. Search in Google Scholar

7. Press release (German, July 2018): Fusion von HPC und Data Analytics, Search in Google Scholar

8. D. Schemala, D. Schlesinger, P. Winkler, H. Herold, and G. Meinel. Semantic segmentation of settlement patterns in gray-scale map images using RF and CRF within an HPC environment. In: Proceedings of the GEOBIA 2016, Enschede, Holland. Search in Google Scholar

9. H. Herold, R. Hecht, and G. Meinel. Old maps for land use change monitoring – analysing historical maps for long-term land use change monitoring. In: Proceedings of the International Workshop Exploring Old Maps (EOM 2016), University of Luxembourg, 2016, p. 11–12. Search in Google Scholar

10. J. Tiepmar, T. Eckart, D. Goldhahn, C. Kuras. Integrating Canonical Text Services into CLARIN’s Search Infrastructure, Linguistics and Literature Studies, vol. 5, (2017), p. 99–104. Search in Google Scholar

11. J. Staib, S. Grottel, and S. Gumhold. Visualization of Particle-based Data with Transparency and Ambient Occlusion, Computer Graphics Forum, vol. 34, p. 151–160. Search in Google Scholar

12. J. Staib, S. Grottel, and S. Gumhold. Enhancing Scatterplots with Multi-Dimensional Focal Blur, Computer Graphics Forum, vol. 35, p. 11–20. Search in Google Scholar

13. M. Junghanns, A. Petermann, K. Gomez, E. Rahm. Distributed Grouping of Property Graphs with GRADOOP. In: Proc. Datenbanksysteme für Business, Technologie und Web (BTW) 2017, 3 2017. Search in Google Scholar

14. A. Petermann, M. Junghanns, S. Kemper, K. Gomez, N. Teichmann, and E. Rahm, Graph Mining for Complex Data Analytics. In: ICDM, 2016. Search in Google Scholar

15. M. Junghanns, M. Kießling, N. Teichmann, K. Gomez, A. Petermann, E. Rahm, Declarative and distributed graph analytics with GRADOOP, PVLDB, vol. 11, (2018), no. 12, p. 2006–2009. Search in Google Scholar

16. R. Grunzke, F. Jug, B. Schuller, R. Jäkel, G. Myers, and W. E. Nagel. Seamless HPC Integration of Data-intensive KNIME Workflows via UNICORE. In: Desprez F. et al., (eds), Euro-Par 2016: Parallel Processing Workshops, Euro-Par 2016. Lecture Notes in Computer Science, vol. 10104. Springer, Cham. Search in Google Scholar

17. M. R. Berthold, N. Cebron, F. Dill, T. R. Gabriel, T. Kötter, T. Meinl, P. Ohl, K. Thiel, and B. Wiswedel. KNIME – the Konstanz information miner: version 2.0 and beyond. SIGKDD Explor. Newsl. 11 (November 2009), no. 1, p. 26–31. Search in Google Scholar

18. K. Benedyczak, B. Schuller, M. Petrova-ElSayed, J. Rybicki, R. Grunzke. UNICORE 7 – Middleware Services for Distributed and Federated Computing. In: International Conference on High Performance Computing & Simulation, HPCS2016, Innsbruck, Austria, IEEE 2016, p. 613–620. Search in Google Scholar

Received: 2018-10-01
Revised: 2018-10-16
Accepted: 2018-10-18
Published Online: 2018-11-08
Published in Print: 2018-12-19

© 2018 Walter de Gruyter GmbH, Berlin/Boston