Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Foundations of Computing and Decision Sciences

The Journal of Poznan University of Technology

4 Issues per year

CiteScore 2016: 0.75

SCImago Journal Rank (SJR) 2016: 0.330
Source Normalized Impact per Paper (SNIP) 2016: 0.709

Open Access
See all formats and pricing
More options …

Tools for Distributed Systems Monitoring

Łukasz Kufel
  • Institute of Computing Science, Poznan University of Technology, Poznan, Technical Operations Manager, Expedia.com, Poland
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2016-12-13 | DOI: https://doi.org/10.1515/fcds-2016-0014


The management of distributed systems infrastructure requires dedicated set of tools. The one tool that helps visualize current operational state of all systems and notify when failure occurs is available within monitoring solution. This paper provides an overview of monitoring approaches for gathering data from distributed systems and what are the major factors to consider when choosing a monitoring solution. Finally we discuss the tools currently available on the market.

Keywords: distributed systems; monitoring; monitoring solution; monitoring tools


  • [1] Aceto G., Botta A., De Donato W., Pescape A., Cloud monitoring: A survey, Computer Networks, vol. 57, pp. 2093-2115, 2013.Google Scholar

  • [2] Boccia V. et al., Infrastructure Monitoring for distributed Tier1: The ReCaS project use-case, International Conference on Intelligent Networking and Collaborative Systems, Salerno, Italy, 2014.Google Scholar

  • [3] Fatema K., Emeakaroha V. C., Healy P. D., Morrison J. P., Lynn T., A survey of Cloud monitoring tools: Taxonomy, capabilities and objectives, Journal of Parallel and Distributed Computing, vol. 74, no. 10, pp. 2918-2933, 2014.Google Scholar

  • [4] Hakulinen T., Ninin P., Nunes R., Riesco-Hernandez T., Revisiting CERN Safety System Monitoring (SSM), Proceedings of International Conference on Accelerator & Large Experimental Physics Control Systems, San Francisco, California, USA, 2013.Google Scholar

  • [5] Hernantes J., Gallardo G., Serrano N., IT Infrastructure-Monitoring Tools, IEEE Software, vol. 32, no. 4, pp. 88-93, 2015.Google Scholar

  • [6] Horalek J., Sobeslav V., Proactive ICT Application Monitoring, Latest Trends in Information Technology, Wseas Press, pp. 49-54, 2012.Google Scholar

  • [7] Kent K., Souppaya M., Guide to Computer Security Log Management, US Nat'l Inst. Standards and Technology, Sept. 2006; http://csrc.nist.gov/publications/nistpubs/800-92SP800-92.pdf.Google Scholar

  • [8] Kufel L., Security Event Monitoring in a Distributed Systems Environment, IEEE Security & Privacy, vol. 11, no. 1, pp. 36-43, 2013.Google Scholar

  • [9] Massie M., Li B., Nicholes B., Vuksan V., Monitoring with Ganglia, Book published by O’Reilly Media, 2013.Google Scholar

  • [10] Smit M., Simmons B., Litoiu M., Distributed, application-level monitoring for heterogeneous clouds using stream processing, Future Generation Computer Systems, vol. 29, pp. 2103-2114, 2013.Google Scholar

  • [11] Spellmann A., Gimarc R., Capacity Planning: A Revolutionary Approach for Tomorrow’s Digital Infrastructure, Computer Measurement Group Conference, La Jolla, California, USA, 2013.Google Scholar

  • [12] Terenziani P., Coping with Events in Temporal Relational Databases, IEEE Trans. Knowledge and Data Eng., vol. 25, no. 5, pp. 1181-1185, 2013.Google Scholar

  • [13] Tierney B., Crowley B., Gunter D., Holding M., Lee J., Thompson M., A Monitoring Sensor Management System for Grid Environments, Proceedings of The Ninth International Symposium On High-performance Distributed Computing, IEEE CS, pp. 97-104, 2000.Google Scholar

  • [14] Amazon AWS Micro instance limitations, https://aws.amazon.com/ec2/faqs, Jul 2016.Google Scholar

  • [15] AppDynamics, Application Performance Monitoring & Management, http://www.appdynamics.com, Apr 2016.Google Scholar

  • [16] Datadog, Cloud Monitoring as a Service, http://www.datadoghq.com, Apr 2016.Google Scholar

  • [17] DevOps support teams, http://theagileadmin.com/what-is-devops, Apr 2016.Google Scholar

  • [18] External Data Representation (XDR), Wikipedia page, https://en.wikipedia.org/wiki/External_Data_Representation, Feb 2016.Google Scholar

  • [19] Ganglia Monitoring System, http://ganglia.sourceforge.net, Feb 2016.Google Scholar

  • [20] Graphite, Graphs rendering application, http://graphite.readthedocs.org, Apr 2016.Google Scholar

  • [21] High availability, Wikipedia page, https://en.wikipedia.org/wiki/High_availability, Feb 2016.Google Scholar

  • [22] HP Operations Manager, http://hp.com/go/Ops, Feb 2016.Google Scholar

  • [23] Hyperic Application & System Monitoring, http://sourceforge.net/projects/hyperic-hq, Feb 2016.Google Scholar

  • [24] IBM SmartCloud Monitoring, http://ibm.com/software/tivoli/products/smartcloudmonitoring, Feb 2016.Google Scholar

  • [25] Icinga, Open Source Monitoring, http://www.icinga.org, Apr 2016.Google Scholar

  • [26] InfluxData, The platform for time-series data, https://influxdata.com, Apr 2016.Google Scholar

  • [27] International Telecommunication Union, X.733: Information technology - Open Systems Interconnection - Systems Management: Alarm reporting function, http://www.itu.int/rec/T-REC-X.733/en, Apr 2016.Google Scholar

  • [28] Live monitoring console of Wikimedia Grid, http://ganglia.wikimedia.org, Feb 2016.Google Scholar

  • [29] ManageEngine Applications Manager, http://appmanager.com, Feb 2016.Google Scholar

  • [30] Nagios - The Industry Standard In IT Infrastructure Monitoring, http://www.nagios.org, Feb 2016.Google Scholar

  • [31] New Relic, Application Performance Management & Monitoring, http://newrelic.com, Apr 2016.Google Scholar

  • [32] PagerDuty, The Incident Resolution Platform For IT Operations & DevOps Teams, http://www.pagerduty.com, Apr 2016.Google Scholar

  • [33] Prometheus, Monitoring system and time-series database, http://prometheus.io, Apr 2016.Google Scholar

  • [34] Request for Comments (RFC) 5424 - The Syslog Protocol, http://tools.ietf.org/html/rfc5424#section-6.2.1, Feb 2016.Google Scholar

  • [35] Request for Comments (RFC) 5674 - Alarms in Syslog, https://tools.ietf.org/html/rfc5674.html, Apr 2016.Google Scholar

  • [36] Riemann, A network monitoring system, http://riemann.io, Apr 2016.Google Scholar

  • [37] Sensu, Monitoring for today’s infrastructure, https://sensuapp.org, Apr 2016.Google Scholar

  • [38] Shinken Monitoring, http://shinken-monitoring.org, Apr 2016.Google Scholar

  • [39] Windows Event Types, http://msdn.microsoft.com/enus/library/windows/desktop/aa363662.aspx, Feb 2016.Google Scholar

  • [40] Zabbix - The Enterprise-Class Open Source Network Monitoring Solution, http://www.zabbix.com, Feb 2016.Google Scholar

About the article

Received: 2016-02-22

Accepted: 2016-09-15

Published Online: 2016-12-13

Published in Print: 2016-11-01

Citation Information: Foundations of Computing and Decision Sciences, Volume 41, Issue 4, Pages 237–260, ISSN (Online) 2300-3405, DOI: https://doi.org/10.1515/fcds-2016-0014.

Export Citation

© by Łukasz Kufel. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. BY-NC-ND 4.0

Comments (0)

Please log in or register to comment.
Log in