Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Biometrical Letters

The Journal of Polish Biometric Society

2 Issues per year

Open Access
Online
ISSN
1896-3811
See all formats and pricing
More options …

Clustering of Symbolic Data based on Affinity Coefficient: Application to a Real Data Set

Áurea Sousa
  • Corresponding author
  • University of Azores, Department of Mathematics, CEEAplA, and CMATI, 9501-855-Ponta Delgada, Portugal
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Helena Bacelar-Nicolau
  • Corresponding author
  • University of Lisbon, Faculty of Psychology, Laboratory of Statistics and Data Analysis 1649-013-Lisboa, Portugal, and DataScience
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Fernando C. Nicolau / Osvaldo Silva
Published Online: 2013-06-05 | DOI: https://doi.org/10.2478/bile-2013-0015

SUMMARY

In this paper, we illustrate an application of Ascendant Hierarchical Cluster Analysis (AHCA) to complex data taken from the literature (interval data), based on the standardized weighted generalized affinity coefficient, by the method of Wald and Wolfowitz. The probabilistic aggregation criteria used belong to a parametric family of methods under the probabilistic approach of AHCA, named VL methodology. Finally, we compare the results achieved using our approach with those obtained by other authors.

Keywords : Ascendant Hierarchical Cluster Analysis; Symbolic Data; Interval Data; Affinity Coefficient; VL Methodology

  • Bacelar-Nicolau H. (1980): Contributions to the Study of Comparison Coefficients in Cluster Analysis, PhD Th. (in Portuguese), Univ. Lisbon.Google Scholar

  • Bacelar-Nicolau H. (1987): On the Distribution Equivalence in Cluster Analysis, Proc. of the NATO ASI on Pattern Recognition Theory and Applications, Springer- Verlag, New York, 1987: 73-79.Google Scholar

  • Bacelar-Nicolau H. (1988): Two Probabilistic Models for Classification of Variables in Frequency Tables. In: Classification and Related Methods of Data Analysis, H.-H. Bock (ed.), North Holland: Elsevier Sciences Publishers B.V.: 181-186.Google Scholar

  • Bacelar-Nicolau H. (2000): The Affinity Coefficient. In: Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data, H.-H. Bock and E. Diday (Eds.), Berlin: Springer-Verlag: 160-165.Google Scholar

  • Bacelar-Nicolau H. (2002): On the Generalised Affinity Coefficient for Complex Data. Biocybernetics and Biomedical Engineering 22(1): 31-42.Google Scholar

  • Bacelar-Nicolau H., Nicolau F.C., Sousa A., Bacelar-Nicolau L. (2009): Measuring Similarity of Complex and Heterogeneous Data in Clustering of Large Data Sets, Biocybernetics and Biomedical Engineering 29(2): 9-18.Google Scholar

  • Bacelar-Nicolau H., Nicolau F.C., Sousa A., Bacelar-Nicolau L. (2010): Clustering Complex Heterogeneous Data Using a Probabilistic Approach. Proceedings of Stochastic Modeling Techniques and Data Analysis International Conference (SMTDA2010), Chania Crete Greece, 8-11 June 2010 - published on the CD Proceedings of SMTDA2010 (electronic publication).Google Scholar

  • Bock H.-H., Diday E. (2000): Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data. Series: Studies in Classification, Data Analysis, and Knowledge Organization, Berlin: Springer- Verlag.Google Scholar

  • Chavent M., Lechevallier Y. (2002): Dynamical Clustering Algorithm of Interval Data: Optimization of an Adequacy Criterion Based on Hausdorff Distance. In: Classification, Clustering, and Data Analysis, K. Jajuga, A. Sokolowski, H.-H. Bock (Eds.), Berlin: Springer-Verlag: 53-60.Google Scholar

  • Chavent M., De Carvalho F.A.T., Lechevallier Y., Verde R. (2003): Trois Nouvelles Méthodes de Classification Automatique de Données Symboliques de type intervalle, Revue de Statistique Appliquée, tome 51(4): 5-29.Google Scholar

  • De Carvalho F.A.T., Brito P., Bock H-H. (2006a): Dynamic Clustering for Interval Data Based on L2 Distance. Computational Statistics 21(2).Google Scholar

  • De Carvalho F.A.T., Souza R.M.C.R. de, Chavent M., Lechevallier Y. (2006b): Adaptive Hausorff Distances and Dynamic Clustering of Symbolic Interval Data. Pattern Recognition Letters 27(3).CrossrefGoogle Scholar

  • Esposito F., Malerba D., Tamma V. (2000): Dissimilarity Measures for Symbolic Objects, In: Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data, H.-H. Bock and E. Diday (Eds.), Berlin: Springer-Verlag: 165-185.Google Scholar

  • Fraser D.A.S. (1975): Non Parametric Methods in Statistics. Chapman and Hall.Google Scholar

  • Lerman I.C. (1972): Étude Distributionelle de Statistiques de Proximité entre Structures Algébriques Finies du Même Type: Apllication à la Classification Automatique. Cahiers du B.U.R.O., 19, Paris.Google Scholar

  • Lerman I.C. (1981): Classification et Analyse Ordinale des Données, Paris: Dunod.Google Scholar

  • Matusita K. (1951): On the theory of Statistical Decision Functions, Ann. Instit. Stat. Math. III: 1-30.Google Scholar

  • Nicolau F.C. (1983): Cluster Analysis and Distribution Function. Methods of Operations Research 45: 431-433.Google Scholar

  • Nicolau F.C.m, Bacelar-Nicolau H. (1998): Some Trends in the Classification of Variables. In: Data Science, Classification, and Related Methods, C. Hayashi, N. Ohsumi, K. Yajima, Y. Tanaka, H.-H. Bock, Y. Baba (Eds.), Springer-Verlag: 89-98.Google Scholar

  • Nicolau F.C. (1983): Cluster Analysis and Distribution Function. Methods of Operations Research 45: 431-433.Google Scholar

  • Souza R.M.C.R. de, De Carvalho F.A.T. (2004): Clustering of interval data Based on City-Block distances, Pattern Recognition Letters 25: 353-365.CrossrefGoogle Scholar

About the article

Published Online: 2013-06-05

Published in Print: 2013-06-01


Citation Information: Biometrical Letters, Volume 50, Issue 1, Pages 27–38, ISSN (Print) 1896-3811, DOI: https://doi.org/10.2478/bile-2013-0015.

Export Citation

This content is open access.

Comments (0)

Please log in or register to comment.
Log in