In this paper, a method for evaluating the chronological age of adolescents on the basis of their voice signal is presented. For every examined child, the vowels a, e, i, o and u were recorded in extended phonation. Sixty voice parameters were extracted from each recording. Voice recordings were supplemented with height measurement in order to check if it could improve the accuracy of the proposed solution. Predictor selection was performed using the LASSO (least absolute shrinkage and selection operator) algorithm. For age estimation, the random forest (RF) for regression method was employed and it was tested using a 10-fold cross-validation. The lowest absolute error (0.37 year ± 0.28) was obtained for boys only when all selected features were included into prediction. In all cases, the achieved accuracy was higher for boys than for girls, which results from the fact that the change of voice with age is larger for men than for women. The achieved results suggest that the presented approach can be employed for accurate age estimation during rapid development in children.
We would like to thank Bruce Turner for the English language corrections.
Research funding: Authors state no funding involved.
Conflict of interest: Authors declare no conflict of interest.
Informed consent: Informed consent is not applicable.
Ethical approval: The conducted research is not related to either human or animal use.
 Russell M, Series RW, Wallace JL, Brown C, Skilling A. The STAR system: an interactive pronunciation tutor for young children. Comput Speech Lang 2000;14:161–75.10.1006/csla.2000.0139Search in Google Scholar
 Kim HJ, Bae K, Yoon HS. Age and gender classification for a home-robot service. In: RO-MAN 2007 – The 16th IEEE International Symposium on Robot and Human Interactive Communication; 2007:122–6.10.1109/ROMAN.2007.4415065Search in Google Scholar
 Bugdol MD, Bugdol MN, Lipowicz AM, Mitas AW, Bienkowska MJ, Wijata AM. Prediction of menarcheal status of girls using voice features. Comput Biol Med 2018;100:296–304.10.1016/j.compbiomed.2017.11.005Search in Google Scholar PubMed
 Mirhassani SM, Zourmand A, Ting HN. Age estimation based on children’s voice: a Fuzzy-based decision fusion strategy. Sci World J 2014;2014:9.10.1155/2014/534064Search in Google Scholar PubMed PubMed Central
 Muller C, Burkhardt F. Combining short-term cepstral and long-term pitch features for automatic recognition of speaker age. In: Interspeech 2007, 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium; 2007:2277–80.10.21437/Interspeech.2007-618Search in Google Scholar
 Metze F, Ajmera J, Englert R, Bub U, Burkhardt F, Stegmann J, et al. Comparison of four approaches to age and gender recognition for telephone applications. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing – Proceedings, Honolulu, HI, USA. vol. 4; 2007:IV1089–92. DOI: 10.1109/ICASSP.2007.367263.10.1109/ICASSP.2007.367263Search in Google Scholar
 Mahmoodi D, Marvi H, Taghizadeh M, Soleimani A, Razzazi F, Mahmoodi M. Age estimation based on speech features and support vector machine. In: CEEC’11, 3rd Computer Science and Electronic Engineering Conference, Colchester, UK; 2011:60–4. DOI: 10.1109/CEEC.2011.5995826.10.1109/CEEC.2011.5995826Search in Google Scholar
 Van Heerden C, Barnard E, Davel M, Van Der Walt C, Van Dyk E, Feld M, et al. Combining regression and classification methods for improving automatic speaker age recognition. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing – Proceedings, Dallas, TX, USA; 2010:5174–7. DOI: 10.1109/ICASSP.2010.5495006.10.1109/ICASSP.2010.5495006Search in Google Scholar
 Li M, Han KJ, Narayanan S. Automatic speaker age and gender recognition using acoustic and prosodic level information fusion. Comput Speech Lang 2013;27:151–67.10.1016/j.csl.2012.01.008Search in Google Scholar
 Iseli M, Shue YL, Alwan A. AGE- and gender-dependent analysis of voice source characteristics. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing – Proceedings, Toulouse, France. vol. 1; 2006:I389–92. DOI: 10.1109/ICASSP.2006.1660039.10.1109/ICASSP.2006.1660039Search in Google Scholar
 Bocklet T, Maier A, Bauer JG, Burkhardt F, Nöth E. Age and gender recognition for telephone applications based on GMM supervectors and support vector machines. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing – Proceedings, Las Vegas, NV, USA; 2008:1605–8. DOI: 10.1109/ICASSP.2008.4517932.10.1109/ICASSP.2008.4517932Search in Google Scholar
 Dobry G, Hecht RM, Avigal M, Zigel Y. Supervector dimension reduction for efficient speaker age estimation based on the acoustic speech signal. IEEE T Acoust Speech 2011;19:1975–85.10.1109/TASL.2011.2104955Search in Google Scholar
 Meinedo H, Trancoso I. Age and gender classification using fusion of acoustic and prosodic features. In: Interspeech 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan; 2010:2818–21.10.21437/Interspeech.2010-745Search in Google Scholar
 Minematsu N, Sekiguchi M, Hirose K. Automatic estimation of one’s age with his/her speech based upon acoustic modeling techniques of speakers. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing – Proceedings, Orlando, FL, USA. vol. 1; 2002:I/137–40. DOI: 10.1109/ICASSP.2002.5743673.10.1109/ICASSP.2002.5743673Search in Google Scholar
 Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. Springer Series in Statistics. New York, NY, USA: Springer New York Inc.; 2001.10.1007/978-0-387-21606-5Search in Google Scholar
©2020 Walter de Gruyter GmbH, Berlin/Boston