Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Paladyn, Journal of Behavioral Robotics

Editor-in-Chief: Schöner, Gregor

CiteScore 2017: 0.33

SCImago Journal Rank (SJR) 2017: 0.104

ICV 2017: 99.90

See all formats and pricing

Access brought to you by:

provisional account

Open Access
More options …

Active Choice of Teachers, Learning Strategies and Goals for a Socially Guided Intrinsic Motivation Learner

Sao Mai Nguyen / Pierre-Yves Oudeyer
Published Online: 2013-05-10 | DOI: https://doi.org/10.2478/s13230-013-0110-z


We present an active learning architecture that allows a robot to actively learn which data collection strategy is most efficient for acquiring motor skills to achieve multiple outcomes, and generalise over its experience to achieve new outcomes. The robot explores its environment both via interactive learning and goal-babbling. It learns at the same time when, who and what to actively imitate from several available teachers, and learns when not to use social guidance but use active goal-oriented self-exploration. This is formalised in the framework of life-long strategic learning.

The proposed architecture, called Socially Guided Intrinsic Motivation with Active Choice of Teacher and Strategy (SGIM-ACTS), relies on hierarchical active decisions of what and how to learn driven by empirical evaluation of learning progress for each learning strategy. We illustrate with an experiment where a simulated robot learns to control its arm for realising two kinds of different outcomes. It has to choose actively and hierarchically at each learning episode: 1) what to learn: which outcome is most interesting to select as a goal to focus on for goal-directed exploration; 2) how to learn: which data collection strategy to use among self-exploration, mimicry and emulation; 3) once he has decided when and what to imitate by choosing mimicry or emulation, then he has to choose who to imitate, from a set of different teachers. We show that SGIM-ACTS learns significantly more efficiently than using single learning strategies, and coherently selects the best strategy with respect to the chosen outcome, taking advantage of the available teachers (with different levels of skills).

Keywords: strategic learner; imitation learning; mimicry; emulation; artificial curiosity; intrinsic motivation; interactive learner; active learning; goal babbling; robot skill learning


  • [1] Brenna D. Argall, B. Browning, and Manuela Veloso. Learning robot motion control with demonstration and advice-operators. In In Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 399–404. IEEE, September 2008.Google Scholar

  • [2] Brenna D. Argall, B. Browning, and Manuela Veloso. Teacher feedback to scaffold and refine demonstrated motion primitives on a mobile robot. Robotics and Autonomous Systems, 59(3–4):243 255, 2011.Web of ScienceGoogle Scholar

  • [3] Brenna D. Argall, Sonia Chernova, Manuela Veloso, and Brett Browning. A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5):469 – 483, 2009.Google Scholar

  • [4] C.G. Atkeson, Moore Andrew, and Schaal Stefan. Locally weighted learning. AI Review, 11:11-73, April 1997.Google Scholar

  • [5] Y. Baram, R. El-Yaniv, and K. Luz. Online choice of active learning algorithms. The Journal of Machine Learning Research,, 5:255–291, 2004.Google Scholar

  • [6] Adrien Baranes and Pierre-Yves Oudeyer. Active learning of inverse models with intrinsically motivated goal exploration in robots. Robotics and Autonomous Systems, 61(1):49–73, 2013.CrossrefWeb of ScienceGoogle Scholar

  • [7] Andrew G. Barto, S. Singh, and N Chentanez. Intrinsically motivated learning of hierarchical collections of skills. In ICDL International Conference on Developmental Learning, pages 112–119, 2004.Google Scholar

  • [8] Aude Billard, Sylvain Calinon, Ruediger Dillmann, and Stefan Schaal. Handbook of Robotics, chapter Robot Programming by Demonstration. Number 59. MIT Press, 2007.Google Scholar

  • [9] Cynthia Breazeal and B. Scassellati. Robots that imitate humans. Trends in Cognitive Sciences, 6(11):481–487, 2002.CrossrefGoogle Scholar

  • [10] Maya Cakmak, C. Chao, and Andrea L. Thomaz. Designing interactions for robot active learners. Autonomous Mental Development, IEEE Transactions on, 2(2):108–118, 2010.Google Scholar

  • [11] Maya Cakmak, Nick DePalma, Andrea L. Thomaz, and Rosa Arriaga. Effects of social exploration mechanisms on robot learning. In The 18th IEEE International Symposium on Robot and Human Interactive Communication, 2009. RO-MAN 2009., pages 128–134. IEEE, 2009.Google Scholar

  • [12] J. Call and M. Carpenter. Imitation in animals and artifacts, chapter Three sources of information in social learning, pages 211–228. Cambridge, MA: MIT Press., 2002.Google Scholar

  • [13] Sonia Chernova and Manuela Veloso. Interactive policy learning through confidence-based autonomy. Journal of Artificial Intelligence Research, 34, 2009.Google Scholar

  • [14] Gergely Csibra. Teleological and referential understanding of action in infancy. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 358(1431):447, 2003.Google Scholar

  • [15] B.C. da Silva, G. Konidaris, and Andrew G. Barto. Learning parameterized skills. In 29th International Conference on Machine Learning (ICML 2012), 2012.Google Scholar

  • [16] Kerstin Dautenhahn and Chrystopher L. Nehaniv. Imitation in Animals and Artifacts. MIT Press, 2002.Google Scholar

  • [17] Daniel H Grollman and Odest Chadwicke” Jenkins. Incremental learning of subtasks from unsegmented demonstration. In Intelligent Robots and Systems IROS 2010 IEEERSJ International Conference on, pages 261-266, 2010.Google Scholar

  • [18] Jens Kober, Andreas Wilhelm, Erhan Oztop, and Jan Peters. Reinforcement learning to adjust parametrized motor primitives to new situations. Autonomous Robots, pages 1–19, 2012. 10.1007/s10514-012-9290-3.Web of ScienceGoogle Scholar

  • [19] N Koenig, L Takayama, and M Matari¢. Communication and knowledge sharing in human-robot interaction and learning from demonstration. Neural Netw, 23(8-9):1104 1112, Oct-Nov 2010.PubMedGoogle Scholar

  • [20] Petar Kormushev, Sylvain Calinon, and Darwin G. Caldwell. Robot motor skill coordination with EM-based reinforcement learning. In Proc. IEEE/RSJ Intl Conf. on Intelligent Robots and Systems (IROS), pages 3232–3237, Taipei, Taiwan, October 2010.Google Scholar

  • [21] Petar Kormushev, Sylvain Calinon, and Darwin G. Caldwell. Imitation learning of positional and force skills demonstrated via kinesthetic teaching and haptic input. Advanced Robotics, 25(5):581–603, 2011.CrossrefWeb of ScienceGoogle Scholar

  • [22] J. C. Lagarias, J. A. Reeds, M. H. Wright, and P. E. Wright. Convergence properties of the nelder-mead simplex method in low dimensions. SIAM Journal of Optimization, 9(1):112–147, 1998.CrossrefGoogle Scholar

  • [23] Manuel Lopes, Thomas Cederborg, and Pierre-Yves Oudeyer. Simultaneous acquisition of task and feedback models. Development and Learning (ICDL), 2011 IEEE International Conference on, pages 1 – 7, 2011.Google Scholar

  • [24] Manuel Lopes, Tobias Lang, Marc Toussaint, Pierre-Yves Oudeyer, et al. Exploration in model-based reinforcement learning by empirically estimating learning progress. In Neural Information Processing Systems (NIPS), 2012.Google Scholar

  • [25] Manuel Lopes, Francisco Melo, and Luis Montesano. Active learning for reward estimation in inverse reinforcement learning. Machine Learning and Knowledge Discovery in Databases, pages 31–46, 2009.Google Scholar

  • [26] Manuel Lopes, Francisco Melo, Luis Montesano, and Jose Santos-Victor. From Motor to Interaction Learning in Robots, chapter Abstraction Levels for Robotic Imitation: Overview and Computational Approaches. Springer, 2009.Google Scholar

  • [27] Manuel Lopes and Pierre-Yves Oudeyer. The Strategic Student Approach for Life-Long Exploration and Learning. In IEEE Conference on Development and Learning / EpiRob, San Diego, États-Unis, November 2012.Google Scholar

  • [28] Chrystopher L Nehaniv and Kerstin Dautenhahn. Imitation and Social Learning in Robots, Humans and Animals: Behavioural, Social and Communicative Dimensions. Cambridge Univ. Press, Cambridge, March 2007.Web of ScienceGoogle Scholar

  • [29] Sao Mai Nguyen, Adrien Baranes, and Pierre-Yves Oudeyer. Bootstrapping intrinsically motivated learning with human demonstrations. In IEEE International Conference on Development and Learning, Frankfurt, Germany, 2011.Google Scholar

  • [30] Sao Mai Nguyen and Pierre-Yves Oudeyer. Properties for efficient demonstrations to a socially guided intrinsically motivated learner. In 21st IEEE International Symposium on Robot and Human Interactive Communication, 2012.Google Scholar

  • [31] M.N. Nicolescu and M.J. Mataric. Natural methods for robot task learning: Instructive demonstrations, generalization and practice. In Proceedings of the second international joint conference on Autonomous agents and multiagent systems, pages 241–248. ACM, 2003.Google Scholar

  • [32] Pierre-Yves Oudeyer and Frederic Kaplan. What is intrinsic motivation? a typology of computational approaches. Frontiers in Neurorobotics, 2007.Google Scholar

  • [33] Pierre-Yves Oudeyer, Frederic Kaplan, and Verena Hafner. Intrinsic motivation systems for autonomous mental development. IEEE Transactions on Evolutionary Computation, 11(2):265–286, 2007.Web of ScienceGoogle Scholar

  • [34] Jan Peters and Stefan Schaal. Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4):682–697, 2008.Web of ScienceGoogle Scholar

  • [35] G. Qi, X. Hua, Y. Rui, J. Tang, and H. Zhang. Two-dimensional active learning for image classification. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1–8. IEEE, 2008.Google Scholar

  • [36] Roi Reichart, Katrin Tomanek, Udo Hahn, and Ari Rappoport. Multi-task active learning for linguistic annotations. In Annual Meeting of the Association for Computational Linguistics (ACL). Citeseer, 2008.Google Scholar

  • [37] Matthias Rolf and Jochen J Steil. Goal babbling: a new concept for early sensorimotor exploration. pages 40–43, Osaka, 11/2012 2012. IEEE.Google Scholar

  • [38] Richard M. Ryan and Edward L. Deci. Intrinsic and extrinsic motivations: Classic definitions and new directions. Contemporary Educational Psychology, 25(1):54 – 67, 2000.Google Scholar

  • [39] Stefan Schaal, A Ijspeert, and Aude Billard. Computational approaches to motor learning by imitation. Philosophical transactions of the Royal Society of London. Series B, Biological sciences, 358(1431), 03 2003.Google Scholar

  • [40] Andrea L. Thomaz. Socially Guided Machine Learning. PhD thesis, MIT, 5 2006.Google Scholar

  • [41] Andrea L. Thomaz and Cynthia Breazeal. Experiments in socially guided exploration: Lessons learned in building robots that learn with and without human teachers. Connection Science, 20 Special Issue on Social Learning in Embodied Agents(2-3):91–110, 2008.Web of ScienceGoogle Scholar

  • [42] M. Tomasello and M. Carpenter. Shared intentionality. Developmental Science, 10(1):121–125, 2007.CrossrefPubMedGoogle Scholar

  • [43] Andrew Whiten. Primate culture and social learning. Cognitive Science, 24(3):477–508, 2000.CrossrefGoogle Scholar

About the article

Received: 2012-12-15

Accepted: 2013-03-27

Published Online: 2013-05-10

Published in Print: 2012-09-01

Citation Information: Paladyn, Journal of Behavioral Robotics, Volume 3, Issue 3, Pages 136–146, ISSN (Online) 2081-4836, DOI: https://doi.org/10.2478/s13230-013-0110-z.

Export Citation

© Sao Mai Nguyen et al.. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License. BY-NC-ND 3.0

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

Lorenzo Jamone, Emre Ugur, Angelo Cangelosi, Luciano Fadiga, Alexandre Bernardino, Justus Piater, and Jose Santos-Victor
IEEE Transactions on Cognitive and Developmental Systems, 2018, Volume 10, Number 1, Page 4
Emre Ugur and Justus Piater
IEEE Transactions on Cognitive and Developmental Systems, 2017, Volume 9, Number 4, Page 328
Mustafa Ersen, Erhan Oztop, and Sanem Sariel
IEEE Robotics & Automation Magazine, 2017, Volume 24, Number 3, Page 108
Anna-Lisa Vollmer, Britta Wrede, Katharina J. Rohlfing, and Pierre-Yves Oudeyer
Frontiers in Neurorobotics, 2016, Volume 10
James Law, Patricia Shaw, Mark Lee, and Michael Sheldon
IEEE Transactions on Autonomous Mental Development, 2014, Volume 6, Number 2, Page 93
Albert Ali Salah, Pierre-Yves Oudeyer, Cetin Mericli, and Javier Ruiz-del-Solar
IEEE Transactions on Autonomous Mental Development, 2014, Volume 6, Number 2, Page 77
Serena Ivaldi, Sao Mai Nguyen, Natalia Lyubova, Alain Droniou, Vincent Padois, David Filliat, Pierre-Yves Oudeyer, and Olivier Sigaud
IEEE Transactions on Autonomous Mental Development, 2014, Volume 6, Number 1, Page 56
Thomas Cederborg and Pierre-Yves Oudeyer
IEEE Transactions on Autonomous Mental Development, 2013, Volume 5, Number 3, Page 222
Emre Ugur, Yukie Nagai, Erol Sahin, and Erhan Oztop
IEEE Transactions on Autonomous Mental Development, 2015, Volume 7, Number 2, Page 119
Pedro Cardoso-Leite and Daphne Bavelier
Current Opinion in Neurology, 2014, Volume 27, Number 2, Page 185
Sao Mai Nguyen and Pierre-Yves Oudeyer
Autonomous Robots, 2014, Volume 36, Number 3, Page 273

Comments (0)

Please log in or register to comment.
Log in