Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Paladyn, Journal of Behavioral Robotics

Editor-in-Chief: Schöner, Gregor

Covered by SCOPUS

CiteScore 2018: 2.17

SCImago Journal Rank (SJR) 2018: 0.336
Source Normalized Impact per Paper (SNIP) 2018: 1.707

ICV 2018: 120.52

Open Access
See all formats and pricing
More options …

Object Affordance Driven Inverse Reinforcement Learning Through Conceptual Abstraction and Advice

Rupam Bhattacharyya / Shyamanta M. Hazarika
  • Biomimetic and Cognitive Robotics Lab, Department of Mechanical Engineering, Indian Institute of Technology Guwahati, Assam, India
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2018-09-14 | DOI: https://doi.org/10.1515/pjbr-2018-0021


Within human Intent Recognition (IR), a popular approach to learning from demonstration is Inverse Reinforcement Learning (IRL). IRL extracts an unknown reward function from samples of observed behaviour. Traditional IRL systems require large datasets to recover the underlying reward function. Object affordances have been used for IR. Existing literature on recognizing intents through object affordances fall short of utilizing its true potential. In this paper, we seek to develop an IRL system which drives human intent recognition along with the capability to handle high dimensional demonstrations exploiting the capability of object affordances. An architecture for recognizing human intent is presented which consists of an extended Maximum Likelihood Inverse Reinforcement Learning agent. Inclusion of Symbolic Conceptual Abstraction Engine (SCAE) along with an advisor allows the agent to work on Conceptually Abstracted Markov Decision Process. The agent recovers object affordance based reward function from high dimensional demonstrations. This function drives a Human Intent Recognizer through identification of probable intents. Performance of the resulting system on the standard CAD-120 dataset shows encouraging result.

Keywords: inverse reinforcement learning; object affordance; human intent recognition; MDP


  • [1] M. Babeş-Vroman, V.Marivate, K. Subramanian, M. Littman, Apprenticeship learning about multiple intentions, In: Proceedings of the 28th International Conference on Machine Learning (ICML-11), 2011, 897-904Google Scholar

  • [2] L. Frommberger, Qualitative Spatial Abstraction in Reinforcement Learning, Springer Science & Business Media, 2010Google Scholar

  • [3] M. Garnelo, K. Arulkumaran, M. Shanahan, Towards deep symbolic reinforcement learning, 2016, arXiv1609.05518Google Scholar

  • [4] J. J. Gibson, The ecological approach to the visual perception of pictures, Leonardo, 1978, 11(3), 227-235CrossrefGoogle Scholar

  • [5] T. E. Horton, A. Chakraborty, R. S. Amant, Affordances for robots a brief survey, Avant, 2012, 3(2), 70-84Google Scholar

  • [6] H. S. Koppula, R. Gupta, A. Saxena, Learning human activities and object affordances from RGB-D videos, The International Journal of Robotics Research, 2013, 32(8), 951-970Google Scholar

  • [7] M. Wen, I. Papusha, U. Topcu, Learning from demonstrations with high-level side information, In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, 2017Google Scholar

  • [8] J. Choi, K.-E. Kim, Hierarchical Bayesian inverse reinforcement learning, IEEE Transactions on Cybernetics, 2015, 45(4), 793-805Google Scholar

  • [9] H. Min, C. Yi, R. Luo, J. Zhu, S. Bi, Affordance research in developmental robotics a survey, IEEE Transactions on Cognitive and Developmental Systems, 2016, 8(4), 237-255Google Scholar

  • [10] P. Zech, S. Haller, S. R. Lakani, B. Ridge, E. Ugur, J. Piater, Computational models of affordance in robotics a taxonomy and systematic classification, Adaptive Behavior, 2017, 25(5), 235-271Google Scholar

  • [11] R. Bhattacharyya, Z. Bhuyan, S. M. Hazarika, O-PrO An Ontology for Object Affordance Reasoning, Springer International Publishing, Cham, 2017, 39-50Google Scholar

  • [12] H. S. Koppula A. Saxena, Physically grounded spatiotemporal object affordances, In: Computer Vision - ECCV 2014, Springer, 2014, 831-847Google Scholar

  • [13] S. Kim, Z. Yu, J. Kim, A. Ojha, M. Lee, Human-robot interaction using intention recognition, In: Proceedings of the 3rd International Conference on Human-Agent Interaction, ACM, 2015, 299-302Google Scholar

  • [14] N. Yamanobe, W. Wan, I. G. Ramirez-Alpizar, D. Petit, T. Tsuji, S. Akizuki, et al., A brief review of affordance in robotic manipulation research, Advanced Robotics, 2017, 31(19-20), 1086-1101Web of ScienceGoogle Scholar

  • [15] A. Nguyen, D. Kanoulas, D. G. Caldwell, N. G. Tsagarakis, Detecting object affordances with convolutional neural networks, In: 2016 IEEERSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2016, 2765-2770Google Scholar

  • [16] A. Nguyen, D. Kanoulas, D. G. Caldwell, N. G. Tsagarakis, Objectbased affordances detection with convolutional neural networks and dense conditional random fields, In: 2017 IEEERSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2017, 5908-5915Google Scholar

  • [17] T.-T. Do, A. Nguyen, I. Reid, AffordanceNet: An end-to-end deep learning approach for object affordance detection, 2017, arXiv1709.07326Google Scholar

  • [18] L. Montesano, M. Lopes, A. Bernardino, J. Santos-Victor, Learning object affordances From sensory-motor coordination to imitation, IEEE Transactions on Robotics, 2008, 24(1), 15-26Web of ScienceCrossrefGoogle Scholar

  • [19] V. Krunic, G. Salvi, A. Bernardino, L. Montesano, J. Santos- Victor, Affordance based word-to-meaning association, In: IEEE International Conference on Robotics and Automation (ICRA’09), IEEE, 2009, 4138-4143Google Scholar

  • [20] M. Lopes, F. S. Melo, L. Montesano, Affordance-based imitation learning in robots, In: 2007 IEEERSJ International Conference on Intelligent Robots and Systems (IROS 2007), IEEE, 2007, 1015-1021Google Scholar

  • [21] P. Sermanet, K. Xu, S. Levine, Unsupervised perceptual rewards for imitation learning, 2016, arXiv1612.06699Google Scholar

  • [22] K. Kitani, B. Ziebart, J. Bagnell, M. Hebert, Activity forecasting, Computer Vision - ECCV 2012, 2012, 201-214Google Scholar

  • [23] S. Shahryari P. Doshi, Inverse reinforcement learning under noisy observations, In: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems (AAMAS’17), International Foundation for Autonomous Agents and Multiagent Systems, 2017, 1733-1735Google Scholar

  • [24] K. Bogert, J. F.-S. Lin, P. Doshi, D. Kulic, Expectationmaximization for inverse reinforcement learning with hidden data, In: Proceedings of the 2016 International Conference on Autonomous Agents and MultiAgent Systems (AAMAS’16), International Foundation for Autonomous Agents and Multiagent Systems, 2016, 1034-1042Google Scholar

  • [25] M. Wulfmeier, D. Z. Wang, I. Posner, Watch this Scalable costfunction learning for path planning in urban environments, In: 2016 IEEERSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2016, 2089-2095Google Scholar

  • [26] C. Finn, S. Levine, P. Abbeel, Guided cost learning Deep inverse optimal control via policy optimization, In: Proceedings of the 33rd International Conference on Machine Learning, 2016, 48Google Scholar

  • [27] M. Monfort, A. Liu, B. Ziebart, Intent prediction and trajectory forecasting via predictive inverse linear-quadratic regulation, In: 29th AAAI Conference on Artificial Intelligence, 2015Google Scholar

  • [28] G. Kunapuli, P. Odom, J. W. Shavlik, S. Natarajan, Guiding autonomous agents to better behaviors through human advice, In: 2013 IEEE 13th International Conference on Data Mining (ICDM), IEEE, 2013, 409-418Google Scholar

  • [29] P. Odom, S. Natarajan, Active advice seeking for inverse reinforcement learning, In: Proceedings of the 2016 International Conference on Autonomous Agents andMultiagent Systems, International Foundation for Autonomous Agents and Multiagent Systems, 2016, 512-520Google Scholar

  • [30] C. Rothkopf, C. Dimitrakakis, Preference elicitation and inverse reinforcement learning, ECML PKDD 2011Machine Learning and Knowledge Discovery in Databases, 2011, 34-48Google Scholar

  • [31] Z. Yu, S. Kim, R. Mallipeddi, M. Lee, Human intention understanding based on object affordance and action classification, In: 2015 International Joint Conference on Neural Networks (IJCNN), IEEE, 2015, 1-6Google Scholar

  • [32] M. Sindlar J.-J. Meyer, Affordance-based intention recognition in virtual spatial environments, In: Principles and Practice ofMulti- Agent Systems, Springer, 2010, 304-319Google Scholar

  • [33] H. S. Koppula A. Saxena, Anticipating human activities using object affordances for reactive robotic response, IEEE Transactions on Pattern Analysis andMachine Intelligence, 2016, 38(1), 14-29Google Scholar

  • [34] D. Xie, T. Shu, S. Todorovic, S.-C. Zhu, Modeling and inferring human intents and latent functional objects for trajectory prediction, 2016, arXiv1606.07827Google Scholar

  • [35] S. Holtzen, Y. Zhao, T. Gao, J. B. Tenenbaum, S.-C. Zhu, Inferring human intent from video by sampling hierarchical plans, In: 2016 IEEERSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2016, 1489-1496Google Scholar

  • [36] J. Young, N. Hawes, Learning by observation using qualitative spatial relations, In: Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, International Foundation for Autonomous Agents and Multiagent Systems, 2015, 745-751Google Scholar

  • [37] A. G. Cohn, S. M. Hazarika, Qualitative spatial representation and reasoning An overview, Fundamenta Informaticae, 2001, 46(1-2), 1-29Google Scholar

  • [38] E. Clementini, P. Di Felice, D. Hernández, Qualitative representation of positional information, Artificial intelligence, 1997, 95(2), 317-356Google Scholar

  • [39] P. Duckworth, Y. Gatsoulis, F. Jovan, N. Hawes, D. C. Hogg, A. G. Cohn, Unsupervised learning of qualitative motion behaviours by amobile robot, In: Proceedings of the 2016 International Conference on Autonomous Agents and Multiagent Systems (AAMAS’ 16), International Foundation for Autonomous Agents and Multiagent Systems, 2016, 1043-1051Google Scholar

  • [40] M. Richardson, P. Domingos, Markov logic networks, Machine learning, 2006, 62(1-2), 107-136Web of ScienceGoogle Scholar

  • [41] T. L. Saaty,What is the analytic hierarchy process, In:Mathematical models for decision support, Springer, 1988, 109-121Google Scholar

  • [42] B. Bonet, J. Pearl, Qualitative MDPS and POMDPS An orderofmagnitude approximation, In: Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann Publishers Inc., 2002, 61-68Google Scholar

  • [43] R. S. Sutton, A. G. Barto, Reinforcement learning An introduction, MIT press Cambridge, 1998Google Scholar

  • [44] G. H. John, When the best move isn’t optimal Q-learning with exploration, In: AAAI-1994 Proceedings, 1464Google Scholar

  • [45] M. James, BURLAP (Brown-UMBC Reinforcement Learning And Planning), 2013, [Accessed Online on 27-July-2017]Google Scholar

  • [46] S. Wintermute, Using imagery to simplify perceptual abstraction in reinforcement learning agents, Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI’10), 2010, 1567-1573Google Scholar

  • [47] J. Tayyub, A. Tavanai, Y. Gatsoulis, A. G. Cohn, D. C. Hogg, Qualitative and quantitative spatio-temporal relations in daily living activity recognition, In: Asian Conference on Computer Vision (ACCV 2014), Springer, 2014, 115-130Google Scholar

  • [48] M. Sridhar, A. G. Cohn, D. C. Hogg, Unsupervised learning of event classes from video, In: Proceedings of the 24th AAAI Conference on Artificial Intelligence, 2010, 1631-1638Google Scholar

About the article

Received: 2017-12-29

Accepted: 2018-08-03

Published Online: 2018-09-14

Citation Information: Paladyn, Journal of Behavioral Robotics, Volume 9, Issue 1, Pages 277–294, ISSN (Online) 2081-4836, DOI: https://doi.org/10.1515/pjbr-2018-0021.

Export Citation

© by Rupam Bhattacharyya, published by De Gruyter. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. BY-NC-ND 4.0

Comments (0)

Please log in or register to comment.
Log in