AI, Robotics, and Markov Decision Processes: Enhancing Precision and Autonomy inHealthcare Systems

Authors

  • Ikram Dahamou * Laboratory of Information Processing and Decision Support, Sultan Moulay Slimane University, Faculty of Sciences and Techniques Beni-Mellal, Morocco. https://orcid.org/0009-0007-5109-6865
  • Cherki Daoui Laboratory of Information Processing and Decision Support, Sultan Moulay Slimane University, Faculty of Sciences and Techniques Beni-Mellal, Morocco. https://orcid.org/0000-0001-5435-6414

https://doi.org/10.22105/ahse.vi.38

Abstract

Robotic systems are increasingly integrated into healthcare to enhance precision, autonomy, and efficiency. This study provides a systematic review of decision-making systems and control architectures for autonomous and social robots in hospitals, with a specific focus on Markov Decision Processes (MDPs) and their variants. A systematic search of ScienceDirect, SpringerLink, and IEEE databases was conducted covering the last three decades. Inclusion criteria focused on studies describing action-selection or decision-making methods for autonomous or semiautonomous healthcare robots. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) methodology was applied to identify, screen, and analyze relevant publications. The review identifies major application areas of MDP-based decision-making in healthcare: 1) surgical robotics, predominantly using Completely Observable Markov Decision Processes (COMDPs), 2) rehabilitation, where Partially Observable Markov Decision Processes (POMDPs) combined with deep reinforcement learning are common, 3) telemedicine, using COMDP frameworks with multi-agent coordination, 4) elderly care, leveraging POMDPs with human feedback, and 5) emergency response, applying multi-robot COMDPs enhanced with Bayesian updates. Emerging trends include hybrid COMDP–POMDP approaches and integration with machine learning for real-world deployment. MDP-based decision-making systems demonstrate strong potential to improve autonomy and adaptability in healthcare robotics. While COMDPs are effective in structured environments such as surgery, POMDPs are increasingly preferred for human-centered and uncertain contexts. Key challenges remain, including a lack of standardized benchmarks, limited clinical validation, and computational complexity. Addressing these gaps will be essential for the safe, efficient, and ethical deployment of robotic systems in healthcare.

Keywords:

Markov decision processes, Healthcare robotics, Human-robot interaction, Partially observable Markov decision processes, Decision support systems, Deep reinforcement learning

References

  1. [1] Lanfranco, A. R., Castellanos, A. E., Desai, J. P., & Meyers, W. C. (2004). Robotic surgery: A current perspective. Annals of surgery, 239(1), 14–21. http://dx.doi.org/10.1097/01.sla.0000103020.19595.7d

  2. [2] Loh, H. W., Ooi, C. P., Seoni, S., Barua, P. D., Molinari, F., & Acharya, U. R. (2022). Application of explainable artificial intelligence for healthcare: A systematic review of the last decade (2011–2022). Computer methods and programs in biomedicine, 226, 107161. https://doi.org/10.1016/j.cmpb.2022.107161

  3. [3] Paul, D., Sanap, G., Shenoy, S., Kalyane, D., Kalia, K., & Tekade, R. K. (2021). Artificial intelligence in drug discovery and development. Drug discovery today, 26(1), 80–93. https://doi.org/10.1016/j.drudis.2020.10.010

  4. [4] Broadbent, E., Stafford, R., & MacDonald, B. (2009). Acceptance of healthcare robots for the older population: Review and future directions. International journal of social robotics, 1(4), 319–330. https://doi.org/10.1007/s12369-009-0030-6

  5. [5] Alsabah, M., Naser, M. A., Albahri, A. S., Albahri, O. S., Alamoodi, A. H., Abdulhussain, S. H., & Alzubaidi, L. (2025). A comprehensive review on key technologies toward smart healthcare systems based IoT: technical aspects, challenges and future directions. Artificial intelligence review, 58(11), 343. https://doi.org/10.1007/s10462-025-11342-3

  6. [6] Olaronke, I., Ojerinde, O., & Ikono, R. (2017). State of the art: A study of human-robot interaction in healthcare. International journal of information engineering and electronic business, 3(3), 43–55. https://doi.org/10.5815/ijieeb.2017.03.06

  7. [7] Yu, K.-H., Beam, A. L., & Kohane, I. S. (2018). Artificial intelligence in healthcare. Nature biomedical engineering, 2(10), 719–731. https://doi.org/10.1038/s41551-018-0305-z

  8. [8] Jiang, F., Jiang, Y., Zhi, H., Dong, Y., Li, H., Ma, S., … ., & Wang, Y. (2017). Artificial intelligence in healthcare: Past, present and future. Stroke and vascular neurology, 2(4), 230-243. https://doi.org/10.1136/svn-2017-000101

  9. [9] Topol, E. (2019). Deep medicine: How artificial intelligence can make healthcare human again. Hachette UK. https://www.amazon.com/Deep-Medicine-Artificial-Intelligence-Healthcare/dp/1541644638

  10. [10] Sutton, R. S., Barto, A. G., & others. (1998). Reinforcement learning: An introduction (Vol. 1). MIT press Cambridge. https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf

  11. [11] Birkhoff, D. C., van Dalen, A. S. H. M., & Schijven, M. P. (2021). A review on the current applications of artificial intelligence in the operating room. Surgical innovation, 28(5), 611–619. https://doi.org/10.1177/1553350621996961

  12. [12] Yang, G. Z., Cambias, J., Cleary, K., Daimler, E., Drake, J., Dupont, P. E., … ., Taylor, R. H. (2017). Medical robotics—Regulatory, ethical, and legal considerations for increasing levels of autonomy. Science Robotics, 2(4). https://doi.org/10.1126/scirobotics.aam8638

  13. [13] Köbis, N., Starke, C., & Rahwan, I. (2022). The promise and perils of using artificial intelligence to fight corruption. Nature machine intelligence, 4(5), 418–424. https://doi.org/10.1038/s42256-022-00489-1

  14. [14] Kristoffersson, A., Coradeschi, S., & Loutfi, A. (2013). A review of mobile robotic telepresence. Advances in human-computer interaction, 2013, 1–17. https://doi.org/10.1155/2013/902316

  15. [15] World Health Organization (WHO). (2021). 14.9 million excess deaths associated with the COVID-19 pandemic in 2020 and 2021. https://www.who.int/news/item/05-05-2022-14.9-million-excess-deaths-were-associated-with-the-covid-19-pandemic-in-2020-and-2021

  16. [16] World Health Organization (WHO). (2021). Noncommunicable diseases: Mortality. https://www.who.int/data/gho/data/themes/topics/noncommunicable-diseases-mortality

  17. [17] Weiser, T. G., Regenbogen, S. E., Thompson, K. D., Haynes, A. B., Lipsitz, S. R., Berry, W. R., & Gawande, A. A. (2008). An estimation of the global volume of surgery: A modelling strategy based on available data. The lancet, 372(9633), 139–144. https://doi.org/10.1016/S0140-6736(08)60878-8

  18. [18] Williams, S., Layard Horsfall, H., Funnell, J. P., Hanrahan, J. G., Khan, D. Z., Muirhead, W., … ., & Marcus, H. J. (2021). Artificial intelligence in brain tumour surgery—An emerging paradigm. Cancers, 13(19), 1–25. https://doi.org/10.3390/cancers13195010

  19. [19] Podnar, S., Kukar, M., Gunčar, G., Notar, M., Gošnjak, N., & Notar, M. (2019). Diagnosing brain tumours by routine blood tests using machine learning. Scientific reports, 9(1), 14481. https://doi.org/10.1038/s41598-019-51147-3

  20. [20] Eyraud, R., Ayache, S., Tsvetkov, P. O., Kalidindi, S. S., Baksheeva, V. E., Boissonneau, S., ... ., & Tabouret, E. (2023). Plasma nanodsf denaturation profile at baseline is predictive of glioblastoma EGFR status. Cancers, 15(3), 1–9. https://doi.org/10.3390/cancers15030760

  21. [21] Autenbahn, K., & Billard, A. (2022). The impact of interactive robots on autism therapy. Proceedings of the IEEE international symposium on robot and human interactive communication. IEEE. https://doi.org/10.1109/ROMAN.2002.1138485

  22. [22] Feil-Seifer, D., & Matarić, M. (2009). Toward socially assistive robotics for augmenting interventions for children with autism spectrum disorders. In Experimental robotics (Vol. 54, pp. 201–210). Springer Tracts in Advanced Robotics. https://doi.org/10.1007/978-3-642-00196-3_24

  23. [23] Hiolle, A., Lewis, M., & Cañamero, L. (2014). Arousal regulation and affective adaptation to human responsiveness by a robot that explores and learns a novel environment. Frontiers in neurorobotics, 8. https://doi.org/10.3389/fnbot.2014.00017

  24. [24] Lones, J., Lewis, M., & Cañamero, L. (2018). A hormone-driven epigenetic mechanism for adaptation in autonomous robots. IEEE transactions on cognitive and developmental systems, 10(2), 445–454. https://doi.org/10.1109/TCDS.2017.2775620

  25. [25] Lewis, M., & Canamero, L. (2016). Hedonic quality or reward? A study of basic pleasure in homeostasis and decision making of a motivated autonomous robot. Adaptive behavior, 24(5), 267–291. https://doi.org/10.1177/1059712316666331

  26. [26] Casey, D. (2016). MARIO Managing active and healthy aging with use of caring service robots. https://researchrepository.universityofgalway.ie/server/api/core/bitstreams/f2032b66-6112-4857-9fcb-f56341a4261b/content

  27. [27] Kelasidi, E., Moe, S., Pettersen, K. Y., Kohl, A. M., Liljebäck, P., & Gravdahl, J. T. (2019). Path following, obstacle detection and obstacle avoidance for thrusted underwater Snake robots. Frontiers in robotics and ai, 6. https://doi.org/10.3389/frobt.2019.00057

  28. [28] Pandey, A. K., & Gelin, R. (2018). A mass-produced sociable humanoid robot: Pepper: The first machine of its kind. IEEE robotics & automation magazine, 25(3), 40–48. https://doi.org/10.1109/MRA.2018.2833157

  29. [29] Tanevska, A., Rea, F., Sandini, G., Cañamero, L., & Sciutti, A. (2020). A socially adaptable framework for human-robot interaction. Frontiers in robotics and AI, 7. https://doi.org/10.3389/frobt.2020.00121

  30. [30] Moerman, C. J., van der Heide, L., & Heerink, M. (2019). Social robots to support children’s well-being under medical treatment: A systematic state-of-the-art review. Journal of child health care : for professionals working with children in the hospital and community, 23(4), 596–612. https://doi.org/10.1177/1367493518803031

  31. [31] Pairet, È., Ardón, P., Liu, X., Lopes, J., Hastie, H., & Lohan, K. S. (2019). A digital twin for human-robot interaction. 2019 14th ACM/IEEE international conference on human-robot interaction (HRI) (pp. 372). IEEE. https://doi.org/10.1109/HRI.2019.8673015

  32. [32] Hauskrecht, M. (2000). Value-function approximations for partially observable markov decision processes. Journal of artificial intelligence research, 13, 33–94. https://doi.org/10.1613/jair.678

  33. [33] Shortreed, S. M., Laber, E., Lizotte, D. J., Stroup, T. S., Pineau, J., & Murphy, S. A. (2011). Informing sequential clinical decision-making through reinforcement learning: An empirical study. Machine learning, 84(1), 109–136. https://doi.org/10.1007/s10994-010-5229-0

  34. [34] Blankart, K. E., & Lichtenberg, F. R. (2020). Are patients more adherent to newer drugs? Health care management science, 23(4), 605–618. https://doi.org/10.1007/s10729-020-09513-5

  35. [35] Shechter, S., Bailey, M., & Schaefer, A. (2008). The optimal time to initiate HIV therapy under ordered health states. Operations research, 56(1), 20–33. https://doi.org/10.1287/opre.1070.0480

  36. [36] Puterman, M. L. (1994). Markov decision processes: Discrete stochastic dynamic programming. John Wiley & Sons. https://doi.org/10.1002/9780470316887

  37. [37] Alagoz, O., Hsu, H., Schaefer, A. J., & Roberts, M. S. (2010). Markov decision processes: A tool for sequential decision making under uncertainty. Medical decision making, 30(4), 474–483. https://doi.org/10.1177/0272989X09353194

  38. [38] Thall, P. F., & Wathen, J. K. (2007). Practical Bayesian adaptive randomisation in clinical trials. European journal of cancer, 43(5), 859–866. https://doi.org/10.1016/j.ejca.2007.01.006

  39. [39] Liu, S., See, K. C., Ngiam, K. Y., Celi, L. A., Sun, X., & Feng, M. (2020). Reinforcement learning for clinical decision support in critical care: Comprehensive review. Journal of medical internet research, 22(7), e18477. https://doi.org/10.2196/18477

  40. [40] Kaelbling, L. P., Littman, M. L., & Cassandra, A. R. (1998). Planning and acting in partially observable stochastic domains. Artificial intelligence, 101(1), 99–134. https://doi.org/10.1016/S0004-3702(98)00023-X

  41. [41] Silver, D., & Veness, J. (2010). Monte-carlo planning in large pomdps. Advances in neural information processing systems (pp. 2164–2172). Curran Associates, Inc. https://www.researchgate.net/publication/221620445

  42. [42] Igl, M., Zintgraf, L., Le, T. A., Wood, F., & Whiteson, S. (2018). Deep variational reinforcement learning for pomdps. Proceedings of the 35th international conference on machine learning (Vol. 80, pp. 2117–2126). PMLR. https://proceedings.mlr.press/v80/igl18a.html

  43. [43] Daumé III, H. (2007). Frustratingly easy domain adaptation. Proceedings of the 45th annual meeting of the association of computational linguistics (pp. 256–263). Association for Computational Linguistics (ACL). https://aclanthology.org/P07-1033.pdf

  44. [44] Pineau, J., Gordon, G., Thrun, S. (2003). Point-based value iteration: An anytime algorithm for POMDPs. Proceedings of the 18th International joint conference on artificial intelligence (IJCAI) (Vol. 3, pp. 1025–1032). Morgan Kaufmann Publishers Inc. http://www.thrun.org/papers/Pineau03a.pdf

  45. [45] Goodrich, M. A., & Schultz, A. C. (2008). Human-robot interaction: A survey. Foundations and trends®in human-computer interaction, 1(3), 203–275. https://www.emerald.com/fthci/article/1/3/203/1321642

  46. [46] Sheridan, T. B. (2016). Human-robot interaction: status and challenges. Human factors, 58(4), 525–532. https://doi.org/10.1177/0018720816644364

  47. [47] Dautenhahn, K. (2007). Socially intelligent robots: Dimensions of human-robot interaction. Philosophical transactions of the royal society b: Biological sciences, 362(1480), 679–704. https://royalsocietypublishing.org/rstb/article/362/1480/679/20947

  48. [48] Mittelstadt, B. D., Allo, P., Taddeo, M., Wachter, S., & Floridi, L. (2016). The ethics of algorithms: Mapping the debate. Big data & society, 3(2). https://doi.org/10.1177/2053951716679679

  49. [49] Floridi, L., Cowls, J., Beltrametti, M., Chatila, R., Chazerand, P., Dignum, V., … ., & Vayena, E. (2018). AI4People—An ethical framework for a good AI society: Opportunities, Risks, Principles, and Recommendations. Minds and machines, 28(4), 689–707. https://doi.org/10.1007/s11023-018-9482-5

  50. [50] Rajkomar, A., Hardt, M., Howell, M. D., Corrado, G., & Chin, M. H. (2018). Ensuring fairness in machine learning to advance health equity. Annals of internal medicine, 169(12), 866–872. https://doi.org/10.7326/M18-1990

  51. [51] Matthias, A. (2004). The responsibility gap: Ascribing responsibility for the actions of learning automata. Ethics and information technology, 6(3), 175–183. https://doi.org/10.1007/s10676-004-3422-1

Published

2025-08-27

How to Cite

Dahamou, I., & Daoui, C. (2025). AI, Robotics, and Markov Decision Processes: Enhancing Precision and Autonomy inHealthcare Systems. Annals of Healthcare Systems Engineering, 2(3), 186-207. https://doi.org/10.22105/ahse.vi.38

Similar Articles

11-20 of 23

You may also start an advanced similarity search for this article.