We focus on the problem of adaptivity of companion robots to their users. Until recently, propositions on the subject of intelligent service robots were mostly user independent. Our work is part of the FUI-RoboPopuli project, which concentrates on endowing entertainment companion robots with adaptive and social behaviour. We concentrate on the capacity of robots to learn how to adapt and personalize their behaviour according to their users. Markov Decision Processes (MDPs) are largely used for adaptive robots applications. Several approaches were proposed to decrease the sample complexity to learn the MDP model, including the reward function. We proposed in previous work, two learning algorithms to learn the MDP reward function through analysing interaction traces (i.e. the interaction history between the robot and their users) including users’ feedback. The first algorithm is direct and certain. The second is able to detect the importance of certain information, regarding the users (profiles) and/or the environment, in the adaptation process. We present, in this paper, a user study in addition to simulated experiments. Those experiments prove that our proposed algorithms are able to learn, through interactions with real users, a reward function that leads to an adapted and personalised robot behaviour. We also show the ability of our algorithms to handle exceptions and ambiguities in users feedback during experiments with real users.