↓ Skip to main content

Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task

Overview of attention for article published in Frontiers in Neurorobotics, January 2013
Altmetric Badge

Mentioned by

twitter
1 X user

Citations

dimensions_citation
14 Dimensions

Readers on

mendeley
42 Mendeley
citeulike
1 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task
Published in
Frontiers in Neurorobotics, January 2013
DOI 10.3389/fnbot.2013.00007
Pubmed ID
Authors

Ken Kinjo, Eiji Uchibe, Kenji Doya

Abstract

Linearly solvable Markov Decision Process (LMDP) is a class of optimal control problem in which the Bellman's equation can be converted into a linear equation by an exponential transformation of the state value function (Todorov, 2009b). In an LMDP, the optimal value function and the corresponding control policy are obtained by solving an eigenvalue problem in a discrete state space or an eigenfunction problem in a continuous state using the knowledge of the system dynamics and the action, state, and terminal cost functions. In this study, we evaluate the effectiveness of the LMDP framework in real robot control, in which the dynamics of the body and the environment have to be learned from experience. We first perform a simulation study of a pole swing-up task to evaluate the effect of the accuracy of the learned dynamics model on the derived the action policy. The result shows that a crude linear approximation of the non-linear dynamics can still allow solution of the task, despite with a higher total cost. We then perform real robot experiments of a battery-catching task using our Spring Dog mobile robot platform. The state is given by the position and the size of a battery in its camera view and two neck joint angles. The action is the velocities of two wheels, while the neck joints were controlled by a visual servo controller. We test linear and bilinear dynamic models in tasks with quadratic and Guassian state cost functions. In the quadratic cost task, the LMDP controller derived from a learned linear dynamics model performed equivalently with the optimal linear quadratic regulator (LQR). In the non-quadratic task, the LMDP controller with a linear dynamics model showed the best performance. The results demonstrate the usefulness of the LMDP framework in real robot control even when simple linear models are used for dynamics learning.

X Demographics

X Demographics

The data shown below were collected from the profile of 1 X user who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 42 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United Kingdom 1 2%
Japan 1 2%
United States 1 2%
Canada 1 2%
Unknown 38 90%

Demographic breakdown

Readers by professional status Count As %
Researcher 10 24%
Student > Master 8 19%
Student > Ph. D. Student 6 14%
Student > Bachelor 4 10%
Student > Doctoral Student 4 10%
Other 7 17%
Unknown 3 7%
Readers by discipline Count As %
Computer Science 16 38%
Engineering 6 14%
Medicine and Dentistry 3 7%
Social Sciences 2 5%
Mathematics 2 5%
Other 8 19%
Unknown 5 12%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 05 April 2013.
All research outputs
#20,187,333
of 22,703,044 outputs
Outputs from Frontiers in Neurorobotics
#684
of 845 outputs
Outputs of similar age
#248,729
of 280,707 outputs
Outputs of similar age from Frontiers in Neurorobotics
#20
of 20 outputs
Altmetric has tracked 22,703,044 research outputs across all sources so far. This one is in the 1st percentile – i.e., 1% of other outputs scored the same or lower than it.
So far Altmetric has tracked 845 research outputs from this source. They receive a mean Attention Score of 4.2. This one is in the 1st percentile – i.e., 1% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 280,707 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 1st percentile – i.e., 1% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 20 others from the same source and published within six weeks on either side of this one. This one is in the 1st percentile – i.e., 1% of its contemporaries scored the same or lower than it.