The RoboCup Physical Agent Challenge 97
Overview
Physical bodies have an important role for agents to emerge complex
behaviors to achieve the goal in dynamic real world, on which the
traditional AI research has not paid so much attention. The RoboCup
Physical Agent Challenge provides a good testbed to see how physical
bodies play a significant role in realizing intelligent behaviors
using RoboCup framework \cite{Kitano_et_al95}. RoboCup is an attempt
to foster AI and Robotics research by using soccer game as a
representative domain where wide-range of technologies can be
integrated as well as new technologies can be developed. While the
RoboCup envisions a set of longer range challenges over next few
decades, it also encompasses various short term challenge goals. In
this paper, we present three technical challenges as the RoboCup
Physical Agent Challenge 97: (1) moving the ball to the specified
area (shooting, passing, and dribbling) with no, stationary, or
moving obstacles, (2) catching the ball from an opponent or a common
side player (receiving, goal-keeping, and intercepting), and (3)
passing the ball between two players. First two are concerned with
single agent skills while the third one is related to a simple
cooperative behavior. Why we set up these challenges and how we
should evaluate the realized skills are given.
The ultimate goal in AI, and probably in robotics, is to build
intelligent systems capable of emerging complex behaviors to accomplish
the given tasks through interactions with a dynamically changing
physical world. The traditional AI research has been mainly seeking for
the methodology of symbol manipulations to be used in knowledge
acquisition and representation and reasoning on it with little
attention on their applications in dynamic real worlds. While, in
robotics much more emphasis have been put on the issues of designing
and building hardware systems and their controls. However, recent
topics spread over two areas include design principles of autonomous
agents, multi-agent collaboration, strategy acquisition, real-time
reasoning and planning, intelligent robotics, sensor-fusion, and
behavior learning. These topics expose new aspects with which
traditional approaches seem difficult to cope.
In order to cope with these issues and finally to achieve the ultimate
goal, physical bodies have an important role in enabling the system to
interact with physical environments, that makes the system learn from
the environment and develop its internal representation. The meanings
of ``having a physical body'' can be summarized as follows:
- Sensing and acting capabilities are not separable, but tightly
coupled.
- In order to accomplish the given tasks, the sensor and
actuator spaces should be abstracted under the resource bounded
conditions (memory, processing power, controller etc.).
- The abstraction depends on both the fundamental embodiments
inside the agents and the experiences (interactions with their
environments).
- The consequences of the abstraction are the agent-based
subjective representation of the environment, and its evaluation can
be done by the consequences of behaviors.
Even though we should advocate the importance of ``having a physical
body,'' it seems required to show how the system performs well coping
with new issues in a concrete task domain. In other words, we need a
standard problem which people regard as a new one that expose various
kinds of important aspects of intelligent behaviors in real worlds.
Research Issues of RoboCup Physical Agent Track
Research Issues of RoboCup includes adaptive behaviors, flexible and
multi modal cooperative behaviors in multi agents, optimal
communication strategies in complex, dynamic, and uncertain real
worlds. Especially, physical agent track focuses on the integration of
hardware, software, and communication means.
RoboCup physical agent challenges are summarized as follows:
Perception:
The player should observe in realtime the
behaviors of other objects which it cannot completely predict or
control in order to take an action to deal with them. Such objects
include a ball, an opponent, and in some sense, a common-side
player, and further a judge. Capabilities of wide range perception,
discrimination of other agents, and estimation of their locations and
motions coping with occlusions are needed. Such perception is a basic
technology to expand the robotic applications.
Action:
The player needs motion capabilities of quick
acceleration/deceleration and turns, and motion stability,
skillfulness and powerfulness in kicking a ball or trapping. Some
of these requirements are inconsistent with each other. For example,
the more powerfully the player likes to kick, the less skillful it
is. If we improve the mechanical stiffness to satisfy the both, it
would cause the increase of mass and energy which affect other motion
capabilities. In addition to this, prohibition of wires and
limitations of weight severely affect the mechanical design
issues. Therefore, not only the high skill of each motion ability but
also the total balance of the whole system are required.
Situation and Behavior:
The task domain itself is
simple, but almost infinite number of situations will occur in
accordance with dynamic changes of the relationships in position and
relative motion between the ball, the goal, and the players, and the
context of the game. The optimal behavior changes from one situation
to another. Since our goal is more than a "dumb" soccer playing team,
we need abilities beyond simple reflexive behaviors such as
situation understanding, tactics selection and modification, minimum
communication with common side players, teamwork behaviors acquired
through practical training. These issues are closely related to the
cognitive issues such as organization of spatio-temporal memory of
the world and categorization of sets of motor behaviors into skills
(symbols?) grounded by the physical bodies.
Realtime:
Since the situation rapidly changes according
to motions of the ball and other players, there is no time to
carefully analyze the situation and deliberate the plan. Therefore,
the player should take an action such as kicking a ball immediately
dribbling it if surrounded by no opponents in order to wait for
the better situation, or moving a certain position to be located to
synchronize the common side player's motion.
These challenges described above are significant long term ones to
realize a good soccer planning robot team, which will take a few
decades to meet. However, due to the clarity of the final target,
several subgoals can be derived, which define mid term and short term
challenges. One of the major reasons why RoboCup is attractive to so
many researchers is that it requires the integration of a broad range
of technologies into a team of complete agents, as opposed to a
task-specific functional module. The long term research issues are
too broad to compile as a list of specific items. Nevertheless, the
challenges would involve broad range of technological issues ranging
from the development of physical components, such as high performance
batteries and motors, to highly intelligent real time perception and
control software.
The mid term technical challenges, which are the target for the next
10 years, can be made more concrete, and a partial list of specific
topics can be compiled. Following is a partial list of research areas
involved in RoboCup physical agent track, mainly targeted for the mid
term time span:
- agent architecture in general,
- implementation of realtime and robust sensing,
- realization of stable and high-speed robot control,
- sensor fusion,
- behavior learning for multi agent environment,
- cooperation in dynamic environment,
The RoboCup Physical Agent Challenge shall be understood in the
context of larger and longer range challenges, rather than as a
one-shot challenge. Thus, we wish to provide a series of short term
challenges, which naturally leads to the accomplishment of the mid
term and long term challenges. The RoboCup Physical Agent Challenge-97
is the first attempt of this initiative together with the RoboCup
Multi Agent Challenge-97.
Overview of The RoboCup Physical Agent Challenge-97
For the RoboCup Physical Agent Challenge-97, we offer
three specific challenges, essential not only for RoboCup but also for
general mobile robotics research. These challenges will specifically
deal with the real robot league, rather than RoboCup's simulator
league. Challenges for soft agents will be described elsewhere.
The fundamental issue for researchers who wish to build real robot
systems to play a soccer game in RoboCup is how to obtain basic skills
to control a ball in various kinds of situations. Typical examples are
to shoot the ball into the goal, to intercept the ball from an
opponent, and to pass the ball to a common side player. These skills
are needed to realize cooperative behaviors with common side players
and competitive ones against opponents in soccer game. Among basic
skills to control the ball, we selected three challenges
as the RoboCup Physical Agent Challenge-97:
- moving the ball to the specified area with no, stationary, or
moving obstacles,
- catching the ball from an opponent or a common side player, and
- passing the ball between two players.
These three challenges have many variations in different kinds of
situations such as passing, shooting, dribbling, receiving, and
intercepting the ball with/without opponents of which defense skills
varies from amateur level to professional one. Although they seem very
specific ones to RoboCup, these challenges can be regarded as very
general tasks in the field of mobile robotics research in a flat
terrain environment. Since target reaching, obstacle avoidance, and
their coordination are basic tasks in the area, a task of shooting the
ball avoiding opponents that try to block the player should be ranked
one of the most difficult challenge in the area. Once the robot
succeeds in acquiring these skills, it can move anything to anywhere if
it can do.
In the other aspect, these three challenges can be regarded as a
sequence of one task which guide the increase of the complexity of the
internal representation according to the complexity of the environment
\cite{Asada96j}.
In the case of visual sensing, the agent can discriminate the static
environment (and self body if observed) from others by directly
correlating the motor commands the agent sent and the visual
information observed during the motor command executions. In other
words, such things can be labeled as self body or stationary
environment. While, other active agents do not have a simple and
straightforward relationship with the self motions. In the early stage,
they are treated as noise or disturbance because of not having direct
visual correlation with the self motor commands. Later, they can be
found as having more complicated and higher correlation (cooperation,
competition, and others). The complexity is drastically
increased. Between them there is a ball which can be stationary or
moving as a result of self or other agent motions.
The complexities of both the environment and the internal
representation of the agent can be categorized in the cognitive issue
in general, and such a issue is naturally involved in this challenge.
In the following, we describe the challenges more concretely.
The RoboCup Physical Agent Challenge 97 (I)
-- ball moving challenge
Objectives
The objectives of this challenge is to check how the most fundamental
skill of moving a ball to the specified area under several conditions
with no (Level I), stationary (Level II), or moving
(Level III) obstacles in the field can be acquired in various kinds
of agent architectures, and to evaluate merits and demerits of
realized skills using the standard tasks.
The specifications of the ball and the surface of the field is the
common issue to the all challenges in the physical agent track. In
order to emerge various behaviors, the field surface should not so
rough as preventing the ball from rolling, but not so smooth as no
friction. The former means no kicks or passes, and the latter no
dribbles.
Since the variations of the task environment is few in Level I,
agent architecture, specially sensing capability is focused. While, in
Level II motion control is the central issue, and in Level
III prediction of the motion of obstacles is the key issue.
Technical Issues
Vision
General computer and robot vision issues are too broad to deal with
here. Finding and tracking independently moving objects (ball,
players, judges) and estimating their motion parameters (2-D and
further 3-D) from complicated background (field lines, goals, corner
poles, flags waved by the supporters in the stadium) is too difficult
for the current computer and robot vision technology to completely
perform in realtime.
In order to focus on the skill acquisition, visual image processing
should be drastically simplified. Discrimination by color information
such as a red ball, a blue goal, a yellow opponent makes it easy to find
and track objects in realtime \cite{Asada96a}. Nevertheless, robust
color discrimination is a tough problem because digitized signals are so
naive against the slight changes of lighting conditions. In the case
of remote (wireless) processing, much more noises due to environmental
factors cause fatal errors in image processing. Currently, human
programmer adjusts key parameters used in discriminating colored
objects on site. Self calibration method should be developed, which
can expand image processing applications much more widely in general.
In the other aspect, visual tracking hardware based on image intensity
correlation inside window region can be used to find and track objects
from the complicated background by setting the initial windows
\cite{Inoue92}. Currently, color tracking version is commercially
available. As long as the initialized color pattern inside each window
does not change so much, tracking is almost successful. Coping with
pattern changes due to lighting conditions and occlusions is one of the
central issues in the case of using this sort of hardware system
\cite{Adachi96b}.
As long as the vision system can cope with the above issues, and
capture the images of both the specified area (the target) and the
ball, there might be no problem \cite{Nakamura95d,Nakamura96a}. To
prevent the agent from losing the target, and/or the ball
(in Level II and III, obstacles, too), an active vision system with
panning and tilting motions seems preferable, but this makes the
control system more complicated and causes spatial memory organization
problem for keeping the information of lost objects. More practical way
is to use a wider-angle lens. One extreme of this sort is to use the
omni directional vision system to capture the image all around the
agent. This sort of lens seems very useful not only acquiring the
basic skills but also realizing cooperative behaviors in multi agent
environment. Currently this type of lens is commercially available as
conic and hyperboloidal ones \cite{Ishiguro96}.
Other perception
In the case of other sensing strategies, the agent should find the
ball, (in Level II and III, obstacles, too) and know what is the
target. Beside vision, typical sensors used in mobile robot research
are range finder, sonar, and bumper ones. However, it seems difficult
for each or any combination among them to discriminate the ball
(obstacles, too in higher levels) and the target unless a special
equipment such as transmitter is set inside the ball or the target, or
a global positioning system besides on-board sensing and communication
lines are used to inform the positions of all agents. The simplest case
is no on-board sensing but only a global positioning system, which is
adopted in the small robot league in the physical agent track because
on-board sensing facilities are limited due to its size regulation.
In Level II and III, obstacle avoidance behavior and
coordination between it and ball carrying (or passing/shooting)
behavior are required. One good strategy is assign the sensor roles in
advance. For example, sonar and bumper sensors are used for obstacle
avoidance while vision sensor is used for the target reaching. One can
make the robot learn to assign the sensor roles \cite{Nakamura96c}.
Action
As described in section \ref{sec:RI}, total balance of the whole
system is a key issue to design the robot. In order for the system to
expose more various kinds of behaviors, more complicated mechanical
system and its sophisticated control techniques are necessary. We
should start with a simpler one and then step up. The simplest case is
to use just a car-like vehicle which has only two DOFs (degrees of
freedom, in this case forward and turn), and pushes the ball to the
target (dribbling).
The target can be just a location, the goal (shooting), and one of
common side players (passing). In the case of location, a dribbling
skill to carry the ball to the location might be sufficient. In the
case of latters, the task is to kick the ball into the desired
direction without caring about final position of the ball. To
discriminate it from a simple dribbling skill, we may need more DOFs
to realize a kick motion with one feet (or we may call arm). In the
case of passing, velocity control of the ball might be a technical
issue because one of common players to be passed is not stationary but
moving.
In Level II and III, obstacle avoidance behavior and
coordination between it and ball carrying (or passing/shooting)
behavior are required. To smoothly switch two behaviors, the robot
should speed down, but this increase the possibility to be gotten the
ball by the opponent. To avoid these situations, the robot quickly
switch the behaviors, which causes unstability of the robot motion. One
can use the omni-directionally movable vehicle based on the
sophisticated mechanism \cite{Asama95}. The vehicle can move to any
direction anytime. In addition to the motion control problem, there are
more issues to be considered such as how to coordinate these two
behaviors (switching conditions) \cite{Uchibe96c}.
Mapping from perception to action
There are several approaches to implement the control mechanisms which
perform the given task. A conventional approach is first to reconstruct
the geometrical model of the environment (ball, goal, other agents
etc.), then deliberate a plan, and finally execute the plan. However,
this sort of approach is not suitable for dynamically changing game
environment due to its time-consuming reconstruction process although a
simple carrying task (Level I) can be performed.
A look up table indicating the mapping from perception to action by
whatever method seems suitable for quick action selection. One can
make such an LUT by hand-coding given a priori, precise knowledge of
the environment (the ball, the goals, and other agents) and the agent
model (kinematics/dynamics). In a simple task domain, human programmer
can do that to some extent, but seems difficult to cope with possible
situations completely. An opposite approach is learning to decide
action selection given almost no a priori knowledge. Between them,
several variations with more or less knowledge. The approaches are
summarized as follows:
\begin{enumerate}
\item complete hand-coding (no learning),
\item parameter tuning given the structural (qualitative) knowledge
(self calibration),
\item typical reinforcement learning such as Q-learning with
almost no a priori knowledge, but given the state and action
spaces \cite{Asada96a,Uchibe96c},
\item action selection from the state and action space
construction \cite{Asada96h,Takahashi96b},
\item tabula rasa learning (nothing assumed?)
\end{enumerate}
These approaches should be evaluated in various kinds of viewpoints.
Evaluation
In order to evaluate the achieved skills, we set up the following
standard tasks with some variations.
- move the ball to the specified area circle of which radius 25cm
in the middle size class, and 8cm in the small size. Variations are
different configurations of the ball near the robot and the target
area.
- move the ball to the goal of which size is specified in the
regulations. Variations are the same as the above.
- 1 with stationary obstacles. Different densities of obstacles
makes variations.
- 2 with stationary obstacles. Different densities of obstacles
makes variations.
variations.
- 1 with moving obstacles. The number of obstacles is fixed, but
the policy including speed can change from task to task.
- 2 with moving obstacles. The number of obstacles is fixed, but
the policy including speed can change from task to task.
The speed and the accuracy are main issues in Level I. In
Pre-RoboCup96, the hand-coded teams were better than AI-based teams in
the simulation league. How does this tendency change in the physical
agent track? Especially in Level II and III.
The RoboCup Physical Agent Challenge 97 (II)
-- ball catching challenge
Objectives
The objectives of this challenge is to check how the most fundamental
skill of catching a ball under several conditions such as pass
receiving (Task A), goal keeping (Task B), or
intercepting (Task C) can be acquired, and to evaluate
merits and demerits of realized skills using the standard tasks.
Technical Issues
In addition to issues in the challenge (I), there remain several
issues:
- Task A:
Prediction of the ball speed and direction is a
key issue to receive the ball. To receive the passed ball during the
self motion, the relationship between the moving ball and the self
motion should be made clear.
- Task B:
In addition to the above issue, goal protection
issue is important. To estimate the goal position, the agent may have
to watch the goal area lines and the penalty area line. Again, the
omni directional lens is much better to see the coming ball and the
goal position simultaneously. In the goal area line, the agent can
receive and keep the ball while outside this line it may have to kick
the ball (not receiving but just protecting the goal). Discrimination
of these lines might cause the vision much more complicated.
- Task C:
It seems similar to Task A, but the main
difference is to get the ball from the pass between opponents. This
requires more accurate prediction of motions of not only the ball but
also opponents (both passer and receiver). Also, more load in
perception module.
Evaluation
Although we can test the Tasks A nd B with human players, we would like
to test with the obtained skill in the challenge (I). On the Task C,
there is no opponents to pass the ball each other, therefore what we
can do is to change the speed of the ball faster. After the following
challenge, we can check the both skills, that is, passing skill and
intercepting one. Of course, each of both should be evaluated
separately in advance.
The RoboCup Physical Agent Challenge 97 (III)
-- cooperative behavior challenge (pass and receive)
Objectives
The objectives of this challenge is to check how the most fundamental
cooperative skill (passing a ball between two players) can be
acquired, and to evaluate merits and demerits of realized skills
using the standard tasks.
The challenge (III) focuses on a basic skill of cooperative behavior
between two agents while the challenges (I) and (II) focus on the basic
skills of one single agent even though the environment includes other
agents (possible opponents). If the challenge (I) and (II) are
successfully achieved, passing the ball between two players might
become easy. That is, a combination of passing and receiving
skills. However, from a viewpoint of cooperative behaviors, there might
be more issues.
Technical Issues
In addition to issue in the challenge (I) and (II), three issues are
left.
- Since the control architecture is not centerized but
decenterized, each agent should know capabilities in passing and
receiving skills of not only itself but the other. The simplest case
is that the other has the same level skills. Otherwise, the agent
should estimate the level of the other agent skills. That means agent
modeling. Learning from observation \cite{Kuniyoshi93} seems
promising, but the problem includes partial observation one due to
the limitation of the perception.
- Even though both two agents have the reasonable skills of
passing and receiving, the timings of passing and receiving should
be learned between two. If both agent try to learn to skill up its
behaviors, the learning will not converge because the policy of the
other changes simultaneously. To prevent this situation, one of the
agents should be a coach (fixed policy) and the other be a learner. In
this case, modeling of the learner is another issue for good teaching.
- Selection of passing direction depends on the motions of
opponents. This causes the opponent modeling issue which makes the
cooperative behavior much harder to realize.
Evaluation
Since the challenge with many issues is very hard in the current stage,
the challenge 97 will only check the possibility of the cooperative
behavior in the benign environment, that is, the same level two players
with no opponents.