The RoboCup Physical Agent Challenge 97

Overview

Physical bodies have an important role for agents to emerge complex behaviors to achieve the goal in dynamic real world, on which the traditional AI research has not paid so much attention. The RoboCup Physical Agent Challenge provides a good testbed to see how physical bodies play a significant role in realizing intelligent behaviors using RoboCup framework \cite{Kitano_et_al95}. RoboCup is an attempt to foster AI and Robotics research by using soccer game as a representative domain where wide-range of technologies can be integrated as well as new technologies can be developed. While the RoboCup envisions a set of longer range challenges over next few decades, it also encompasses various short term challenge goals. In this paper, we present three technical challenges as the RoboCup Physical Agent Challenge 97: (1) moving the ball to the specified area (shooting, passing, and dribbling) with no, stationary, or moving obstacles, (2) catching the ball from an opponent or a common side player (receiving, goal-keeping, and intercepting), and (3) passing the ball between two players. First two are concerned with single agent skills while the third one is related to a simple cooperative behavior. Why we set up these challenges and how we should evaluate the realized skills are given. The ultimate goal in AI, and probably in robotics, is to build intelligent systems capable of emerging complex behaviors to accomplish the given tasks through interactions with a dynamically changing physical world. The traditional AI research has been mainly seeking for the methodology of symbol manipulations to be used in knowledge acquisition and representation and reasoning on it with little attention on their applications in dynamic real worlds. While, in robotics much more emphasis have been put on the issues of designing and building hardware systems and their controls. However, recent topics spread over two areas include design principles of autonomous agents, multi-agent collaboration, strategy acquisition, real-time reasoning and planning, intelligent robotics, sensor-fusion, and behavior learning. These topics expose new aspects with which traditional approaches seem difficult to cope. In order to cope with these issues and finally to achieve the ultimate goal, physical bodies have an important role in enabling the system to interact with physical environments, that makes the system learn from the environment and develop its internal representation. The meanings of ``having a physical body'' can be summarized as follows:

Sensing and acting capabilities are not separable, but tightly coupled.
In order to accomplish the given tasks, the sensor and actuator spaces should be abstracted under the resource bounded conditions (memory, processing power, controller etc.).
The abstraction depends on both the fundamental embodiments inside the agents and the experiences (interactions with their environments).
The consequences of the abstraction are the agent-based subjective representation of the environment, and its evaluation can be done by the consequences of behaviors.

Even though we should advocate the importance of ``having a physical body,'' it seems required to show how the system performs well coping with new issues in a concrete task domain. In other words, we need a standard problem which people regard as a new one that expose various kinds of important aspects of intelligent behaviors in real worlds.

Research Issues of RoboCup Physical Agent Track

Research Issues of RoboCup includes adaptive behaviors, flexible and multi modal cooperative behaviors in multi agents, optimal communication strategies in complex, dynamic, and uncertain real worlds. Especially, physical agent track focuses on the integration of hardware, software, and communication means. RoboCup physical agent challenges are summarized as follows: Perception:
The player should observe in realtime the behaviors of other objects which it cannot completely predict or control in order to take an action to deal with them. Such objects include a ball, an opponent, and in some sense, a common-side player, and further a judge. Capabilities of wide range perception, discrimination of other agents, and estimation of their locations and motions coping with occlusions are needed. Such perception is a basic technology to expand the robotic applications.

Action:
The player needs motion capabilities of quick acceleration/deceleration and turns, and motion stability, skillfulness and powerfulness in kicking a ball or trapping. Some of these requirements are inconsistent with each other. For example, the more powerfully the player likes to kick, the less skillful it is. If we improve the mechanical stiffness to satisfy the both, it would cause the increase of mass and energy which affect other motion capabilities. In addition to this, prohibition of wires and limitations of weight severely affect the mechanical design issues. Therefore, not only the high skill of each motion ability but also the total balance of the whole system are required.

Situation and Behavior:
The task domain itself is simple, but almost infinite number of situations will occur in accordance with dynamic changes of the relationships in position and relative motion between the ball, the goal, and the players, and the context of the game. The optimal behavior changes from one situation to another. Since our goal is more than a "dumb" soccer playing team, we need abilities beyond simple reflexive behaviors such as situation understanding, tactics selection and modification, minimum communication with common side players, teamwork behaviors acquired through practical training. These issues are closely related to the cognitive issues such as organization of spatio-temporal memory of the world and categorization of sets of motor behaviors into skills (symbols?) grounded by the physical bodies.

Realtime:
Since the situation rapidly changes according to motions of the ball and other players, there is no time to carefully analyze the situation and deliberate the plan. Therefore, the player should take an action such as kicking a ball immediately dribbling it if surrounded by no opponents in order to wait for the better situation, or moving a certain position to be located to synchronize the common side player's motion.

These challenges described above are significant long term ones to realize a good soccer planning robot team, which will take a few decades to meet. However, due to the clarity of the final target, several subgoals can be derived, which define mid term and short term challenges. One of the major reasons why RoboCup is attractive to so many researchers is that it requires the integration of a broad range of technologies into a team of complete agents, as opposed to a task-specific functional module. The long term research issues are too broad to compile as a list of specific items. Nevertheless, the challenges would involve broad range of technological issues ranging from the development of physical components, such as high performance batteries and motors, to highly intelligent real time perception and control software. The mid term technical challenges, which are the target for the next 10 years, can be made more concrete, and a partial list of specific topics can be compiled. Following is a partial list of research areas involved in RoboCup physical agent track, mainly targeted for the mid term time span:

agent architecture in general,
implementation of realtime and robust sensing,
realization of stable and high-speed robot control,
sensor fusion,
behavior learning for multi agent environment,
cooperation in dynamic environment,

The RoboCup Physical Agent Challenge shall be understood in the context of larger and longer range challenges, rather than as a one-shot challenge. Thus, we wish to provide a series of short term challenges, which naturally leads to the accomplishment of the mid term and long term challenges. The RoboCup Physical Agent Challenge-97 is the first attempt of this initiative together with the RoboCup Multi Agent Challenge-97.

Overview of The RoboCup Physical Agent Challenge-97

For the RoboCup Physical Agent Challenge-97, we offer three specific challenges, essential not only for RoboCup but also for general mobile robotics research. These challenges will specifically deal with the real robot league, rather than RoboCup's simulator league. Challenges for soft agents will be described elsewhere. The fundamental issue for researchers who wish to build real robot systems to play a soccer game in RoboCup is how to obtain basic skills to control a ball in various kinds of situations. Typical examples are to shoot the ball into the goal, to intercept the ball from an opponent, and to pass the ball to a common side player. These skills are needed to realize cooperative behaviors with common side players and competitive ones against opponents in soccer game. Among basic skills to control the ball, we selected three challenges as the RoboCup Physical Agent Challenge-97:

moving the ball to the specified area with no, stationary, or moving obstacles,
catching the ball from an opponent or a common side player, and
passing the ball between two players.

These three challenges have many variations in different kinds of situations such as passing, shooting, dribbling, receiving, and intercepting the ball with/without opponents of which defense skills varies from amateur level to professional one. Although they seem very specific ones to RoboCup, these challenges can be regarded as very general tasks in the field of mobile robotics research in a flat terrain environment. Since target reaching, obstacle avoidance, and their coordination are basic tasks in the area, a task of shooting the ball avoiding opponents that try to block the player should be ranked one of the most difficult challenge in the area. Once the robot succeeds in acquiring these skills, it can move anything to anywhere if it can do. In the other aspect, these three challenges can be regarded as a sequence of one task which guide the increase of the complexity of the internal representation according to the complexity of the environment \cite{Asada96j}. In the case of visual sensing, the agent can discriminate the static environment (and self body if observed) from others by directly correlating the motor commands the agent sent and the visual information observed during the motor command executions. In other words, such things can be labeled as self body or stationary environment. While, other active agents do not have a simple and straightforward relationship with the self motions. In the early stage, they are treated as noise or disturbance because of not having direct visual correlation with the self motor commands. Later, they can be found as having more complicated and higher correlation (cooperation, competition, and others). The complexity is drastically increased. Between them there is a ball which can be stationary or moving as a result of self or other agent motions. The complexities of both the environment and the internal representation of the agent can be categorized in the cognitive issue in general, and such a issue is naturally involved in this challenge. In the following, we describe the challenges more concretely.

The RoboCup Physical Agent Challenge 97 (I) -- ball moving challenge

Objectives

The objectives of this challenge is to check how the most fundamental skill of moving a ball to the specified area under several conditions with no (Level I), stationary (Level II), or moving (Level III) obstacles in the field can be acquired in various kinds of agent architectures, and to evaluate merits and demerits of realized skills using the standard tasks. The specifications of the ball and the surface of the field is the common issue to the all challenges in the physical agent track. In order to emerge various behaviors, the field surface should not so rough as preventing the ball from rolling, but not so smooth as no friction. The former means no kicks or passes, and the latter no dribbles. Since the variations of the task environment is few in Level I, agent architecture, specially sensing capability is focused. While, in Level II motion control is the central issue, and in Level III prediction of the motion of obstacles is the key issue.

Technical Issues

Vision

General computer and robot vision issues are too broad to deal with here. Finding and tracking independently moving objects (ball, players, judges) and estimating their motion parameters (2-D and further 3-D) from complicated background (field lines, goals, corner poles, flags waved by the supporters in the stadium) is too difficult for the current computer and robot vision technology to completely perform in realtime.

In order to focus on the skill acquisition, visual image processing should be drastically simplified. Discrimination by color information such as a red ball, a blue goal, a yellow opponent makes it easy to find and track objects in realtime \cite{Asada96a}. Nevertheless, robust color discrimination is a tough problem because digitized signals are so naive against the slight changes of lighting conditions. In the case of remote (wireless) processing, much more noises due to environmental factors cause fatal errors in image processing. Currently, human programmer adjusts key parameters used in discriminating colored objects on site. Self calibration method should be developed, which can expand image processing applications much more widely in general.

In the other aspect, visual tracking hardware based on image intensity correlation inside window region can be used to find and track objects from the complicated background by setting the initial windows \cite{Inoue92}. Currently, color tracking version is commercially available. As long as the initialized color pattern inside each window does not change so much, tracking is almost successful. Coping with pattern changes due to lighting conditions and occlusions is one of the central issues in the case of using this sort of hardware system \cite{Adachi96b}.

As long as the vision system can cope with the above issues, and capture the images of both the specified area (the target) and the ball, there might be no problem \cite{Nakamura95d,Nakamura96a}. To prevent the agent from losing the target, and/or the ball (in Level II and III, obstacles, too), an active vision system with panning and tilting motions seems preferable, but this makes the control system more complicated and causes spatial memory organization problem for keeping the information of lost objects. More practical way is to use a wider-angle lens. One extreme of this sort is to use the omni directional vision system to capture the image all around the agent. This sort of lens seems very useful not only acquiring the basic skills but also realizing cooperative behaviors in multi agent environment. Currently this type of lens is commercially available as conic and hyperboloidal ones \cite{Ishiguro96}.

Other perception

In the case of other sensing strategies, the agent should find the ball, (in Level II and III, obstacles, too) and know what is the target. Beside vision, typical sensors used in mobile robot research are range finder, sonar, and bumper ones. However, it seems difficult for each or any combination among them to discriminate the ball (obstacles, too in higher levels) and the target unless a special equipment such as transmitter is set inside the ball or the target, or a global positioning system besides on-board sensing and communication lines are used to inform the positions of all agents. The simplest case is no on-board sensing but only a global positioning system, which is adopted in the small robot league in the physical agent track because on-board sensing facilities are limited due to its size regulation.

In Level II and III, obstacle avoidance behavior and coordination between it and ball carrying (or passing/shooting) behavior are required. One good strategy is assign the sensor roles in advance. For example, sonar and bumper sensors are used for obstacle avoidance while vision sensor is used for the target reaching. One can make the robot learn to assign the sensor roles \cite{Nakamura96c}.

Action

As described in section \ref{sec:RI}, total balance of the whole system is a key issue to design the robot. In order for the system to expose more various kinds of behaviors, more complicated mechanical system and its sophisticated control techniques are necessary. We should start with a simpler one and then step up. The simplest case is to use just a car-like vehicle which has only two DOFs (degrees of freedom, in this case forward and turn), and pushes the ball to the target (dribbling).

The target can be just a location, the goal (shooting), and one of common side players (passing). In the case of location, a dribbling skill to carry the ball to the location might be sufficient. In the case of latters, the task is to kick the ball into the desired direction without caring about final position of the ball. To discriminate it from a simple dribbling skill, we may need more DOFs to realize a kick motion with one feet (or we may call arm). In the case of passing, velocity control of the ball might be a technical issue because one of common players to be passed is not stationary but moving.

In Level II and III, obstacle avoidance behavior and coordination between it and ball carrying (or passing/shooting) behavior are required. To smoothly switch two behaviors, the robot should speed down, but this increase the possibility to be gotten the ball by the opponent. To avoid these situations, the robot quickly switch the behaviors, which causes unstability of the robot motion. One can use the omni-directionally movable vehicle based on the sophisticated mechanism \cite{Asama95}. The vehicle can move to any direction anytime. In addition to the motion control problem, there are more issues to be considered such as how to coordinate these two behaviors (switching conditions) \cite{Uchibe96c}.

Mapping from perception to action

There are several approaches to implement the control mechanisms which perform the given task. A conventional approach is first to reconstruct the geometrical model of the environment (ball, goal, other agents etc.), then deliberate a plan, and finally execute the plan. However, this sort of approach is not suitable for dynamically changing game environment due to its time-consuming reconstruction process although a simple carrying task (Level I) can be performed.

A look up table indicating the mapping from perception to action by whatever method seems suitable for quick action selection. One can make such an LUT by hand-coding given a priori, precise knowledge of the environment (the ball, the goals, and other agents) and the agent model (kinematics/dynamics). In a simple task domain, human programmer can do that to some extent, but seems difficult to cope with possible situations completely. An opposite approach is learning to decide action selection given almost no a priori knowledge. Between them, several variations with more or less knowledge. The approaches are summarized as follows: \begin{enumerate} \item complete hand-coding (no learning), \item parameter tuning given the structural (qualitative) knowledge (self calibration), \item typical reinforcement learning such as Q-learning with almost no a priori knowledge, but given the state and action spaces \cite{Asada96a,Uchibe96c}, \item action selection from the state and action space construction \cite{Asada96h,Takahashi96b}, \item tabula rasa learning (nothing assumed?) \end{enumerate} These approaches should be evaluated in various kinds of viewpoints.

Evaluation

In order to evaluate the achieved skills, we set up the following standard tasks with some variations.

move the ball to the specified area circle of which radius 25cm in the middle size class, and 8cm in the small size. Variations are different configurations of the ball near the robot and the target area.
move the ball to the goal of which size is specified in the regulations. Variations are the same as the above.
1 with stationary obstacles. Different densities of obstacles makes variations.
2 with stationary obstacles. Different densities of obstacles makes variations. variations.
1 with moving obstacles. The number of obstacles is fixed, but the policy including speed can change from task to task.
2 with moving obstacles. The number of obstacles is fixed, but the policy including speed can change from task to task.

The speed and the accuracy are main issues in Level I. In Pre-RoboCup96, the hand-coded teams were better than AI-based teams in the simulation league. How does this tendency change in the physical agent track? Especially in Level II and III.

The RoboCup Physical Agent Challenge 97 (II) -- ball catching challenge

Objectives

The objectives of this challenge is to check how the most fundamental skill of catching a ball under several conditions such as pass receiving (Task A), goal keeping (Task B), or intercepting (Task C) can be acquired, and to evaluate merits and demerits of realized skills using the standard tasks.

Technical Issues

In addition to issues in the challenge (I), there remain several issues:

Task A:
Prediction of the ball speed and direction is a key issue to receive the ball. To receive the passed ball during the self motion, the relationship between the moving ball and the self motion should be made clear.
Task B:
In addition to the above issue, goal protection issue is important. To estimate the goal position, the agent may have to watch the goal area lines and the penalty area line. Again, the omni directional lens is much better to see the coming ball and the goal position simultaneously. In the goal area line, the agent can receive and keep the ball while outside this line it may have to kick the ball (not receiving but just protecting the goal). Discrimination of these lines might cause the vision much more complicated.
Task C:
It seems similar to Task A, but the main difference is to get the ball from the pass between opponents. This requires more accurate prediction of motions of not only the ball but also opponents (both passer and receiver). Also, more load in perception module.

Evaluation

Although we can test the Tasks A nd B with human players, we would like to test with the obtained skill in the challenge (I). On the Task C, there is no opponents to pass the ball each other, therefore what we can do is to change the speed of the ball faster. After the following challenge, we can check the both skills, that is, passing skill and intercepting one. Of course, each of both should be evaluated separately in advance.

The RoboCup Physical Agent Challenge 97 (III) -- cooperative behavior challenge (pass and receive)

Objectives

The objectives of this challenge is to check how the most fundamental cooperative skill (passing a ball between two players) can be acquired, and to evaluate merits and demerits of realized skills using the standard tasks.

The challenge (III) focuses on a basic skill of cooperative behavior between two agents while the challenges (I) and (II) focus on the basic skills of one single agent even though the environment includes other agents (possible opponents). If the challenge (I) and (II) are successfully achieved, passing the ball between two players might become easy. That is, a combination of passing and receiving skills. However, from a viewpoint of cooperative behaviors, there might be more issues.

Technical Issues

In addition to issue in the challenge (I) and (II), three issues are left.

Since the control architecture is not centerized but decenterized, each agent should know capabilities in passing and receiving skills of not only itself but the other. The simplest case is that the other has the same level skills. Otherwise, the agent should estimate the level of the other agent skills. That means agent modeling. Learning from observation \cite{Kuniyoshi93} seems promising, but the problem includes partial observation one due to the limitation of the perception.
Even though both two agents have the reasonable skills of passing and receiving, the timings of passing and receiving should be learned between two. If both agent try to learn to skill up its behaviors, the learning will not converge because the policy of the other changes simultaneously. To prevent this situation, one of the agents should be a coach (fixed policy) and the other be a learner. In this case, modeling of the learner is another issue for good teaching.
Selection of passing direction depends on the motions of opponents. This causes the opponent modeling issue which makes the cooperative behavior much harder to realize.

Evaluation

Since the challenge with many issues is very hard in the current stage, the challenge 97 will only check the possibility of the cooperative behavior in the benign environment, that is, the same level two players with no opponents.