Jun Rekimoto
Sony Computer Science Laboratories Inc.
3-14-13 Higashigotanda, Shinagawa-ku,
Tokyo 141 Japan
+81-3-5447-4380
rekimoto@csl.sony.co.jp
http://www.csl.sony.co.jp/person/rekimoto.html
abstract
This paper proposes a new field of user interfaces called multi-computer direct manipulation and presents a pen-based direct manipulation technique that can be used for data transfer between different computers as well as within the same computer. The proposed Pick-and-Drop allows a user to pick up an object on a display and drop it on another display as if he/she were manipulating a physical object. Even though the pen itself does not have storage capabilities, a combination of Pen-ID and the pen manager on the network provides the illusion that the pen can physically pick up and move a computer object. Based on this concept, we have built several experimental applications using palm-sized, desk-top, and wall-sized pen computers. We also considered the importance of physical artifacts in designing user interfaces in a future computing environment.keywords
direct manipulation, graphical user interfaces, input devices, stylus interfaces, pen interfaces, drag-and-drop, multi-computer user interfaces, ubiquitous computing, computer augmented environments
However, using multiple computers without considering the user-interface introduces several problems. The first problem resides in a restriction of today's input devices. Almost all keyboards and pointing devices are tethered to a single computer; we cannot share a mouse between two computers. Therefore, using multiple computers on the same desk top often results in a ``mouse (or keyboard) jungle'', as shown in Figure~\ref{fig:jungle}. It is very confusing to distinguish which input device belongs to which computer.
The other problem is the fact that today`s user interface techniques are not designed for multiple-computer environments. Oddly enough, as compared with remote file transmission, it is rather cumbersome to transfer information from one computer to another on the same desk, even though they are connected by a network. A cut-and-paste on a single computer is easy, but the system often forces users to transfer information between computers in a very different way. A quick survey reveals that people transfer information from display to display quite irregularly (Table 1). Interestingly, quite a few people even prefer to transfer data {\em by hand} (e.g., read a text string on one display and type it on another computer), especially for short text segments such as an e-mail address or a universal resource locator (URL) for the World Wide Web . We consider these tendencies to be caused by a lack of easy direct data transfer user interfaces (e.g., copy-and-paste or drag-and-drop) between different but nearby computers.
0 | 1 | 2 | > 3 |
0% | 7.7 % | 38.5 % | 53.8 % |
By | Through | By | By | Through | |
hand | shared files | ftp | floppies | Other | |
62.9 % | 62.9 % | 57.1 % | 34.3 % | 20.0 % | 22.9% |
The first problem is partially solved by using more sophisticated input devices such as a stylus. Today's stylus input devices such as WACOM's, provide untethered operation and thus can be shared among many pen sensitive displays. This situation is more natural than that of a mouse, because in the physical world, we do not have to select a specific pencil for each paper. With the second problem, however, we have much room for improvement from the viewpoint of user interfaces.
Although some systems use multi-display configurations [bolt80,PARC-CSL-95-1,dualui], direct manipulation techniques for multi-display environments have not been well explored to date. We believe that the concept of multi-display direct manipulation offers many new design challenges to the field of human-computer interfaces.
In this paper, we propose a new pen based interaction technique called "Pick-and-Drop". This technique lets a user exchange information from one display to another in the manner of manipulating a physical object. This technique is a natural extension to the drag-and-drop technique, which is popular in today's many GUI applications. Figure 2 shows the conceptual difference between the traditional data transfer method and Pick-and-Drop.
Fiture 2: The conceptual difference between remote copy and Pick-and-Drop
However, simply applying the drag-and-drop to pen user interfaces presents a problem. It is rather difficult to drag an object with a pen while keep the pen tip contacted on the display surface. It is often the case that a user accidentally drops an object during the drag operation, especially when dragging over a large display surface.
Development of our proposed Pick-and-Drop method started as useful alternative to drag-and-drop for overcoming this problem. With Pick-and-Drop, the user first picks up an computer object by tapping it with the pen tip and then lifts the pen from the screen. After this operation, the pen virtually holds the object. Then, the user moves the pen tip towards the designated position on the screen without contacting display surface. When the pen tip comes close enough to the screen, a shadow of the object appears on the screen (Figure 3) as a visual feedback showing that the pen has the data. Then, the user taps the screen with the pen and the object moves from the pen to the screen at the tapped position. This method looks much more natural than that of drag-and-drop. In our real lives, we regularly pick up an object from one place and drop it on another place, rather than sliding it along the surface of something. We would also like to mention that this Pick-and-Drop metaphor might be more familiar to people who normally use chop sticks at meals.
There are a number of opportunities where people need to exchange information from one computer to another. Examples include:
Although these operations can also be implemented by using remote copy or shared file systems, we feel that it is more natural to allow a user to manipulate a computer object as if it were a real (physical) object. \begin{figure} \centerline{\epsfile{file=figs/PickDropSysConf.eps,width=0.47\textwidth}} \caption{System configuration} \label{fig:config} \end{figure} \begin{figure} \centerline{\epsfile{file=figs/icons.eps,width=0.47\textwidth}} \caption{Pen and icons: (a) the pen contacts the display, (b) the pen lifts up, but remains close to the screen, and (c) the pen is away from the screen} \label{fig:shadow} \end{figure}
When a user taps an object (typically an icon) on the screen with the pen, the pen manager binds its object ID to the pen ID. This binding represents a situation in which the pen virtually holds the object (even though the pen itself does not contain any storage). When the user moves the same pen towards the other display, the pen manager supplies the type of the bound object to the display. Then the shadow of the data appears on the display below the current pen position. At this moment, the pen does not touch the screen. Finally, when the user touches the display with the pen, the pen manager asks the first computer to transfer the data to second computer.
Since each pen has its own ID, simultaneous Pick-and-Drop operations by more than one pen can overlap. This feature would be useful in a collaborative setting.
Note that Pick-and-Drop can also coexist with the normal drag-and-drop by using a time-out. The system distinguishes between these two operation by measuring the period of time between pen-down and pen-up. When a user touches an object with the pen and drags it without lifting the pen tip, it initiates a drag-and-drop instead of a Pick-and-Drop.
The state transition of Pick-and-Drop is shown in Figure XX.
A pen's proximity to the screen can be sensed by combining the motion event and a time-out. When a user moves a pen close to the screen, the screen begins reading motion events from the pen. If motion events occur continuously, the system regards the pen as being near the screen. When a pen leaves the screen, motion events seize and the system can detect it again by setting a time-out. This technique is used for both the Pick and the Drop operations (Figure~\ref{fig:state}).
The simplest usage of Pick-and-Drop is to support the exchange of information between two co-workers. When two people need to transfer a file or a short text segment between computers, they can simply pick it up from one's PDA display and drop it on the other's display (Figure~\ref{fig:pdas}). Note that these two PDAs are communicating via wireless networks.
It is also possible to pick up information from a kiosk terminal in an public space or an office. In our laboratory, we use a kind of ``push media'' terminal that periodically retrieves selected information from external and internal news sources on the World Wide Web. The terminals are installed at public spaces in the laboratory such as the coffee corner, and continuously display information~\cite{jr:intelli97}. We added a Pick-and-Drop capability to this system so that people can pick up URL information from the terminal and drop it to his/her PDA (Figure~\ref{fig:kiosk}).
This example can also be seen as a variation of Pick-and-Drop. The user picks up pen attributes and drops (draws) on the canvas by using the same pen.
The concept of multi-display operations is also helpful for considering interaction between desk-top computers. For example, when a user is editing a document on a desktop computer, he/she can also use several small tablets on the desk that act as "temporal work buffers". The user can freely Pick-and-Drop diagrams or text elements between the desk-top display and the tablets. We refer to this work style as "Anonymous Displays", because users no longer regard such a tablet as a distinct computer. Instead, the user can easily introduce an additional tablet to the desk space according to their work load. Pick-and-Drop supports intuitive data transfer without bothering with each computer's symbolic name.
As compared with the virtual paste buffers used in traditional GUI systems, employing physical tablets provides a more natural and spatial interface for users. The user can freely arrange tablets on a physical desk-top according to his/her work style. Since all information on the tablets are visible, the user can correctly handle more than two work buffers. Even though the size of the main desk-top display is limited and fixed, the user can add as many work spaces as desired without consuming space on the main display. The concept of Anonymous Displays is to introduce a familiar physical artifact into computer work spaces. Note that we do not have to sacrifice computational power when introducing tangible objects into user-interfaces. For example, the user could perform a ``global search'' on all of the anonymous tablets. Such a capability is unavailable in a {\em real} physical environment.
Our prototype system called {\em PaperIcons} allows Pick-and-Drop between a paper object and a computer display (Figure~\ref{fig:paper}). The user can pick up an object from a printed page and drop it on a display. The page is placed on a pen sensitive tablet and a camera is mounted over the tablet. The camera is used to identify the opened page by reading an ID mark printed on it. The user can freely flip through the booklet to find a desirable icon. The system determines which icon is picked based on the page ID and the picked position on the tablet. Currently, the position of the page on the tablet is assumed and fixed, but it can also be tracked by the video camera by locating markers on the printed page.
Although it is also possible to implement icons book as an electronic display and to provide a Pick-and-Drop operation between booklet< and other computers, the PaperIcons style is quite suitable for selecting ``clip art'' or ``color samples'' from a physical book. If the user is accustomed to a frequently used book, he/she can flip through pages very quickly by feeling the thickness of the book.
Modifier buttons attached to the stylus are used for pen identification. Since WACOM stylus has two modifiers, the system can distinguish up to three pens simultaneously (note that modifier buttons are an alternative). This number is sufficient for testing the Pick-and-Drop concept, but may not be for practical applications. There are several possibilities to extend the number of distinguishable pens. One way is to attach a wireless tag to each pen. Another possibility is to use an infrared beacon.
All the applications described in the APPLICATIONS Section were developed with Java~\cite{java-white-lang95}. The pen manager is also a Java application and communicates with applications with TCP/IP connections. When Pick-and-Drop occurs, one (source) application transmits a Java object (e.g., a file icon) to another (destination) application. We use Java's serializable class~\cite{java-serial97} for implementing object transfers. All instances which are the subclass of class Serializable can be converted to and from a byte sequence. When one computer transfers a Java object, the system first serializes it and sends the resulting byte sequence to the other computer. The receiving computer then de-serializes and recreates the object.
Among the computers described in the APPLICATIONS Section, wall-sized displays (computers) and desk-top displays are directly connected to the Ethernet, while other PDAs use wireless local area networks (LAN). We use Proxim RangeLan2 spread spectrum wireless LAN that employs 2.4~GHz spread spectrum radios and achieves a 1.6M~bps data transmission rate.
Pick-and-Drop is physical and visible as opposed to {\em symbolic}. We observed how people behave when copying information between two different computers and found that they extensively interchange symbolic concepts. In fact, a copy operation could not be completed without verbal support. For example, a typical conversation was: ``Mount {\em Disk C:} of my computer on your computer.'', ``What is your machine's name?'' ``{\em Goethe}.'' ``Open folder {\em Document97} on my {\em Disk C:} and ...''. In this example sequence, {\em ``Disk C:''}, {\em ``Goethe''}, and {\em ``Document97''} are symbolic concepts and unnecessary information for simply exchanging files. On the other hand, information exchange using Pick-and-Drop was more direct. They simply moved the icon as if it were a physical object. Although this operation might also be supported verbally, it is more like a conversation for exchanging physical objects (e.g., ``Pick up {\em this} icon'', or ``Drop it {\em here}'').
The visibility of Pick-and-Drop plays an important role in collaborative settings. Consider, for example, two or more people working together with many computers. When one participant moves data using Pick-and-Drop, this operation is visible and understandable to the others. On the other hand, when a traditional file transfer method is used, the other participants might become confused because its intention could not be effectively communicated.
Although Pick-and-Drop and the shared file solution can be used in conjunction (especially when transferring data to remote computers), there are some issues where Pick-and-Drop looks more natural.
First, as described in the previous section, shared files force the user to deal with certain symbolic concepts such as a machine's name or a file system's name, even though they can actually transfer data by using drag-and-drop. Since the screen sizes of PDAs are normally limited, thus opening another machines file folder often hides local folders, making operations inconvenient. If the user has to deal with more than two computers, keeping track of ``which folder belongs to which machine'' becomes a significant problem, which is similar to the ``mouse jungle'' problem described in the INTRODUCTION Section. In our daily lives, we do not need to have a ``remote drawer'' mounted on the dresser for moving an physical object from one drawer to another. We simply pick it up and move it.
Secondly, a unit of data transfer is not always a file. We often need to copy a short text segment such as a URL from computer to computer. Although it is possible to transfer such a data element through a temporary file, this operation is more complicated compared to the Pick-and-Drop.
In summary, the shared file approach is a good solution for transferring data between geographically separated computers, but not so intuitive between computers within close proximity.
The Spatial Data Management System (SDMS)~\cite{bolt80} is a well known multi-modal system that uses hand pointing and voice commands. SDMS is also a multi display system. Information is displayed on a wall-sized projection display and the operator uses a small touch-sensitive display mounted on the armrest of a chair. Although the user manipulates two different screens to perform a single task, direct inter-computer manipulation is not considered.
The PARC TAB is a palm sized computer that was developed at Xerox PARC as part of the Ubiquitous Computing project~\cite{PARC-CSL-95-1}. It is also used in an multi-display environment. For example, the PARC TAB can be used as an telepointer for the LiveBoard system~\cite{liveboard92}. However, direct manipulation techniques between the PARC TAB and the LiveBoard was not seriously considered.
The DigitalDesk~\cite{wellner93} is a computer augmented desk consisting of the combination of a desk, a tablet, a camera, and a projector. The PaperPaint application developed for the DigitalDesk allows select-and-copy operations between paper and a projected image. Video Mosaic~\cite{mackay94} also introduces a user interface using physical paper into a video editing system.
The PDA-ITV system~\cite{dualui} tries to use a PDA as a commander for interactive TV. Although it uses two different displays for one task, the roles of PDA and TV are static; PDA always acts as a commander for the TV. Inter-computer manipulation is not considered. For example, it is not possible to grab information from the TV screen and drop it to the PDA.
The PaperLink system~\cite{arai97} is a computer augmented pen with a video camera that is capable of recognizing printed text. Although PaperLink can pick up information from paper, it does not support inter-computer operations. For example, it was not designed to manipulate a computer object and paper information with the same PaperLink pen.
Audio-Notebook system~\cite{stifelman96} augments paper-based notetaking with voice memos. It allows a user to make links between written notes and voice notes. The system uses printed marks on each page for automatic page detection. The Ultra Magic Key system~\cite{usuda97} is another example of a paper-based user interface; it allows a user to manipulate the system through paper. The user mounts a piece of paper (specially printed for this system) on a folder and touches the surface of the paper with their finger. The tip of the finger is tracked by a camera mounted above the tablet. The camera is also used to distinguish the paper type. These configurations have some similarity to the PaperIcon system described in the APPLICATIONS Section, but none of them support interactions {\em between} paper and the computer.
Finally, the Graspable User Interface~\cite{fitz95} proposes a new way to interact with computer objects through physical handles called {\em bricks}. The user can attach a brick to a computer object on the screen such as a pictorial element in an diagram editor. Pick-and-Drop and Graspable UIs share many concepts in the sense that both try to add {\em physicalness} to virtual worlds. Unlike Pick-and-Drop, the Graspable UI mainly deals with a single display environment.
At the moment, the prototype system is immature and there is a lot of room we have for improvement. We would like to expand the number of identifiable pens by introducing radio-frequency (RF) tags. Currently, the system can only exchange Java serializable objects, but it should also be possible to implement Pick-and-Drop with more general file transfer protocols and the cut-and-paste protocols such as the X-Window inter-client communication convention (ICCCM).
In addition to enhancing implementations, there are many ways to extend the idea of multi-display operations. We are planning to build and evaluate an application that supports informal discussions between two or more participants in the same place. Using this system, each participant has their own PDA, while a wall-sized display serves as a common workplace for all participants. Using the Pick-up-Drop, the participants can easily exchange information between their PDA and the wall, or between individual PDA's.
\begin{figure} \centerline{\epsfile{file=figs/clearboard.ps,width=0.47\textwidth}} \caption{Pick-and-Drop in a ClearBoard setting (ENVISIONMENT)} \label{fig:clearboard} \end{figure}
Another possible improvement would be to incorporate the Pick-and-Drop with currently available pen interface techniques. For example, several pen gestures like ``grouping'' can be integrated into Pick-and-Drop ; the user first selects a group of objects by making a group gesture, then picks them up and drops them on another display.
It would also be interesting to incorporate Pick-and-Drop with video-conferencing systems such as the ClearBoard~\cite{ishii92chi}. With such a setting, two users could meet over the network through a shared video window. Each user stores his/her own information on the PDA, and they can exchange information between PDAs through the window (Figure~\ref{fig:clearboard}). Using the Pick-and-Drop metaphor, users can seamlessly integrate their personal work spaces with their shared work spaces.
The concept of Pick-and-Drop is not limited to pen user interfaces. It should be possible to implement interface similar to Pick-and-Drop by using normal displays and a wireless mouse. Since a wireless mouse is not tethered to a single computer, we can suppose that each user owns a wireless mouse as a personalized pointing device. Such a device can be used to transfer information from one computer to another, while providing personal identification. This concept also suggests a future role of PDAs -- that of manipulating multiple devices in an networked environment. For example, one can pick up TV program information on a web page (with a PDA), then drop this information on a VCR to request a recording of it.
As a final remark, the common design philosophy behind all these systems is the understanding that we are living in a fusion of physical (real) and virtual (computer) worlds. Each has its own advantages and disadvantages. Pick-and-Drop, for example, adds physicalness to user interfaces, because we feel that traditional data transfer methods are too virtual and hard to learn due to their lack of physical aspects. To the contrary, many augmented reality systems add virtual properties to the physical world~\cite{Bajura92,feiner93,jr:uist95}. However, these two approaches do not contradict on another. We believe that one of the most important roles of user interface design is to {\em balance} the virtuality and physicalness of the target area.