To appear in: Computers in Human Behavior, Vol 13, No. 4, pp. 443 - 463, 1997, Elsevier 1997
Dag Svanæs
Department of Computer and Information Science
University
of Trondheim, NTNU
7055 Dragvoll, Norway
e-mail: dags@ifi.ntnu.no
home page
Designers of computer-based material are currently forced by the available design tools to express interactivity with concepts derived from the logical-mathematical paradigm of computer science. For designers without a training as programmers this represents a barrier. The three psychological experiments presented indicate that it is possible to express interactive behaviour in a more direct fashion by letting the designers compose software from interaction elements with built-in behaviour. The resulting «kinaesthetic thinking» of the software designers shows similarities with visual and musical thinking. To be able to support this style of design it might be necessary to rebuild from scratch parts of today's software using simple interactive building blocks. As an illustration, a design tool is presented, based on a pixel-level agent architecture.
On most popular personal computer platforms a wide variety of multi-media tools are currently available. These are easy to use, require little or no skill in programming, and range from editors for pixel graphics and animation, to tools for integrating the different media resources. Most of the tools provide excellent support for graphics, sound and video. The problems arise when designers want to be creative concerning interactivity. If a designer wants to create interactive solutions that were not imagined by the tool makers, he or she has to make use of a scripting language like HyperTalk or even leave the tools all together and do programming in a programming language like C++ or Java.
Most designers do not have training as programmers, and for these users programming becomes a barrier that cannot be crossed without detailed help from a professional programmer. If such help is not available, the designer has hit the wall and has to settle for solutions with less interactivity. As the potential for interactivity is the most powerful feature of the computer compared to other media, this is a very unfortunate situation.
Andersen et al. (1993) describe the problems they had as non-programmers with the design and implementation of a multimedia product about the Scandinavian Bronze Age. To illustrate how a landscape was experienced by people in the Bronze Age, they introduced the concept of 'interactive texture'. The idea was quite simple:
While we now see water as a hindrance to locomotion, and firm ground as a help, the situation was to some extent the opposite at that time when water united (because of boats) and land divided (because of large forests and moors). We can let the user experience this through his fingers by making the cursor move differently in different areas. If the spot is on land, it travels slowly, while it goes quickly if it is on sea (p. 260).
To implement this feature they found the hypermedia tools they were using quite inadequate, and instead had to do advanced scripting/programming. From this and similar experiences they conclude:
Logical-mathematical intelligence is of course necessary for programming, but it brings forth sterile and boring products... The real solution is to invent programming environments and to create a style of programming that artistic people find themselves at home with ... to remove programming from the clutches of logical-mathematical intelligence, and hand it over to musical and spatial intelligence (p 262).
I find this to be an exciting and relevant design challenge put forward by representatives of a growing user group.
Alan Kay, one of the inventors of personal computing, points to the same challenge when he justifies the need for research in end-user-programming (Cypher, 1993):
The term «computer literacy» also surfaces in the sixties, and in its strongest sense reflected the belief that the computer was going to be more like the book than a Swiss army knife. Being able to «read» and «write» in it would be as universally necessary as reading and writing became after Gutenberg (p. xiii).
Having described «reading,» he continues:
Writing on the other hand requires the end-user to somehow construct the same kind of things that they had been reading - a much more difficult skill ... One of the problems is range. By this I mean that when we teach children English, it is not our intent to teach them a pidgin language, but to gradually reveal the whole thing: the language that Jefferson and Russell wrote in ... In computer terms, the range of aspiration should extend at least to the kinds of applications purchased from professionals. By comparison, systems like HyperCard offer no more than a pidgin version of what is possible on the Macintosh. It doesn't qualify by my standards (p. xiii).
There is little reason to believe that the current technological development alone will change this situation. Most of the popular windowing systems have internally grown increasingly complex during the 90s. The programming skills needed to implement inventive behaviour on top of the current windowing systems now far exceeds what a curious user-interface designer can learn in his or her spare time. A modern personal computer now consists of layer upon layer of constructed systems of abstraction, from assembler languages, through programming languages and user interface environments up to the visible levels of the applications in use.
The structure and content of these layers is not only determined by the hardware at hand as a lot of different systems of abstraction are possible. It is the result of thousands of design decisions made by numerous programmers and systems architects over a long period of time. It represents massive investments that give the current solutions a strong momentum.
From the point of view of computer users with little or no skill in programming who want to create interactive software, the current technological development has introduced the following problems:
Research aimed at addressing these problems should have as a technological objective to develop a new class of software tools that makes it easier for non-programmers to become computer «writers» (in Kay's terminology). This is a very broad objective that covers most of the current research in end-user-programming (see Cypher, 1993). Despite a lot of interesting results from this research tradition, most of the work is technology driven and does not challenge the dominant logical-mathematical paradigm of computer science.
The route I have taken to this problem is to start out by studying how interactive behaviour is conceptualised by users who have not been exposed to the dominant paradigms of programming, that is, to study their naive theories of interactivity. This is done through three controlled psychological experiments. The results from these experiments point to some interesting differences between the dominant paradigm and the subjects' «intuitive» way of understanding interactive computer behaviour. From this it is possible to spell out some possible consequences for the design of interactive software.
Studies of naive theories have been done on domains as diverse as physics, electricity, heat, Micronesian navigation, and arithmetic. Naive theories can give insights into the nature of alternative understandings of a domain, and have proven valuable to designers and instructors.
The naive theories of physics present a paradigm case. Several experimental studies indicate that it is common to reason about motion with pre-Newtonian concepts. In a study by White & Horwitz (1987), high school students were shown the drawing in figure 1 and asked what happens when the runner holding the ball drops it. Does it follow path A (correct), B or C? Only 20 percent got it right. When asked to explain their reasons for answering B or C, the students made use of a naive theory of motion very similar to the medieval impetus theory that was dominant in the millennium from the 6th century until the time of Galilee and Newton.
Figure 1. Possible paths for a falling ball.
Naive theories have similarities with Lakoff & Johnson's (1980) classification of important implicit metaphors in everyday English. These metaphors are available as resources to the language user. Lakoff & Johnson's work says nothing about which metaphor a given language user will make use of in a certain situation, but it lists the metaphors available. If we take the description of temporal phenomena as an example, Lakoff & Johnson (1980) say that in English, time is either described in one of two ways: (1) something stationary that you pass through, like in: «we are approaching the end of the year;» or (2) as a moving object, like in: «the time for action has arrived.» No other implicit metaphors are in common use in English. A multitude of other metaphors are theoretically possible like seeing time as colour. From Lakoff & Johnson's work we can conclude that statements such as «today time is green» are meaningless in English, or at least that the intended meaning of such statements needs to be explained at length to the listener.
To a designer of a natural-language user-interface for a scheduling system, this kind of information would be very useful even though it would not predict what a specific person would say in a specific situation. It would on the other hand restrict the cases that would have to be dealt with dramatically and justify putting some strong limitations on the grammar of the user interface.
Andersen & Madsen (1988) report on a study of the naive theory of some Danish librarians concerning their database system. Based on their empirical findings, the authors proposed a redesign of the query language to better fit the conceptions of the librarians.
Little has been done on studying naive theories of graphical user interface (GUI) behaviour. Every usability test involving qualitative questions about the user's intuitive perception of a GUI can be seen as an attempt at searching for such theories, but I have found no attempts at generalising from these findings to a naive theory of the workings of the computer itself. Turkle (1984) has described how users intuitively reason about and relate to computers, but her study was done before the widespread use of GUIs and focused on the verbal interaction between user and computer. With the advent of bitmapped graphics and mouse input it is possible to see the computer as an interactive medium where the dominant mode of interaction is non-verbal.
The experiments presented here attempt at uncovering some aspects of how computer-naive users reason about the behaviour of graphical user interfaces. The areas that might benefit most from a better understanding of the naive theories of GUIs are those where non-programmers are given some control over the behaviour of their systems. This includes systems for end-user programming, visual programming and personalisation of GUIs.
Where pure experimental psychology would ask questions about the psychological mechanisms involved, my intent is not to inform psychology as a research field but to build an understanding of how interactive media is experienced in use. As the question posed is of a qualitative kind, concerning an understanding of the interactive experience, a qualitative research methodology is appropriate.
Qualitative methods have to a large extent been used for interpreting field data. In the present context this would correspond to the analysis of human-computer interaction in natural settings. A lot of such field data is available, but it does not provide the level of detail necessary to build the kind of theory I am aiming at here. To be able to study the interactive experience qualitatively down to the level of detail of the separate interaction elements, I consequently found it necessary to set up a controlled experiment.
Having decided to design a psychological experiment, one is faced with the question whether the experimental set-up should mimic a realistic situation or whether it is possible to learn anything from analysing how users interact with purely abstract interaction elements. Referring to the Bauhaus tradition, the present research has a lot in common with Itten's systematic study of how different colours interact in visual art (Itten, 1973). Itten's resulting theory says things like: If you want an object in a painting to appear as if it is placed behind another object, you should make it darker than the one in the front. Thumb rules of this kind hold whether you paint fruit, skyscrapers or people, and whether the painting should hang on a wall or be used as an illustration in an advertisement. As the kind of phenomena we are looking for here hopefully are of a similar general kind, there should be no need to use figurative examples or provide a naturalistic context for interpretation. This opens up the possibility of using purely abstract interaction examples as stimuli in a controlled experiment. The term «abstract» is used here in the same way as modernist painters like Kandinsky (1994) used it to denote non-figurative expressions of art.
The subjects were seven undergraduates, five male and two female. Their experience with graphical user interfaces ranged from «almost nothing» to «a bit.» They were paid to participate.
A total of 38 Finite State Automata (FSAs) were developed for the experiment. They consist of either one, two or three squares. The squares can be either white or black. Changes in colour always come as an immediate response to a user's action. The two user actions detected are «mouse button press» and «mouse button release.» All examples were deterministic. The subjects were offered an additional «repeat» button that enabled them to bring the FSAs back to their initial state whenever they liked.
Figure 2. The FSAs as they were presented to the subjects.
Figure 2 shows a snapshot from the screen with a two-square FSA. The FSAs can be described formally as an initial state and a set of production rules. The complexity of the 38 FSAs range from one-state FSAs with no transitions to a 8-state FSA with 24 transitions. The FSAs were presented in an increasing order of complexity, starting with all the FSAs with only one square, then all the two square FSAs, and last all the three-square FSAs.
In all three experiments the user worked with a Macintosh IIci personal computer running SMALLTALK/V.
The subjects were instructed to freely describe the examples as they explored them. They were explicitly told that the experiment was set up to learn about how people experience interacting with computers; not to learn about how good that individual subject was at reaching some answer.
Apple's Usability Testing Guideline (Tognazzini, 1992) was followed rigorously when setting up the experiment. This includes instructing the subjects on how to think-aloud, explaining to them beforehand the purpose of all the equipment in the room, and telling them that they had the right to stop the experiment at any time without loosing their pay if they for any reason felt uncomfortable.
The 38 examples were presented to the subject one at a time, and the subject was given control of the progress of the experiment. As seen in Figure 2, a «next» button was put on the screen for this purpose. Care was taken not to induce in the subject any preconceptions about the nature of the examples they were to explore. The experimenter referred to the examples by pointing to «that there» on the screen. The experimenter sat by the side of the subject throughout the session, and was available for technical assistance.
The sessions were videotaped. The camera was pointed at the screen, and the subjects were aware that they themselves were never filmed. An external directional microphone was used to increase audio quality.
In addition all interactions were logged with time stamps that made it possible to reproduce the interaction in detail. The resulting four hours of taped material was then transcribed together with the non-verbal interactions.
All subjects got engaged in the task very quickly. On some occasions they asked the experimenter whether they were doing the right thing, but they never lost concentration. With a total of 3 exception, all FSAs were explored and described by all subjects. The subjects all controlled the progress of the experiment in a very natural way. It seemed as if at a certain point in the exploration of a FSA; it was completely «understood» , and the subject was ready for the next one.
All subjects developed a good skill at exploring the FSAs. In some cases complex FSAs were fully «grasped» in a few seconds. They often had a much harder time describing them. One subject put it: «It is easy to do, but hard to explain.»
The subjects often described a FSA as being identical to some previously tested FSA. The FSAs were also described as modification of previous FSAs as in: «It is like that other one, but when …» What was being compared were the actual FSAs as they were experienced in interaction, and not some decomposed understanding of the interaction. A possible term for the FSAs as perceived reality could be Interaction Gestalts.
Lakoff & Johnson's (1980) theory of metaphor was used as an inspiration for doing an analysis of the implicit metaphors used by the subjects in describing interactive behaviour. To show how this analysis is done, an example is provided.
Figure 3. The state transition diagram for an FSA.
Figure 3 shows the state transition diagram of a three-square FSA. The squares are numbered from the left. The FSA starts out in the colour combination white-black-white. It has four states. The colour of the rightmost square can be changed by clicking on it. This square has «toggle behaviour.» When the rightmost square is black, it is possible to swap colour between the leftmost and the middle square by clicking on the one being white.
One subject (female) described this behaviour as:
« ...OK, you have to begin with the one to the right to be able to do anything ... then you can move that one [square 1] ... and turn the one to the right on and off ...»
We see here that she sees the FSA as consisting of a switch in position 3 which can be turned on and off, and a white square that can be moved between position 1 and 2 whenever the switch is «on.» Her mental model of the FSA is illustrated in figure 4:
Figure 4. A graphical representation of a mental model.
A total of ten implicit metaphors were identified in the material. The first four of these were:
The last six metaphors all describe movements in some space. The object moving is either the user as in: «then I move to the left,» or the FSA or some part of it as in: «the white square moves to the right.» The three spaces are:
Table 1 shows the six combinations with example quotes:
Table 1. The six metaphors involving spatialisations.
These ten metaphors were often used in combination as exemplified by the example.
Figure 5. The Interaction Gestalt editor.
To test the practical usefulness of the idea of Interaction Gestalts, an experiment was set up. The editor shown in Figure 5 is based on the idea of Interaction Gestalts and was created for the experiment. The user here works directly with interactive single-square FSAs constructing complex FSAs from a set of four basic building blocks with built-in behaviour. Each FSA is treated as a separate entity that can be cut, copied and pasted. A set of unary and binary operations have been defined on the FSAs. The editor can be seen as implementing an algebra of interactive single-square FSAs.
The tool was tested individually on three high school students (ages 16-17) with no background in programming. They first ran through the examples of experiment I to learn about the design domain, and were then asked to reproduce some of the FSAs using the tool.
All subjects were able to reproduce all problems given. They worked directly with interactive user-interface elements in this non-verbal manner, constructing complex new widgets from the set of elementary FSAs.
There is a strong tradition in Scandinavia for involving end-users directly in the design of software through techniques of participatory design. In the third experiment the design process is used as a gateway to the naive theories of the target domain. The approach consisted of letting a group of end-users construct a tool for building FSAs through techniques of participatory design with iterative prototyping.
The subjects were 5 high school students (ages 16-17) with no programming background. As in experiment II they first individually went through the same FSAs as in experiment I to get acquainted with the problem domain.
They were then as a group presented with the design problem and asked to come up with as many design ideas as possible. This brainstorming lead to an initial design that was prototyped by a programmer and tested in the following session. Modifications and extensions were incorporated into the next prototype, and this iteration was repeated five times until the group was satisfied with their result.
The subjects were not exposed to any tools or design ideas before or during the design process. They consequently had to use their own conceptions about the domain as basis for their design.
The group designed an advanced tool incorporating ideas similar to those leading to experiment II. The final version of their editor can be seen in figure 6.
Figure 6, The result of the design process
It allows for the construction of two-square widgets. The design group identified a set of 10 elementary one-square widgets (FSAs) which they listed on the left. They further defined operations and modifiers on these interactive building blocks. When asked to reconstruct some given widgets with their tool, they reasoned directly with their elementary interactive widgets.
Figure 7. State-Transition Diagram for the widget
This widget shown in figure 7 consists of two squares, the left initially white, and the right initially black. When you press the mouse button in any of the squares, the colours change. This creates an illusion of spatial movement.
Their «language game» evolved in parallel with the evolution of the prototype in what could be called a language-artefact cycle.
This widget was described:
«I think it is a five [widget #5] and a seven [widget #7]."
"No, it is a seven and an eigth with both-react [a modifier]."
Figure 8. "A seven and an eigth with both-react"
Figure 8 illustrates graphically how the widget was conceptualised by the subjects. They saw it as consisting of two single-square widgets with toggle behaviour, the left initially white, and the right initially black. These single-square widgets were available as building blocks in their tool (widgets #7 and #8 respectively). They further added their property "both react" which makes both widgets sensitive to the same input. This gives a two-square widget with the required behaviour.
From a computer-science perspective the editors of experiment II and III represent hybrid solutions that introduce an unnecessary level of abstraction on top of the underlying internal state-transition formalisms. From a user centred (phenomenological) perspective it seems that simple widgets with built-in behaviour (perceived as Interaction Gestalts) are natural atomic objects for this problem domain. The state-transition formalisms normally used for describing GUI behaviour rest on an abstraction of linear time with stable states. When designing interactive behaviour directly and non-verbally with Interaction Gestalts, there seem to be no need to involve abstractions of time.
If we contrast these results with the dominant programming paradigm in computer science (Object-Oriented Programming, or OOP), we find some striking differences concerning the conception of time and concerning the nature of objects.
In OOP, interactive behaviour is described in algorithmic terms. Underlying an algorithmic description is the notion of linear, discrete forward-moving time.
In the naive theory emerging from the experiments, linear time is only one out of many possible ways of conceptualising interactive behaviour. In most of the cases the interactive experiences were not dissected into discrete events, but were dealt with as meaningful wholes either directly or through metaphor.
In OOP, an object is described as a fixed entity having an objective existence in «the external world.» This way of seeing objects conform with the way natural phenomena are described in systems theory and in most of the natural sciences. Philosophers have called the underlying scientific paradigm «positivism» and «objectivism.»
The objects described by the subjects in the experiment existed only for them through interaction. The objects emerged as a result of the interplay between the intentions of the user, the user's actions, and the feedback given by the system. When the intentions of the user changed, different objects emerged. This world view corresponds to the view put forward by phenomenological philosophers like Heidegger (Dreyfus 1992) and Merleau-Ponty (1962). The way they see it, it is meaningless to talk about objects as existing «in the external world» independent of the intentionality of the subject. For these philosophers, the structure of the physical world emerges to us as a result of our interaction with it.
The fact that the subjects were able to reason directly with these interactive widgets without having to break them up further, leads to the assumption that we are dealing here with a mode of thinking which has a lot in common with visual thinking (Arnheim 1969) and musical thinking. Meaningful experiential wholes were mentally compared, superimposed, inverted etc. as if they were images. Johnson (1987) proposes "kinaesthetic image schemata" as a term to describe experiential wholes resulting from interaction with our physical environment. He claims that these schemata have a lot in common with visual image schemata, and that they have a very fundamental role in cognition and understanding. Following his terminology, I propose the term "Kinaesthetic Thinking" to signify direct cognitive operations on tactile-kinaesthetic sense experiences, i.e. on Interaction Gestalts.
For many purposes the term "tactile" is used as a synonym for "kinaesthetic". In the case of GUIs with mouse input, I would find it misleading to talk about "Tactile Thinking" as no true tactile feedback is provided by the computer.
To support the view that kinaesthetic image schemata are important in human cognition, Lakoff (1987) reports on psychological experiments where blind subjects perform mental operations on tactile pictures:
It seems to me that the appropriate conclusion to draw from these experiments is that much of mental imagery is kinaesthetic - that is, it is independent of sensory modality and concerns awareness of many aspects of functioning in space: orientation, motion, balance, shape judgement, etc. [p. 446].
The notion of a separate kinaesthetic sense modality can be traced back to the early work of the dance theoretician Rudolf Laban (1988). He defines the kinaesthetic sense:
..the sense by which we perceive muscular effort, movement, and position in space. Its organs are not situated in any one particular part of the body, as those of seeing and hearing,....[p. 111].
He says about the process of composing a dance:
...this cannot be an intellectual process only, although the use of words tends to make it so. The explanatory statements represent solely a framework which has to be filled out and enlivened by an imagery based on a sensibility for movement.[p. 110].
His "imagery based on a sensibility for movement" is very close to what I mean with Kinaesthetic Thinking. It is important to note that imagery in the context of choreography does not mean visual imagery, but an imagery which happens directly in the kinaesthetic sense modality when the dancer uses the body as medium and material.
In the context of interaction design, Kinaesthetic Thinking involves not only "a sensibility for movement", but also a sensibility for orchestrated responses to movement, i.e. interaction.
Kinaesthetic Thinking is to a large extent "tacit" in the sense that it is not simply manipulation on symbolic representations. In the terminology of Polyani (1966), the trained Kinaesthetic Thinker possesses "Tacit Knowledge".
Schön (1983) reports on professional practices in as diverse fields as architecture and psychoanalysis. He found an important aspects of the design practices he studied to be a dialogue between the designer and the material. Each practice has its own specific "language of designing" which to a large extent is influenced by the medium through which the design ideas are expressed. Architects communicate their ideas mainly through sketches on paper. Most of their thinking also happens in this medium. To use the terminology of Norman (1993), the act of drawing sketches on paper is an integral part of the cognitive process of the architect. Architects think with both brain, hands and eyes. They often experience it as if the materials of the situation "talk back" and take part in the process of creation.
Most writers have experienced the same phenomenon: the process of writing brings forth ideas that were not necessarily present before the actual writing started. The writer has entered into a dialogue with the text.
This view of the design process is very different from the view held by some theoreticians, e.g. Newell and Simon (1972), that design is a rational process which is best explained as goal seeking behaviour. Rationalistic views of this kind were until recently widely held within computer science with respect to the software design process. As a result of extensive field studies of software design practices, it is currently hard to defend a view of software design as a rational top-down process.
Papert (1992) borrows the term bricolage/bricoleur from the French anthropologist Levi-Strauss to describe design processes involving large elements of improvisations based on the material available. As the "father" of the educational programming language LOGO, he has observed how children are able to construct interesting programs in a bottom-up fashion. Their unstructured and playful behaviour shows important similarities with what Levi-Strauss observed among "primitive" tribesmen. The latter constructed their artefacts through playful improvisation from what was available in their natural environment.
Papert's colleague Martin (1995) observed the same phenomenon among university students in the LEGO-Robot course at MIT. In this course, the students were given the task of constructing working robots using an extended version of the LEGO-Technics construction set. This set is extended with useful electronics, and with a programming environment for a microprocessor on-board the robots. He observed that:
While some students do take a "top-down" approach towards their projects, enough do not so that we as educators should take notice. Most students' projects evolve iteratively and in a piecemeal fashion: they build on top of their previous efforts, sometimes pausing to re-design a previously working mechanism or integrate several previously separate ones.
From this he concludes that the design environments and materials should encourage playful design and creative exploration in the same way as the LEGO bricks do. Applied to software environments, he claims that this means providing clean levels of abstraction with well-defined and observable building blocks.
As Turkle (1984) has pointed out, the computer differs from other media in that it is both a constructive and a projective medium. In physics you can not change the laws of nature to better fit naive theories of motion, but with a computer you can create software tools that enable users to construct their systems with abstractions and representations that are close to their own intuitive conceptions.
To support the bricoleur interaction designer, it is important that the tools and materials allow for the designer to work fluently in an iterative and explorative modus operandi. To use the terminology of Schön (1983), it is important that the software environment enhances a "dialogue with the material" in the design process. The popularity of WYSIWYG (What You See Is What You Get) interfaces in computer-based tools for graphics and layout design indicate that this is best done in the sense modality of the resulting product. For the interactive aspects of user interfaces, this means designing in the kinaesthetic sense modality.
The design tools of experiments II and III have a lot in common with the end-user-programming environment AgentSheets, developed by Alex Repenning (1993). AgentSheets provides an agent architecture that allow for domain experts to build domain-specific construction sets. The construction sets consist of agents with built-in behaviour which can be placed on a grid in a visual programming environment. A number of construction sets have been built and successfully tested on end users. Repenning has coined the term "tactile programming" to describe this way of constructing interactive software.
The findings from the three experiments reported here indicate that it is possible to create tools that enable designers to "tacitly" construct interactive behaviour directly in the kinaesthetic sense modality. From this it is possible to sum up the previous discussions in a set of guidelines for making user-interface design tools more supportive to Kinaesthetic Thinking:
In an attempt to try out the ideas presented here, a very first prototype was made of a design tool that enables interaction designers to construct graphical user interfaces by "painting" with pixels with inherent behaviour. I built this prototype to show the basic functionality of a possible new class of design tools. Informal user tests have been done which encourage me to continue developing the prototype along the lines outlined here. The basic idea is to allow the designer to construct interactive behaviour directly at the pixel level with "tools" resembling tools in pixel-based paint programs like MacPaint. The prototype has strong similarities with AgentSheets (Reppening 1993). In AgentSheets, the agents are represented as icons on a grid. Here, I have taken this approach to an extreme by letting each pixel be an interactive agent communicating with its neighbouring agents (i.e. pixels).
The functionality of the prototype is best shown through an example. Figure 9 shows an example of an interactive drawing which was constructed with the prototype. It illustrates an on/off switch which controls a light bulb. The switch has toggle behaviour.
Figure 9. An interactive drawing.
To construct this interactive drawing in black & white , four additional "colours" were needed as illustrated in figure 10:
Figure 10. An interactive drawing in the making.
A runtime system takes care of making adjacent interactive pixels share input. This makes it possible to construct large areas, like the switch in the above example, which behave as interactive objects. Interactive objects can similarly be split into separate interactive objects simply by taking them apart on the canvas.
The present study shows that users with no formal training in programming are able to reason about the behavioural aspects of GUIs without breaking the interactions down into discrete events. I have introduced the term Kinaesthetic Thinking to signify the modus operandi involved when interactive behaviour is designed directly in the kinaesthetic sense modality, i.e. without making use of textual or visual representations of behaviour. This is the "visual thinking" of interaction design.
Kinaesthetic Thinking is poorly supported by the current user-interface design tools. Great benefits should be expected concerning both productivity, creativity, and job satisfaction from building design tools that better support this artistic way of working with interactive behaviour.
It is to me an open question how the psychology of Kinaesthetic Thinking should be studied further. How should the experiments be set up, and how should the results be interpreted? Will it be possible to identify simple gestalt principles in the kinaesthetic domain similar to those in the visual domain? If so, will these principles be as applicable to interaction design as the gestalt principles of the visual domain and the insights about visual thinking have been to graphical design?
I plan to develop the presented prototype into a working design tool together with a representative user group. A first step will be to develop a version of the prototype that enables kids to add interactivity to their computer drawings. I plan to develop this tool in Java and make it publicly available on World Wide Web to gain practical experience with tools supporting Kinaesthetic Thinking.
To extend this approach from the well defined domain of the experiments to the domain of today's complex software, we are faced with a trade-off between visual elegance and flexibility. To be able to give the end-users back the control of their tools, we might have to rebuild today's software systems from scratch using simple interactive building blocks that fit together in simple ways. By doing this we will probably lose some of the elegance of the current software, but we will gain a new kind of flexibility. The situation is very similar to the difference between a toy built from LEGO bricks and other toys. Most ordinary toys have only one use, and when the kids are tired of them they are thrown away. A LEGO toy can be taken apart, recycled, modified, and extended for ever. The drawback is of course that a toy made of LEGO bricks looks much less like what it is supposed to mimic.
It is currently an open question how the market would react to user interfaces and application that have different look and feel than today's software, but which can be easily opened up, studied, understood, modified, and re-used by most ordinary users.
Thanks to J. Sphorer, S. Houde, A. Cypher and numerous others at Apple Computer who gave me the opportunity to discuss these ideas during my sabbatical in their Advanced Technology Group. Also special thanks to W. Verplank, F. Nake, A. Repenning and A. Mørch for fruitful feedback.
Andersen, P. B. & Madsen K. H.(1988) Design and professional languages. In P. B. Andersen & T.Bratteteig (Eds.), Computers and Language at Work (pp. 157-196.), Oslo: SYDPOL, University of Oslo.
Andersen, P.B, Holmqvist, B., & Jensen, J.F. (1993) The computer as medium. Cambridge: Cambridge University Press
Arnheim, R. (1969)Visual Thinking. Berkeley, CA: University of California Press.
Cypher, A. (1993). See What I Do, Boston: M.I.T. Press.
Dreyfus, H. L., (1991) Being-in-the-world: A Commentary on Heidegger's Being and Time, Division I. Cambridge, MA: MIT Press.
Itten, J. (1973). The art of color: the subjective experience and objective rationale of color,. New York: Van Nostrand Reinhold.
Johnson, M.(1987) The body in the mind. Chicago: Univ.. of Chicago Press.
Kandinsky, W. (1994). Complete writings on art. New York: Da Capo Press.
Laban, R. (1988) Modern Educational Dance. Plymouth: Northcote House.
Lakoff, G. , Johnson, M.(1980). Metaphors we live by. Chicago: University of Chicago Press.
Lakoff, G. (1987).Woman, Fire, and Dangerous Things. Chicago: Univeristy of Chicago Press, .
Martin, F.G. (1995) A Toolkit for Learning: Technology of the MIT LEGO Robot Design Competition. MIT.
Merleau-Ponty, M. (1962). Phenomenology of Perception. Colin Smith (transaltor). New York: HUmanities Press.
Newell, A., Simon, H.A. (1972.)Human Problem Solving. Cliffs, N.J.: Prentice-Hall, Englewood
Norman, D. (1993)Things that make us smart. Reading, Ma: Addison-Wesley, .
Papert, S. (1992) The Children's Machine. New York: Basic Books.
Polyani, M. (1966) The Tacit Dimension. London : Routledge & Kaufman Paul.
Repenning, A. (1993) Agentsheets: A Tool for Building Domain-Oriented Dynamic, Visual Environments. Colorado: Ph.D. thesis, Dept. of Computer Science, Univerisity of Colorado, .
Schön, D. (1983) The Reflective Practitioner. How professionals think in action. London: Basic Books.
Tognazzini, B. (1992). TOG on Interface, pp. 79-89, Reading MA: Addison-Wesley.
Turkle, S. (1984). The second self: Computers and the human spirit. New York: Simon & Schuster.
White, B.Y. & Horwitz, P. (1987). Thinker tools: Enabling children to understand physical laws. (Report No. 6470). Cambridge, MA: BBN Laboratories.