Towards a Design Methodology for Adaptive Applications - CiteSeerX

May 18, 1998 - Application. Figure 1: A Layered View of the Adaptive Application. 2 ... Having made a best guess at the semantics, a design goal is to ensure ...
122KB taille 23 téléchargements 299 vues
Towards a Design Methodology for Adaptive Applications Malcolm McIlhagga, Ann Light and Ian Wakeman School of Cognitive and Computing Sciences University of Sussex Brighton BN1 9QH May 18, 1998 Abstract We describe an abstract architecture of adaptive applications, and indicate where we believe crucial design decisions must be made. We illustrate the use of the abstract model in the design of an image proxy, and show where studies are required in determining the appropriate design points. In particular, even though adaptation to resource constraints is generally considered a systems problem, the adaptation is visible to the user in changes in utility, and so the user must be involved in designing the application. Finally, we discuss the politics that creep in when designers change the semantics of applications.

1 Introduction In an age of cheap mobile compute devices, we would like an ubiquitous computing environment [20] in which applications can follow us around as we move from computing interface to interface, and from device to device. However, because of the well-known constraints of battery power, display capabilities and network characteristics for mobile computing, the resources are not equally available across all platforms. To allow applications to scale across the platforms, we need the applications to adapt to the changing constraints of the resources available in the environment. There have been a number of adaptive applications, ranging from those attempting to alleviate the effects of congestion and limited bandwidth, such as vat [5], vic [15], ivs [6] and rat [17], to User Interface Management Systems such as Amulet [16] which attempt to adapt to the display capabilities through battery management schemes within the operating system. The rise of middleware has generated renewed interest in providing generic adaptation policies, but we believe that the interfaces to manipulate resource requirements are phrased in the wrong vocabulary [4]. Recently there has been an attempt at defining object design patterns [18], but these do not address the holistic application. The work on the Glomop architecture [3] is closest in approach to our work. We have focussed on developing a policy control architecture rather than developing a scalable solution, and in particular in generalising to an architecture of adaptive applications. In this paper, we provide a number of design choices which should be explicitly addressed when designing an adaptive application. We believe these design choice are common across all adaptive applications, and are fundamental in determining how successful the application is in adapting to changing resources yet still providing utility to the user. After explaining the design choices, we illustrate their use in the design of an Image Proxy for the WWW. A feature of adaptive applications is that they are no longer purely systems design - they interact with users directly, and we believe that user studies must be undertaken to determine the appropriate interface and behaviours for the application. We finish by describing some of the political issues that arise when adaptive applications are used. As a side product of this work, we have empirically determined appropriate quality factors for JPEG images, which are much lower than those produced by standard scanner software. We show that if JPEG images are reduced to the threshold quality factors, then image file sizes can be significantly reduced, lightening the ever-increasing load on the Internet.

2 Adapting to Constraints: Choices in Application Design An adaptive application is one in which the application changes its behaviour according to the perceived constraints in the environment, so as to maintain the semantics of the application for the user. 1

User Utility

Application Semantics

OI OI Adaption Controller OI Components

Application

Resource Constraints (network, display, battery...

Figure 1: A Layered View of the Adaptive Application

2

If we take the above definition as our starting point, then we can immediately break the problem of designing an adaptive application into various stages. The constraints in the environment for the application should first be determined. For mobile computing applications, the likely constraints are network quality of service, battery power and display capabilities. We must next define the semantics of the application. What is the application trying to achieve? This is user and task specific, and can vary from context to context. As we will argue, the more generic the application, the more difficult it is to determine the application semantics. Having made a best guess at the semantics, a design goal is to ensure that the semantics of the application remain invariant across the constraint-reacting behaviours of the application. By doing so, we hope that the utility of the application to the user remains high. In order to maintain the semantics, the changes in behaviour of the application should ensure that the application continues to work whilst adapting to the changing constraints. If the behaviour is changing to adapt to a restriction of some needed resource, such as bandwidth, the change should inflict the minimum damage upon the essential functionality of the application. This leads us to a solution in which the implementation of the application should be opened up so as to allow the choice of some equivalent but less resource-hungry implementation. We thus come to the first design point in our methodology Design Choice: Providing an Open Implementation of the components in the Underlying Technology The concept of open implementation [7] from software engineering enables the designers of components to open up the implementation of their component so that it can be adjusted and adapted to suit various needs. The behaviour of a component should be described by its interface abstractions, but the implementation is generally hidden behind the interface. Following the design guidelines of open implementation (OI) [13], the designer of the component will offer other meta interfaces through which the programmer can adjust the implementation of the component. If we are aiming to produce distributed applications which scale across network and display capabilities, then OI offers a suitable software engineering approach to enable this scalability. Simply put, the OI part of a software component exposes the network requirements of a particular implementation of a component. By allowing the adaptation controller to manipulate the implementation of a component, the controller can adjust the network requirements of the component and thus adapt the component to the prevailing network conditions. Following the guidelines in [8], the designer of an open implementation interface should attempt to: 1. Define the abstract black box interface, which instantiates the “useful” behaviour of the component. 2. Using the domain knowledge of both the inherent implementation of the component and of how clients will use the component, define interfaces which allow the client to control implementation strategies. The inherent implementation of the component is based around the abstractions that will be used to implement the component. For the case of the file caching mechanism described in [12], it is the disk buffers and caches of a file system. For the network retrieval architecture described in Section 3, the abstractions are the compression schemes and representations of the various forms of multimedia. Since the abstractions used in understanding the implementation of a component are closer to the domain of the application, the designer will be able to better tune the adaptation of the application. Design Choice: Determining the Degrees of Freedom in degradation trajectories The OI interfaces of the components will normally provide a discrete set of implementations upon some variation. Imagine a simple speech tool which provided a set of encodings to use for speech. As the degree of compression in the encodings gets higher, the bandwidth requirements decrease but the quality of the resultant speech is decreased, thereby decreasing the utility of the application. For more complex applications, each component may provide other choices of implementation which are orthogonal to each other in that they can reduce resource usage in different ways. If we can add redundancy to the speech encodings, using some of our bandwidth to provide data to repair lost packets at the expense of greater latency, then we provide another set of choices of about how to use valuable bandwidth. If we regard each component as supplying an OI interface which can be adjusted independently of the other components, the designer thus has a choice over each component of an implementation. The number of independent axes which contain implementation choices is the degree of freedom of the application. Adjusting the Open Implementation of a component can be viewed as placing the component at some new point in the space of (implementation choice, resource usage, application utility) space. As we select a new set of implementation choices to decrease the resource usage of a component, we generally decrease the utility of the overall application. Thus for each of the possible implementations of the component, we could theoretically measure the utility of the resultant application. The designer must understand this space of possible implementations, for it is from this set of implementation choices that the designer chooses a path of degradation in the face of resource constraints. 3

Design Choice: How transparent should the degradation be? Having plotted the implementation space, the designer must next decide how the resource usage is to be monitored and controlled. For networked applications, this has generally been the congestion signal such as a packet loss, or for processor sensitive applications, by examining the process queue length [9]. In our abstractions, the control is through a generic adaptation controller, but this is generally integrated with some other component. The designer must next decide what the resource availabilities upon which to switch implementations, and whether this should include the user within the loop or not. Systems which exclude the user and use closed loop feedback need to worry about the stability of the control loop, and the effect that varying application utility will have upon the user. Systems which include the user must be certain that the user is educated about the need for their collusion in adapting the application. Systems in the latter case include battery monitors querying the user about whether to close down applications, or video conferencing system in which adaptation would reduce the quality to less than the minimum demanded [6]. Design Choice: Selecting Run-time versus Design-time Behaviour Specifications Having investigated the implementation space and decided upon how degradation should be controlled, the designer must next decide when the possible trajectories through the implementation space are decided – when designing the application or through interpreting some later set of instructions. Design-time behaviours include the use of reactive protocols in the face of congestion, such as in the video conferencing tools vic and ivs. These are fixed policies of degradation, or fixed trajectories through the implementation space. In general design-time policies are easier to encode. However, as the applications become more generic and are used for more disparate tasks, then the semantics of the application are increasingly determined by the context of use, and the utility of the application is difficult to pinpoint. An application which simply reports the state of some object has simple semantics and will only be used in a very limited set of contexts, and thus the behaviour can be pre-determined. The user can either have an update or not. On the other hand, a web browser is used by users in many different ways [10, 14], and the semantics of the retrieved pages depend highly upon context. As designers, we can only make best guesses about the utility of various implementations, and so we can only select trajectories through implementation space that approximate the profile of some imaginary user. In real situations, we must allow users to override profiles to determine which aspects of the application semantics are important to them at run-time. Design Choice: Selecting Fixed Trajectory Behaviours Users of applications don’t care about the subtleties of network and display performance – they just want the software to work. Thus whether the behaviours are fixed at design-time or run-time, the user should be able to use the application without fiddling with editors to set values up. So the designer must determine and install a set of trajectories through the implementation space that correspond to some expected path through resource availability. We thus provide some fixed trajectories which give the application a behaviour when degrading in the face of resource shortage. In doing so, the designer has to determine the likely contexts in which the application is to be used, so that they can guess at the likely semantics, and then experiment and select appropriate behaviours. The choice of fixed trajectories has a large impact on whether users will use an application and so is a very important part of the design process. We believe that the context and associated semantics can only be determined by users, so as designers we must bring users into the design process. If possible, we should study them in situ, but if this is not feasible due to project budgetary constraints they should at least be used in experimental mock-ups. Design Choice: Designing Appropriate Interfaces for Customisation Traditionally, customisation of networking and distributed applications has been via dialogues which provide direct access to actual variables within the program. However, this is is of limited use when the abstractions are themselves complex and of no meaning within the experience of the user. The open implementation of the components provides abstractions within the application domain to allow manipulation of behaviour. However, these abstractions may be far removed from the experience and understanding of the user. In particular, the various nuances and abstractions of networked and distributed applications are rarely understood by programmers, so it is unlikely the the user will be able to comprehend behaviour specification in terms of networking abstractions. Instead, the abstractions of the application must be mapped from the system image onto metaphorical controls which build upon the experiences of the user. By using metaphors from the experience of the user, we can educate the user to have an appropriate mental model of the application [1].

4

3 Technological Choice: The Lowband Multimedia Information Retrieval Architecture We use active multimedia objects to adapt themselves to the available bandwidth and display characteristics. An active multimedia object is some form of multimedia such as an image or a video with an associated piece of code that can control various aspects of the object either at the point it is served, cached or displayed. These aspects are such things as the level of lossy compression in a JPEG image, the frame rate of video, or the colour depth and size of an image adapted to a specific display device. We use the obvious media hierarchy to represent the media, as in Figure 2, and provide control interfaces to the media, as in Table 1. These control interfaces can be remotely called across the network, transforming the media before it is downloaded. As long as the overhead of the remote calls plus transformed media size is less than the original media size, then the transformations may considerably reduce the time taken to download the media, yet be more suited to the display. The control interfaces are the Open Implementation of the components in our technology. Full details of the network aspects of the architecture can be found in [4]. Media

Audio

Video

PCM LPC ADPCM M-JPEG

Image

MPEG XBM

Text

GIF TIFF

JPEG

ASCII

Figure 2: A partial media hierarchy

Interface name compress scale reduce toAscii

toBmp toGiff toHtml toProgressive toBaseline

Description Apply a lossless compression to the media reduce media to display in a smaller size Apply lossy compression Translate media into some ASCII equivalent, typically a description of what the media represents Convert an image to an X-Bitmap Convert an image to a Giff Convert a media to some html representation Convert a JPEG to a progressive scan representation describe JPEG using Huffman coding only

Applicable classes uncompressed media Video, Images all all

image image all JPEG JPEG

Table 1: Control interfaces To demonstrate the viability of the architecture, we have developed a test bed application that acts as an image proxy for web pages, providing similar functionality to [2]. Image tags within pages requested by browsers are replaced by applets which talk to a proxy holding the image, applying the transformations before downloading the image. The decision about what set of transformations to apply to a given piece of multimedia is made within the applet. This decision comes from the interpretation of an instance of what we have termed the Media Policy Language mpl. This script is at the heart of the configurability of our system; it weaves the transformations and media together. As the name suggests, mpl defines policies on what to do to media in particular situations of bandwidth availability and display characteristics described in Section 5.

5

4 Plotting the Implementation Space: Experiments on Images For the image proxy, the key points in the implementation space occur when the user detects a reduction in quality due to the adaptation taking place in the proxy. For the scale method, this is too dependent upon context to determine - the authors of the HTML may have fixed ona particular width of images and any deviation will reduce the quality. However, for the reduce method operating on JPEGs, we can generally reduce the quality factor without impact upon the user. We could find no hard numbers within the literature, and from observation, most JPEGs were encoded with too high a quality factor, so we designed an experiment to determine the limits of the quality factor on user perceptions. We could then relate this to the corresponding files sizes.

4.1 Perceptual Limits on JPEGs Quality The primary aim of this experiment was to determine the point at which users can no longer perceive the difference between a high quality image and the same imaged reduced in quality through JPEGs compression – this point we shall call the ’just perceivable difference’ or JPD. When combined with knowledge about the approximate relationship to file sizes, we can plot the quality threshold for the various implementations of JPEGs images. It is asserted that: the mean of the JPD (JPEGs index) from a sample of Web users plus twice the standard deviation of the JPD between an image of high quality and the same image compressed, represents the cut-off point to which all JPEG images can be compressed in a default policy: given a normal distribution 98% of users would see no perceivable difference in image quality. Those who notice a difference would only notice a slight variation. In this study it was hypothesised that a difference would be found between JPDs for different types of images used in the experiment. 4.1.1

Methodology in determining the Just Perceivable Difference

Four different pictures were considered, one at a time. Each subject was shown a series of images arranged in pairs - one being a full quality image and the other, an image that was gradually degraded and then improved in quality throughout the series (The quality of an image reduces as JPEG compression increased). They were asked to say if the two images appeared the same or different in quality. The process of determining JPD for each image involves two phases in a technique which approaches the subjects limit from both positive and negative directions. Phase one comprises presenting two identical images and asking the subject if they are the same or different in quality. The presentation is repeated as the JPEG index of one of the presented images is reduced until the subject indicates that they notice a difference in the quality of the images. Phase two differs from phase one, in that one of the images is of a very low quality (has a very low JPEG index). The presentation is repeated as the JPEG index of the low quality image is increased until the subject indicates that they notice no difference in the quality of the images. Each presentation pair is organised in the following way: the first image is presented for 1 second and then removed from the screen. After a gap of 1 second the second image is displayed for 1 second and then removed from the screen. The high quality image and altered quality image for subject comparison were presented in a random order. After the second image has been removed from the screen, two buttons appear and the subject can indicate with a mouse click that the images appear the “THE SAME” or “NOT THE SAME” in quality. Phases one and two are repeated for each picture to complete a set. A set comprises determining the JPD for all four images. Each Subject completed the set twice. As well as recording the subjects’ scores, age and sex were noted. Statistically a conservative cut off point is one where no percentage of the population see a difference between the image they thought that they were downloading and the image that they receive via the lowband proxy server. We have defined this as the mean plus twice the standard deviation of a sample of JPDs. However, a less conservative figure might be the mean plus one standard deviation; these issues are discussed below in section 4.2. The images: Since it was predicted that there might be a difference in the effects of compression on different kinds of image, and also in the detail users observe when looking at different kinds of image, it was important to show several different types of image to the subjects.

6

Four kinds of image were selected: a face, a tree, a map and an interior scene. The second two pictures included several straight lines, whereas the former two were structurally more natural. All of them contained areas of detail and areas of block colour. This combination of image types was deemed to offer a cross section of the kind of images that users would encounter on the Web and also those with differing susceptibility to the JPEG compression algorithm. Presentation Criteria: Image presentation time is bounded by two factors: the visual persistence effect [11] and subject strategy development. The visual persistence effect is found when we look at an object for a short period of time: for a short period of time after a visual stimulus has been removed we can recall nearly all of the detail of the stimulus, but after that time recall becomes much worst. That is to say that we have a short term visual buffer. The persistence effect is typically measured at around 250ms. If the presentation time and the period of the delay between presentation was lower than 1000ms then the persistence effect would represent a more significant part of the subjects recall mechanism, producing unwanted and uncontrolled for effects. In pre-experiment trials subjects reported that they developed strategies to determine image quality difference when the image was displayed for more than about a second – subjects probably do develop strategies however the effect is less marked with a shorter presentation time. It was therefore decided that a presentation time of 1000ms for each image and delay of 1000ms between images combined with a random presentation order would be the best Presentation criteria for the experiment. The Subjects: 20 people took part in this section of the study. They ranged in age from 14 to 40 and all were regular Web users. Everyone had good or corrected sight. They were positioned approximately 75cm from the screen. 4.1.2

Results

Table 4.1.2 combines both sets of scores for each picture used in the experiment. All figures are measure in the units of JPEG Compression index. Table 4.1.2 shows the improvement in compression ratio when different cut-off points are used; a JPEG compression index of 75 was used as the baseline. Finally table 4.1.2 shows the probabilities and significance that the scores generated across subjects and between pictures are draw from the same population. Clearly, in all cases but one the images can be said to have different JPDs. 1st and 2nd Set Combined (JPEG index) mean sd wse wie face 27.13 8.37 8.22 13.65 tree 18.10 10.18 7.48 18.43 map 25.06 8.92 9.85 18.31 room 31.40 8.92 6.02 10.63 all 25.42 8.48 7.89 12.95 wse=within subject mean squared err wie=within image mean squared error Table 2: Mean Scores Of Both Sets For All Subjects And All Pictures.

mean+2*sd mean+sd mean

42.38 33.90 25.42

57% 78% 113%

Table 3: Mean Compression Ratios For Meaningful Cut-Off Points

4.2 Image Quality Choices There are two aspects of interest regarding how Web users’ interact with the images that they download in Web pages: the users’ perceptual limits, and their tolerance to visibly degraded images under differing conditions and in 7

sets face-tree face–map face-room tree-map tree-room map–oom

P of Similarity 3.26e-09 0.0724 4.18e-05 2.41e-05 5.08e-10 1.21e-06

Significance High None High High High High

Table 4: Results of pair wise t-test for each picture pair

different contexts. Establishing the mean JPD over a number of images not only allows the calculation of a cut-off point for default policies, but also delimits the variation that can meaningfully offered to users for customisation of their policy. The hypothesis proposed that the mean of the sample (25.42) plus two times the standard deviation (8.48) giving a further compression of around 57% would be practicable. An alternative would be to adopt the mean plus one standard deviation, a value of 25.42. This would also be meaningful – 16% of users would notice only slight degradation in image quality and the compression would be higher at 78%. In other words, a meaningful cut-off point does not emerge from the statistics but must be chosen. The probabilities of similarity, derived from performing a pairwise T-Test between each of the pictures, are very revealing. There are highly significant differences between each of the images – except between the face and the map. Authors should be made aware of these findings. In particular, their tools should be designed to encourage the use of a default setting that reflects the cut-off point established here. Images intended for use in web pages generated by popular graphic and art packages can be saved at a JPEG index of 34. Finally, authors should be encouraged to write policies in which the quality of the image reflects its importance. In conclusion, the variation in subjects’ scores combined with the differences across images encourages a conservative estimate of s+2SD. However,if only s+SD is used, a few may notice the degradation but it is a small price to pay for the added compression. The cut-off point of s+SD (JPEG index of 34) will be used for the systems default policy.

5 Run-time Policies: The Media Policy Description Language We have chosen to implement the policy of degradation at run-time for the following reasons:

   

The underlying technology has four main methods through which the implementation of the transferred media can be manipulated. The application thus has at least four degrees of freedom in which we can change the implementation, and so providing a wide choice of trajectories of degradation. The application using the media is very generic - the Web model of documents is being pushed by Microsoft as others as the basic metaphor for the next generation of machine interfaces. Since the uses of the application are legion, we as designers cannot constrain the choice of policy degradation. The ability to transform the media entirely across types implies an even wider implementation space. Since transformations may be chosen as more convenient by the user, eg. a partially sighted user may transform text to audio, we believe that allowing users to determine policies of adaptation will provide more exciting use of the technology. In downloading a particular media instance, there are two associated actors who are interested in how the media is downloaded and displayed – the person initiating the download and the author of the media. Each of these actors may have preferences about how media should arrive – users may want download as fast as possible, yet authors may require a colour display. To allow each of these actors to define requirements, it is easier to resolve run-time resolution of policies..

Policy degradation objects are written in the Media Policy Language mpl and are interpreted within the application and the server. They can be combined with each other to form new policy objects representing the combined needs and experience of user, author and system designer.

8

The policy objects have been used to control the preferences users have in how images are viewed across the Web. However, the principle applies equally to other multimedia application (networked or otherwise) and indeed to any application that wants to scale it’s interface. User, authorial and default policies could be devised to allow an application’s widgets to present themselves in a sensible manner on differing platforms and visual displays. The mpl is a rule based language that allow the mapping of certain actions or mutations to specific groups or subgroups of media according to the various current networking and display conditions. The general format is: path : condition : action [, submission] Path means apply action to the specified media type if the condition is true and some rule belonging to a different rule group doesn’t override the path. Conditions are legal boolean expressions. We use environment variables to hold values of networking and display conditions and attributes of the media, as determined by the run-time environment [19]. Each environment variable has a unique name and has a type. Currently variables are: FILESIZE MEDIA-HEIGHT MEDIA-WIDTH MEDIA-DEPTH DISPLAY-WIDTH DISPLAY-HEIGHT DISPLAY-DEPTH BANDWIDTH RATE MIME META

-

int int int int int int int int int string string

(in Bytes/second) (flow rate for streams eg. frames per sec) (mime type of media object) (used for passing any other info)

Actions are compress (lossless), reduce (quality: lossy compression), scale (Dimension), and transform (from one media to another) etc, as in Table 1. For instance, if a user wishes to compress all objects over 10k media.* : SIZE >= 10*1024 : compress; or if a user has a monochrome browser media.image.* : true : toMono; or for a more complicated expression when the browsers is always used over low bandwidth links. media.image.jpeg : meta(progressive) == false : toProgressive, submit default; media.* : true : scale 75, submit author; media.* : true : reduce 50, submit author; media.* : true : compress 10; Policies can be combined with information to the environment to select paths of degradation depending upon the available constraints, so if the user wishes to ensure that all download times are less than five seconds, they first attempt to reduce the quality, then if this fails, they scale the object, using the resolution process in Section 5.1, media.* : SIZE/BANDWIDTH>5 : reduce 50; media.* : SIZE/BANDWIDTH>5 : scale 75; media.* : true : compress 10; An author’s policy to make sure that a JPEG fits in the available space: media.image.jpeg."www.site.org/pics/mypic.jpeg" : MEDIA-WIDTH > DISPLAY-WIDTH : scale (MEDIA-WIDTH / DISPLAY-WIDTH);

9

5.1 Policy Resolution The policy scripts are compiled together to create a single executable policy. The policy can act on the multimedia object with which it is associated through the various interfaces discussed earlier: transformation, reduction, scaling and compression. When the policy is activated (asked to transform its media), it goes through a number of parsing and resolution phases. These phases determine which rules are relevant to the associated multimedia object, which rules can be removed or overridden by others and they establish a definitive precedence order between rules from different policy sources. Initially any rule which applies to media other than that of the attached multimedia object are removed. Then rules which clash are resolved according to the following criteria: 1. user rules have precedence over authorial and default rules, except when 3. or 4. is in place, 2. authorial rule have precedence over default rules, except when 3. is in place. 3. user and authorial rules can specify that they submit to default rules. 4. user rules can specify that they submit to authorial rules. Four phases of multi-pass rule activation then take place: Transformation involves firing rules that transform the multimedia object from one type in the media hierarchy (figure 2). Rules that utilise the transformation interface are key to the writing of policies that cope with difficult display attributes and location dependant data. Reduction involves firing rules that utilise the reduce interface; thus reducing the quality of the attached multimedia object and so improving down load time. The reduction phase is a multi-pass operation. Each pass of the rules further reduces the multimedia object. Passes are repeated until no rule is fired, that is, all of the media’s size and quality requirements are met. Of course this may never happen! So, the multi-pass mechanism is constrained by a set of heuristics which can identify looping, the exhaustive limits of compression and rules which reduce the multimedia object to the edge of our perception. Scaling is the process of altering the dimension of the Media. Images are scaled in the X,Y dimension as are Video (although they also have the frames per second dimension). For some media types scaling is not meaningful or takes on a different meaning. For example it is not meaning to scale ASCII and it only meaning to scale audio in terms of amplitude or tone – qualities that can be adjusted to suit the user, but do not effect the download time. compression is a simple one-pass treatment of using non-lossy compression where possible. Media that benefit from this are those which, unlike JPEG and MPEG, do not support their own compression technique. Anything under the mime type mime:text/* can benefit from the use of non-destructive compression 1 . This multi-phase multi-pass process of resolution is at the heart of what makes the policy work. It combines the disparate needs of user, author and the system designers.

6 Setting the Fixed Trajectories: Determining a Default User Policy There are three good reasons for the existence of a default policy. First, it is necessary to provide a reasonable behavior for an object if no policy is provided by either user or author. Second, it is sensible to insist on certain behaviours unless the user or author overrides those behaviours. It is sensible to use lossless compression on objects that implement no compression of their own, such as text, html, postscript, etc. It is also useful to be able to provide default policy decisions that only effect the quality of the media slightly but massively improve down load time or display usability, such as a small reduction in the quality of a JPEG or the frame rate of 30 fps video (see section 4.1). Finally, the default policy is the basis for the user to incrementally develop their own policies. In an ideal world in which we had usable technology, we would have studied users in context, using browsers connected to the lowband proxy. Unfortunately, the performance of the proxy was stymied by the performance problems of RMI and in particular the Object Serialisation. Whilst we wait for Sunsoft to fix these in a new release of the JDK, we have conducted exploratory interviews to try to capture user opinions about acceptable trajectories of degradation, and interface behaviour. 1 The

GZIP compression utility is used as a publically available non-destructive algorithm

10

6.1 Image Quality Tolerance in Context It is relatively straightforward to determine the point at which images of reduced quality become perceptibly different from the full-quality counterpart. The point varies between users, affected by physical factors, but the range is measurable. However, the same standards cannot be applied to the issue of users’ tolerance. Tolerance is a notional concept of image acceptability and, clearly, a number of variables complicate any measurement of this. It can be predicted that there would be differences in how individuals regard the issue based on personal taste and also what their relationship with the image might be at the time: do they need information from it? Do they like it? Do they even notice its presence? In order to understand a little more of the interaction between these factors, we undertook an exploratory study. The study was not be expected to generate any definitive design criteria, rather to explore whether such criteria were appropriate or not. Obviously, a thorough test of quality tolerance would have involved working with several different pictures drawn from different genres, presented in different contexts and evaluated in the pursuit of different tasks. Since at this point of development, the purpose in involving users was to gather indicative data, a much simpler method was sufficient. It involved presenting users with one picture in several conditions. Semi-structured questioning optimised the amount of information that could be gleaned. There were four groups of questions: user-related, image-related, time-related and product-related. The image The chosen picture was presented to participants, without context, in four states of degradation: the full Web image of JPEG quality factor 75 and reduced images of quality factors 18, 12 and 62 . After looking at the full quality image (see below), users were given the three degraded images and asked whether they would accept “all or any of these” in the context of a web-page they had requested. The picture presented to them showed a room with ornamental detail in old furniture and a series of antique portraits. These details were fairly clear in the original image, but by the third degraded image (of quality factor 6), nothing except shape and disintegrated colour were evident. The image was picked as holding enough detail - in its full state - to arouse detached interest or curiosity, but as being emotionally neutral. This was confirmed by the results of the first question, which asked for users’ response to the full quality picture’s content. From the user’s point of view, when using the Web there are two categories of image: the ones that the user actively searches to see and those that just arrive as part of pages (which can be further broken down into those which happen to arouse interest and those which do not). A “neutral” image modelled the haphazard condition and allowed for a similar response between users in terms of engagement. Other conditions were discussed. The Users 10 people participated in this study. At the time of the interviews they were all using the Web at least 3hrs a week, with some up to 10hrs. Only one browsed regularly with images disabled, but another occasionally used a text-only browser. One had been using a text-only browser extensively to get “a gist” of pages but abandoned this tactic in the last year as images had become more integral to pages. Otherwise users viewed the Web through a Netscape browser of 2.2 or above; one using a black and white VDU and another a small screen. Most had access to fast network connections for which they did not pay. Users’ response to the three degraded images in the second study varied widely. Most users qualified their answers even before committing themselves to considering the images in front of them. As expected, context and purpose were determining factors: “it depends on what the image is for”, “surely depends on the relevance it has to the page being called”; while one person identified the size at which the picture appeared as the major issue. A majority agreed the best quality image of the three (22Kb) was “acceptable”. However, everyone also agreed that there were times when this was not good enough. If the purpose of finding the picture was to study the detail or to print it off, then higher quality was needed. Users also identified other kinds of images that would require better rendering: images of faces or fine art in particular. Opinion split about the cut-off of this notional acceptability, with an increasing number rejecting the image as quality went down. Some people felt that no picture was preferable to a degraded one, while others felt that general information could still be gleaned from it. Most users mentioned either context or purpose again at this point. One user accepted any quality of image. “I just wouldn’t look at it as long”, she said of the poorest. Another user rejected all the images: “I’d find any quite frustrating if I was trying to look at the picture for some reason”. So a difference in general tolerance became apparent, even in a small sample. 2 corresponding to

approximately 73, 22, 15 and 9 Kb

11

Waiting for images Users were vague as to how long they would wait for images to appear on a page. Several said it was relative to what they were used to. Typically, it was “not very long”. One user had measured his tolerance and identified it as 15 secs. Other figures, where hazarded, stretched from a few seconds to a couple of minutes. Answering the question appeared difficult for most users: it was accompanied by long pauses, claims that it was a difficult question and one person watched a second go by on his watch before answering. There was also an implicit acknowledgement that patience varies from occasion to occasion. Of course, context was a determining factor too: if users knew that they wanted to see the picture, then they would wait considerable longer than if they had no interest. And one pointed out that if she knew how long the wait was likely to be, she could plan round it. Several other interesting behaviours were referred to:

   

browsing as a background activity while doing something else browsing with more than one browsing window open so that multiple web activities could be conducted reading text as the rest of the page (ie images) downloads clicking on links in the text in the appearing page without waiting for the image to download (“clicking through pages”)

Despite these mechanisms to avoid waiting, frustration with slow downloading images was apparent. The reasons for this varied: for instance, there might be no interest in the particular image or there might be no interest in the page - given the exploratory nature of much web activity and the poor description of links, it was not surprising that several people said they wanted to see quickly if they had found something interesting or not. Reference was made to the cost of waiting for images, especially across a slow modem link. There was consensus that, all other things being equal, they would prefer images that arrived in 2 seconds rather than half a minute. Preview mechanism Users were then asked whether they would appreciate the option of previewing images fast at low quality. They were told that:

  

the quality would be around that of the most degraded image they had just been shown it would take a few seconds or less to arrive under usual circumstances the full image would be easily summoned, if they wanted to see (and wait for) this.

All of the sample welcomed the idea enthusiastically. They were then asked whether a “point and click” mechanism would be appropriate for summoning the full image. Most people felt that this would be intuitive, though several points were raised:

     

This could be confused with hitting a link, especially as some pictures were links. Newcomers might not know that clicking for a full quality image was an option. An icon should accompany degraded images to indicate their status. A slider mechanism would allow a choice of how much quality was returned. Dragging an icon over the picture to be upgraded would allow for several to be chosen and reloaded at the same time. A menu choice could allow all the pictures on a particular page to be upgraded simultaneously.

In summary, users’ tolerance of visibly degraded images is dependent on their purposes in viewing the image, the kind of image viewed and the context in which the image is placed, although there are also observable differences in people’s underlying tolerance levels. Despite this, users would seem to appreciate a fast and degraded “preview” image in place of a full quality picture on a web-page as long as they knew that they could summon the full picture easily. These findings suggest that pushing compression to the edge of perceptibility and beyond would be acceptable in writing default policies as long as users could easily amend them. But would they? There is no convenient

12

correlation between tolerance and readiness to customise. Some of the most critical users are impatient with software controls, while others would be happy to adjust them at length. The findings also suggest that units of time are not an intuitive measure for users. Relative terms such as “faster” might be more appropriate to describe speed of delivery than actual timings. Since bandwidth varies during downloading and timings cannot be guaranteed, this strategy is supported by local circumstances.

7 User Customisation: The Policy Editors The Media Policy Language would be sufficient for an expert programmer to configure their browser using a raw text editor. However, users cannot be expected to write mpl script. Instead we have constructed a series of layers around the policy descriptions, transmogrifying the concepts of mpl to abstractions and language that are more meaningful to the user. Each user of the system begins with a default mpl file, which provides basic facilities to improve the web experience over low bandwidth links, or resource poor displays. They are then allowed access to concentric shells of editors which adjust the parameters according to their expertise. However, since little is known about the appropriate models upon which to build interfaces for distributed applications, the designs of the interfaces allow for the collection of information about the minimum levels of quality users wish to accept.

Figure 3: The simple editor The prototype offers two interfaces to the user: simple and advanced. A first dialog box, common to both, allows the user to select details of constraints and persistent preferences in viewing from a series of choices. This creates a policy which never overestimates the capacity of the display screen and can make maximal use of other resources, such as bandwidth. The advanced screen offers a mechanism for further customizing the display. We anticipate that most people will opt to use the simple controls (Figure 3 presented here), a series of buttons and a slider which allow the user to determine change in terms of outcome (faster or smaller images) rather than method (e.g. compression and transformation). The system gives immediate feedback by representing the prospective changes on an image in the dialog box before a decision is confirmed. However, another advanced button gives access to the code which describes the policies (see below) and confident users can adjust settings here directly. The interfaces were developed directly with users, who chose the terminology in which the outcomes were described. We gave this aspect of the work particular attention as concepts of bandwidth, image file size and compression are not familiar to some of the users who would benefit from employing the architecture, while the frustrations of delay and display are common to all. It is easy to determine if the user is viewing a web page from a mobile device such as a laptop or palmtop – the user can specify a policy that accounts for the display and bandwidth attributes of their device; indeed they could select a predefined policy in the same way device drivers are selected by user for new hardware.

13

8 Policy politics: Use and Usability As system designers, we expected that developing an application that attempted to maximise utility under resource constraint would unconditionally be “a good thing”. However, for shared applications in which the parties do not necessarily share the same goals, issues arise about which of the parties should retain power. The degree of control that authors of Web pages have over presentation compares unfavourably with that in most other media, as the standard layout tool, HTML, is a logical mark-up language. HTML is changing slowly to take in more optical factors, such as specified fonts and styles, because of producers’ dissatisfaction with this state of affairs. However, these new tags do not guarantee that what is produced is what the user sees, because of the variation in browser’s interpretations. And users continue to have the opportunity to override some of the layout tags provided, to switch off images and Java applets if they so choose. The balance of power between user, technology and author is in continual flux, remaining a negotiable feature of the Web as originally intended. On the face of it, policies strengthen the user’s hand and contribute to this uncertain environment for the author. If some authors continue to show the same cavalier attitude to image file size and the provision of “Alt” tags as a text alternative, then at least users have redress now. But policies have been designed to allow either side to claim or relinquish control. An author must defer to user preferences where they exist and cannot insist on certain standards of presentation, but only request them. Conversely, a user may define a policy that is in part overridden if an author has produced a policy and in part defines their definite display and download needs. Thus, they can receive something close to the intentions of the author (if they so wish) or something that is ”now readable” on their palm top. In the final analysis, if both parties have conflicting policies, then the user’s will triumph. After all, it is the user that must wait while images download and the user who has such special needs as small displays or disabilities that necessitate differing policies. The prototype software offers a workable alternative to browsing with the images disabled and therefore goes a long way to ensuring that the author’s vision is delivered, if in slightly modified form. A philosophical view upon the conflict between the author and user policies is that we are seeing a concrete representation of the post-modern clash about control over the text. Distribution of media over networks has provided another form of distance between the author and the reader, where the network and display may change the experience wished for by the author into something completely different, even before the reader brings themselves to bear. By making the changes forced by networked delivery explicit and configurable, we can enable the author and the user to enter into a dialogue about what the media is intended for, and use the policy precedence to resolve the asynchronous dispute harmoniously. But this dialogue can only occur if authors recognise they must take account of the various alternate representation of their creations.

9 Future Work We are currently extending the Image proxy to a general proxy architecture in which any media can become active. We are also investigating authoring tools that encourage the generation of alternate representations of multimedia presentations, and that allow authors to express their policies.

References [1] L. Clark and M. A. Sasse. Conceptual design reconsidered - the case of the internet session directory tool. In People and Computers XII: Proceedings of HCI’97, pages 67–85, Bristol, August 1997. [2] Armando Fox and Eric A. Brewer. Reducing www latency and bandwidth requirements via real-time distillation. In Proc. Fifth Intl. WWW Conference, Paris, May 1996. [3] Armando Fox, Steven D. Gribble, Eric A. Brewer, and Elan Amir. Adapting to network and client variability via ondemand dynamic distillation. In Proc. Seventh Intl. Conf. on Arch. Support for Prog. Lang. and Oper. Sys. (ASPLOS-VII), Cambridge Ma, October 1996. [4] Malcolm McIlhagga Ian Wakeman and Andy Ormsby. Signalling in a component based world. In Proceedings of the First IEEE Open Architectures for Signalling, San Francisco, Ca., April 1998. [5] Van Jacobson and Steve McCanne. Visual audio tool - vat. Manual Pages, 1992. [6] Ian Wakeman Jean Bolot and Thierry Turletti. Multicast congestion control in the distribution of variable bit rate video in the internet. In Proceedings ACM SIGCOMM94. ACM, August 1994. [7] G. Kiczales. Beyond the black box: Open implementation. IEEE Software, January 1996. [8] Gregor Kiczales, John Lamping, Cristina Videira Lopes, Anurag Mendhekar, and Gail Murphy. Open implementation design guidelines. In Proceedings of International Conference on Software Engineering, Boston Ma, May 1997.

14

[9] Isidor Kouvelas and Vicky Hardman. Overcoming workstation scheduling problems in a real-time audio tool. In Proceedings of Usenix Annual Technical Conference, Anaheim Ca, January 1997. [10] Ann Light. Interactivity on the web. Web Document ¡http://www.cogs.susx.ac.uk/users/annl/tax.html¿, April 1998. [11] Gerald M Long. Iconic memory: A review and critique of the study of short term memory. Psychological Builtin, 88(3):787–820, 1980. [12] Chris Maeda. A metaobject protocol for controlling file buffer caches. In Proceedings of ISOTAS ’96, 1996. [13] Chris Maeda, Arthur Lee, Gail Murphy, and Gregor Kiczales. Open implementation analysis and design. In Symposium on Software Reusability (SSR97), Boston Ma, May 1997. ACM. [14] Gary Marchionini. Browsing: Not lazy searching. ¡http://www.ee.umd.edu/ march/asis96/teal.html¿, 1996.

Slides from talk given at ASIS 96 Panel on Browsing

[15] Steven McCanne and Van Jacobson. vic: A flexible framework for packet video. In Proceedings of ACM Multimedia, San Francisco Ca, November 1995. [16] Brad A. Myers, Richard G. McDaniel, Robert C. Miller, Alan S. Ferrency, Andrew Faulring, Bruce D. Kyle, Andrew Mickish, Alex Klimovitski, and Patrick Doane. The amulet environment: New models for effective user interface software development. IEEE Transactions on Software Engineering, 23(6):347–365, June 1997. [17] Colin Perkins, Vicky Hardman, Isidor Kouvelas, and Angela Sasse. Multicast audio: The next generation. In Proceedings of INET 97,, Kuala Lumpur, Malaysia, June 1997. [18] Edward J. Posnak, R. Greg Lavender, and Harrick M. Vin. An adaptive framework for developing multimedia software components. CACM, 40(10):43–47, October 1997. [19] Nick Sharples and Ian Wakeman. Netbase: Gaining access to internet quality of service from an application. Technical Report CSRP 476, School of Cognitive and Computing Science, University of Sussex, 1998. [20] M. Weiser. Some computer science issues in ubiquitous computing. Communications of the ACM, 36(7):75–85, July 1993.

15