Application of a simple visual attention model to the communication

Jun 7, 2007 - The bottom-up activation depends of the stimuli's properties and largely independent of the user's knowledge. Treisman [18] proposed that in a.
141KB taille 5 téléchargements 368 vues
Nicolas Maisonneuve

Application of a simple visual attention model to the communication overload problem Received: 07-06-2007 / Accepted: 22-06-2007

Abstract The network organization model described several years ago by Lenge [2] has, to a large extent, become a reality. Furthermore the convergence of devices and networks has increased the connections among people. It is always-on, anywhere, anytime. But this sense of real-time and global awareness has a price: the overload of solicitations and interruptions. The main goal of this paper is to present a simple mechanism based on an attention model able to select from a list of received items (emails, blog/community’s feeds, IM) the most salient ones for a limited attention user. This attention model is inspired by visual attention model of J.M Wolfe [21] reflecting a biological behavior where visual signals are changed in communication stimuli for our concern. Like this visual attention model our model has two forms of guidance. Bottom-up guidance directs attention toward signal whose features differ from their neighbors (reputation, solicitation scarcity, audience focus, etc..). Top-down guidance directs attention toward signals that have target features (current user’s interests represented in an intention profile). Keywords: attention aware system, communication overload, visual attention model, social network, computer mediated communication, attention economy

1. Introduction 1.1. Context1 Internet increases people's social capital and increase connections among people. Not surprisingly, the time N. Maisonneuve The Centre for Advanced Learning Technologies, INSEAD , Fontainebleau Tel.: +33 (0) 1 6072 9168 E-mail : [email protected]

dedicated to our communication has increased considerably. This is also true from the enterprise perspective. The network organization model described several years ago by Lenge [2] has, to a large extent, become a reality. Increasingly, the knowledge workers of today work in virtual and distributed teams. Furthermore the convergence/interoperability of devices (offline devices, e.g. cell phone are connected to online ones, e.g. IM, emails) and networks provides a real-time and global sense of awareness. Related to this feature, a new user’s behavior is coming up described by L.Stone[17] as “Continual partial attention” (CPA): people attempt to stay partially but continuously aware about the activity within their networks. “Continuous partial attention involves constantly scanning for opportunities and staying on top of contacts, events, and activities in an effort to miss nothing.” It is an always-on, anywhere, anytime, any place.

1.2. Problem In fact being aware of the activity of virtual teams, virtual communities and personal or professional social networks is a difficult task and requires making choices. According to the study [16] the total cost to the U.S. economy of attention-management problems caused by e-mail and other online tools amounts to about $588 billion a year. In fact an information-rich environment is characterized by a competition for the user’s attention (attention economy [2]). A person has still a limited capacity to manage his attention (and his social relationship [3]). A typical situation for a user in a rich social environment is to decide from a list of received items (emails, IM messages, received articles from communities/newsgroup by feed) which ones are the most salient for his limited attention capacity (e.g. time to read), keeping him aware to unexpected but important events and avoiding noisy information according his current interests.

1.3. Proposed approach In this early step of our research we introduce an attention aware system based upon the application of a visual attention model in a communication context. More specifically we adapt the visual attention model from J.M Wolfe called “guided search 2.0” [21] in replacing visual stimuli by communication stimuli. We use this analogy because of the similarity of the problem. Humans have developed a versatile ability to extract, filter and compress information from visual stimuli. Basically the retina has approximately 130 million cells to capture the light but there is only 1 million fibers leaving the eye. This high-level faculty to filter and compress (roughly 1:100) the visual information overload is notably achieved due to a complex set of attentional processes enabling, and guiding the selection of incoming perceptual information. The remainder of this paper is organized as follows: in Section 2 we present some attention models and especially the “guided search 2.0” model. In Section 3 we introduce our approach and its adaptation for the attention management in a communication context. We conclude with Section 4, which presents a few scenarios related to our problem.

2. Related works 2.1. Attention models Usually all attention models (not only the visual ones) try to do the same thing: given a set of constraints (e.g. limited resources) and some information regarding the environment and the task, the models attempt to determine/predict which option will be chosen in order to achieve a given goal. A lot of research has been about the design of attention aware system addressing the issue of interruption (see Roda[13] for a review). Recently, Huberman et al [8] attempted to answer the same kind of problems we have (selection of salient items in the context of a rich information environment) in a general way. In their approach they formulated this problem as a restless bandit problem (MAB), a dynamic allocation problem. But this approach does not reflect the psychological aspect of the attention, or the social aspect. Furthermore in human/robotics vision and psychology fields, a lot of research has been done to develop visual attention models (predicting the allocation of human

visual attention). These models fall into one of two categories [6]: • Models of visual sampling/monitoring behavior (how do people scan/monitor a set of area). This problematic is mainly studied in the aviation domain, especially in supervisory control tasks. (cf. [6] for a review) • Models of visual search (how do people locate an object in the visual environment). We think that the study of both types of visual model can help in our attention regulation problem, but, in this paper, we focus only on visual search models, and notably, due to its simplicity, the guided search v2.0 proposed by Wolfe.

2.2. The “guided search2.0” visual attention model A person searching for visual targets among distractor items, guide attention with a mix of top-down (endogenous attention) and bottom up activations (exogenous attention). The bottom-up activation depends of the stimuli’s properties and largely independent of the user’s knowledge. Treisman [18] proposed that in a preattentive stage, only simple basic visual features, such as intensity, color, orientation, motion are computed in a parallel manner over the entire visual scene resulting in feature maps. These feature maps aim at detecting salient areas in the scene for each feature. However, saliency cannot always capture attention in a purely bottom-up fashion if attention is focused or directed elsewhere in advance. Thus it is necessary to recognize the importance of how attention is also controlled by top-down information relevant to current visual behaviors. The information that guided your attention in this case can be labeled top-down—meaning that it depended on the observer’s knowledge.

Fig. 1 – “guided search 2.0” visual attention model

In the guided search model, a visual scene is decomposed in feature maps for each feature (intensity, color, orientation, motion, etc..). All the feature maps are then combined and modulated by the user’s task into a unique scalar ``saliency map'' which encodes for the saliency of a location in the scene (see Fig.1 ).

3. Our approach 3.1. Received items as a “visual” perception of the network‘s activity In our context the stimuli are not visual signals, but communication signals. We are aware of the activity of our network and our communities by the information we receive (email, rss feed,IM). In our research we assume that this perception can be represented in a visual way where each signal is not visual but a communication signal (incoming message). The signal’s properties are not color, luminance, motion but the message’s author, content, date, popularity etc.

3.2. Feature maps (bottom up) As we said previously, features are attractable properties guiding covert attention. In a communication context we propose 7 salient preattentive features that could attract the attention of the receiver without any knowledge (i.e. without regard the user’s current interests). For each feature, all the signals (e.g. incoming messages) are computed according to the feature to produce a feature map (i.e. a distribution of the saliency of the signals in this feature) Author’s influence: As suggested by [20] prioritizing messages by contact’s importance will improve the email system, due to the fact that less unsolicited attentional demands come from important senders. Without knowing the current context of the user, the definition of an important sender can be based upon his reputation or local affinity in the user’s social network. Guéguen [7] has shown that the information on the reputation of the sender has an influence of the reading and response of an email. Hence the author’s reputation feature map has a value range between 0 (low reputation) and 1 (high reputation). Popularity influence: collective attention. The popularity of an item influences clearly the selection of the message [9]. In large-scale communities (digg.com, YouTube, Del.icio.us, Slashdot) the popularity of an item (e.g.

number of votes, views, comments) plays an important role due to the information overload. This social filtering process is, in our context, perceived in a different way. We call this behavior collective attention alignment. The user is attracted by resources to which a lot of people have paid attention to. In selecting popular items the user aligns his attention toward the attention of the whole community. A social attention aware system should therefore detect attention alignment disorder (e.g. a user unaware of messages that seem have attracted the attention of his social network). The value range of the feature map characterizing the collective attention or popularity of the message is normalized: 0 (low collective attention) to 1 (high collective attention) Temporal influence: Lifecycle of a signal: In general the messages are ordered by time due to the fact that a new message will attract more attention than an old one, except if the message is a reminder (e.g. a call for a conference in 1 month). In this case the reminder will attract more and more the user’s attention until its deadline. The value range of the lifecycle’s feature map characterizing the temporal aspect of the message is between 0 (obsolete) to 1 (active) Issue/topic’s scarcity: Without knowing the user’s context and so his current topic’s interest, a message about an unusual topic attracts by nature more attention than a common topic. The value range of the topic importance map is between 0 (common topic/issue) and 1 (rare topic/issue) Medium influence. We have introduced the influence of the medium in using the Media Richness Theory [2].This theory suggests that media vary in certain characteristics that affect an individual’s ability to communicate rich information. The richer the media is in information, the lesser effort is required from the user to get the information. That is why a user will be, a priori, more attracted to a multimedia content than a text. The value range of the medium map characterizing the attractiveness of the user for the media’s type is between 0 (poor information medium e.g. text) and 1 (rich information medium e.g. video) Audience focus: A user is more attracted by messages sent specially for him than messages sent for a large community or for a public/anonymous audience [5]. We can identify several types of audience: public, large, middle, small audience and personal. The value range of

the audience map characterizing the attractiveness of the user for the messages’ audience is between 0 (public/anonymous audience), to 1 (personal). Explicit attention demand: A user is more attracted by an urgent message than a normal message. The priority property of an email allows the sender to explicitly solicit the receiver’s attention. Any form of explicit attention solicitation is a factor of attractiveness. The value range is between 0 (no explicit solicitation) to 1 (urgent solicitation)

3.3 Intention profile (Top-down) For the user-driven or top-down activation, we have created a simple intention profile. A user’s intention profile, noted P is the set of concepts C (e.g. a task, an issue, a person) interesting the user in a given context (e.g. user’s current projects, current social environment). At each concept c C is associated a weight  representing its level of interest. The set of  is noted W. P  C, W The user has also a limited attention capacity. Because the user can’t want to pay attention to everything if he has, for instance, only 5 minutes we force him or her to choose (behavior regulator) in adding an attention capacity noted H. The user’s attention capacity is a function depending on the interval of time (The longer the interval is, the higher the user’s capacity is) and the user’s effort, as described by Kahneman [9], that we attribute to a context k (e.g. at work more attention is required than at home). So P depends also of H by adding the following constraint on W. ∑   (∆t,k)   > 

or implicitly found in tracking the user’s activity ([11] [14]). Intention Map: We assume that for each signal  a function can evaluate if the concept c is present or not in the item (e.g. if the message’s sender belongs to the current intention profile). We note ,  the result of this evaluation (e.g. 0 or 1). Then we build an intention map  , computing, for each item  its intention level  such as   ∑ ,

3.4. Top down and bottom up influence The saliency map, the output of the attention system, represents the final saliency levels of the items. The set of feature maps and the intention map are simply combined and normalized (different features contribute with different strengths to perceptual salience [12]) in a global saliency map: ∑!   " #   ∑!  " # with F the set of the feature maps  , the  coefficients representing the importance of each feature in the bottom up activation, and # the influence ratio between top-down and bottom-up maps (#=∑!  for an equal influence). This ratio can be adjusted according to the user’s preference.

Sender’s Reputation/ Trust Signal’s Lifecyle Audience Focus

map

map

map

with  the minimum possible attention level. Due to this limit the user has to choose his or her priorities (paying attention to only a subset of his social network or a limited set of topics according to his context). This intention profile2 can be explicitly completed by the user

2

We can also take this profile to describe the user’s attention and characterize it in studying its distribution: dispersed user = small attention level for a lot of concepts) or concentrated user = high level for very few concepts) and overloaded user = distribution’s air > H.

Collective Attention/ Popularity

Author Interest

map

Media Choice Explicit attention demand

signals

Attention Process

map

Saliency Map map

(ranking signals per saliency)

map Topic/Issue interest

Intention Profile for a context k (top down activation)

Salient Features (bottom up activation)

Fig. 2 – An overview of the attention aware system

her network’s activity without being totally closed to unexpected/important events.

4. Scenarios At work: managing the user’s solicitations A common scenario for a knowledge worker is when he has to do a collaborative task. He wants to focus mainly his attention on messages sent by certain colleagues (or/and certain topic related to his task). He configures an intention profile representing this degree of social focus to limit solicitations out of the context (e.g. Pk=(Colleague1: 1, (Colleague2, 0,5) (colleague3, 0,2) ). To not disturb the user, a certain perception threshold pS is set : only message with a saliency higher than pS will be notified directly to the user. So the user will be aware of targeted messages or exceptionally important but unexpected solicitations. Futhermore if he wants check his or her messages , the attention aware system can rank them by saliency and recommend only the top-n messages to decrease the user’s effort. At home: managing rss feeds Having subscribed to several general and active communities or blogs a user doesn’t want to be overloaded by all sorts of article. At home he prefers to be aware of new articles about politics and business, his favorite topics. But he still wants to be aware of the others topics but only popular ones. So his draws his preferences in an intention profile as following:

5. Conclusion Our system attempts solve a common problem in a rich environment where the user is commonly interrupted by incoming messages or/and have to manage a lot of messages (email, rss feed, IM chat). Our original approach is to use existing visual attention models in this communication context due to the similarity of the problem, in adapting signal from a visual perspective to a communication perspective. Due to a mix of a bottom-up and top-down influence, this attention aware system: • Is sensitive to the user’s context (intention profile) • Is able to filter unwanted messages according to the user’s interest but also able to accept important but unexpected messages. • The ranking model doesn’t take only factors from the message’s content (topic scarcity), but also social ones (popularity, author’s reputation or trust, explicit attention demand, audience’s focus), temporal or medium-related ones. This basic model of an attention aware system has to be evaluated and adjusted, notably the importance of each feature in the user’s judgment to find what is attractive. In parallel we are going to extend and consolidate it in studying more attention models like the ART[1] and the SEEV model[6].

technolo…

Sport

Science

Politics

Business

6. References

Fig. 3 - intention profile of the user when he’s at Home

So due to the influence of both the popularity as a salient feature and the intention profile, the attention system allows to filter not so popular articles about business (top down influence) but also only popular articles about technology (bottom up influence). Sensitive to the context According to the context (at work, at home or switch from one task to another one) the user can apply a specific intention profile to have a customized perception of his or

[1] Carpenter, G.A. & Grossberg, S. “Adaptive Resonance Theory”, In M.A. Arbib (Ed.), The Handbook of Brain Theory and Neural Networks, Second Edition, MIT Press (2003) [2] Daft, R.L. & Lengel, R.H, “Information richness: a new approach to managerial behavior and organizational design” In: Cummings, L.L. & Staw, B.M. (Eds.), Research in organizational behavior 6, (191-233). Homewood, IL: JAI Press. (1984) [3] Davenport T.H., Beck John C. “The Attention Economy: Understanding the New Currency of Business” Harvard

Business School Press, (2001) [4] Dunbar, R. I. M.. “Coevolution of neocortical size, group size and language in humans. Behavioral and Brain Sciences” 16 (4): 681-735 (1993)

[5] Fisher D, Hogan B, Brush AJ, Smith M,, A Jacobs Jacobs, “Using social sorting to enhance email management““ Human-Computer Interaction Consortium (HCIC'06) (2006)

[19] Whittaker, S. and Sidner, C. "Email overload: exploring personal rsonal information management of email," CHI '96 Conference Proceedings (1996)

[6] Fleetwood M, “Refining Theoretical Models of Visual Sampling in Supervisory Control Tasks: Examining the Influence of Alarm Frequency, Effort, Value, and Salience” PhD thesis. Rice University. (2005)

[20] Whittaker et al, “Contact Contact Management: Identifying Contacts to Support Long-Term Term Communication”, Communication CSCW02 (2002)

mail and solicitor's [7] Guéguen N, Jacob C. “Solicitation by e-mail status: a field study of social influence on the web”

CyberPsychology & Behavior, 5(4), 377--383. (2002) The Economics of Attention: [8] Huberman and Wu “The Maximizing User Value in Information-Rich Rich Environments » (2006) Attention and Effort” Effort”, New Jersey: [9] Kahneman, Daniel, “Attention Prentice Hall. (1973) Popularity [10] Knobloch-Westerwick S & al “Impact of Popularit Indications on Readers' Selective Exposure to Online News”, Journal of Broadcasting & Electronic Media (2005) [11] Michlmayr E.& al. “Add-A-Tag: Tag: Learning Adaptive User Profiles from Bookmark Collections”, International Conference on Weblogs and Social Media (2006) [12] Nothdurft, H. “Salience Salience from feature contrast: variations with texture density” Vision Research., 40(23) (2000) [13] Roda C., Thomas J. “Attention Attention aware systems systems: Theories, applications, and research agenda” Computers in Human Behavior (2006) [14] Santos-Neto Neto E. & al ”Tracking User Attention in Collaborative Tagging Communities” submitted to CAMA'07 Workshop (2007) [15] Senge, Peter. M. «The Fifth Discipline – The Art & Practice of the A Learning Organization » NewYork, NY: Currency Doubleday, (1990) [16] Spira Jonathan B. and Joshua B. Feintuch “The Cost of Not Paying Attention: How Interruptions Impact Knowledge Worker Productivity," Basex (2005) [17] Stone L. in “HBR List (breakthrough ideas for 2007) 2007)”, Havard Business Review (2007) [18] Treisman, R. Paterson “Emergent features, attention and object perception” J. Exp. Psychol: Human Perception and Performance (1984)

[21] Wolfe J.M. “Guided search 2.0: a revised model of visual search”,Psyonomic Psyonomic Bulletin and Review. (1994)