communications, collaboration, and community

45
1 Communications, Collaboration, and Community Anoop Gupta Microsoft Research Collaborators: Michael Cohen, Ross Cutler, Zicheng Liu, Yong Rui, Kentaro Toyama, Zhengyou Zhang, and others

Upload: vadin

Post on 11-Jan-2016

16 views

Category:

Documents


0 download

DESCRIPTION

Communications, Collaboration, and Community. Anoop Gupta Microsoft Research Collaborators: Michael Cohen, Ross Cutler, Zicheng Liu, Yong Rui, Kentaro Toyama, Zhengyou Zhang, and others. Deployment-Driven Multidisciplinary Research: Challenges and Opportunities. Anoop Gupta - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Communications, Collaboration, and Community

1

Communications, Collaboration, and Community

Anoop Gupta

Microsoft Research

Collaborators:

Michael Cohen, Ross Cutler, Zicheng Liu, Yong Rui, Kentaro Toyama, Zhengyou Zhang, and others

Page 2: Communications, Collaboration, and Community

2

Deployment-Driven Multidisciplinary Research:

Challenges and Opportunities

Anoop Gupta

Microsoft Research

Collaborators:

Michael Cohen, Ross Cutler, Zicheng Liu, Yong Rui, Kentaro Toyama, Zhengyou Zhang, and others

Page 3: Communications, Collaboration, and Community

3

Collaboration and Multimedia Group

• 16 people – 9 Researchers, 5 R-SDEs, 1 Designer, 1 Usability– Diverse: Systems, Cog Psych, Sociologist, Vision, Graphics

• Focus:– Peripheral awareness and people-centric interfaces– Tele-presentation and tele-meeting technologies– Make audio-video information a first-class citizen– Enhanced online communities

=>Technologies, Applications, and Social Factors

Page 4: Communications, Collaboration, and Community

4

• Peripheral awareness and people-centric interfaces– How do we stay aware of relevant information

without annoying notifications– How do we stay aware of people, communicate

with them, and bring them to the front of the user interface

– How can we leverage technology to provide a better idea of people/environment state

Page 5: Communications, Collaboration, and Community

5

• Tele-presentations and tele-meetings– Leverage the combination of

• cheap sensors (cameras, microphones, …),

• cheap computing power, bandwidth, and storage,

• Advances in vision-graphics-SP technologies

– Convincing remote presence and interactivity– Whiteboard, note-taking, local interaction tools– High quality recording and archiving– Rich indices and browsing support

Page 6: Communications, Collaboration, and Community

6

• Make audio-video information a first-class citizen– Low-cost and high-quality capture– Automatic index creation and highlights– Rich support for annotation and collaboration– Browsing tools and interfaces

Page 7: Communications, Collaboration, and Community

7

• Enhanced online communities– Tracking Interaction / Social History

– Incentive Structures• Encourage high quality content creation• Encourage interaction• Discourage inappropriate behavior

– Filtering and Synopsis

– Community Portals

Page 8: Communications, Collaboration, and Community

8

Outline

• Our group

• Research approach

• Project samplings– Office activity modeling– Distributed meetings– Tele-presentations– Face modeling

• Concluding Remarks / Challenges

Page 9: Communications, Collaboration, and Community

9

Research Approach

• Deployment-driven research– End-users vs. other researchers as main customer– Robustness vs. Functionality– Multiple sensor technologies with graceful degradation– Value existing infrastructure– Simplicity of set-up and operation– Design with end-user in the loop– Field evaluations

• Multi-disciplinary tool-set

BuildPrototype

Evaluation /Publication

RefinePrototype

ProductImpact

Page 10: Communications, Collaboration, and Community

10

• Uses of Office Awareness– Intelligent messaging

• Send messages on appropriate channel – instant message, office phone, e-mail, mobile, etc.

– Intelligent instant messaging• Stopped typing = not there

– Peripheral awareness for “buddies” • Is now a good time to drop by Jack’s office?

1. Office Activity Modeling(joint with ASI group at MSR)

Page 11: Communications, Collaboration, and Community

11

So how does the deployment-driven approach impact our decisions?

Page 12: Communications, Collaboration, and Community

12

• Environment– Office with door (w/ window); Cubicle; Open plan; …

• Number of people – (0 / 1+) | (0 / 1 / 1+) | (0/1/2/3/…)

• Gross activity– At desk; On PC ; On phone; In meeting; …

• Fine activity– Who are the people present– Reading; Answering mail; …

• Activity Trends– Usually comes in at 7am, leaves at 5pm– Never comes in on weekends

• …

Environment and Outputs

Page 13: Communications, Collaboration, and Community

13

• Keyboard / Mouse• Calendar (appointment schedule)• Desktop microphone • TAPI-enabled phone (VoIP)• Desktop camera

• Other:– Motion detector, high-quality microphone / headset;

bird’s-eye camera; laser/IR gates;thermal cameras etc.

Sensors

Page 14: Communications, Collaboration, and Community

14

• Use reliable sensors as much as possible

• Use reliable sensors to label data for other sensors

• For vision, stick to reliably extractable, robust cues (e.g., presence of motion, optic flow)

• “Quasi-supervised” learning, using data labeled as above

Making the Inferences… in increasing approximate expected order of research interest

Page 15: Communications, Collaboration, and Community

15

• Eve/Priorities project at MSR (ASI)– Integrates capture of features (keyboard/mouse use, app use,

vision, audio events,…)– Language for combining low-level features– Bayesian fusion– Vision component can determine whether person is facing front or

not, but still not as robust as desired

• Current work in quasi-supervised learning of low-level features…

Hope to deploy base versions in summer

Results

Page 16: Communications, Collaboration, and Community

16

Concatentation of 3 sections of low-level vision data only, sampled from 8-hour log

Unsupervised clustering segments sections cleanly.

Results(preliminary)

Page 17: Communications, Collaboration, and Community

17

Results(preliminary)

Correlates with high keyboard/mouse activity, no speechGround truth: 1 person at monitor

Page 18: Communications, Collaboration, and Community

18

Benefits and Challenges

• Benefits– Prioritizing problems and context

– How far we need to push the solution

– Earlier benefits for end-users; enables social science research

• Drawbacks– Need substantial engineering (plus algorithmic) skills

– Need multidisciplinary team

Page 19: Communications, Collaboration, and Community

19

2. Distributed Small Group Meetings

• Scenario:– Imagine 8-10 people– In conference room, from desktops, mobile– Rich back and forth interaction– Archival and browsing support

Page 20: Communications, Collaboration, and Community

20

Contextualized Research Challenges

• Novel camera, microphone, display systems• Speaker tracking; multi-person tracking• Gaze and pose correction• Activity tracking and gesture recognition• Graphical avatars and virtual environments• Real and virtual camera management• Automated indexing and browsing support• Integration of handheld devices• User interface / User experience

Page 21: Communications, Collaboration, and Community

Meeting environmentOmni-directional camera

An example omni image

360-degree panorama viewF

irst

P

roto

typ

e

Page 22: Communications, Collaboration, and Community

22

Second Prototype

• Cost $300 vs. $10K

• Much better quality ~3000 x 500 pixels

• All processing done on the PC

Page 23: Communications, Collaboration, and Community

23

Remote Interfaces

• All-up• Computer controlled• User controlled• User + Computer + Overview

Page 24: Communications, Collaboration, and Community

24

Short/Medium Term Plan• Cameras, Calibration, Stitching

– Camera design to minimize parallax– Automatic camera calibration– Real-time on today’s processors

• Speaker detection and multiple-person detection– Microphone array sound source localization– Computer vision tracking of multiple people– Fusing A/V for better speaker detection

• Simple remote participation interface• Automatic camera management• Video compression, storage, and transmission• Automatic index creation and meeting browsing

Expect to deploy in a few conference rooms during summer

Page 25: Communications, Collaboration, and Community

25

3. Tele-Presentations

• Enable people to – Easily broadcast/capture lectures (speaker and audience)– Esthetically pleasing– Participate from remote locations

• Solution components– Tracking cameras, microphone arrays, …– Video production rules from professionals– Mapping of rules to cameras and software video director– Remote presence and interactivity system (TELEP)

• First prototype being used in the small lecture room at MSR

Page 26: Communications, Collaboration, and Community
Page 27: Communications, Collaboration, and Community

27

Key Modules

• Speaker tracking and audience tracking– Computer-vision-based tracking

– Microphone-array-based tracking

Page 28: Communications, Collaboration, and Community

28

Key modules (cont)

• Virtual video director (FSM)– Maintain min shot duration

– Dynamic max shot duration• Function of shot quality• Triggers TIME_EXPIRE event

– Monitoring status change• Triggers STATUS event

– Encode editing knowledge into transition probabilities

Page 29: Communications, Collaboration, and Community

29

Initial Deployment Results

• Tested concurrent human operator and our system– Field study

– Lab study

• Results:– Human operator better, but difference is not statistically

significant

– People could not distinguish which operator was human and which was computer

Page 30: Communications, Collaboration, and Community

30

Technical Challenges• Design and configuration of camera/m-phone systems

• More robust lecturer tracking– Smooth tracking in close-up shots– Multiple lecturers– Lecturers move into the audience area

• More robust audience tracking– Background noise and room reverberation

• More sophisticated rules and knowledge– Human operators have much better ability to deal with exceptions– A flexible/learning automated camera management system

Page 31: Communications, Collaboration, and Community

31

4. Face Modeling

• Technical goals:– Build a realistic-looking face model from video images– The face model can be animated right away– Painless in data acquisition & Efficient in model building– Commodity equipment (computer+camera)– No special requirement on the acquisition condition

(background, lighting, …)

• Uses:– Enhanced chat / gaming environments– Conferencing over low-bandwidth links

Page 32: Communications, Collaboration, and Community

32

System Overview

Page 33: Communications, Collaboration, and Community

33

Examples

Page 34: Communications, Collaboration, and Community

34

Example Application: Virtual Poker

• Designed as a social interface

• Each player controls an avatar

• Some behaviors automatically generated

Page 35: Communications, Collaboration, and Community

35I guess it’s my turn

• Players automatically turn to follow action/voice

Virtual Poker

Page 36: Communications, Collaboration, and Community

36

Research Challenges

• Teeth, tongue, eyes and hair

• Personalized facial expressions

• Real-time animation driven from video

• Yet more robust and easy to use

Page 37: Communications, Collaboration, and Community

37

Outline

• Our group

• Research approach

• Project samplings– Office activity modeling– Distributed meetings– Tele-presentations– Face modeling

• Concluding Remarks / Challenges

Page 38: Communications, Collaboration, and Community

38

Concluding Remarks

• Focus on deployment-driven research– Tremendous leverage in:

• Prioritizing problems we explore

• Context we assume while solving

• How far we push the solution

• Earlier benefits for end-users

• Enabling social science research

• Keeping management support Effort Spent

% C

omp

lete

Page 39: Communications, Collaboration, and Community

39

– Challenges:• Need more resources (or pursue fewer things)

• Need substantial engineering (plus algorithmic) skills

• Premier conferences do not appreciate engineering aspects

• Not all important research yields to above constraints

– Some solution options:• Community shared infrastructure (environments) into

which things can be plugged (e.g., SUIF for compilers)

• Premier conferences / Senior researchers attitudes

• Funding agency attitudes

Page 40: Communications, Collaboration, and Community

40

• Focus on multidisciplinary research– Tremendous leverage in providing:

• More robust solutions (or solutions at all)

• More cost effective solutions

• Getting deployment of research ideas out to end-user and the knowledge from resulting feedback

– Challenges:• Vision, Video, Graphics, Hardware, Speech, SP, …

• Need diversity within the group plus close ties externally

• Need supportive management and funding structure

• Academic departments, lab research groups, conferences, tenure organized around traditional disciplinary boundaries

• Discourages pushing one discipline as hard as possible when another provides an easier answer

Page 41: Communications, Collaboration, and Community

41

– Some solution components:

• Strong leaders (e.g., Hennessy – Brought Arch, Compilers, Prog. Lang, OS folks together)

• Premier conferences / Senior researchers attitudes

• Funding agency attitudes

Page 42: Communications, Collaboration, and Community

42

Questions / Discussion

• Graphics: What is the killer application in the workplace?

• Vision: How can we identifying the state of the art to a non-expert?

• Are you satisfied with the degree of connection with the end-user/reality in your sub-field?

• What do you think of the role of multi-disciplinary research? Who should do it?

• Do we have balance?

Page 43: Communications, Collaboration, and Community

43

• Graphics: What is the killer application in the workplace– We have tried:

• 3D Shell

• 3D Avatars in tele-meetings

• 3D in visualizations, …

• …

– Killer application still eludes us

Page 44: Communications, Collaboration, and Community

44

• Vision: Identifying the state of the art– E.g., Speech

• Speaker dependent or independent• Size of vocabulary• Language model / Grammar / Domain• Microphone quality

– What’s the equivalent for vision• How can we characterize / partition / … the space in

a way so that the non-expert knows when/where vision technology can be relied upon

Page 45: Communications, Collaboration, and Community

45

Questions / Discussion