managing innovation: how microsoft research works jim gray distinguished engineer microsoft...
TRANSCRIPT
Managing Innovation:Managing Innovation:How Microsoft Research How Microsoft Research WorksWorks
Jim GrayJim GrayDistinguished EngineerDistinguished EngineerMicrosoft CorporationMicrosoft Corporation
Actionable IdeasActionable Ideas
Co-lo if possibleCo-lo if possibleAdopt a “university model”Adopt a “university model”Recruit from the topRecruit from the topRecruit for passion and Recruit for passion and a desire to have impacta desire to have impactInstall a Research Program Install a Research Program Management organization to Management organization to orchestrate tech-transferorchestrate tech-transferInstitute an annual TechFestInstitute an annual TechFest
InnovationInnovationBuild versus Buy versus InvestBuild versus Buy versus Invest
Build:Build: Have in-house research Have in-house researchBell Labs, IBM, GM, Pfizer, Merc, Bell Labs, IBM, GM, Pfizer, Merc, Microsoft…Microsoft…
Buy: Buy: Acquire startups or whole Acquire startups or whole companiescompanies
IBM, Cisco, Intel, Microsoft, Pfizer, IBM, Cisco, Intel, Microsoft, Pfizer, Merc… Merc…
Invest:Invest: All boats rise All boats riseGovernment research fundingGovernment research fundingIBM, Cisco, Intel, Microsoft, Pfizer, IBM, Cisco, Intel, Microsoft, Pfizer, Merc…Merc…
All 3 approaches validAll 3 approaches validComplement one another Complement one another
Companies Are DifferentCompanies Are Different
Selected IT company FY02 R&D budgets:Selected IT company FY02 R&D budgets:Notice that R&D is correlated with margin Notice that R&D is correlated with margin
IBM and HP have large service revenues IBM and HP have large service revenues So, their “real” R&D investment rate is higherSo, their “real” R&D investment rate is higher
Dell, Accenture, EDS have modest R&D – innovate in other Dell, Accenture, EDS have modest R&D – innovate in other waysways
Intel
R&D15%
S G&A16%
Product19%
Gross50%
Microsoft
Gross40%
Product18%
S G&A27%
R&D15%
IBM
S G&A23%
Product31%
Gross38%
other2%
R&D6%
Oracle
R&D12%
Gross36%
Product26%
S G&A26%
HP
S G&A16%
Product
44%
Gross27%
other7%
R&D6%
Cisco
S G&A25%
Product33%
Gross26%
R&D16%
DELL
R&D1%
Gross18%
Product73%
S G&A8%
Accenture
Gross32%Product
47%
S G&A21%
R&D0%
EDS
S G&A9%
Product69%
Gross8%other
14%
R&D0%
Most R&D Is DMost R&D Is DHow to Do Basic Research in How to Do Basic Research in Industry?Industry?Critical questions (from Rick Critical questions (from Rick Rashid)Rashid)How can IHow can I
create and maintain a world class create and maintain a world class research organization in an industrial research organization in an industrial setting?setting?How do I How do I keep the lines of communication keep the lines of communication open open between product teams and between product teams and researchers?researchers?How do IHow do I get new technology into products get new technology into products quickly?quickly?
ApproachApproachAdapt the Academic ModelAdapt the Academic Model
Organizational goal: Advance state of Organizational goal: Advance state of the artthe artUniversity organizational modelUniversity organizational model
Flat structure, critical mass groupsFlat structure, critical mass groups
Open research environmentOpen research environmentAggressive publication in peer-reviewed Aggressive publication in peer-reviewed literatureliteratureFrequent visitors, daily seminarsFrequent visitors, daily seminars
Strong ties to University ResearchStrong ties to University ResearchNearly 15% of basic research budget Nearly 15% of basic research budget directly invested in Universitiesdirectly invested in Universities
Lab grants, research grants, fellowships, etc.Lab grants, research grants, fellowships, etc.
Hundreds of interns and visitorsHundreds of interns and visitors
Microsoft Research TodayMicrosoft Research Today
Founded in 1991Founded in 1991Staff of over 700 in over 55 areasStaff of over 700 in over 55 areasInternationally recognized research Internationally recognized research teamsteamsResearch lab locations : Research lab locations :
Redmond, Washington, Redmond, Washington, 75%75%San Francisco, CaliforniaSan Francisco, California 1% 1%Cambridge, United KingdomCambridge, United Kingdom 10% 10% Beijing, People’s Republic of China Beijing, People’s Republic of China 10% 10% Mountain View, CaliforniaMountain View, California 5% 5%
Microsoft ResearchMicrosoft ResearchExpanding the State of the ArtExpanding the State of the Art
Thousands of peer-reviewed publicationsThousands of peer-reviewed publications10%…30% of papers at our focus conferences10%…30% of papers at our focus conferencesgraphics, programming, systems, data graphics, programming, systems, data management…management…
Community leadershipCommunity leadershipProfessional societiesProfessional societiesJournalsJournalsConferencesConferences
Mentoring InternsMentoring InternsHosting academic summers and Hosting academic summers and sabbaticalssabbaticalsSpecial workshopsSpecial workshops
How To Build A GroupHow To Build A Group
Identify a promising areaIdentify a promising areaHire the leader (internal or external) Hire the leader (internal or external) Support her/himSupport her/him
Build team around senior researcherBuild team around senior researcher
Look for people whoLook for people whoWant to have impactWant to have impactHave passion for their ideasHave passion for their ideas
Same template works for whole labsSame template works for whole labsCambridge, Beijing, Silicon ValleyCambridge, Beijing, Silicon Valley
Keeping Open The Lines Of Keeping Open The Lines Of Communication To Product Communication To Product TeamsTeams
Co-location helps: 75% “on campus”Co-location helps: 75% “on campus”““How can I help?” attitude How can I help?” attitude demonstrates willingness to “get demonstrates willingness to “get dirty” dirty” to help product succeedto help product succeedProduct group spin-offs build strong Product group spin-offs build strong tiestiesOver time a number of product Over time a number of product groups evolved from research (e.g., groups evolved from research (e.g., Windows Media)Windows Media)Researchers involved in all corporate Researchers involved in all corporate product reviews product reviews
MSR Relationship To MS MSR Relationship To MS ProductsProducts
Virtually every research group Virtually every research group actively engaged with product groupsactively engaged with product groups
E.G., Windows, Office, streaming media, E.G., Windows, Office, streaming media, SQL, Exchange, IIS, commerce server, SQL, Exchange, IIS, commerce server, visual studio, office, consumer products, visual studio, office, consumer products, MSN, etc.MSN, etc.
Tech transfer:Tech transfer:IdeasIdeasCodeCodePeoplePeopleContactsContactsRecruitingRecruiting
Focused Technology Focused Technology Transfer Transfer Quickly getting technology into Quickly getting technology into productsproducts
Program management team Program management team with sole focus on tech transferwith sole focus on tech transferResearchers on product “advisory” boardsResearchers on product “advisory” boards““Mind-swaps” – joint product/research off-Mind-swaps” – joint product/research off-sitessitesJoint product/research teams, e.g.,Joint product/research teams, e.g.,
ClearType (Windows XP)ClearType (Windows XP)Datamining (SQL 2000)Datamining (SQL 2000)Natural Language & Speech (Office)Natural Language & Speech (Office)TabletPCTabletPCSmart Personal Objects (SPOT)Smart Personal Objects (SPOT)
Encourage and recognize contributions Encourage and recognize contributions
MSR TechfestMSR Techfest
Internal open house for Microsoft Internal open house for Microsoft ResearchResearchAnnual event since 2001Annual event since 2001~ 7000 attendees~ 7000 attendees170 demos, 26 lectures170 demos, 26 lectures““Research in progress” Research in progress”
Breadboard demosBreadboard demosThis is research idea/prototype This is research idea/prototype
Great networking event: Great networking event: Breaks down barriersBreaks down barriersSerendipitous connections.Serendipitous connections.
Examples Of Technology Examples Of Technology TransferTransfer
Critical support technologiesCritical support technologiesMemory Optimization Technology enabled Memory Optimization Technology enabled sim-ship of Win95/Office95sim-ship of Win95/Office95Automated bug detection in Windows 2000Automated bug detection in Windows 2000
Key technologies that drive productsKey technologies that drive productsE.G., MS audio 4.0, ClearType, intelligent E.G., MS audio 4.0, ClearType, intelligent search, search, collaborative filtering, Intellimirror, etc.collaborative filtering, Intellimirror, etc.
Incubated major productsIncubated major productsWindows streaming mediaWindows streaming mediaWindows CE, TabletPC, eBookWindows CE, TabletPC, eBookEcommerce, DataminingEcommerce, DataminingNatural language and speech technologies, etc.Natural language and speech technologies, etc.
MSR Mission StatementMSR Mission Statement
Expand the state of the art in each of Expand the state of the art in each of the areas in which we do researchthe areas in which we do researchRapidly transfer innovative Rapidly transfer innovative technologies into Microsoft productstechnologies into Microsoft productsEnsure that Microsoft products have Ensure that Microsoft products have a futurea future
Personal Examples of R&DPersonal Examples of R&D
Scaleable ServersScaleable ServersTerraServerTerraServerSkyServerSkyServer
DatabasesDatabasesData Cube, Snapshot IsolationData Cube, Snapshot IsolationSQL Stress testingSQL Stress testing
Reliable MulticastReliable MulticastPersonal Media ManagementPersonal Media Management
TerraServer & TerraServer & TerraServiceTerraService
http://http://terraservice.netterraservice.net A .NET web serviceA .NET web serviceOpenGIS OpenGIS Place SearchPlace SearchTerraServer Map TerraServer Map ServerServer
Landmarks & annotations Landmarks & annotations
layered on imagerylayered on imagery
Used by thousands of Used by thousands of real apps todayreal apps todayShows Shows
Web ServicesWeb ServicesPerformancePerformance
http://terraserver-usa.comhttp://terraserver-usa.comUSGS Photo and Topo USGS Photo and Topo mapsmaps16TB of data16TB of dataOnline since 1997Online since 19977 billon pages served7 billon pages served120 TB served120 TB servedShows Shows
ScalabilityScalabilityAvailability Availability ManageabilityManageabilitySQL + Windows SQL + Windows
TerraServiceTerraServiceTerraServerTerraServer
TerraServer TodayTerraServer Today
TerraServer TomorrowTerraServer Tomorrow
Mirrored System versus SANMirrored System versus SAN3 mirrored DB servers + spare 3 mirrored DB servers + spare versus 4 DB serversversus 4 DB servers
Commodity versus EnterpriseCommodity versus EnterpriseWhite box Dual Xeon White box Dual Xeon versus 8-way brandedversus 8-way brandedDAS 250GB SATA DAS 250GB SATA versus FC-SAN 73GB SCSIversus FC-SAN 73GB SCSINo Tape versus No Tape versus LTO Tape RobotLTO Tape Robot$0.1M versus $0.1M versus $1.8M$1.8M
Geoplex: 2 sitesGeoplex: 2 sitesYou can afford 2!You can afford 2!
KVM / IPKVM / IP
World Wide TelescopeWorld Wide Telescopehttp://www.voforum.org/http://www.voforum.org/
Premise: Most Astro data is onlinePremise: Most Astro data is online
So, So, the Internet isthe Internet is the the world’s best telescope:world’s best telescope:
Has data on every part of the skyHas data on every part of the skyIn every measured spectral bandIn every measured spectral bandAs deep as the best instruments As deep as the best instruments It is up when you are up;It is up when you are up;the “seeing” is always greatthe “seeing” is always great (no working at night, no clouds no moons no…)(no working at night, no clouds no moons no…)
It’s a smart telescope: It’s a smart telescope: links objects and data links objects and data to literature on themto literature on them
Next-Generation Data Next-Generation Data AnalysisAnalysis
Looking forLooking forNeedles in haystacks – the Higgs particleNeedles in haystacks – the Higgs particleHaystacks: Dark matter, Dark energyHaystacks: Dark matter, Dark energy
Needles are easier than haystacksNeedles are easier than haystacksGlobal statistics have poor scalingGlobal statistics have poor scaling
Correlation functions are Correlation functions are NN22,, likelihood techniques likelihood techniques NN33
As data and computers grow at same rate, As data and computers grow at same rate, we can only keep up with we can only keep up with N logNN logNA way out? A way out?
data is fuzzy, answers are approximate data is fuzzy, answers are approximate Requires combination of statistics and Requires combination of statistics and computer sciencecomputer science
FederationFederation
Data Federations Of Web Data Federations Of Web ServicesServices
Massive datasets live near their owners:Massive datasets live near their owners:Near the instrument’s software pipelineNear the instrument’s software pipelineNear the applicationsNear the applicationsNear data knowledge and curationNear data knowledge and curationSuper Computer centers become Super Data CentersSuper Computer centers become Super Data Centers
Each Archive publishes a web serviceEach Archive publishes a web serviceSchema: documents the dataSchema: documents the dataMethods on objects (queries)Methods on objects (queries)
Scientists get “personalized” extractsScientists get “personalized” extractsUniform access to multiple ArchivesUniform access to multiple Archives
A common global schemaA common global schema
Challenge: Challenge: What is the object model for your What is the object model for your science?science?
Yourprogram
Web Service
Web Service
Web Services – The Key?Web Services – The Key?
Web SERVER:Web SERVER:Given a url + parameters Given a url + parameters Returns a web page (often Returns a web page (often dynamic)dynamic)
Web SERVICE:Web SERVICE:Given a XML document (soap Given a XML document (soap msg)msg)Returns an XML documentReturns an XML documentTools make this look like an Tools make this look like an RPC.RPC.
F(x,y,z) returns (u, v, w)F(x,y,z) returns (u, v, w)
Distributed objects for the Distributed objects for the web.web.+ naming, discovery, + naming, discovery, security,..security,..
Internet-scale Internet-scale distributed computingdistributed computing
Yourprogram soap
objec
tin
xml
http
Web
page
DataIn your address
space
DataIn your address
space
Federating Astronomy Federating Astronomy ArchivesArchives
Great Test for data mining algorithmsGreat Test for data mining algorithmsIt is real and well documented dataIt is real and well documented data
High-dimensional data High-dimensional data (with confidence intervals) (with confidence intervals) Spatial dataSpatial data Temporal dataTemporal data
Many different instruments from Many different instruments from many different places and many different places and many different times many different timesFederation is a goalFederation is a goalThere is a lot of it (petabytes)There is a lot of it (petabytes)Can share cross companyCan share cross companyUniversity researchersUniversity researchers
IRAS 100m
ROSAT ~keV
DSS Optical
2MASS 2m
IRAS 25m
NVSS 20cm
WENSS 92cm
GB 6cm
SkyServer – One such SkyServer – One such archivearchiveSkyServer.SDSS.orgSkyServer.SDSS.org
Sloan Digital Sky Sloan Digital Sky Survey Pixels + Survey Pixels + Data MiningData Mining400 attributes 400 attributes per “object”per “object”Spectrograms for Spectrograms for 1% 1% Demo: Demo: pixel pixel spacespace
record record spacespace
set spaceset spaceteaching teaching
SkyQuery: Federating SkyQuery: Federating ArchivesArchiveshttp://http://skyquery.netskyquery.net//
Distributed Query tool using a set of web Distributed Query tool using a set of web servicesservicesFederates ten astronomy archives from Federates ten astronomy archives from Pasadena, Chicago, Baltimore, Cambridge Pasadena, Chicago, Baltimore, Cambridge (England)(England)Implemented in C# and .NETImplemented in C# and .NETAllows queries like:Allows queries like:
SELECT o.objId, o.r, o.type, t.objId FROM SDSS:PhotoPrimary o,
TWOMASS:PhotoPrimary t WHERE XMATCH(o,t)<3.5
AND AREA(181.3,-0.76,6.5) AND o.type=3 and (o.I - t.m_j)>2
SELECT o.objId, o.r, o.type, t.objId FROM SDSS:PhotoPrimary o,
TWOMASS:PhotoPrimary t WHERE XMATCH(o,t)<3.5
AND AREA(181.3,-0.76,6.5) AND o.type=3 and (o.I - t.m_j)>2
2MASS
SkyQueryPortal
SkyQuery StructureSkyQuery Structure
Each SkyNode Each SkyNode publishes publishes
Schema Web ServiceSchema Web ServiceDatabase Web ServiceDatabase Web Service
Portal Portal Plans Query (2 phase) Plans Query (2 phase) Integrates answersIntegrates answersIs itself a web serviceIs itself a web service
SDSS
INT
FIRST
ImageCutout
DatabasesDatabasesTheory to practiceTheory to practice
Data Cube Data Cube Wrote paperWrote paperSQL Server product and SQL Server product and ISO Standard adopted ideaISO Standard adopted idea
Snapshot IsolationSnapshot IsolationPaper in 1996Paper in 1996Product in 2004 Product in 2004 ReaderReader
versionversion
oldold
newnew
DatabasesDatabasesStress TestStress Test
Generate millions of Generate millions of random SQL queriesrandom SQL queriesSend them to 4 different productsSend them to 4 different productsCompare the answers:Compare the answers:
If all agree, good!If all agree, good!If not, a bug somewhereIf not, a bug somewhere
Found many bugs in DB productsFound many bugs in DB productsMuch appreciated by MS DB groupMuch appreciated by MS DB groupTool cloned by other DB vendorsTool cloned by other DB vendors
InformixInformix
OracleOracle
DB2DB2
SqlServerSqlServer
====
SQL Automated Test SQL Automated Test Example Example Four SQL systems on 2,000 statementsFour SQL systems on 2,000 statements
W X Y Z
1672 1672 1672 1672
232 234 241 31
1 1 1 1
31 15 12 28
1 12 5 116
0 29 32 4
18 18 19 25
45 19 18 113
Error
All fouragree 84%
Problem with intermediate table.
Case
W,X, and Y agree 95%
Reliable multicast protocolReliable multicast protocolScales using hierarchy, suppression, Scales using hierarchy, suppression, and FEC “on-demand” and FEC “on-demand” (FEC on-demand is our contribution)(FEC on-demand is our contribution)
Joint work with Cisco and othersJoint work with Cisco and others
IETF standardIETF standardImplemented prototype Implemented prototype (Multicast PowerPoint)(Multicast PowerPoint)Shipped in Windows XPShipped in Windows XP
PGMPGMPretty Good MulticastPretty Good Multicast
MyLifeBitsMyLifeBits
““A lifetime store of everything”A lifetime store of everything”The experiment:The experiment:
digitizing Gordon Bell’s lifedigitizing Gordon Bell’s life
The software:The software:Based on SQL serverBased on SQL serverTools to capture web pages, Tools to capture web pages, IM chats, TV, radio & telephoneIM chats, TV, radio & telephoneReports, links, full text search, Reports, links, full text search, pivot by time or any other attributepivot by time or any other attribute
MyLifeBits SoftwareMyLifeBits Software
Internet
MyLifeBits MyLifeBits storestore
databasedatabase
filesfiles
Voice Voice annotation annotation tooltool
Text Text annotation annotation tooltool
Legacy Legacy applicationsapplications
MAPI MAPI interfaceinterface
Legacy email Legacy email clientclient
Radio EPG Radio EPG tooltool
PocketPC PocketPC transfer transfer tooltool
Telephone Telephone capture toolcapture tool
Radio Radio capture toolcapture tool
TV capture TV capture tooltool
TV EPG TV EPG download download tooltool
Browser Browser tooltool
MyLifeBits MyLifeBits ShellShell
PocketRadio PocketRadio playerplayer
Research FailuresResearch Failures
Not everything is a successNot everything is a successWe had technology transfer failuresWe had technology transfer failuresWe had projects with little impactWe had projects with little impactSuccess and Failure depend on Success and Failure depend on environmentenvironment
Even if you have a GREAT! ideaEven if you have a GREAT! ideaThere are many exogenous factors in There are many exogenous factors in technology transfertechnology transferAnd, sometimes the idea or focus is And, sometimes the idea or focus is wrong wrong
Allow people to fail once or twice.Allow people to fail once or twice.
SummarySummaryActionable IdeasActionable Ideas
Co-lo if possibleCo-lo if possibleAdopt a “university model”Adopt a “university model”Recruit from the topRecruit from the topRecruit for passion and a desire to Recruit for passion and a desire to have impacthave impactInstall a Research Program Install a Research Program Management organization to Management organization to orchestrate tech-transferorchestrate tech-transferInstitute an annual TechFestInstitute an annual TechFest
© 2003 Microsoft Corporation. All rights reserved.© 2003 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.