taken some of the hype out of big data again - medtech pharma, nürnberg july 2014

Post on 27-Jan-2015

105 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

I was invitted to redo the talk about Big Data i did in Berlin earlier this year - slides also here. Slides are similar but updated to reflect my new company and some slides are new. Enjoy

TRANSCRIPT

MedTech PharmaNürnberg 2014

Taking (some of) the mystery out of Big Data

Contact

Claus Stie Kallesøe

Founder, CEO

claus@gritsystems.dk

+45 30 14 15 36

Introduction

Big Data –Either VERY large datasets AND/OR other complexities

Characteristics of big data

Source: IBM methodology

A couple of words about scale• 100’s of Megabytes

• This should not be a problem. Can be handled with Matlab, R, Ruby

• 100/500 Gigabytes – 1Terabyte• 2 Terabyte harddrives can be bought in the local shop for €100

• Connect it to your laptop and install postgresql or a no-sql database on it

• > 5 Terabytes• Now you might have a size issue

Inspired by: http://www.chrisstucchio.com/blog/2013/hadoop_hatred.html

Big Data - “Definition”

"Big Data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization."

Cool, but remember where we are!Gartner Hype Cycle 2013

Big Data in Pharma R&D

What is Big Data in Pharma R&D?• Many ideas/possibilities across Pharma R&D and market

access• But many of them are likley NOT “real” Big Data problems!

• Are they relevant and can they bring insights?• Yes, very much so

• Should we than find a way to handle them?• Absolutely

Disclaimer

• I am a (web) tech geek• I have nothing against new technologies

• Like many other geeks I like it

• But do try to use the right tool for the right job

http://blog.mongohq.com/you-dont-have-big-data/

Another great tool - for some

Q: “Could you help me get to Nürnberg, pls?”A: “Yes, absolutely. Not a problem”

Q: “Ok, btw I want to try the Endeavour A: “...ahh why?”

Q: “Because I have read it’s great”A: “Yes, but the ICE….”

MapReduce explained in 41 wordsGoal: Count the number of books in the library.

Map: You count up shelf #1, I count up shelf #2.

(The more people we get, the faster this part goes. )

Reduce: We all get together and add up our individual counts.

http://www.chrisstucchio.com/blog/2011/mapreduce_explained.html

What is it then? Linked data?

Does it matter what it is?

No!

It’s data - and potential analytics (business) opportunities.

Size and complexity should drive the technology

TechnologiesCan we do anything on our own

For many people/companies”Big data technology” is a black box

”A lot of stuff”

And then the vendors go:If

{ box = magic or money}then

{ box = expensive}

Working within a communityA lot of tools available

From: ttp://people10.com/blog/ruby-on-rails-the-popular-platform-for-web-development/

New visualisations – easy and free

http://philogb.github.io/jit/demos.html

Automated calculations - can bring you far

Job submitted to asynccalculation server

https://circleci.com/

Also a lot of great tools to handle data

Elasticsearch text indexes

• Indexed research assay metadata=> Google like search to find the relevant assay

• Indexed sharepoint project workspaces=> Enable easy, fast cross project queries to find trends

Conclusion – Big data in Pharma R&D• Many opportunities across R&D and market access

• More data linking and data analytics than Big Data

• You can use freely available tools on ”normal” hardware

• No magic ”Under the hood” – it’s just data

BUT you still need to define the questions you

want to answer – before diving into technology!

www.gritsystems.dk

Ask….

top related