big data mgmt

Upload: tweet-binder

Post on 03-Apr-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 Big Data Mgmt

    1/23

    Big Data MGMTBig Data MGMT

    Tweets and Stats

    By Tweet Category for iPad

    Date: 04/02/2013

    Analysis of the session 'Big Data MGMT' created with Tweet Category

  • 7/28/2019 Big Data Mgmt

    2/23

    Session Big Data MGMT

    Introduction: Report made with the iPad app tweet category: stats and tweets in categories. If you want to make reports like this one download today the app from

    http://www.TweetCategory.com We created several categories for this event, one per each answer and also positive tweets and other.

    Statistics

    Potential impact

    Potential reach

    Categories

    Category Total

    tweets

    % Original

    Tweets

    RT Users

    Links 240 33 143 97 100

    Other 170 23 87 83 58

    Replies 51 7 51 0 20

    Answer 2 36 5 26 10 12

    Answer 6 33 5 17 16 14

    Answer 3 32 4 15 17 13

    Answer 5 32 4 19 13 13

    Answer 1 28 4 15 13 10

    Rest of categories 111 15 64 47 58

    Charts

    num.

    tweets

    time

    16

    13:2027 mar

    586

    01:3428 mar

    18

    13:48

    50

    02:0229 mar

    9

    14:17

    41

    02:3130 mar

    5

    14:45

    7

    03:5931 mar

    0

    16:13

    num.

    users

    num.

    followers

    15

    0-50

    23

    50-100

    12

    100-150

    8

    150-200

    6

    200-250

    4

    250-300

    5

    300-400

    8

    400-500

    12

    500-750

    10

    750-1000

    8

    1000-1500

    23

    1500-5000

    5

    5000-10000

    7

    >10000

    num.

    tweets

    per user

    num. users

    971

    172

    43

    54

    15

    22>5

    Most Active Users

    tweets

    followers

    tweets

    followers

    tweets

    followers

    tweets

    followers

    tweets

    followers

    tweets

    followers

    tweets

    followers

    Analysis of the session 'Big Data MGMT' created with Tweet Category

  • 7/28/2019 Big Data Mgmt

    3/23

    Other

    Links

    Other

    Links

    Links

    Other

    Links

    Other

    Links

    Other

    Charts

    Category Total

    tweets

    % Original

    Tweets

    RT Users Impressions Potential Reach Tweets/

    User

    Followers/

    User

    Links 240 33 143 97 100 804.870 178.364 2,4 1.783Other 170 23 87 83 58 846.389 251.955 2,9 4.344

    Replies 51 7 51 0 20 248.269 61.137 2,5 3.056

    Answer 2 36 5 26 10 12 115.578 22.354 3,0 1.862

    Answer 6 33 5 17 16 14 127.768 27.087 2,4 1.934

    Answer 3 32 4 15 17 13 124.246 35.978 2,5 2.767

    Answer 5 32 4 19 13 13 130.020 47.221 2,5 3.632

    Answer 1 28 4 15 13 10 78.334 26.441 2,8 2.644

    Positive 25 3 16 9 11 135.377 34.948 2,3 3.177

    Questions 24 3 16 8 14 103.895 37.450 1,7 2.675

    Answer 7 22 3 10 12 13 95.607 38.249 1,7 2.942

    Answer 8 20 3 9 11 9 83.706 31.685 2,2 3.520

    Answer 4 19 3 12 7 10 69.040 36.056 1,9 3.605

    Pictures 1 0 1 0 1 1.792 1.792 1,0 1.792

    Analysis of the session 'Big Data MGMT' created with Tweet Category

  • 7/28/2019 Big Data Mgmt

    4/23

    11,2 5,0 20.307

    IBMbigdata

    137tweets

    Natasha_D_G

    74tweets

    BTRG_MikeMartin

    46tweets

    tweets

    tweets

    GlenGilmore

    133.839followers

    Timothy_Hughes

    30.755followers

    IBMSoftware

    16.996followers

    followers

    followers

    IBMbigdata

    1.657.015impressions

    jameskobielus

    225.690impressions

    furrier

    208.995impressions

    impressions

    impressions

    IBMbigdata

    13num. categories

    Natasha_D_G

    13num. categories

    jeffreyfkelly

    13num. categories

    num. categories

    num. categories

    IBMbigdata

    38num. of RTs

    Natasha_D_G

    37num. of RTs

    BigDataAlex

    21num. of RTs

    num. of RTs

    num. of RTs

    IBMbigdata

    99original tweets

    Natasha_D_G

    37original tweets

    BTRG_MikeMartin

    32original tweets

    original tweets

    original tweets

    2565

    2

    Very low

    0 to 10

    followers

    13

    Low

    10 to 50

    followers

    43

    Medium-low

    50 to 200

    followers

    23

    Medium

    200 to 500

    followers

    22

    Medium-high

    500 to 1000

    followers

    31

    High

    1000 to 5000

    followers

    12

    Very high

    >5000

    followers

    Analysis of the session 'Big Data MGMT' created with Tweet Category

  • 7/28/2019 Big Data Mgmt

    5/23Analysis of the session 'Big Data MGMT' created with Tweet Category

    IBMbigdata

    137

    Natasha_D_G

    74

    BTRG_MikeMartin46

    jeffreyfkelly

    42

    BigDataAlex

    41

    jameskobielus

    30

    zacharyjeans

    27

    Dmattcarter

    22

    InfoMgmtExec

    19

    cristianmolaro

    17

    furrier

    15

    dvellante

    14

    katsnelson

    12

    johncrupi

    11

    rkeshavmurthy

    10

    dfloyer

    8

    tomjkunkel

    8

    IBM_InfoSphere

    7

    TerraEchos

    7

    BTRGIG

    6

    CuneytG

    6

    susvis

    6

    kdnuggets

    5

    IBM_DB2

    4

    IBM_InfoMgmt_SE

    4

    TheSocialPitt

    4

    tmustacchio

    4

    troycoleman4

    Ercan__Yilmaz

    3

    IBMRedbooks

    3

    StacyLeidwinger

    3

    timoelliott

    3

    CrystaAnderson

    2

    Ellen_Friedman

    2

    GCSResearch

    2

    IBMSmrtrCmptng

    2

    IBM_Guardium

    2

    IBMinfomgtFR

    2

    K_Orovboni

    2

    MoserMaCH

    2

    PWIndustries

    2

    abaum67

    2

    camilo_rojas

    2

    easysoft

    2

    gzim

    2

    ibm_iod

    2

    jasebell

    2

    karthik_ph

    2

    nige25

    2

    ASUG365

    1

    AVialBoukobza

    1

    AdvaiyaInc

    1

    BButlerNWW1

    BIABAYCOM

    1

    BigDataCoaltion

    1

    BostjanKozuh

    1

    CGOC_Council

    1

    CenturyLinkBiz

    1

    ChristopheGC

    1

    ForsythMAlexand

    1

    FransBouma

    1

    GlenGilmore

    1

    IBMOptim4Oracle

    1

    IBMPartnerPlan

    1

    IBMPowerSystems

    1

    IBMSoftware

    1

    IBM_DWAnalytics

    1

    IBMdatamag

    1

    ITredux

    1

    InfoMgmtPartner

    1

    JaneTHoye

    1

    Javier_A_Soto

    1

    JohnEvans_IBM

    1

    KeithBraswell

    1

    LifeisData

    1

    LubicaT

    1

    MDI_LLC

    1

    MULTILINKGIRL1

    MattRMorrison

    1

    Mbs_craig

    1

    MhdKarneeb

    1

    NicolasJMorales

    1

    PR_KBrosey

    1

    PabloJMoralesG

    1

    PaminaPiegsa

    1

    PlottingSuccess

    1

    ReneeLivsey

    1

    Storagecreep

    1

    TT_Nicole

    1

    TarekAbouAli1

    1

    TheJillT

    1

    Timothy_Hughes

    1

    VinGAbr

    1

    VinnieCardoso12

    1

    WebhelpTSC

    1

    annettefranz

    1

    annickLEBER

    1

    anupam_gaur

    1

    battymarc

    1

    bhurtibm

    1

  • 7/28/2019 Big Data Mgmt

    6/23Analysis of the session 'Big Data MGMT' created with Tweet Category

    bigdatasci

    1

    billramo

    1

    brunokilian

    1

    cate

    1

    claverieberge1

    day_dree

    1

    edd

    1

    euclid_project

    1

    fooisms

    1

    heppenstance

    1

    ibmpartners

    1

    ideeHO

    1

    jacqilevy

    1

    jaumebp

    1

    jennifer_dubow

    1

    jsgarano

    1

    jvfaulks

    1

    kirstengraham

    1

    ktwiter99

    1

    malhotrayush

    1

    marcusborba

    1

    matt_parkerZT

    1

    mervynvk

    1

    mytek

    1

    nigelwallis

    1

    nkalaima

    1

    padma8376

    1

    paulawilesigmon

    1

    piersgrundy

    1

    plankers

    1

    resilvajr

    1

    roger_barnard1

    sauravpoudel

    1

    stephloverde

    1

    stevengustafson

    1

    storageio

    1

    strataconf

    1

    suvimarias

    1

    swatzmystery

    1

    swissjohnny

    1

    tctjr

    1

    theRab

    1

    tinagroves

    1

    tinyclues

    1

    yeahsathish

    1

    zaxar16

    1

  • 7/28/2019 Big Data Mgmt

    7/23Analysis of the session 'Big Data MGMT' created with Tweet Category

    @furrierJohn Furrier

    where are all the big data apps? they are already here. Analytics

    & in memory make them better #bigdatamgmt

    Cat.: Questions @zacharyjeansZachary Jeans

    Is @Spotify an in memory application? #BigDataMgmt

    Cat.: Questions

    @troycolemanTroy Coleman

    Do you see any in-memory databases running on z/OS?

    #bigdatamgmt

    Cat.: Questions @IBMbigdataIBM big data

    Great #bigdatamgmt contributions from @katsnelson @InfoMgmtExec

    @johncrupi @timoelliott @zacharyjeans @dfloyer @jasebell

    Cat.: Positive

    @CrystaAndersonCrysta Anderson

    Great #bigdatamgmt chat! Very interesting conversations. Thanks for

    herding, @thesocialpitt @ibmbigdata

    Cat.: Positive @johncrupiJohn Crupi

    In a year, will we still be talking about in-memory as a separate thing.

    Or will it just become in-memory analytics. #bigdatamgmt

    Cat.: Other

    @furrierJohn Furrier

    one issue is counterfeit Flash NAND devices data recovery not

    possible is a healthy industry of counterfeiting Flash NAND

    #bigdatamgmt

    Cat.: Other @dfloyerDavid Floyer

    #BigDataMgmt Using flash in conjunction with DRAM increases the

    scope of problems tackled and improves recoverability dramatically

    Cat.: Other

    @johncrupiJohn Crupi

    #m2m #IndustrialInternet analytics is the killer use case for in-memory,

    IMO. #bigdatamgmt

    Cat.: Other @johncrupiJohn Crupi

    We have to treat in-memory as the new architectural tier for real-time

    analytic apps #bigdatamgmt

    Cat.: Other

    @InfoMgmtExec

    Richard R. Lee

    #bigdatamgmt Info Mgmt has always been about "managing the

    bottlenecks". A major one has always been the db itself. In-Memory

    helps a lot.

    Cat.: Other @IBMbigdata

    IBM big data

    Welcome to the chat @GCSResearch! Glad to have you!

    #bigdatamgmt

    Cat.: Other

    @cristianmolaroCristian Molaro

    A8 main role should be to accelerate access in relevant chunks...

    #bigdata is too big to be contained in memory... #bigdatamgmt

    Cat.: Answer 8 @dvellanteDave Vellante

    A8. But no IO is expensive so in-memory in #bigdata has to be used

    judiciously #bigdatamgmt

    Cat.: Answer 8

    @Ercan__YilmazErcan Yilmaz

    A7. #spark uses in memory querying of data #bigdatamgmt

    Cat.: Answer 7 @dvellanteDave Vellante

    A7. yes and @jeffreyfkelly - interesting Aerospike - that's an extensionof memory using flash #bigdatamgmt

    Cat.: Answer 7

  • 7/28/2019 Big Data Mgmt

    8/23Analysis of the session 'Big Data MGMT' created with Tweet Category

    @dvellanteDave Vellante

    A6. Ask any DW practitioner and they'll tell you a story of "chasing the

    chips" #the_need_for_speed #bigdatamgmt

    Cat.: Answer 6 @zacharyjeansZachary Jeans

    A6: I don't know the answer. What are the stability issues with long

    term storage on physical media vs In Memory solutions?

    #BigDataMgmt

    Cat.: Answer 6

    @katsnelsonLeon Katsnelson

    A5 right cost model for the right type of data. Nothing is cheap or

    expensive on its own. Too expensive for something #bigdatamgmt

    Cat.: Answer 5 @BigDataAlexAlex Philp

    A5: It would take only one shelf of a flash-based storage system.

    #bigdatamgmt

    Cat.: Answer 5

    @Natasha_D_GNatasha Bishop

    A4: When data scientists can find answers 2 questions they didnt

    THINK to ask its a win #bigdatamgmt

    Cat.: Answer 4 @BigDataAlexAlex Philp

    A4: Fire Scientists in Montana are using in-memory computing to

    better understand wild land fire given a changing climate.

    #bigdatamgmt

    Cat.: Answer 4

    @jeffreyfkellyJeff Kelly

    A3 any transaction workload that requires real-time response in order

    to win/save/upsell the customer is in-memory candidate #bigdatamgmt

    Cat.: Answer 3 @CuneytGCuneyt Goksu

    A3 all oltp apps need to be fast. n memory is fast too. So any oltp app

    is in the scope of inmemory #bigdatamgmt

    Cat.: Answer 3

    @cristianmolaroCristian Molaro

    A2 When you remove the I/O constraints by going on-memory you will

    hit the next performance wall: CPU #bigdatamgmt

    Cat.: Answer 2 @Natasha_D_GNatasha Bishop

    A2: In-memory tech = gold in #CX tactics and can drive proactive

    #custserv: up-sell, cross sell #bigdatamgmt #cxo

    Cat.: Answer 2

    @cristianmolaroCristian Molaro

    A1 faster data access enables real-time massive data processing: real-

    time #bigdata #bigdatamgmt

    Cat.: Answer 1 @BigDataAlexAlex Philp

    A1: IMC reduces power and storage costs, revolutionizing access.

    #bigdatamgmt

    Cat.: Answer 1

  • 7/28/2019 Big Data Mgmt

    9/23

    Annexes

  • 7/28/2019 Big Data Mgmt

    10/23Analysis of the session 'Big Data MGMT' created with Tweet Category

    Category

    Answer 2Tweets to question 2

    26 12 10 22.354 115.578 2,2tweets users retweets potential reach potential

    impact

    tweets / user

    Most Active

    Users

    1. jeffreyfkelly2. IBMbigdata

    3.Natasha_D_G4.cristianmolaro5.zacharyjeans

    Tweets from this category

    @IBMbigdataIBM big data

    Q2 What are the killer apps of in-memory tech? Share examples for

    good reference models #bigdatamgmt

    @BigDataAlexAlex Philp

    A2: Working with streaming data to analyze audio processing in real-

    time 32 petabytes a day burn rate. #bigdatamgmt

    @jeffreyfkellyJeff Kelly

    A2 anything requiring speed-of-thought response time - allows for

    exploration of large data sets in near real-tim #bigdatamgmt

    @katsnelsonLeon Katsnelson

    Q2 Call Detail Records processing in memory. 9 bilion CDRs per day.

    Can't think of a better case for memory #bigdatamgmt

    @Natasha_D_GNatasha Bishop

    A2: In-memory tech = gold in #CX tactics and can drive proactive

    #custserv: up-sell, cross sell #bigdatamgmt #cxo

    @Natasha_D_GNatasha Bishop

    Nice RT @katsnelson: Q2 Call Detail Recs processing in memory. 9

    bilion CDRs per day. Can't think of a better case for memory

    #bigdatamgmt

    @IBMbigdataIBM big data

    Nice RT @katsnelson: Q2 Call Detail Records processing in memory.

    9 bilion CDRs per day. Cant think of a better case for memory#bigdatamgmt

    @cristianmolaroCristian Molaro

    A2 I cannot think about any application that would not take advantage

    of faster processing... #bigdatamgmt

    @IBMbigdataIBM big data

    Cash! RT @Natasha_D_G: A2: In-memory tech = gold in #CX tactics

    and can drive proactive #custserv: up-sell, cross sell #cxo

    #bigdatamgmt

    @InfoMgmtExecRichard R. Lee

    #bigdatamgmt A2 - Apps such as High Frequency Trading and Real-

    time Risk/Fraud Analysis come to mind as strong users In-Memory.

    Many more.

    @cristianmolaroCristian Molaro

    A2 When you remove the I/O constraints by going on-memory you will

    hit the next performance wall: CPU #bigdatamgmt

    @IBMbigdataIBM big data

    Good 1s RT @InfoMgmtExec: #bigdatamgmt A2 - Apps such as High

    Frequency Trading and Real-time Risk/Fraud Analysis come to mind

    #bigdatamgmt

    @jeffreyfkellyJeff Kelly

    A2 smart meter analytics #bigdatamgmt

    @katsnelsonLeon Katsnelson

    A2 many apps where data is not valuable enough to even store on

    disk. In Streams we process stuff in memory and discard

    #bigdatamgmt

    @CuneytGCuneyt Goksu

    A2 fraud detection and investigation is a good candidate

    #bigdatamgmt

    @zacharyjeansZachary Jeans

    .@InfoMgmtExec: #bigdatamgmt A2 - Apps such as High Frequency

    Trading and Real-time Risk/Fraud Analysis come to mind

    #bigdatamgmt

  • 7/28/2019 Big Data Mgmt

    11/23Analysis of the session 'Big Data MGMT' created with Tweet Category

    @IBMbigdataIBM big data

    Then what? RT @cristianmolaro: A2 When you remove I/O constraints

    by going on-memory you hit next performance wall: CPU

    #bigdatamgmt

    @TerraEchosTerraEchos, Inc.

    Definitely has great security applications! RT @CuneytG: A2 fraud

    detection and investigation is a good candidate #bigdatamgmt

    @Natasha_D_GNatasha Bishop

    Good for #finserv & #insurance RT @CuneytG: A2 fraud

    detection and investigation is a good candidate #bigdatamgmt

    @jeffreyfkellyJeff Kelly

    A2 investigating network traffic issues, finding bottlenecks

    #bigdatamgmt

    @cristianmolaroCristian Molaro

    A2 on-memory allows applications to fully exploit today's more and

    more powerful CPUs... good news for #bigdata! #bigdatamgmt

    @jeffreyfkellyJeff Kelly

    A2 analyzing high-velocity financial data in trading scenarios - no time

    to lose in this use case! #bigdatamgmt

    @jeffreyfkellyJeff Kelly

    A2 iterate, iterate, iterate #bigdatamgmt

    @IBMbigdataIBM big data

    Then iterate again RT @jeffreyfkelly: A2 iterate, iterate, iterate

    #bigdatamgmt

    @zacharyjeansZachary Jeans

    A2: Logistics. SAP HANA reduced a chinese bottled water company's

    calculation time from 24 hours to under a minute. #BigDataMgmt

    @jeffreyfkellyJeff Kelly

    A2 in-memory allows Data Scientists to ask more questions, to quickly

    refine questions, and to more quickly find answers #bigdatamgmt

  • 7/28/2019 Big Data Mgmt

    12/23Analysis of the session 'Big Data MGMT' created with Tweet Category

    Category

    Answer 6Tweets to question 6

    17 14 16 27.087 127.768 1,2tweets users retweets potential reach potential

    impact

    tweets / user

    Most Active

    Users

    1. IBMbigdata2.cristianmolaro

    3.BigDataAlex4.Natasha_D_G5. jeffreyfkelly

    Tweets from this category

    @IBMbigdataIBM big data

    Q6 How does in-memory support or supplement data warehousing?

    #edw

    #bigdatamgmt

    @BigDataAlexAlex Philp

    A6: IMC can help folks leverage their data warehouse - rewire the

    house for speed. #bigdatamgmt

    @jeffreyfkellyJeff Kelly

    A6 back to economics - don't need your entire DW in-memory - use in-

    memory to supplement trad DW workloads #bigdatamgmt

    @dvellanteDave Vellante

    A6. DW/BI for years has been like a "snake swallowing a basketball" -

    in memory is critical to solve this problem #bigdatamgmt

    @IBMbigdataIBM big data

    I feel the need! RT @BigDataAlex: A6: IMC can help folks leverage

    their data warehouse - rewire the house for speed. #bigdatamgmt

    @zacharyjeansZachary Jeans

    A6: I don't know the answer. What are the stability issues with long

    term storage on physical media vs In Memory solutions?

    #BigDataMgmt

    @dvellanteDave Vellante

    A6. Ask any DW practitioner and they'll tell you a story of "chasing the

    chips" #the_need_for_speed #bigdatamgmt

    @InfoMgmtExecRichard R. Lee

    #bigdatamgmt A6 In Memory EDW is Holy Grail. Makes EDW more of

    "real-time repository" that can better serve Operational &Analytical needs.

    @CuneytGCuneyt Goksu

    A6 in memory analytics is needed if you expect fast reply from dw

    supported by hadoop #bigdatamgmt

    @cristianmolaroCristian Molaro

    A6 some data warehousing appliances take advantage of in-memory

    processing of data. An example is IBM IDAA #bigdatamgmt

    @BigDataAlexAlex Philp

    A6: People need to save money in building and supporting their

    warehouse. IMC is one way to get there. #bigdatamgmt

    @cristianmolaroCristian Molaro

    A6 in-memory processing has been for ages THE performance

    strategy of every database management system... bufferpools?

    #bigdatamgmt

    @jeffreyfkellyJeff Kelly

    A6 must balance biz value of better performance via in-memory versus

    cost as applied to DW workloads - all workloads really #bigdatamgmt

    @BigDataAlexAlex Philp

    A6: #Forbes is writing about In Memory Computing - paradigm

    shifting. #bigdatamgmt

    @Natasha_D_GNatasha Bishop

    Funny....RT @BigDataAlex: A6: #Forbes is writing about In Memory

    Computing - paradigm shifting. #bigdatamgmt

    @cristianmolaroCristian Molaro

    A6 often computer systems are CPU rich and Memory poor... in some

    cases adding more memory can be the best performance upgrade

    #bigdatamgmt

  • 7/28/2019 Big Data Mgmt

    13/23Analysis of the session 'Big Data MGMT' created with Tweet Category

    @cristianmolaroCristian Molaro

    A6 a huge amount of memory is not necessarily a recipe for great

    performance: the system has to divide info to conquer #bigdata

    #bigdatamgmt

  • 7/28/2019 Big Data Mgmt

    14/23Analysis of the session 'Big Data MGMT' created with Tweet Category

    Category

    Answer 3Tweets to question 3

    15 13 17 35.978 124.246 1,2tweets users retweets potential reach potential

    impact

    tweets / user

    Most Active

    Users

    1. IBMbigdata2.BigDataAlex

    3.Natasha_D_G4. InfoMgmtExec5. jeffreyfkelly

    Tweets from this category

    @IBMbigdataIBM big data

    Q3 in a minute #bigdatamgmt

    @IBMbigdataIBM big data

    Q3 What are in-memory's applications in transactional computing?

    #bigdatamgmt

    @InfoMgmtExecRichard R. Lee

    #bigdatamgmt A3 Orgs want entire Customer Base, Product Sku's

    & Pricing in Memory for rapid transaction processing. Customers

    will not wait!

    @jeffreyfkellyJeff Kelly

    A3 ad tech - analyzing user data, real-time bidding, delivering

    persnalized content - in milliseconds #bigdatamgmt

    @katsnelsonLeon Katsnelson

    A3 many Streams apps are transactional and Streams is always in

    memory. #bigdatamgmt

    @IBMbigdataIBM big data

    Pondering Q3, I see. What are in-memory's applications in

    transactional computing? #bigdatamgmt

    @CuneytGCuneyt Goksu

    A3 all oltp apps need to be fast. n memory is fast too. So any oltp app

    is in the scope of inmemory #bigdatamgmt

    @BigDataAlexAlex Philp

    A3:Connecting the Internet of Things - IP addressable sensors to real-

    time calibrate our models for better predictive analytics #bigdatamgmt

    @IBMbigdataIBM big data

    Impatient souls RT @InfoMgmtExec: #bigdatamgmt A3 Orgs want

    entire Customer Base, Product Skus & Pricing in Memory

    #bigdatamgmt

    @Natasha_D_GNatasha Bishop

    Needed in our "instant" mrkt RT @InfoMgmtExec: #bigdatamgmt A3

    Orgs want entire Customer Base, Product Skus & Pricing in

    Memory #bigdatamgmt

    @IBMbigdataIBM big data

    Customization RT @jeffreyfkelly: A3 ad tech, analyzing user data, real-

    time bidding, persnalized content in millisecs #bigdatamgmt

    @jeffreyfkellyJeff Kelly

    A3 any transaction workload that requires real-time response in order

    to win/save/upsell the customer is in-memory candidate #bigdatamgmt

    @BigDataAlexAlex Philp

    A3:Working in the oil and gas industry-energy exploration requires

    millions of transactions a day for discovery of new resource

    #bigdatamgmt

    @furrierJohn Furrier

    A3: memory is making up for disk speed & is now becoming more

    important in software models-big oppty #bigdatamgmt

    @InfoMgmtExecRichard R. Lee

    #bigdatamgmt A3. In-Memory db will allow Predictive Models to be

    deployed into Transactional Work Flows for real-time scoring &

    prediction

  • 7/28/2019 Big Data Mgmt

    15/23Analysis of the session 'Big Data MGMT' created with Tweet Category

    Category

    Answer 5Tweets to question 5

    19 13 13 47.221 130.020 1,5tweets users retweets potential reach potential

    impact

    tweets / user

    Most Active

    Users

    1. IBMbigdata2.Natasha_D_G

    3.zacharyjeans4.BigDataAlex5.dvellante

    Tweets from this category

    @IBMbigdataIBM big data

    Q5 What are the economics? Is in-memory more expensive? Where

    does it make sense? #bigdatamgmt

    @BigDataAlexAlex Philp

    A5: Flash memory is cheap, and getting cheaper. #bigdatamgmt

    @zacharyjeansZachary Jeans

    A5: In-Memory must either serve a mission critical system, or profit the

    company via efficiency gain. #BigDataMgmt

    @BigDataAlexAlex Philp

    A5: It takes 4 racks of disk storage to create a system capable of 1

    million IOPS, or input/output operations per second. #bigdatamgmt

    @Natasha_D_GNatasha Bishop

    MT @jameskobielus: #bigdatamgmt A5: Yes, in-mem more expensive

    acq than HDD, but coming down. Cost per IOPS, though, in-mem cost-

    effective

    @BigDataAlexAlex Philp

    A5: It would take only one shelf of a flash-based storage system.

    #bigdatamgmt

    @jeffreyfkellyJeff Kelly

    A5 hybrid approach - in-memory/disk - often needed to make

    economics work #bigdatamgmt

    @InfoMgmtExecRichard R. Lee

    Reductions in latency well worth the cost factors. @jameskobielus:

    @IBMbigdata #bigdatamgmt A5: Yes, in-mem more expensive acqthan HDD.

    @zacharyjeansZachary Jeans

    A5: We wouldn't even be talking In Memory solutions today if the price

    for RAM wasn't becoming so reasonable. #BigDataMgmt

    @katsnelsonLeon Katsnelson

    A5 right cost model for the right type of data. Nothing is cheap or

    expensive on its own. Too expensive for something #bigdatamgmt

    @IBMbigdataIBM big data

    Nice! RT @BigDataAlex: A5: It would take only one shelf of a flash-

    based storage system. #bigdatamgmt

    @dvellanteDave Vellante

    A5. Isn't it really a balance? - hierarchy of media from in-memory-

    >flash->spinning rust #bigdatamgmt

    @Natasha_D_GNatasha Bishop

    Gd point RT @katsnelson: A5 right cost model 4 right type data.

    Nothing cheap or expensive on its own. 2 expensive 4 something

    #bigdatamgmt

    @IBMbigdataIBM big data

    Rust - nice! RT @dvellante: A5. Isnt it really a balance? - hierarchy of

    media from in-memory->flash->spinning rust #bigdatamgmt

    @furrierJohn Furrier

    A5: opensource impacts the economics when talking mission critical;

    sw written to live in-memory is paradigm shift #disruption

    #bigdatamgmt

    @dvellanteDave Vellante

    A5. Best economic solution is intelligence in file sys where active data

    svcd fm fast memory and slow data is in the bit bucket #bigdatamgmt

  • 7/28/2019 Big Data Mgmt

    16/23Analysis of the session 'Big Data MGMT' created with Tweet Category

    @InfoMgmtExecRichard R. Lee

    #bigdatamgmt A5 Economics self-evident. Living in real-time world

    using tools that are not real-time. Reducing Latency to Zero is end

    game.

    @dvellanteDave Vellante

    A5. imho less a matter of $ + more case of biz impact. If biz

    case=excellent $ of in-mem is irrelevant #bigdatamgmt

    @zacharyjeansZachary Jeans

    A5: #BigDataMgmt #CXO #Leadfromwithin #LeadWithGiants

    #CXOTalk are all great twitter chats. #cmgrhangout

  • 7/28/2019 Big Data Mgmt

    17/23Analysis of the session 'Big Data MGMT' created with Tweet Category

    Category

    Answer 1Tweets to question 1

    15 10 13 26.441 78.334 1,5tweets users retweets potential reach potential

    impact

    tweets / user

    Most Active

    Users

    1.Natasha_D_G2.BigDataAlex

    3. IBMbigdata4.BTRGIG5.Dmattcarter

    Tweets from this category

    @IBMbigdataIBM big data

    Reminder: Use A1, A2, etc. as you respond to signify which question

    you are addressing, and include #bigdatamgmt

    @IBMbigdataIBM big data

    Q1 What is in-memory tech? How does it enable real-time speed-of-

    thought #analytics? #bigdatamgmt

    @BigDataAlexAlex Philp

    A1: In-Memory Computing (IMC) utilizes RAM-DRAM for extremely

    fast I/O, moving us away from slow, underutilized spinning disk

    #bigdatamgmt

    @Natasha_D_GNatasha Bishop

    A1:In-memory tech enables biz 2 utilize data stored in main memory vs

    fragmented/siloed trad databases #bigdatamgmt

    @jeffreyfkellyJeff Kelly

    A1 in-memory refers to storing data in main memory (DRAM) rather

    than spinning disk #bigdatamgmt

    @IBMbigdataIBM big data

    Nobody likes slow RT @BigDataAlex: A1: In-Memory Computing

    (IMC) utilizes RAM-DRAM for extremely fast I/O #bigdatamgmt

    @Natasha_D_GNatasha Bishop

    A1: In simplest form: open book exams vs memorizing ans. Time it

    takes to search for answers test is over! #bigdatamgmt

    @BigDataAlexAlex Philp

    A1: IMC reduces power and storage costs, revolutionizing access.

    #bigdatamgmt

    @jeffreyfkellyJeff Kelly

    A1 much faster to pull data from memory than disk - response time

    much quicker than spinning rusty metal allows #bigdatamgmt

    @IBMbigdataIBM big data

    Ha ha! RT @Natasha_D_G: A1: In simplest form: open book exams vs

    memorizing ans. #bigdatamgmt

    @cristianmolaroCristian Molaro

    A1 memory access is way faster than disk I/O... even against SSD

    #bigdatamgmt

    @cristianmolaroCristian Molaro

    A1 faster data access enables real-time massive data processing: real-

    time #bigdata #bigdatamgmt

    @jameskobielusjameskobielus

    #bigdatamgmt A1: Speed of thought is any tech that doesnt have any

    architectural bottlenecks that arbitrarily slow people's explorations

    @CuneytGCuneyt Goksu

    A1 #bigdatamgmt inmemory means fast access to data #bigdatamgmt

    @Natasha_D_GNatasha Bishop

    A1: Memory makes diff! Ability 2 deliver accurate answer w/o pregnant

    pauses impacts biz agility #bigdatamgmt

  • 7/28/2019 Big Data Mgmt

    18/23Analysis of the session 'Big Data MGMT' created with Tweet Category

    Category

    Answer 7Tweets to question 7

    10 13 12 38.249 95.607 0,8tweets users retweets potential reach potential

    impact

    tweets / user

    Most Active

    Users

    1. IBMbigdata2. jeffreyfkelly

    3.BTRG_MikeMartin4. InfoMgmtExec5.Natasha_D_G

    Tweets from this category

    @IBMbigdataIBM big data

    Q7 Can in-memory techniques be applied to non-relational databases

    and/or #Hadoop? #bigdatamgmt

    @jeffreyfkellyJeff Kelly

    A7 yes - see @aerospike, NoSQL, flash-optimized in-memory DB

    #bigdatamgmt

    @InfoMgmtExecRichard R. Lee

    #bigdatamgmt A7 -Time Series db's(Informix) will benefit substantially

    from In Memory. Critical to Smart Metering and Smart Grid strategies.

    @dfloyerDavid Floyer

    #BigDataMgmt A7 Of course! Databases such as Couchbase

    (Memcache) & Aerospike (Flash) use KV pairs in memory

    extensively for transactions

    @dvellanteDave Vellante

    A7. yes and @jeffreyfkelly - interesting Aerospike - that's an extension

    of memory using flash #bigdatamgmt

    @IBMbigdataIBM big data

    A7 MT @katsnelson: Hadoop is about data on disk. Streams does

    opposite i.e processes in-memory. IBM bundles Hadoop and Streams

    #bigdatamgmt

    @jeffreyfkellyJeff Kelly

    A7 like the DW question, in-memory DB can supplement Hadoop

    batch analytics w/ real-time analytic queries #bigdatamgmt

    @BigDataAlexAlex Philp

    A7: InfoSphere #Streams brings "database" functions into IMC in real

    time for continuous query and calculations #bigdatamgmt

    @Ercan__YilmazErcan Yilmaz

    A7. #spark uses in memory querying of data #bigdatamgmt

    @jeffreyfkellyJeff Kelly

    A7 I believe there are in-memory instances of #Cassandra - anybody

    have info? #bigdatamgmt

  • 7/28/2019 Big Data Mgmt

    19/23Analysis of the session 'Big Data MGMT' created with Tweet Category

    Category

    Answer 8Tweets to question 8

    9 9 11 31.685 83.706 1,0tweets users retweets potential reach potential

    impact

    tweets / user

    Most Active

    Users

    1. IBMbigdata2.dvellante

    3.cristianmolaro4.BTRG_MikeMartin5.Natasha_D_G

    Tweets from this category

    @IBMbigdataIBM big data

    Last ? - Q8 What is main role for in-memory in #bigdata

    infrastructures? Where does flash memory fit? #bigdatamgmt

    @dvellanteDave Vellante

    A8. The best IO is no IO #bigdatamgmt

    @dvellanteDave Vellante

    A8. But no IO is expensive so in-memory in #bigdata has to be used

    judiciously #bigdatamgmt

    @IBMbigdataIBM big data

    No IO! No IO! No IO! RT @dvellante: A8. The best IO is no IO

    #bigdatamgmt

    @cristianmolaroCristian Molaro

    A8 main role should be to accelerate access in relevant chunks...

    #bigdata is too big to be contained in memory... #bigdatamgmt

    @Natasha_D_GNatasha Bishop

    HA! RT @BTRG_MikeMartin: RT Favorite comment so far @dvellante:

    A8. The best IO is no IO #bigdatamgmt

    @jeffreyfkellyJeff Kelly

    A8 in-memory should be used strategically in #BigData infrastructure

    where speed, performance gains outweigh costs #bigdatamgmt

    @cristianmolaroCristian Molaro

    A8 I like the concept of multi-temperature storage: the hottest data

    stored on the faster (and more expensive) storage device#bigdatamgmt

    @cristianmolaroCristian Molaro

    A8 not all the #bigdata has the same requirements for access

    performance: keep the hot data close to you and in memory

    #bigdatamgmt

  • 7/28/2019 Big Data Mgmt

    20/23Analysis of the session 'Big Data MGMT' created with Tweet Category

    Category

    Answer 4Tweets to question 4

    12 10 7 36.056 69.040 1,2tweets users retweets potential reach potential

    impact

    tweets / user

    Most Active

    Users

    1.Natasha_D_G2. IBMbigdata

    3.BigDataAlex4. InfoMgmtExec5. jeffreyfkelly

    Tweets from this category

    @IBMbigdataIBM big data

    Q4 How does in-memory support greater data scientist productivity?

    #bigdatamgmt

    @BigDataAlexAlex Philp

    A4: HPC, next gen chip design, less I/O disk functions in our code,

    converging toward better scientific computing. #bigdatamgmt

    @jeffreyfkellyJeff Kelly

    A4 less trips to the watercooler waiting for query response

    #bigdatamgmt

    @Natasha_D_GNatasha Bishop

    Indeed! RT @jeffreyfkelly: A4 less trips to the watercooler waiting for

    query response #bigdatamgmt

    @Natasha_D_GNatasha Bishop

    A4: Data scientist gain major advantage when they can access &

    digest massive amts data in secs. #bigdatamgmt

    @InfoMgmtExecRichard R. Lee

    #bigdatamgmt A4 - Decision Scientists spend way too much time

    today conditioning & gathering data. In Memory can have it all in

    one place.

    @BigDataAlexAlex Philp

    A4: Fire Scientists in Montana are using in-memory computing to

    better understand wild land fire given a changing climate.#bigdatamgmt

    @Natasha_D_GNatasha Bishop

    A4: When data scientists can find answers 2 questions they didnt

    THINK to ask its a win #bigdatamgmt

    @IBMbigdataIBM big data

    Nice point RT @Natasha_D_G: A4: When data scientists can find

    answers 2 questions they didnt THINK to ask its a win #bigdatamgmt

    @Ercan__YilmazErcan Yilmaz

    A4. To the effect that it improves data munging and visualization, it

    helps #bigdatamgmt

    @InfoMgmtExecRichard R. Lee

    #bigdatamgmt A4 - In-Memory allows DS to create a "Memory Palace"

    for Models, A/B Tests, Algorithms in development, etc. All in real-time.

    @IBMbigdataIBM big data

    A palace! RT InfoMgmtExec: A4 - In-Memory allows DS to create a

    "Memory Palace" for Models, A/B Tests, Algorithms in dev

    #bigdatamgmt

  • 7/28/2019 Big Data Mgmt

    21/23

    Glossary

  • 7/28/2019 Big Data Mgmt

    22/23Analysis of the session 'Big Data MGMT' created with Tweet Category

    Page 1: General Overview: This page shows at a glance the evolution and the global statistics of the session.

    Statistics

    Number of Tweets: Total number of Tweets sent during the session, RTs and replies included. It is shown below, a

    breakdown of the types of tweets: original tweets (those containing text only) tweets with links, retweets, conversations

    (tweets as part of a conversation between several users), check-ins and photos.

    - Number of users: Total number of users who participated in the session using the given hashtag. It also includes userswho only sent RTs.

    - Potential Impact: Number of impressions of the hashtag, which is the number of times that people could have seen the

    hashtag. This is important because it tells you how many times it has been possible to visualize the hashtag. This number is

    calculated by multiplying the number of followers of each user by the number of number of Tweets and adding those results.

    Example: If a user sends 2 tweets and he has 100 followers, the number of impressions generated by the users would be

    200. If another user sends 3 tweets and he has 50 followers, the number of impressions generated by this person would be

    150 which would make a total of 350 impressions of the session.

    - Potential Reach: Number of users who have been unable to see the hashtag and could have been impacted by the

    hashtag. This number is calculated by adding all the followers of each user who participated in the session. Using the

    previous example, if the session had 2 users, one with 100 followers and the other one with 50, the reach will be 150

    followers, regardless of the number of tweets sent. IMPORTANT: both the impact and reach are 'potential' because not

    everyone may have seen the hashtag and users can have other users in common.

    - Average number of Tweets per user: this number is the average of Tweets sent per each user. This number is calculated

    by dividing the number of tweets between the number of users who have participated. RTs included.

    - Average followers per user: the average number of followers that users of the session have. This figure indicates how

    influential are the participants in our session. Given that the average number of followers that a Twitter user has is about

    250, you can calculate if participants in your session exceed that average. This number is calculated by dividing the sum of

    followers by the number of users who have participated.

    - Difference between Total Tweets and Tweets: The Total Tweets include RTs, links, replies, links and 'Tweets'.

    'Tweets' are the ones containing only text.

    ChaRTs

    There are different types of graphs in the report Tweet Category:

    Temporal Evolution: shows the time evolution of the tweets sent by users. Tweet Category takes the first and last tweet

    and draws the timeline of the session. Thanks to this chart you will be able to identify the moment people tweeted the most

    or the least.

    Influence of Users: shows the influence of the users who participated in the session. On the vertical axis you will find the

    total number of users and on the horizontal one, the number of followers of those users. As we move to the right part of the

    graph you will see the users who have a greater number of followers and therefore influence. The higher the columns on the

    right, the higher influence of your users.

    User activity: shows the number of tweets sent by users. The vertical axis shows the number of tweets sent and the

    horizontal one the number of users who have participated.

    Page 2: Statistics of the categories: on this page you will find the detailed statistics for each category.

    Rankings of categories:

    This ranking shows which categories have reached the top 5 according to several statistics. It is interesting to note that

    although a category may have a greater number of tweets that another one, it could have a minor number of impressions

    (lower impact). The rankings show the categories with the highest reach, impact, number of users, number of tweets and

    number of RTs.

  • 7/28/2019 Big Data Mgmt

    23/23

    ChaRTs:

    Impact by category: This graph shows which category has the highest number of impressions and therefore the highest

    impact.

    Tweets by category Chart: This chart shows which category has the highest number of total tweets.

    Users by category Chart: This chart shows which category has the highest number of users.

    Table of categories:

    This table shows detailed statistics for each category; these statistics are the same variables as the global statistics of the

    session but applied to each of the category. Thus you can see which category gets more impact, more users, and so on. It is

    worth taking a second to consider this table as very interesting conclusions can be obtained from it.

    Page 3: User Rankings

    Tweet Category offers different kinds of user rankings:

    Most active users: the ones who tweeted the most using the hashtag. RTs included.

    Most popular users: the ones who have the highest number of followers in the session.

    Users with the highest impact: the ones who generated the highest number of impressions.

    Most participative users: the ones who participated in more categories.

    Most retweeter users: the ones who sent the highest number of RTs.

    Most original users: the ones who sent the highest number of original tweets (No RTs).