text mining - taftie ·  · 2017-09-18a new approach in ffg to achieve deeper insights into...

24
A new approach in FFG to achieve deeper insights into project data TAFTIE Expert Session Sept 14, 2017 Harald Hochreiter, Daniel Vilsecker Text Mining

Upload: truongtuong

Post on 12-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • A new approach in FFG to achieve

    deeper insights into project data

    TAFTIE Expert Session

    Sept 14, 2017

    Harald Hochreiter, Daniel Vilsecker

    Text Mining

  • INNOVATION MAPPING IN FFG

    FFG tried to map innovation trends from the projects we

    fund

    Thematic teams were created who met every 1-2 months

    to discuss projects and exchange experience (eg. mobility,

    services)

    Worked well for short term requests and strengthened

    thematic bridges between programmes and departments

    Did not succeed in identifying larger trends or develop

    thematic briefings that stakeholders would have been

    interested in

    1 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at

  • SENTIMENT ANALYSIS

    2 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at

  • MACHINE LEARNING

    Deepminds AlphaGo

    computer program has

    recentely beat the

    human world champions

    of Go

    3 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at

  • TEXT MINING

    4 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at

  • Text Mining

  • DEFINITION OF TEXT MINING

    The overarching goal is to turn text into data for analysis,

    via application of natural language processing (NLP) and

    analytical methods.

    Text analysis involves information retrieval, lexical analysis

    to study word frequency distributions, pattern recognition,

    tagging/annotation, information extraction, data mining

    techniques including link and association analysis,

    visualization, and predictive analytics. (Source: Wikipedia)

    6 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at

  • WORD CLOUD OF MPG ACTIVITY REPORT

    7 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at

  • K-Means Clustering

    sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at 8

  • FFG DATA

    20.000 project applications

    12.750 abstracts selected (german only, excluded specific

    programmes)

    Analysed with unsupervised k-means clustering aiming at

    15 clusters

    Clusters containing between 238 and 2486 projects

    9 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at

  • Clustering vs SIC codes (Standard Industrial Classification codes)

  • K-Means Clustering (k=15)

    sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at 11

    2486 2222

    1344

    1027 855 831

    750 676

    486

    451 450

    322

    319

    289 238

    Clusters contain between 238 and 2486 projects

  • INDIVIDUAL CLUSTER

    sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at 12

    kund

    softwar

    mobil

    plattform

    servic

    produkt

    automat

    digital

    app

    technisch

    prototyp

    onlin

    markt

    1344

  • INDIVIDUAL CLUSTER

    sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at 13

    patient

    klinisch

    medizin

    studi

    behandl

    erkrank

    human

    zell

    wirksam

    antikorp

    wirkstoff

    substanz

    therapeut

    450

  • Clustering result vs SIC Codes

    sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at 14

  • Identify trends over time

  • 1.279 PROJECTS IN 2012

    16 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at

    software / application

    / platform

    web /

    system /

    mobil

    production /

    construction

    component

    / wood /

    plastic

    material /

    steel

    /ceramic

    raw

    material

    /

    biomass processing /

    building

    surface /

    purification

    packaging /

    resilent

    substrate /

    bacteria

    medicine /

    patient

    vaccine /

    disease

    laser /

    visual sensor /

    calculate

    media /

    social

    building /

    smart grid

    / energy

    facility /

    heat pump

    / electicity

    led /

    durability /

    battery

    score /

    education

    woman /

    needs

    child /

    youth

    pupil /

    teaching

    engine /

    gas vehicle /

    hybrid

  • 1.382 PROJECTS IN 2013

    17 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at

    sensor / engine /

    automatic / control

    service /

    platform /

    onilne

    waste /

    processing alloy /

    copper

    surface /

    lamination /

    diagnostic

    cell /

    protein patient / clinical

    mobil /

    software

    smart grid

    / energy

    facility /

    photovoltaic child /

    youth

    women /

    men / social

    vehicle /

    driving

    traffic /

    multimodal

    big data /

    country

    IEA / tool

    optical /

    led

    visualization /

    real-time

    component /

    manufacturing

    raw material /

    chemical

    emission / fuel

    / biomass

    material / steel

    / consolidation

    solar /

    concept

    hybrid /

    adaptive

  • Self Organising Maps (SOM)

  • SOM Example (Scatter View)

    sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at 22

  • SOM Example (Density View)

    sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at 23

  • Next steps

  • NEXT STEPS

    Work on visualisation (heatmaps et al)

    Start working on supervised learning using ontologies

    (e.g.: which projects contain aspects of digitisation?)

    Analyse individual sectors (e.g.: what are the research

    trends in the forest industry?)

    Scale up and integrate it into routine data analysis

    25 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at

  • FEEDBACK AND DISCUSSION

    Comments?

    Questions?

    Suggestions?

    Thank you!

    26 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at