text mining - taftie · · 2017-09-18a new approach in ffg to achieve deeper insights into...
TRANSCRIPT
-
A new approach in FFG to achieve
deeper insights into project data
TAFTIE Expert Session
Sept 14, 2017
Harald Hochreiter, Daniel Vilsecker
Text Mining
-
INNOVATION MAPPING IN FFG
FFG tried to map innovation trends from the projects we
fund
Thematic teams were created who met every 1-2 months
to discuss projects and exchange experience (eg. mobility,
services)
Worked well for short term requests and strengthened
thematic bridges between programmes and departments
Did not succeed in identifying larger trends or develop
thematic briefings that stakeholders would have been
interested in
1 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at
-
SENTIMENT ANALYSIS
2 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at
-
MACHINE LEARNING
Deepminds AlphaGo
computer program has
recentely beat the
human world champions
of Go
3 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at
-
TEXT MINING
4 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at
-
Text Mining
-
DEFINITION OF TEXT MINING
The overarching goal is to turn text into data for analysis,
via application of natural language processing (NLP) and
analytical methods.
Text analysis involves information retrieval, lexical analysis
to study word frequency distributions, pattern recognition,
tagging/annotation, information extraction, data mining
techniques including link and association analysis,
visualization, and predictive analytics. (Source: Wikipedia)
6 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at
-
WORD CLOUD OF MPG ACTIVITY REPORT
7 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at
-
K-Means Clustering
sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at 8
-
FFG DATA
20.000 project applications
12.750 abstracts selected (german only, excluded specific
programmes)
Analysed with unsupervised k-means clustering aiming at
15 clusters
Clusters containing between 238 and 2486 projects
9 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at
-
Clustering vs SIC codes (Standard Industrial Classification codes)
-
K-Means Clustering (k=15)
sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at 11
2486 2222
1344
1027 855 831
750 676
486
451 450
322
319
289 238
Clusters contain between 238 and 2486 projects
-
INDIVIDUAL CLUSTER
sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at 12
kund
softwar
mobil
plattform
servic
produkt
automat
digital
app
technisch
prototyp
onlin
markt
1344
-
INDIVIDUAL CLUSTER
sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at 13
patient
klinisch
medizin
studi
behandl
erkrank
human
zell
wirksam
antikorp
wirkstoff
substanz
therapeut
450
-
Clustering result vs SIC Codes
sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at 14
-
Identify trends over time
-
1.279 PROJECTS IN 2012
16 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at
software / application
/ platform
web /
system /
mobil
production /
construction
component
/ wood /
plastic
material /
steel
/ceramic
raw
material
/
biomass processing /
building
surface /
purification
packaging /
resilent
substrate /
bacteria
medicine /
patient
vaccine /
disease
laser /
visual sensor /
calculate
media /
social
building /
smart grid
/ energy
facility /
heat pump
/ electicity
led /
durability /
battery
score /
education
woman /
needs
child /
youth
pupil /
teaching
engine /
gas vehicle /
hybrid
-
1.382 PROJECTS IN 2013
17 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at
sensor / engine /
automatic / control
service /
platform /
onilne
waste /
processing alloy /
copper
surface /
lamination /
diagnostic
cell /
protein patient / clinical
mobil /
software
smart grid
/ energy
facility /
photovoltaic child /
youth
women /
men / social
vehicle /
driving
traffic /
multimodal
big data /
country
IEA / tool
optical /
led
visualization /
real-time
component /
manufacturing
raw material /
chemical
emission / fuel
/ biomass
material / steel
/ consolidation
solar /
concept
hybrid /
adaptive
-
Self Organising Maps (SOM)
-
SOM Example (Scatter View)
sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at 22
-
SOM Example (Density View)
sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at 23
-
Next steps
-
NEXT STEPS
Work on visualisation (heatmaps et al)
Start working on supervised learning using ontologies
(e.g.: which projects contain aspects of digitisation?)
Analyse individual sectors (e.g.: what are the research
trends in the forest industry?)
Scale up and integrate it into routine data analysis
25 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at
-
FEEDBACK AND DISCUSSION
Comments?
Questions?
Suggestions?
Thank you!
26 sterreichische Forschungsfrderungsgesellschaft | Sensengasse 1 | 1090 Wien | www.ffg.at