exploring automated patent search with...

29
Exploring Automated Patent Search with KNIME Possibilities, Limits, Future Alexander Klenner-Bajaja, PhD [email protected]

Upload: others

Post on 09-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Exploring Automated Patent

Search with KNIME Possibilities, Limits, Future

Alexander Klenner-Bajaja, PhD

[email protected]

Page 2: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

European Patent Office

2

Offices: Berlin, Vienna, Munich, The Hague (Rijswijk), Brussels

Staff: over 7000

Mission: Granting European Patents

New building in Rijswijk,

finished approx. 2018

Page 3: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Why Search – European Patent Convention

3

Information Management`s

Task: Support Search

Page 4: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Patent Search – Filing numbers increase

4

2005 -----------------------------------------------------------2015

Page 5: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Introduction – What do we want, where are we?

5

Application Query Formulation

PriorArt

Search

relevant documents

Page 6: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

MetaData

Neo4J

The EPO`s Tool landscape

6

Translation

API Boolean Search

Engine

Lucene Search

Engine

Fulltext

MongoDB GoldStandard

SQL

Search

Evaluation

ImageDB

Concept

Extraction

Page 7: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

MetaData

Neo4J

The EPO`s Tool landscape

7

Translation

API Boolean Search

Engine

Lucene Search

Engine

Fulltext

MongoDB

GoldStandard

SQL

Search

Evaluation

ImageDB

Concept

Extraction

Page 8: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

MetaData

Neo4J

The EPO`s Tool landscape

8

Translation

API Boolean Search

Engine

Lucene Search

Engine

Fulltext

MongoDB

GoldStandard

SQL

Search

Evaluation

ImageDB

Concept

Extraction

Page 9: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

MetaData

Neo4J

The EPO`s Tool landscape

9

Translation

API Boolean Search

Engine

Lucene Search

Engine

Fulltext

MongoDB

GoldStandard

SQL

Search

Evaluation

ImageDB

Concept

Extraction

Page 10: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

One platform to allow rapid prototyping and

evaluation

10

Page 11: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

The current Search System

A Lucene elastic search based system, documents are returned as

ranked lists

11

Search

Spacek

Lucene query1

Page 12: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Patent Gold Standards

We have “manually” curated search reports for about 40 million simple

patent families

The relevant documents are mentioned in the search report as either

–X(I,N),A,Y,... documents

12

median: 5 citations

in search reports

Page 13: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Setting up a benchmarking environment

We need to move away from anecdotal evidence to statistically

meaningful facts

TAPAS

13

SEARCH

INDEX

Applications

Method 1 Method 2

MAP:0.4 MAP:0.2

Patent Corpus

* Exploiting real queries

Evaluate!

Page 14: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Using KNIME to enhance automated search

14

Application

Query

Formulation

PriorArt

Search

relevant documents

Evaluation and Feedback

Page 15: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Use Case: Does Translation help to find relevant

Prior Art?

15

Page 16: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Evaluating the results – Does Translation help to

find relevant Prior Art?

16

RelevantExpected

Combinedresults

TranslatedQueries

OriginalQueries

%-relevant retrieved 100 36,92307692 27,69230769 11,53846154

0

10

20

30

40

50

60

70

80

90

100

%-r

ele

va

nt

retr

ieve

d%-relevant retrieved

Page 17: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Patents have multi-modal information content: Images

Images

– Chemical Formulas

– Flow Diagrams

– Circuits

– Technical Drawings

17

Page 18: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Image Search the Google way

Standard Google image search (very strong on real world images)

is currently not suited for technical drawings

Extremely complicated endeavour

18

Google

results

Page 19: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Use Case: Image Search

19

Search

Space

Query

State of the art

Image processing

Filtering and

Visualisation

Page 20: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Image Search Adaptive Hierarchical Density Histogram (AHDH)

closest matchAHDH

20

New

results

Exploit distribution of image points

Page 21: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Can we optimize our Search Tool Parameters?

21

Page 22: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Use Case: Parameter Optimisation

22

Swarm Optimization

Page 23: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Can we optimize our Search Tool Parameters?

23

Start

End

Page 24: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

What are the possibilities?

We were able to create a system where all components sharing the

same (table based) interface.

We make full use of many existing components and combine them

successfully with internal and external additional developments

– Text mining

– Image Similarity

– Machine Learning based document similarity

– Influence of translation to patent search

Rapid Prototyping

Evaluation and Feedback

24

Page 25: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

What are the limits?

Very practical issues like memory freezes of the eclipse Environment

Sometimes extreme over head introduction e.g. text mining nodes

(String to Document)

Not suitable from our experience for full corpus analytics (100 million

full text documents and more)

Getting lost in “yellow” nodes (ungroup, pivot, regroup, filter, missing

value,...)

25

Page 26: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Future

Exposing stable services as web service through the KNIME server

web-interface

Exposing workflow creation to a wider range of user (right now its IM

only)

Connecting more services

Using of streaming to overcome some of the limits

26

Page 27: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

27

Thank you

for your attention

[email protected]

Page 28: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

28

Page 29: Exploring Automated Patent Search with KNIMEfiles.meetup.com/19483337/KNIME_Den_Haag_Alexander... · 2016-03-14 · Exploring Automated Patent Search with KNIME Possibilities, Limits,

Supplement

29