termset metadata tagging presentation - taxonomy bootcamp london 2016

Post on 15-Apr-2017

83 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

TAGGING DOCUMENTS MADE EASY, USING MACHINE LEARNINGBrendan Clarkebrendan@termset.comwww.termSet.com

BRENDAN CLARKE• A Microsoft ECM expert

• Co-Founded TermSet three years ago

• Got the scars from real world IA projects

Creating Tax-

ononomies; 7

NLP; 3

Demo; 10Tagging; 10

Demo; 10Agenda

PART ONE – APPROACHES FOR BUILDING TAXONOMIES

TOP DOWN - APPROCH• Defines top level

containers and work downwards.

• Usually broad (3-10 wide) and shallow (3-4 deep)

• Simple, high level classification (functional)

TOP DOWN – TERMS

• Manually defined or replicated from existing structures

• Imported from other systems

• Industry standards / purchased taxonomies

TOP DOWN – SUMMARY

• People / Committee Driven approach

• Some guesswork of what terms should be

• Simple, high level classification (functional) – Way better than folders!

BOTTOM UP - APPROCH• Terms driven by the

words and phrases within your content

• More complex taxonomies

• Detailed, accurate terms that are subject or facet level

BOTTOM UP - TERMS• Manual analysis of

the documents

• Statistical analysis of terms and phrases

• Natural Language processing

BOTTOM UP - SUMMARY• Technology driven

approach (or a very tough people process)

• Produces detailed taxonomies that reflect the actual content

• Extra granulation of tagging

AND THE WINNER IS…

• Combining top down and bottom up is the best approach

• Top down classifies the type of documents

• Bottom up classifies the subject of the document

• New technology allows bottom up to be realistic

TermSet adds accurate consistent metadata without placing any burden on end users or your IT team.

Builds taxonomies (bottom up) using NLPApplies tagsMetadata as a service TM

WHAT EXACTLY IS NLP ?

DEMO – CREATING TERMS FROM YOUR DOCUMENTS USING NLP

PART TWO – APPLYING YOUR TAGS

MANUAL TAGGING • Adoption problem

• Asbestos problem / GIGO

• Challenging to do retrospectively (migration tools can help)

MANUAL TAGGING • Infer as many terms as possible from:

Document types, Location, Function

• Mandate as few tags as possible

• Stay shallow or flat with hierarchies

MACHINE TAGGING • Simple machine tagging can use search

to match taxonomy terms to the content of documents

• More advanced taggers allow rules or weights to be assigned to each tag (tags not context aware)

• New technologies (NLP) provide a new approach to creating taxonomies

TERMSET TAGGING • TermSet recommends the right

taxonomies for each library (context aware tagging)

• TermSet automates building the underlying IA in SharePoint

• Extra cool NLP tags can be added (Summaries, Sentiment and Language)

• Monitors for new documents and terms arriving into your world

DEMO – TAGGING DOCUMENTS

WRAP UP• TermSet automates a bottom up

approach to create and use taxonomies for SharePoint

• Visit www.termset.com or e-mail brendan@termset.com for a free licence

• If you need assistance with top down taxonomies or you use a different DMS e-mail me to join the beta program for www.taxononica.com

top related