the information school at the university of washington information inventories bob boiko uw ischool...
TRANSCRIPT
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton Information Inventories
Bob Boiko
UW iSchoolischool.washington.eduMetatorial Services Inc.
www.metatorial.com
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
What we will cover
• What is an inventory?• What are the deliverables?
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
What is an Info Inventory?
What is an inventory anywhere?
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
What’s the Point?
• What files exactly do we have?• Can we get them good and organized for
the system we are building?– Use– Audience– Subject– Types– Formats– Source
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
A Readiness Inventory vs. Info Inventory
Doc Inventory for a Readiness Assessment
Info Inventory for Some Project
Docs about the initiative Docs for the initiative
A small number A large number
Informally organized Formally organized
No metadata As much metadata as possible
No further destination Destined for other people to use
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
Info Inventory vs. Info Audit
Inventory Audit
Find files Find opportunities
Tag files Define gaps
Deliver files Deliver recommendations
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
What’s the Overall Process?
1. Establish a domain of interest
2. Establish were the files and other information live that are within the domain
3. Establish a metadata set
4. Tag for that set
5. Amass the metadata and files for delivery
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
An Iterative ModelReady, shoot, aim, shoot, aim, shoot, aim…Do the least work on the most files
• Domain Shares
Directories Files
• Files (small random sample) Files (bigger sample)
Files (bigger still)Files (all)
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
What are the Deliverables?
1. Establish a domain •A strategy
2. Establish locations •Location list with priorities
3. Establish metadata •Metatorial guide•Process plan
4. Meta-tag •Collection tools•A Repository•Reports
5. Amass the metadata and files
•Delivery tools•Metadata collection•File collection
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
The Domain of Interest
• What statement describes the files we are looking for?
• How do you know if a file qualifies?• Where do these kinds of files reside?
– LAN– WAN– Local Hard Drives– Webs– Public Sources
• How can you get access to them?
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
Deliverables: A Strategy
How you will interact with the organization to find and tag information
• What: A Written Plan with– Who you need– What you will need from them
• How– Whatever mandate you might have– Consensus building– Establish span of control– Provide plans– Get buy-in
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
What are the Deliverables?
1. Establish a domain •A strategy
2. Establish locations •Location list with priorities
3. Establish metadata •Metatorial guide•Process plan
4. Meta-tag •Collection tools•A Repository•Reports
5. Amass the metadata and files
•Delivery tools•Metadata collection•File collection
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
Deliverables: A Location List
What files and other information we will tag
• What: spreadsheet, database table or XML structure– Location– Number of files and size– Types of files– Process for deepening the analysis
• How – Browsing, observing, asking– File statistics– Sampling
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
What are the Deliverables?
1. Establish a domain •A strategy
2. Establish locations •Location list with priorities
3. Establish metadata •Metatorial guide•Process plan
4. Meta-tag •Collection tools•A Repository•Reports
5. Amass the metadata and files
•Delivery tools•Metadata collection•File collection
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
What Will it Take- Metadata ROI
• What metadata do we need?
• What metadata can we afford?– What will each kind cost?– Who will each kind take?
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
What Will it Take- Tagging
For each type of metadata:
• One value or many?
• How long will it take per file?
• What expertise will they need?
• With what certainty will taggers be able to discern metadata?
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
Typical Tagging Profile
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
Should I Automate?
For each type of metadata:– Is it auto-detectable?– In what percent of the files?– What will it take to create a tool?– Is it worth it?
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
Deliverables: Metatorial Guide
A definitive guide to how to tag
• What: An MS Word file or Web page– Why are you tagging?– What is the overall process?– For each tag:
• What does it mean?• When do you use it?• What are its allowed values?
• How:– Existing metadata distinctions– File statistics– Automated metadata discovery tools– Feedback and revision process
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
Deliverables: Process Plan
What each person should be doing and when
• What: MS Project or other planning system– Each person’s time commitment– Each person’s assignment– Due dates
• How– Lots of negotiation– Process for constant evaluation and reassignment– Process for training– Relief valves– Process for QC
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
What are the Deliverables?
1. Establish a domain •A strategy
2. Establish locations •Location list with priorities
3. Establish metadata •Metatorial guide•Process plan
4. Meta-tag •Collection tools•A Repository•Reports
5. Amass the metadata and files
•Delivery tools•Metadata collection•File collection
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
Deliverables: Collection Tools
Aids to effective data entry
• What: Templates and small programs– Preloaded spreadsheets– Web forms– Data validation
• How– Automated metadata discovery tools– MS Office power use & programming– Web programming
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
Deliverables: A Repository
A place to put the metadata you collect
• What– Databases and/or XML structures– Controlled vocabularies– Taxonomies– Management info
• How– Loaders from collection tools– Schema development– RDB or XML programming
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
Deliverables: Reports
What do we have, what do we still need, and how are we doing?
• What– Word files– Email messages– Spreadsheets
• How– RDB or XML programming– Statistical analysis – Roughing it out
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
What are the Deliverables?
1. Establish a domain •A strategy
2. Establish locations •Location list with priorities
3. Establish metadata •Metatorial guide•Process plan
4. Meta-tag •Collection tools•A Repository•Reports
5. Amass the metadata and files
•Delivery tools•Metadata collection•File collection
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
Deliverables: Metadata & Files
The results in a useful way• What
– The database or XML in a friendly form– Web sites with navigation, metadata and files– CD’s or DVD’s with UI
• How– RDB or XML programming– File collection tools– UI creation (HTML or otherwise)– Final reports
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
What We Will Cover
What are the goals of the project?What is the overall process?What are the deliverables?
• What does the plan look like?
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
The Team
• Project management– Traffic manager
– Issues manager
• Process designer
• Tool developer
• Quality measurement and control staff
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
The Rest of the Organization
• Who– Info Taggers– Info Finders– QC staff– People in charge of the above
• How much time can you expect from them?
• How much mind-share can you expect?
• How will you establish span of control?
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
Third Parties
• Software development
• Tagging support
• Project management
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
In Sum…• The goal of the project is to amass and deliver
a well described body of information
• The process is to establish the guidelines, tag the files, and collect them for delivery
• The ultimate deliverable is a set of files and their related metadata
• The plan matches a small team with the largest possible staff of knowledgeable insiders and a small set of external experts.
T
he I
nfo
rmati
on
Sch
ool at
the U
niv
ers
ity o
f W
ash
ing
ton
Nagging Questions
• Can you get mindshare?• How do you know what your reuse rights are
on each file?• What do you do with composite files?• When do you stop?• How do you avoid the bottlenecks of the SME’s• How do you take back an early mistake?• How do you scale back?