the yee cataloging rules: frbrized cataloging rules with an rdf data model for the semantic web...

38
THE YEE CATALOGING RULES: FRBRIZED CATALOGING RULES WITH AN RDF DATA MODEL FOR THE SEMANTIC WEB Presented to ALCTS FRBR Interest Group, ALA Annual 2010, Friday, June 25, 2010

Upload: benedict-heath

Post on 28-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

THE YEE CATALOGING RULES: FRBRIZED CATALOGING RULES WITH AN RDF DATA MODEL FOR THE SEMANTIC WEB

Presented to ALCTS FRBR Interest Group, ALA Annual 2010, Friday, June 25, 2010

BY

Martha M. YeeCataloging SupervisorUCLA Film & Television [email protected]://myee.bol.ucla.edu

INTRODUCTION

1. Yee Cataloging Rules and why they are more FRBR than FRBR

2. RDF resists hierarchy?3. RDF too binary for our

data?

YEE CATALOGING RULES

You can find these rules and the data model, (including the RDF schema, and some RDF examples) at:

http://myee.bol.ucla.edu

YEE CATALOGING RULES

1. Unlike RDA, my rules consider cataloging to be in its essence a set of decisions about indexing and display, rather than pushing indexing and display out of the cataloging rules and into “implementation decisions.”

YEE CATALOGING RULES

2. Unlike RDA and AACR2R, my rules start with the work rather than the item. My rules recognize that cataloging is not the creation of a single bibliographic record but rather the knitting of millions of records into a catalog

YEE CATALOGING RULES

2., cont. However, my rules also require

carefully recording the language on the item so as to provide historical evidence for our decisions regarding works and expressions, authorship, etc.

YEE CATALOGING RULES

2., cont. But Martha, you may be

asking; are not these two approaches irreconcilible?

YEE CATALOGING RULES2. My rules go back to the method

used in Panizzi’s British Museum Catalog, using what I call “degression,” that is:

data elements that apply to all expressions of a work are recorded only at the work level

data elements that apply to all manifestations of an expression are recorded only at the expression level

YEE CATALOGING RULES

2, cont. I find FRBR’s use of the term

“abstract” for the expression and work levels to be somewhat misleading

YEE CATALOGING RULES

2, cont. In fact, manifestations, not

just expressions and works, have always been described based on evidence found on specific copies or items

YEE CATALOGING RULES

2, cont. The assumption of the cataloger

has always been that the transcribed data from the copy sitting in front of her provides useful information about the manifestation, expression, and work contained in that item.

YEE CATALOGING RULES

2, cont. Unlike FRBR and RDA, in my

rules the phrase “2nd rev. ed.” is mapped to expression even though it is transcribed.

YEE CATALOGING RULES

2, cont. This approach to edition

statements matches the definition of expression in FRBR better than the FRBR mapping, which maps an edition statement to the manifestation entity

YEE CATALOGING RULES

2, cont. This is why I say that my

rules are more FRBR than FRBR!

YEE CATALOGING RULES

2, cont. Unlike FRBR, my rules consider

an edition of a particular work with illustrations or with commentary to be a new expression of that work with additions, not two different works (1. work and 2. illustrations or commentary)

YEE CATALOGING RULES

3. My rules emphasize the importance of creating language-based human-readable identifiers for works, creators, subjects, genres, and forms, using the name commonly known in the language and script preferred by the catalog user.

YEE CATALOGING RULES

3. Only such language-based human-readable identifiers for works, creators, subjects, genres, and forms will allow the catalog user to scan large retrievals efficiently and recognize the entity sought.

YEE CATALOGING RULES

4. My rules emphasize the importance of hierarchically structured displays to allow the catalog user to scan large retrievals of thousands, even tens of thousands of items efficiently.

YEE CATALOGING RULES

4. Examples of hierarchically structured displays:

All the expressions of a work, and, separately, all the works related to that work, and, separately, all the works about that work.

YEE CATALOGING RULES

4. Examples of hierarchically structured displays:

All the works on a subject, and, separately, all the works on broader subjects, and, separately, all the works on narrower subjects, and, separately, all the works on related subjects.

YEE CATALOGING RULES

4. Examples of hierarchically structured displays:

All the works that take a particular disciplinary perspective (identified using a classification number), and, separately, all the works that take a broader, narrower or related perspective.

YEE CATALOGING RULES

5. My rules do not consider a change of name to be a change of identity. This includes pseudonyms, corporate name changes, journal title changes, textbook title changes, etc.

YEE CATALOGING RULES

5. There is no credible evidence that catalog users consider a change of name to be a change of identity.

Also, it is more likely that we can reach international agreement on entity definition if we dispense with this false economy

YEE CATALOGING RULES

6. Unlike RDA, my rules do not require that a particular data element be confined to one FRBR level. My rules recognize that a film editor may sometimes create a new work and may sometimes create a new expression

YEE CATALOGING RULES

7. In order to recognize the fact that the subject of a book or a film could be a work, a person, a concept, an object, an event, or a place, all classes in the model, it was necessary to define subject itself as a property (a relationship) rather than a class in its own right. All subject properties are defined as having a domain of resource, meaning there is no constraint as to the class to which these subject properties apply.

RDF RESISTS HIERARCHY?RDF seems to resist hierarchy even

more than the data models underlying our current ILS systems. Every link is a one-to-one link rather than a one-to-many link.

Hierarchy is an essential tool for allowing users to navigate efficiently through hundreds of thousands, even millions, of records.

RDF TOO BINARY FOR OUR DATA?

If subject itself is a property, a relationship between two subjects becomes a property of a property. Technically this is possible in RDF but it becomes very complex.

RDF TOO BINARY FOR OUR DATA?

In fact, in my attempt to model our data using RDF, I frequently felt I needed to create a property of a property.

I would like to generalize from this observation to suggest that RDF may be too binary for our data.

RDF TOO BINARY FOR OUR DATA?

I believe that the relational databases we are using now were always too binary for our data and that is why our current ILS systems perform so badly at indexing and displaying our data.

Rows and columns in tables may be fine for business inventory work, but they don’t seem to work for the complex and hierarchical relationships we need to demonstrate in our catalogs.

RDF TOO BINARY FOR OUR DATA?

I am beginning to suspect that RDF’s classes and properties are just rows and columns in a new guise.

RDF TOO BINARY FOR OUR DATA?

Why is it that every different group that tries to model our data in RDF comes up with a completely different model? Perhaps we are thrashing about trying to jam our data into a model that does not fit?

RDF TOO BINARY FOR OUR DATA?

FRBR classes for creators:Person (3.2.5)Corporate Body (3.2.6)

RDF TOO BINARY FOR OUR DATA?

FRAD classes for creators (April 1, 2007 draft; can’t afford the $84.00 and didn’t have time for ILL):

Person FamilyCorporate Body NameIdentifierControlled Access PointRulesAgency

RDF TOO BINARY FOR OUR DATA?

VIAF classes for creators:#AuthorityAgency#NameAuthority#NameAuthorityCluster#Heading#EstablishedHeading#XRef4xx#XRef5xx

RDF CHALLENGE

As the scientific method teaches us, I cannot prove that it is impossible to build a catalog using RDF.

All that can be proved is that it IS possible, and that can be proved by doing it.

RDF CHALLENGETherefore, I would like to challenge

those who think that semantic web technology IS the way forward for us to build a demonstration system and then show us how we can search for a known work using a variant of the author’s name and a variant of the title, and how multiple works are displayed when the user does a search, such as a subject search, that retrieves thousands of works.

READ MORE ABOUT IT

Yee, Martha M. "Can Bibliographic Data be Put Directly Onto the Semantic Web?" Information Technology and Libraries 28:2 (June, 2009): 55-80. Also available on the Web at:

http://repositories.cdlib.org/postprints/3369