freebase schema

Post on 24-Jan-2015

4.433 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Freebase SchemaJamie Taylor

Wednesday, December 8, 2010

Goals

• Schema: The Freebase Data Model

• Schema as API

• Schema patterns

Wednesday, December 8, 2010

Freebase is a collection of factsSofia Coppola directed Marie Antoinette{ { {Freebase only contains

nodes and LinksWednesday, December 8, 2010

Freebase is a Graph

Wednesday, December 8, 2010

Freebase is a labeled Graph

directed

parent sibling

child wrote

directed

directed

starred_in

starred_in

Wednesday, December 8, 2010

Schema

"All the things you can say about something in Freebase"

Schema is the data model for Freebase

Wednesday, December 8, 2010

All nodes are “/type/object”

name

“Francis Coppola”

type/people/person

id/m/02vyw

[{  "id":"/m/02vyw",  "name":null,  "type":[{}]}]

/type/object/name

type /film/director

Wednesday, December 8, 2010

/en/bram_stokers_dracula

Types suggest properties to use

id/m/02vyw

type /film/director

/film/d

irecto

r/film

/type/object/idWednesday, December 8, 2010

Queries follow schema

[{   "id": "/en/francis_ford_coppola",   "/film/director/film": [{     "id":null, "name":null   }]}]

Wednesday, December 8, 2010

Properties link the graph together

id/m/02vyw

type /film/director

/en/bram_stokers_dracula/type/object/id/fil

m/dire

ctor/fi

lm

written_by

Wednesday, December 8, 2010

Queries follow schema

[{   "id": "/en/francis_ford_coppola",   "/film/director/film": [{     "id": "/en/bram_stokers_dracula", "written_by":null   }]}]

Name is returned(how to get ID?)

How to get all the writters for all of Coppola’s movies?

Wednesday, December 8, 2010

Core Concepts

Wednesday, December 8, 2010

Core Concepts

Instance:• Topic: "a thing in the world"

• Blade Runner, Ridley Scott, NBC, Last Proof

Schema:• Types - Categorical collections of instances

• Properties - Relationships between instances

Wednesday, December 8, 2010

Core Concepts

An instance may have multiple Types• "Co-Types" (Types are mix-ins)

• Arnold Schwartzeneger

• Person, Actor, Politician, Sports Figure

Wednesday, December 8, 2010

Lessons from everyday vocabulary

Wikipedia Word Frequency

0

2000000

4000000

6000000

8000000

10000000

12000000

14000000

16000000

18000000

20000000

0 20 40 60 80 100 120

Rank

Freq

uen

cy

Data from Victor S. Grishchenko

Wednesday, December 8, 2010

Schema Principle #1

Use Co-Types Liberally:

Use a few large, encompassing Types to provide general information

Use several smaller, fine grained Types to provide detailed information

Event Example:-Film Festival-Battle of Waterloo

Wednesday, December 8, 2010

Core Concepts

Properties are defined on Types• Properties are the vocabulary for a specific Type

• An instance must be “an instance of a type” before it can use the Type’s properties to describe itself

Relational DBvsRDF

Wednesday, December 8, 2010

Core Concepts

• A Property Value has a specific Type• "Expected Type"

• A Property has exactly one Expected Type

Manufactures

Expected Type ~ RDFS Range

Wednesday, December 8, 2010

Core Concepts

Expected Types (Property Values):• Value Types (literals)

• String (two flavors), Integer, Float, DateTime, boolean

• Object Types

• Everything Else

Wednesday, December 8, 2010

/type/object

Everything in Freebase has this Type

Provides basic properties

• Type

• Name

• .......

All other Properties come from some other Type!

contrast to common topic

Wednesday, December 8, 2010

/common/topic

"Topics"• Things we have discourse about

• Provides properties:

• Alias

• Article

• Image

• Weblinks

• Assumed to be an "Included Type" for any "standard" type

Wednesday, December 8, 2010

Schema Patterns

Compound Value

Mediator

Phylogeny

Enumeration

Wednesday, December 8, 2010

Compound Value

Two or more properties which can only be interpreted with regard to one another

Population

• Dated Integer ("when did this location have that many people")

Movie Budget

• Dated money value

• Date, Currency, Amount

Ticker Symbol

• Exchange, Symbol

complex literal

Wednesday, December 8, 2010

Compound Value

{  "id":   "/en/apocalypse_now",  "type": "/film/film",  "estimated_budget": [{    "currency":   null,    "amount":     null,    "valid_date": null  }]}

estimated_budget

currency

valid_date

amount 31MM

1979

Wednesday, December 8, 2010

MediatorAn annotation on the link between two Topics• Requires an object between the two Topics

• The Topics become separated by two properties

actor performance film

character

• Also useful for indicating the dates when a relationship existed (e.g., education, employment, etc.)

combine date annotation and character = tv character

Wednesday, December 8, 2010

Mediator

{  "id":   "/en/marie_antoinette_2006",  "type": "/film/film",  "starring": [{    "actor":null,    "character":null  }]}

Wednesday, December 8, 2010

Phylogeny

Examples:

• /location/location/containedby

• /computer/computer/parent_model

• /tv/tv_program/spin_offs

Used when instances form a hierarchy

Phylogeny properties have an expected Type which is the same as the Type on which the property is defined.

Wednesday, December 8, 2010

Phylogeny

Why can I use the short name??

{  "id": "/en/fairfax_california",  "/location/location/containedby": [{    "id": null,    "containedby": [{      "id": null    }]  }]}

Wednesday, December 8, 2010

Enumerated Value

Closed collection of “values” for a property

Constrains relations to fixed set of objects

• /people/person/gender

{ female, male, other }

• /visual_art/visual_artist/art_forms

{ drawing, painting, print making, photography.... }

Wednesday, December 8, 2010

Explore the Freebase Graph

directed

parent sibling

child wrote

directed

directed

starred_in

starred_in

Wednesday, December 8, 2010

Explore the Freebase Graph

[{  "id":   null,  "type": "/film/director"}]

Wednesday, December 8, 2010

Explore the Freebase Graph

[{  "id":   null,  "type": "/film/director",  "/people/person/children": [{     "id":   null,     "type": "/film/director"  }]}]

Wednesday, December 8, 2010

Explore the Freebase Graph

[{  "id":   null,  "type": "/film/director",  "film":[ ],  "/people/person/children": [{     "id":   null,     "type": "/film/director"     "film":[ ]  }]}]

Wednesday, December 8, 2010

Explore the Freebase Graph[{  "id":   null,  "type": "/film/director",  "film": [ ],  "/people/person/children": [{     "id":   null,     "type": "/film/director",     "film": [{       "name":null,       "starring": [{         "actor": null       }]     }]  }]}]

Wednesday, December 8, 2010

acto

rfilm

"Harrison Ford"

sta

rring

film

film

actor

person

"Blade Runner"

name

name

performance

date_of_birth

1942-07-13

film character

"Rick Deckard"

name

type

type

"film"

name

insta

nce

type

instance

type

type

"actor"

name

type

"person"

name

type

insta

nce

type

insta

nce

type

insta

nce

type

inst

ance

properties

property

type

"date of birth"

name

expected_typedate_time

type

instance

/

/film

film (key)

/people

people (key)

type

type

instance

name

"domain"

type

instance

type

instance

pers

on (

key)

film (ke

y)

LEGEND

/type/object + /common/topic

/type/object

outgoing incoming

keyvalue (key)

outgoing property

literal value

/namespace

obj type

domain

namespace

domain

"type"

name type

instance

type

type

instance

"property"name

property

type

instance

"expected type"

name

expected_type

insta

nce

properties

It’s all nodes and links!

Wednesday, December 8, 2010

"commons" individual's "bases"

"domains"

BladeRunner

promote

Domains, Bases and Commons

Wednesday, December 8, 2010

Questions?!

Docs: www.freebase.com/docs

Wiki: wiki.freebase.com

Mailing List: lists.freebase.com

Wednesday, December 8, 2010

top related