cis550 handout 41 cis 550 fall 2001 handout 4 object oriented databases

42
CIS550 Handout 4 1 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

Upload: cade-daughtery

Post on 15-Jan-2016

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 1

CIS 550Fall 2001

Handout 4 Object Oriented Databases

Page 2: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 2

Why OODBS?• Consider a relational database in which we

have information about Books:

• Problem: Redundancy in design is forced by the fact that we cannot have set valued attributes.

ISBN Title Author Keyword Publisher0-201 Compilers J.D. Ullman Compiler Addison Wesley0-201 Compilers J.D. Ullman Grammar Addison Wesley0-201 Compilers A.V. Aho Compiler Addison Wesley0-201 Compilers A.V. Aho Grammar Addison Wesley0-201 Compilers J.E. Hopcroft Compiler Addison Wesley0-201 Compilers J.E. Hopcroft Grammar Addison Wesley

Page 3: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 3

“Normalized” Design• We could imagine another design:

ISBN Title Publisher 0-201 Compilers Addison Wesley

ISBN Author ISBN Keyword0-201 J.D. Ullman 0-201 Compiler0-201 A.V. Aho 0-201 Grammar0-201 J.E. Hopcroft

Page 4: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 4

Problems with NormalizedDesign

• The “object” book has been split over different relations.

• Artificial keys are introduced when modeling sets of sets. E.g. Cars can be be bought with certain features (CD, air conditioning, leather seats, etc). For each car, various packages of these features are available, each package being a particular set of features.

Page 5: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 5

Problems, cont.

• Impossible to represent order in relational data model

• While SQL can perform “group-by”, it cannot return a grouped result. E.g. “Give me all the authors together with a list of books written by that author”.

• We can call SQL from a “host” PL, but can we call external programs from SQL?

Page 6: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 6

The natural solution

• This is an example of the nested relational model, in which values may themselves be sets or tuples.

• What we’d like is a type system that allows us freely to combine sets and tuples (non-1NF relations)

• This implies new operators: nest and unnest.

ISBN Title Authors Keywords Publisher 0-201 Compilers {J.D. Ullman, {Compiler, Addison Wesley A.V. Aho, Grammar} J.E. Hopcroft}

Page 7: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 7

Unnest• Given nested relation R on the previous

slide, we can unnest on one of the set-valued fields:

ISBN Title Author Keyword Publisher 0-201 Compilers J.D. Ullman {Compiler, Addison Wesley Grammar}0-201 Compilers A.V. Aho {Compiler, Addison Wesley Grammar}0-201 Compilers J.E. Hopcroft {Compiler, Addison Wesley Grammar}

:UNNEST (R){Author}

Page 8: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 8

Nest

• We can also and get back the original relation.

• But this isn't true in general! Consider the following example:

(R){Author}

NEST

A B1 {2,3}1 {3,4}

Page 9: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 9

Object-orientation• Complex type system, including atomic

types (string, integer, float, char, etc); collection types (sets, bags, lists); and the record type.

• Object identity gives identification of objects independent of value. Contrast this with the notion of a key in the relational model.

• Values with identity are called objects.

Page 10: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 10

Object-orientation, cont.• Methods model the behavior of an

object. For example, modifying the value of an object is often dangerous, and is encapsulated by methods (e.g. RaiseSalary and MovePoint).

• Types + Identity + Methods are Classes.• Each of these new types gives rise to

new operations on those types(as with nest and unnest)

Page 11: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 11

Classes, Types and Names

• Consider an schema representing persons, employees and consultants.

2O

class Person inherit Object public type tuple(name:string, male: boolean, spouse: Person, children: list(Person), dob: Date) public method age: integerend;

Page 12: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 12

Schema, cont.class Employee inherit Person public type tuple(salary: integer, subordinates: unique set(Employee))end;

class Consultant inherit Person public type tuple(phone: integer, rate: integer, num_hours:integer) public method earnings: integerend;

Page 13: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 13

Schema, cont.

• Note that Date is a user-defined type, and that Persons and Employees are the only entry-points into the database.

type Date tuple(month: integer, day: integer, year: integer)

name Persons: unique set(Person)name Employees: unique set(Employee)

Page 14: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 14

Select…from…where• Print all females in the database.

• Print the values of all females in the database.

select pfrom p in Personswhere not(p.male)

Result is of type set(Person)

select *pfrom p in Personswhere not(p.male)

Result is of type set(tuple(name:string, male: boolean, spouse: Person, children: list(Person), dob: Date))

Page 15: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 15

Select…from…where, cont.

• Print the names of all females in the database.

Result is of type set(string).

select distinct p.namefrom p in Personswhere not(p.male)

Page 16: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 16

More queries...• Print the name and birth-date of all

females in the database who were born in July.

Result is of type set(tuple(name:string, dob: Date))

select struct(name: p.name, dob: p.dob)from p in Personswhere not(p.male) and p.dob.month= 7

Page 17: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 17

Path expressions• Print the names of all spouses.

• Print the names and dob of all spouses who were born in June.

Result is of type set(tuple(name:string, dob:Date)).

select p.spouse.namefrom p in Persons

select struct(name: p.spouse.name, dob: p.spouse.dob)from p in Personswhere p.spouse.dob.month=6

Result is of type set(string).

Page 18: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 18

Method Invocation

• Methods are functions which implicitly have as a parameter “self” and optional additional parameters. None of the example methods have any additional parameters.

• Print the names and age of all employees.

select struct(name: e.name, age: e.age())from e in Employees

Page 19: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 19

Aggregate Operators• Aggregate operator count operates on

collections of elements of any type, min and max operate on collections of ordered types, and sum operates on collections of numbers.

• Print the number of Employees.

• Note the difference from SQL:

count(Employees) Result is of type integer.

select count(*)from Employees

Page 20: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 20

Aggregate operators, cont.• Print the maximum number of children

of any person.

• Print the name and names of children of the persons with the maximum number of children.

max(select count(p.children) from p in Persons)

select struct(name: p. name, children: select c.name from c in p.children)from p in Persons where count(p.children)= max(select count(p1.children) from p1 in Person)

Page 21: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 21

Unnesting• Print all pairs (manager, employee) of

managers in Employees.

• Print the names of all subordinates of Joe Brown.

select struct(manager: e.name, employee: s.name)from e in Employees, s in e.subordinateswhere count(e.subordinates) >0

select s.namefrom e in Employees, s in e.subordinateswhere e.name="Joe Brown"

Page 22: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 22

An incorrect version• Compare the previous query with the

following (incorrect) query:

• The incorrect query would cause a compile time error because subordinates is of type set(Employee), and you can only use projection over a record type.

select e.subordinates.namefrom e in Employees

Page 23: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 23

Flatten• Flatten gets rid of one level of set

(collection) nesting, e.g. it takes something of type set(set(element)) and produces something of type set(element) by taking the union of all sets of elements. (This is not the same as the “unnest” operator we have discussed.)

flatten({{1,2},{3,4},{1,5}}) = {1,2,3,4,5,1}

Page 24: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 24

An example...• Print the names of employees who are

supervised by someone.

Note that

is of type set(set(Employees)).

select s.name from s in flatten(select e.supervises from e in Employees)

(select e.supervises from e in Employees)

Page 25: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 25

Group by• Group by introduces a level of set nesting,

and produces something of type set(tuple(label:..., partition: set(...))).

• For each salary, print the employees earning that salary.

Result is of type set(tuple(salary: integer,

partition: set(struct(e:Employee)))).

select * from e in Employeesgroup by e.salary

Page 26: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 26

Example, cont.

• We could also have written the previous query as:

select struct(salary: e.salary, partition:(select * from e1 in Employees where e.salary=e1.salary))from e in Employees

Page 27: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 27

Group by … having• “For each salary earned by more than 2

employees, print the set of names of employees earning that salary.”

• Result is of type set(tuple(salary:integer, name:set(string))).

select struct(salary: e.salary, names: (select e1.name from e1 in partition))from e in Employeegroup by e.salary having count(partition)>2

Page 28: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 28

Indexing into a list.• Lists are ordered sets, and are treated as

arrays indexed starting from 0. We cannot into sets.

• “Print all first-born children.”

• Result is of type set(Person). Note the explicit test that the list was non-empty. We could also have used count.

select p.children[0]from p in Personwhere p.children<>list()

Page 29: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 29

Lists, cont.

• “Print the names of all second-born children.” (Recall that list indexing starts with 0.)

• Result is of type set(string).

select p.children[1].namefrom p in Personwhere count(p.children)> 1

Page 30: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 30

Sort... in… by• “select...from...where” creates a bag of

elements from an input set, “select distinct...from...where” creates a set. To create a list of element as output from an input set, we use “sort...in...by”.

• “Print the names and ages of employees by increasing age.”

• Result type is list(tuple(name:string, age:integer)).

sort x in (select struct(name:e.name, age:e.age()) from e in Employees)by x.age()

Page 31: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 31

Creating sets from lists• A list can be converted to a set using

listtoset. For example, suppose we want to print each person with children represented as a set rather than a list.

• Result type is set(tuple(name:string,children:set(Person)).

select struct(name:p.name, children: listtoset(p.children))from p in Person

Page 32: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 32

Set operations: difference

• “Print the employees who are supervised by noone.”

• Result type is set(Employee).

select e from Employees where not(e in flatten(select e1.supervises from e1 in Employees)) OR(select e in Employees) -flatten(select e.supervises from e in Employees)

Page 33: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 33

Set operations: union• “Print the names of all people who are

either employees or consultants.”

• But we cannot do this because “Consultants” is not a named entry point into our database!

select e.name from e in Employees+ select c.name from c in Consultants

Page 34: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 34

Casting• Casting is sometimes necessary to support type

inferencing. • For example, suppose we know that all people in

our database are either Employees or Consultants: “Print the names of all consultants.”

• Result type is set(string). The type of the expression “select (Person)e from e in Employees” is set(Person).

select c.name from c in(Persons - (select (Person)e from e in Employees))

Page 35: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 35

Casting, cont.

• We could also have said

• Since Employee inherits Person, the type of the expression would have been the most general type (Person), and Employee would be implicitly cast to a Person.

Persons - Employees

Page 36: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 36

Casting, cont.• “Print names and earnings of consultants.”

• Note that

is of type set(Person), and that the Person type does not have the method earnings.

select struct(name:p.name, earnings:((Consultant)p).earnings())from p in (Persons - (select (Person)e from e in Employees))

Persons - (select (Person)e from e in Employees

Page 37: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 37

Quantifier Expressions• Expressions “exists” and “for all” also

exist in OQL. For example:• “Print the names of people who have had

at least one boy.”

• “Print the names of people who had all boys.”

select struct(name:p.name)from p in Personswhere exists c in p.children: c.male

select struct(name: p.name)from p in Personswhere forall c in p.children: c.male

Page 38: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 38

Element• Sometimes we want to turn a collection

containing one element into the element type.

• Note that this is not always safe, i.e. testing whether a collection has a single element is not something that can be checked at compile time, so a run-time error may occur.

Page 39: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 39

Element, an example• “For all people with single-child families, print

the person’s name and their child’s name.”

• Note that this is safe since we explicitly checked the number of children.

select struct(name:p.name, cname: element(select c.name from c in p.children))from p in Personswhere count(p.children)=1

Page 40: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 40

Element, another example• “Print the tuple of information about John

Kramer.”• If we know “name” is a key for Person,

then the following query will be correct:

• However, we do not really have the ability to specify keys in so there is no way that this can be checked at compile time!

element(select *p from p in Persons where p.name="John Kramer")

2O

Page 41: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 41

Summary• OQL is a language for complex object

databases. The types include:– base types- integer, string, boolean– tuple type - tuple( )– the collection types - unique set (set), set

(bag) and list – objects (class types)

• For each type, basic operations (including constructor and destructor operations) are defined.

NNlll :,...,:,: 2211

Page 42: CIS550 Handout 41 CIS 550 Fall 2001 Handout 4 Object Oriented Databases

CIS550 Handout 4 42

Summary, cont• For example,

– For objects o, we can dereference (*o) and cast ((Person)o).

– For tuples, we can project over attributes (e.g. p.name).

– For sets, we can select elements, flatten, extract an element from a singleton set, take unions (+ or union), set difference (- or except) and intersection (* or intersect).

– For lists, we can extract indexed elements and create lists by specifying an ordering on elements.