modern’ databases software

Upload: xanaoct

Post on 06-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Modern Databases Software

    1/34

    Modern Databases Software

    Mahesh Bane

  • 8/2/2019 Modern Databases Software

    2/34

    'Modern' Databases

    In This Lecture

    'Modern' Databases

    Distributed DBs

    Web-based DBs Object Oriented DBs

    Semistructured Data and XML

    Multimedia DBs

    For more information

    Connolly and Begg chapters 22-28

    Ullman and Widom chapter 4

  • 8/2/2019 Modern Databases Software

    3/34

    'Modern' Databases

    Other Sorts of DB

    We have lookedmainly at relationaldatabases

    Relational model

    SQL

    Design techniques

    Transactions

    Many of these topicsrelied on relationalconcepts

    There are severalother types of DB inuse today

    Distributed DBs

    Object DBs

    Multimedia DBs

    Temporal DBs

    Logic DBs

  • 8/2/2019 Modern Databases Software

    4/34

    'Modern' Databases

    Distributed Databases

    A distributed DBsystem consists ofseveral sites

    Sites are connectedby a network

    Each site can holddata and process it

    It shouldnt matterwhere the data is -the system is a singleentity

    Distributed databasemanagement system(DDBMS)

    A DBMS (or set ofthem) to control thedatabases

    Communicationsoftware to handleinteraction betweensites

  • 8/2/2019 Modern Databases Software

    5/34

    'Modern' Databases

    Client/Server Architecture

    The client/serverarchitecture is ageneral model for

    systems where aservice is provided byone system (theserver) to another

    (the client)

    Server

    Hosts the DBMS anddatabase

    Stores the data

    Client

    User programs thatuse the database

    Use the server fordatabase access

  • 8/2/2019 Modern Databases Software

    6/34

    'Modern' Databases

    Distributed Databases

    Network

    Client(s)

    Server

    Client(s)

    Server

    Client(s)

    Server

    Client(s)

    Server

    Client(s)

    Server

  • 8/2/2019 Modern Databases Software

    7/34

    'Modern' Databases

    Web-based Databases

    Database access overthe internet

    Web-based clients

    Web server

    Database server(s)

    Web server servespages to browsers

    (clients) and canaccess database(s)

    Typical operation

    Client sends a requestfor a page to the web

    server Web server sends SQL

    to database

    The web server usesresults to create page

    The page is returned tothe client

  • 8/2/2019 Modern Databases Software

    8/34

    'Modern' Databases

    Web-based Databases

    Client

    (Browser)

    Web

    Server

    Database

    Server

    HTTP request

    SQL query

    SQL result

    HTML page

  • 8/2/2019 Modern Databases Software

    9/34

    'Modern' Databases

    Web-based Databases

    Advantages

    World-wide access

    Internet protocols

    (HTTP, SSL, etc) giveuniform access andsecurity

    Database structure ishidden from clients

    Uses a familiarinterface

    Disadvantages

    Security can be aproblem if you are not

    careful Interface is less

    flexible usingstandard browsers

    Limited interactivity

    over slow connections

  • 8/2/2019 Modern Databases Software

    10/34

    'Modern' Databases

    Object Oriented Databases

    Relational DBs

    The database cantsee datas internal

    structure so cant usecomplex data

    Relational modelgives a simple, andquite powerful,

    structure - but isquite rigid

    Object Oriented DBs

    Use concepts fromobject oriented

    design/programming OO concepts

    Encapsulation

    Inheritance

    Polymoprhism

    OODBMS

  • 8/2/2019 Modern Databases Software

    11/34

    'Modern' Databases

    Object Oriented Databases

    An object orienteddatabase (OODB) is acollection of

    persistent objects Objects - instances of

    a defined class

    Persistent - object

    exist independently ofany program

    An object orientedDBMS

    Manages a collection

    of objects

    Allows objects to bemade persistent

    Permits queries to bemade of the objects

    Does all the normalDBMS things as well

  • 8/2/2019 Modern Databases Software

    12/34

    'Modern' Databases

    OODB example

    In lecture 10 we hada store with differentsorts of products

    Books

    CDs

    DVDs

    This lead to missing

    data among thevarious types

    OODB solution

    We make an abstractProduct class

    Book, CD, and DVDare each a concretesubclass of Product

    The database is apersistent collection

    of Products

  • 8/2/2019 Modern Databases Software

    13/34

    'Modern' Databases

    OODB Example

    Product is abstract

    You cannot make aProduct directly

    You can, however,make a Book, CD, orDVD, and these areProducts

    Product

    Book CD DVD

    Price

    Title

    Shipping

    Author

    Publisher

    Artist Producer

    Director

  • 8/2/2019 Modern Databases Software

    14/34

    'Modern' Databases

    Object Oriented Databases

    Advantages

    Good integration withJava, C++, etc

    Can store complexinformation

    Fast to recover wholeobjects

    Has the advantages ofthe (familiar) objectparadigm

    Disadvantages

    There is no underlyingtheory to match the

    relational model Can be more complex

    and less efficient

    OODB queries tend tobe procedural, unlike

    SQL

  • 8/2/2019 Modern Databases Software

    15/34

    'Modern' Databases

    Object Relational Databases

    Extend a RDBMS withobject concepts

    Data values can be

    objects of arbitrarycomplexity

    These objects haveinheritance etc.

    You can query the

    objects as well as thetables

    An object relationaldatabase

    Retains most of the

    structure of therelational model

    Needs extensions toquery languages (SQLor relational algebra)

  • 8/2/2019 Modern Databases Software

    16/34

    'Modern' Databases

    Semistructured data

    Semistructured Data: A new data modeldesigned to cope with problems ofinformation integration.

    XML : A standard language for describingsemistructured data schemas andrepresenting data.

  • 8/2/2019 Modern Databases Software

    17/34

    'Modern' Databases

    The Information-Integration

    Problem Related data exists in many places and could, in

    principle, work together.

    But different databases differ in:

    Model (relational, object-oriented?).

    Schema (normalised/ not normalized?).

    Terminology: are consultants employees?Retirees? Subcontractors?

    Conventions (meters versus feet?).

    How do we model information residing inheterogeneous sources (if we cannot combine it all ina single new database)?

  • 8/2/2019 Modern Databases Software

    18/34

    'Modern' Databases

    Example

    Suppose we are integrating information about bars insome town.

    Every bar has a database.

    One may use a relational DBMS; another keeps themenu in an MS-Word document.

    One stores the phones of distributors, another doesnot.

    One distinguishes ales from other beers, another

    doesnt. One counts beer inventory by bottles, another by

    cases.

  • 8/2/2019 Modern Databases Software

    19/34

    'Modern' Databases

    Semistructured Data

    Purpose: represent data from independent sourcesmore flexibly than either relational or object-orientedmodels.

    Think of objects, but with the type of each object its

    own business, not that of its class. Labels to indicate meaning of substructures.

    Data is self-describing: structural information is part ofthe data.

  • 8/2/2019 Modern Databases Software

    20/34

    'Modern' Databases

    Graphs of Semistructured

    Data Nodes = objects.

    Labels on arcs (attributes, relationships).

    Atomic values at leaf nodes (nodes with no arcs out).

    Flexibility: no restriction on:

    Labels out of a node.

    Number of successors with a given label.

  • 8/2/2019 Modern Databases Software

    21/34

    'Modern' Databases

    Example: Data Graph

    Bud

    A.B.

    Gold1995

    MapleJoes

    Mlob

    beer beerbar

    manfmanf

    servedAt

    name

    namename

    addr

    prize

    year award

    root

    The bar objectfor Joes Bar

    The beer objectfor Bud

    Notice anew kindof data.

  • 8/2/2019 Modern Databases Software

    22/34

    'Modern' Databases

    XML

    XML = Extensible Markup Language.

    While HTML uses tags for formatting (e.g., italic), XMLuses tags for semantics (e.g., this is an address).

    Key idea: create tag sets for a domain (e.g., bars), andtranslate all data into properly tagged XML documents.

    Well formed XML - XML which is syntactically correct;tags and their nesting totally arbitrary.

    Valid XML - XML which has DTD (document type

    definition); imposes some structure on the tags, butmuch more flexible than relational database schema.

  • 8/2/2019 Modern Databases Software

    23/34

    'Modern' Databases

    XML and Semistructured

    Data Well-Formed XML with nested tags is exactly the same

    idea as trees of semistructured data.

    XML also enables non-tree structures (with references

    to IDs of nodes), as does the semistructured datamodel.

  • 8/2/2019 Modern Databases Software

    24/34

    'Modern' Databases

    Example: Well-Formed XML

    Joes Bar

    Bud2.50

    Miller

    3.00

  • 8/2/2019 Modern Databases Software

    25/34

    'Modern' Databases

    Example

    The XML document is:

    Joes Bar

    Bud 2.50 Miller 3.00

    PRICE

    BAR

    BAR

    BARS

    NAME . . .

    BAR

    PRICENAME

    BEERBEER

    NAME

  • 8/2/2019 Modern Databases Software

    26/34

    'Modern' Databases

    XPATH and XQUERY

    XPATH is a language for describing paths in XMLdocuments.

    Really think of the semistructured data graph and its

    paths. Why do we need path description language: cant

    get at the data using just Relation.Attributeexpressions.

    XQUERY is a full query language for XML documents

    with power similar to OQL (Object Query Language,query language for object-oriented databases).

  • 8/2/2019 Modern Databases Software

    27/34

    'Modern' Databases

    Multimedia Databases

    Multimedia DBs canstore complexinformation

    Images

    Music and audio

    Video and animation

    Full texts of books

    Web pages

    They can be used ina wide range ofapplication areas

    Entertainment

    Marketing

    Medical imaging

    Digital publishing

    GeographicInformation Systems

  • 8/2/2019 Modern Databases Software

    28/34

    'Modern' Databases

    Querying Multimedia DBs

    Metadata searches

    Information about themultimedia data

    (metadata) is stored This can be kept in a

    standard relationaldatabase and queriednormally

    Limited by theamount of metadataavailable

    Content searches

    The multimedia datais searched directly

    Potential for muchmore flexible search

    Depends on the typeof data being used

    Often difficult to

    determine what thecorrect results are

  • 8/2/2019 Modern Databases Software

    29/34

    'Modern' Databases

    Metadata Searches

    Example - indexingfilms we might store

    Title

    Year

    Genre(s)

    Actor(s)

    Director(s)

    Producer(s)

    We can then searchfor things like

    Films starring Kevin

    Spacey Films directed by

    Peter Jackson

    Dramas produced in2000

    We dont actuallysearch the films

  • 8/2/2019 Modern Databases Software

    30/34

    'Modern' Databases

    Metadata Searches

    Advantages

    Metadata can bestructured in a

    traditional DBMS Metadata is generally

    concise and soefficient to store

    Metadata enriches the

    content

    Disadvantages

    Metadata cant alwaysbe found

    automatically, and sorequires data entry

    It restricts the sortsof queries that can bemade

  • 8/2/2019 Modern Databases Software

    31/34

    'Modern' Databases

    Content Searches

    An alternative tometadata is to searchthe content directly

    Multimedia is lessstructured thanmetadata

    It is a richer source ofinformation but

    harder to process

    Example of contentbased retrieval

    Find images similar to

    a given sample Hum a tune and find

    out what it is

    Search for features,such as cuts or

    transitions in films

  • 8/2/2019 Modern Databases Software

    32/34

    'Modern' Databases

    Content-Based Retrieval

    QBIC (Query By Image Content)

    from IBM - searches for images

    having similar colour or layout

    http://wwwqbic.almaden.ibm.com/cgi-bin/stamps-demo

  • 8/2/2019 Modern Databases Software

    33/34

    'Modern' Databases

    Content-Based Retrieval

    Image retrieval ishard

    It is often not clear

    when two images aresimilar

    Image interpretationis unsolved andexpensive

    Different peopleexpect differentthings

    Do we look for?

    Images of roses

    Images of red things?

    Images of flowers?

    Images of red flowers?

    Images of red roses?

  • 8/2/2019 Modern Databases Software

    34/34

    'Modern' Databases

    Other Topics

    Temporal DBs

    Storing data thatchanges over time

    Can ask about thehistory of the DBrather than just thecurrent state

    System time vs real

    time

    Logic DBs

    A database is a set offacts and rules for

    manipulating them(like a Prologprogram)

    The DBMS maintainsand controls these

    facts and rules A query is made by

    applying the rules tothe facts