b103 basics_architecture

21
Module 3: Teradata Basics After completing this module, you will be able to: List and describe the major components of the Teradata architecture. Describe how the components interact to manage incoming and outgoing data List 5 types of Teradata database objects

Upload: anand

Post on 23-Dec-2015

18 views

Category:

Documents


2 download

DESCRIPTION

B103 Basics_Architecture

TRANSCRIPT

Page 1: B103 Basics_Architecture

Module 3: Teradata Basics

After completing this module, you will be able to:

List and describe the major components of the Teradata architecture.

Describe how the components interact to manage incoming and outgoing data

List 5 types of Teradata database objects

Page 2: B103 Basics_Architecture

Teradata Storage Architecture

Notes:

The Parsing Engine dispatches request to insert a row.

The Message Passing Layer insures that a row gets to the appropriate AMP (Access Module Processor).

The AMP stores the row on its associated (logical) disk.

An AMP manages a logical or virtual disk which is mapped to multiple physical disks in a disk array.

Teradata

AMP 4AMP 3AMP 1 AMP 2

Parsing Engine(s)

Message Passing Layer

18

254 41

1290 75

80

32

667

25

Records From Client (in random sequence)

2 32 67 12 90 6 54 75 18 25 80 41

Page 3: B103 Basics_Architecture

Teradata Retrieval Architecture

Notes:

The Parsing Engine dispatches a request to retrieve one or more rows.

The Message Passing Layer insures that the appropriate AMP(s) are activated.

The AMP(s) locate and retrieve desired row(s) in parallel access.

Message Passing Layer returns to retrieved rows to PE.

The PE returns row(s) to requesting client application.

Teradata

AMP 4AMP 3AMP 1 AMP 2

Parsing Engine(s)

Message Passing Layer

18

254 41

1290 75

80

32

667

25

Rows retrieved from table

2 32 67 12 90 6 54 75 18 25 80 41

Page 4: B103 Basics_Architecture

Multiple Tables on Multiple AMPs

EMPLOYEE RowsDEPARTMENT RowsJOB Rows

EMPLOYEE Table DEPARTMENT Table JOB Table

Parsing Engine

AMP #1 AMP #2 AMP #3 AMP #4

Message Passing Layer

Notes:

Some rows from each table may be found on each AMP.

Each AMP may have rows from all tables.

Ideally, each AMP will hold roughly the same amount of data.

EMPLOYEE RowsDEPARTMENT RowsJOB Rows

EMPLOYEE RowsDEPARTMENT RowsJOB Rows

EMPLOYEE RowsDEPARTMENT RowsJOB Rows

Page 5: B103 Basics_Architecture

Linear Growth and Expandability

AMPAMP

ParsingEngine

AMP

SESSIONS

PARALLEL PROCESSING

DATADisk

DiskDisk

ParsingEngine

ParsingEngine

Notes:

• Teradata is a linearly expandable RDBMS.

• Components may be added as requirements grow.

• Linear scalability allows for increased workload without decreased throughput.

• Performance impact of adding components is shown below.

USERS AMPs DATA PerformanceSame Same Same SameDouble Double Same SameSame Double Double SameSame Double Same Double

Page 6: B103 Basics_Architecture

Teradata Objects

There are eight types of objects which may be found in a Teradata database/user.Tables – rows and columns of data Views – predefined subsets of existing tablesMacros – predefined, stored SQL statementsTriggers – SQL statements associated with a tableStored Procedures – program stored within TeradataJoin and Hash Indexes – separate index structures stored as objects within a databasePermanent Journals – table used to store before and/or after images for recovery

DEFINITIONS OF ALL DATABASE

OBJECTS

DD/D

These objects are created, maintained and deleted using Structured Query Language (SQL).

Object definitions are stored in the Data Dictionary / Directory (DD/D).

DATABASE or USER

TABLE 2 TABLE 3TABLE 1

VIEW 2 VIEW 3VIEW 1

MACRO 2 MACRO 3MACRO 1

TRIGGER 2 TRIGGER 3TRIGGER 1

Stored Procedure 1 Stored Procedure 2 Stored Procedure 2

Join/Hash Index 1 Join/Hash Index 2 Join/Hash Index 3

Permanent Journal These aren't directly accessed by users.

Page 7: B103 Basics_Architecture

The Data Dictionary Directory (DD/D)

The DD/D ...

– is an integrated set of system tables

– contains definitions of and information about all objects in the system

– is entirely maintained by the RDBMS

– is “data about the data” or “metadata”

– is distributed across all AMPs like all tables

– may be queried by administrators or support staff

– is accessed via Teradata supplied views

Examples of DD/D views:

DBC.Tables - information about all tables

DBC.Users - information about all users

DBC.AllRights - information about access rights

DBC.AllSpace - information about space utilization

Page 8: B103 Basics_Architecture

Structured Query Language (SQL)

SQL is a query language for Relational Database Systems.– A fourth-generation language– A set-oriented language– A non-procedural language

(e.g, doesn’t have IF, GO TO, DO, FOR NEXT, or PERFORM statements)

SQL consists of:

Data Definition Language (DDL)– Defines database structures (tables, users, views, macros, triggers, etc.)

CREATE DROP ALTER

Data Manipulation Language (DML)– Manipulates rows and data values

SELECT INSERT UPDATE DELETE

Data Control Language (DCL)– Grants and revokes access rights

GRANT REVOKE

Teradata SQL also includes Teradata Extensions to SQL

HELP SHOW EXPLAIN CREATE MACRO

Page 9: B103 Basics_Architecture

CREATE TABLE – Example of DDL

CREATE TABLE Employee,FALLBACK(employee_number INTEGER NOT NULL,manager_emp_number INTEGER,dept_number SMALLINT,job_code INTEGER COMPRESS,last_name CHAR(20) NOT NULL,first_name VARCHAR (20),hire_date DATE FORMAT 'YYYY-MM-DD',birth_date DATE FORMAT 'YYYY-MM-DD',salary_amount DECIMAL (10,2)) UNIQUE PRIMARY INDEX (employee_number),INDEX (dept_number);

Other DDL Examples

CREATE INDEX (job_code) ON Employee ;

DROP INDEX (job_code) ON Employee ;

DROP TABLE Employee ;

Page 10: B103 Basics_Architecture

Views

Views are pre-defined subsets of existing tables consisting of specified columns and/or rows from the table(s).

A single table view:

– is a window into an underlying table

– allows users to read and update a subset of the underlying table

– has no data of its own

MANAGEREMPLOYEE EMP DEPT JOB LAST FIRST HIRE BIRTH SALARYNUMBER NUMBER NUMBER CODE NAME NAME DATE DATE AMOUNT

1006 1019 301 312101 Stein John 861015 631015 39450001008 1019 301 312102 Kanieski Carol 870201 680517 39250001005 0801 403 431100 Ryan Loretta 861015 650910 41200001004 1003 401 412101 Johnson Darlene 861015 560423 46300001007 1005 403 432101 Villegas Arnando 870102 470131 59700001003 0801 401 411100 Trader James 860731 570619 4785000

EMPLOYEE (Table)

PK FK FK FK

EMP NO DEPT NO LAST NAME FIRST NAME HIRE DATE

1005 403 Villegas Arnando 870102 801 403 Ryan Loretta 861015

Emp_403 (View)

Page 11: B103 Basics_Architecture

Multi-Table Views

A multi-table view allows users to access data from multiple tables as if it were in a single table. Multi-table views are also called join views. Join views are used for reading only, not updating.

EMPLOYEE (Table)

1006 1019 301 312101 Stein John 861015 631015 39450001008 1019 301 312102 Kanieski Carol 870201 680517 39250001005 0801 403 431100 Ryan Loretta 861015 650910 41200001004 1003 401 412101 Johnson Darlene 861015 560423 46300001007 1005 403 432101 Villegas Arnando 870102 470131 59700001003 0801 401 411100 Trader James 860731 570619 4785000

MANAGEREMPLOYEE EMP DEPT JOB LAST FIRST HIRE BIRTH SALARYNUMBER NUMBER NUMBER CODE NAME NAME DATE DATE AMOUNT

PK FK FK FK

MANAGERDEPT DEPARTMENT BUDGET EMPNUMBER NAME AMOUNT NUMBER

501 marketing sales 80050000 1017301 research and development 46560000 1019302 product planning 22600000 1016403 education 93200000 1005402 software support 30800000 1011401 customer support 98230000 1003201 technical operations 29380000 1025

PK FK

DEPARTMENT (Table)

LAST DEPARTMENT NAME NAME

Stein research & developmentKanieski research & developmentRyan educationJohnson customer supportVillegas educationTrader customer support

EmpDept (View)

"Joined Together"

Page 12: B103 Basics_Architecture

SELECT – Example of DML

The SELECT statement is used to retrieve data from tables.

Who was hired on October 15, 1986?

1006 1019 301 312101 Stein John 861015 631015 39450001008 1019 301 312102 Kanieski Carol 870201 680517 39250001005 0801 403 431100 Ryan Loretta 861015 650910 41200001004 1003 401 412101 Johnson Darlene 861015 560423 46300001007 1005 403 432101 Villegas Arnando 870102 470131 59700001003 0801 401 411100 Trader James 860731 570619 4785000

EMPLOYEE (partial listing)

MANAGEREMPLOYEE EMP DEPT JOB LAST FIRST HIRE BIRTH SALARYNUMBER NUMBER NUMBER CODE NAME NAME DATE DATE AMOUNT

PK FK FK FK

SELECT Last_Name ,First_Name

FROM EmployeeWHERE Hire_Date = '1986-10-15';

Result

LASTNAMEStein RyanJohnson

FIRSTNAMEJohnLorettaDarlene

Page 13: B103 Basics_Architecture

The JOIN Operation

A join operation is used when the SQL query requires information from multiple tables.

Who works in Research and Development?EMPLOYEE

1006 1019 301 312101 Stein John 861015 631015 39450001008 1019 301 312102 Kanieski Carol 870201 680517 39250001005 0801 403 431100 Ryan Loretta 861015 650910 41200001004 1003 401 412101 Johnson Darlene 861015 560423 46300001007 1005 403 432101 Villegas Arnando 870102 470131 59700001003 0801 401 411100 Trader James 860731 570619 4785000

MANAGEREMPLOYEE EMP DEPT JOB LAST FIRST HIRE BIRTH SALARYNUMBER NUMBER NUMBER CODE NAME NAME DATE DATE AMOUNT

PK FK FK FK

MANAGERDEPT DEPARTMENT BUDGET EMPNUMBER NAME AMOUNT NUMBER

501 marketing sales 80050000 1017301 research and development 46560000 1019302 product planning 22600000 1016403 education 93200000 1005402 software support 30800000 1011401 customer support 98230000 1003201 technical operations 29380000 1025

PK FK

DEPARTMENTResult

LASTNAMEStein Kanieski

FIRSTNAMEJohnCarol

Page 14: B103 Basics_Architecture

Macros – Teradata SQL Extension

A MACRO is a predefined set of SQL statements which is logically stored in a database.

Macros may be created for frequently occurring queries of sets of operations.

Macros have many features and benefits:

• Simplify end-user access

• Control which operations may be performed by users

• May accept user-provided parameter values

• Are stored on the RDBMS, thus available to all clients

• Reduces query size, thus reduces LAN/channel traffic

• Are optimized at execution time

• May contain multiple SQL statements

To create a macro:

CREATE MACRO Customer_List AS (SELECT customer_name FROM Customer;);

To execute a macro:

EXEC Customer_List;

To replace a macro:

REPLACE MACRO Customer_List AS

(SELECT customer_name, customer_number FROM Customer;);

Page 15: B103 Basics_Architecture

HELP Commands – Teradata SQL Extension

Databases and Users:

HELP DATABASE Customer_Service ;

HELP USER Dave_Jones ;

Tables, Views, and Macros:

HELP TABLE Employee ;

HELP VIEW Emp;

HELP MACRO Payroll_3;

HELP COLUMN Employee.*;

Employee.last_name;

Emp.* ;

Emp.last;

HELP INDEX Employee;

HELP STATISTICS Employee;

HELP CONSTRAINT Employee.over_21;

Page 16: B103 Basics_Architecture

Example of HELP DATABASE

HELP DATABASE Customer_Service;

*** Help information returned. 15 rows.

*** Total elapsed time was 1 second.

Table/View/Macro name Kind CommentContact T ?Customer T ?Cust_Comp_Orders V ?Cust_Pend_Orders V ?Cust_Order_ix I ?Department T ?Employee T ?Employee_Phone T ?Job T ?Location T ?Location_Employee T ?Location_Phone T ?Orders T ?Set_Ansidate_on M ?Set_Integerdate_on M ?

Command:

Page 17: B103 Basics_Architecture

SHOW Command – Teradata SQL Extension

SHOW commands display how an object was created.

Command Returns statement

SHOW TABLE tablename; CREATE TABLE statement

SHOW VIEW viewname; CREATE VIEW ...

SHOW MACRO macroname; CREATE MACRO ...

SHOW TRIGGER triggername; CREATE TRIGGER …

SHOW PROCEDURE procedurename; CREATE PROCEDURE …

SHOW TABLE Employee;

CREATE SET TABLE CUSTOMER_SERVICE.Employee, FALLBACK, NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT ( employee_number INTEGER, manager_employee_number INTEGER, department_number INTEGER, job_code INTEGER, : salary_amount DECIMAL(10,2) NOT NULL)UNIQUE PRIMARY INDEX ( employee_number );

Page 18: B103 Basics_Architecture

EXPLAIN Facility – Teradata SQL Extension

The EXPLAIN modifier in front of any SQL statement generates an English translation of the Parser’s plan.

The request is fully parsed and optimized, but not actually executed.

EXPLAIN returns:

• Text showing how a statement will be processed (a plan)

• An estimate of how many rows will be involved

• A relative cost of the request (in units of time)

This information is useful for:

• predicting row counts

• predicting performance

• testing queries before production

• analyzing various approaches to a problem EXPLAIN

EXPLAIN SELECT last_name, department_number FROM Employee ;

Explanation (partial):

3) We do an all-AMPs RETRIEVE step from CUSTOMER_SERVICE.Employee by way of an all-rows scan with no residual conditions into Spool 1, which is built locally on the AMPs. The size of Spool 1 is estimated to be 24 rows. The estimated time for this step is 0.15 seconds.

Page 19: B103 Basics_Architecture

Teradata Features Review

• Designed for decision-support and tactical queries

• Ideal for data warehouse applications

• Parallelism makes possible access to very large tables

• Performance increase is linear as components are added

• Uses standard SQL

• Runs as a “database server” to client applications

• Runs on multiple hardware platforms

• Open architecture – uses industry standard components

Win XPWin 2000

UNIXClient

MainframeClient

TeradataDATABASE

Page 20: B103 Basics_Architecture

Review Questions

1. What language is used to access a Teradata table?

2. What are five Teradata database objects?

3. What are four major components of the Teradata architecture?

4. What are views?

5. What are macros?

Page 21: B103 Basics_Architecture

Module 3: Review Question Answers

1. What language is used to access a Teradata table?

2. What are five Teradata database objects?

3. What are four major components of the Teradata architecture?

4. What are views?

5. What are macros?

SQL

Tables, views, macros, triggers, and stored procedures

PE, AMPs, Vdisks, and Message Passing Layer

- Subset of rows and columns or one or more tables- Virtual tables- Window into one or more tables

Predefined, stored set of SQL statements