b103 basics_architecture
DESCRIPTION
B103 Basics_ArchitectureTRANSCRIPT
Module 3: Teradata Basics
After completing this module, you will be able to:
List and describe the major components of the Teradata architecture.
Describe how the components interact to manage incoming and outgoing data
List 5 types of Teradata database objects
Teradata Storage Architecture
Notes:
The Parsing Engine dispatches request to insert a row.
The Message Passing Layer insures that a row gets to the appropriate AMP (Access Module Processor).
The AMP stores the row on its associated (logical) disk.
An AMP manages a logical or virtual disk which is mapped to multiple physical disks in a disk array.
Teradata
AMP 4AMP 3AMP 1 AMP 2
Parsing Engine(s)
Message Passing Layer
18
254 41
1290 75
80
32
667
25
Records From Client (in random sequence)
2 32 67 12 90 6 54 75 18 25 80 41
Teradata Retrieval Architecture
Notes:
The Parsing Engine dispatches a request to retrieve one or more rows.
The Message Passing Layer insures that the appropriate AMP(s) are activated.
The AMP(s) locate and retrieve desired row(s) in parallel access.
Message Passing Layer returns to retrieved rows to PE.
The PE returns row(s) to requesting client application.
Teradata
AMP 4AMP 3AMP 1 AMP 2
Parsing Engine(s)
Message Passing Layer
18
254 41
1290 75
80
32
667
25
Rows retrieved from table
2 32 67 12 90 6 54 75 18 25 80 41
Multiple Tables on Multiple AMPs
EMPLOYEE RowsDEPARTMENT RowsJOB Rows
EMPLOYEE Table DEPARTMENT Table JOB Table
Parsing Engine
AMP #1 AMP #2 AMP #3 AMP #4
Message Passing Layer
Notes:
Some rows from each table may be found on each AMP.
Each AMP may have rows from all tables.
Ideally, each AMP will hold roughly the same amount of data.
EMPLOYEE RowsDEPARTMENT RowsJOB Rows
EMPLOYEE RowsDEPARTMENT RowsJOB Rows
EMPLOYEE RowsDEPARTMENT RowsJOB Rows
Linear Growth and Expandability
AMPAMP
ParsingEngine
AMP
SESSIONS
PARALLEL PROCESSING
DATADisk
DiskDisk
ParsingEngine
ParsingEngine
Notes:
• Teradata is a linearly expandable RDBMS.
• Components may be added as requirements grow.
• Linear scalability allows for increased workload without decreased throughput.
• Performance impact of adding components is shown below.
USERS AMPs DATA PerformanceSame Same Same SameDouble Double Same SameSame Double Double SameSame Double Same Double
Teradata Objects
There are eight types of objects which may be found in a Teradata database/user.Tables – rows and columns of data Views – predefined subsets of existing tablesMacros – predefined, stored SQL statementsTriggers – SQL statements associated with a tableStored Procedures – program stored within TeradataJoin and Hash Indexes – separate index structures stored as objects within a databasePermanent Journals – table used to store before and/or after images for recovery
DEFINITIONS OF ALL DATABASE
OBJECTS
DD/D
These objects are created, maintained and deleted using Structured Query Language (SQL).
Object definitions are stored in the Data Dictionary / Directory (DD/D).
DATABASE or USER
TABLE 2 TABLE 3TABLE 1
VIEW 2 VIEW 3VIEW 1
MACRO 2 MACRO 3MACRO 1
TRIGGER 2 TRIGGER 3TRIGGER 1
Stored Procedure 1 Stored Procedure 2 Stored Procedure 2
Join/Hash Index 1 Join/Hash Index 2 Join/Hash Index 3
Permanent Journal These aren't directly accessed by users.
The Data Dictionary Directory (DD/D)
The DD/D ...
– is an integrated set of system tables
– contains definitions of and information about all objects in the system
– is entirely maintained by the RDBMS
– is “data about the data” or “metadata”
– is distributed across all AMPs like all tables
– may be queried by administrators or support staff
– is accessed via Teradata supplied views
Examples of DD/D views:
DBC.Tables - information about all tables
DBC.Users - information about all users
DBC.AllRights - information about access rights
DBC.AllSpace - information about space utilization
Structured Query Language (SQL)
SQL is a query language for Relational Database Systems.– A fourth-generation language– A set-oriented language– A non-procedural language
(e.g, doesn’t have IF, GO TO, DO, FOR NEXT, or PERFORM statements)
SQL consists of:
Data Definition Language (DDL)– Defines database structures (tables, users, views, macros, triggers, etc.)
CREATE DROP ALTER
Data Manipulation Language (DML)– Manipulates rows and data values
SELECT INSERT UPDATE DELETE
Data Control Language (DCL)– Grants and revokes access rights
GRANT REVOKE
Teradata SQL also includes Teradata Extensions to SQL
HELP SHOW EXPLAIN CREATE MACRO
CREATE TABLE – Example of DDL
CREATE TABLE Employee,FALLBACK(employee_number INTEGER NOT NULL,manager_emp_number INTEGER,dept_number SMALLINT,job_code INTEGER COMPRESS,last_name CHAR(20) NOT NULL,first_name VARCHAR (20),hire_date DATE FORMAT 'YYYY-MM-DD',birth_date DATE FORMAT 'YYYY-MM-DD',salary_amount DECIMAL (10,2)) UNIQUE PRIMARY INDEX (employee_number),INDEX (dept_number);
Other DDL Examples
CREATE INDEX (job_code) ON Employee ;
DROP INDEX (job_code) ON Employee ;
DROP TABLE Employee ;
Views
Views are pre-defined subsets of existing tables consisting of specified columns and/or rows from the table(s).
A single table view:
– is a window into an underlying table
– allows users to read and update a subset of the underlying table
– has no data of its own
MANAGEREMPLOYEE EMP DEPT JOB LAST FIRST HIRE BIRTH SALARYNUMBER NUMBER NUMBER CODE NAME NAME DATE DATE AMOUNT
1006 1019 301 312101 Stein John 861015 631015 39450001008 1019 301 312102 Kanieski Carol 870201 680517 39250001005 0801 403 431100 Ryan Loretta 861015 650910 41200001004 1003 401 412101 Johnson Darlene 861015 560423 46300001007 1005 403 432101 Villegas Arnando 870102 470131 59700001003 0801 401 411100 Trader James 860731 570619 4785000
EMPLOYEE (Table)
PK FK FK FK
EMP NO DEPT NO LAST NAME FIRST NAME HIRE DATE
1005 403 Villegas Arnando 870102 801 403 Ryan Loretta 861015
Emp_403 (View)
Multi-Table Views
A multi-table view allows users to access data from multiple tables as if it were in a single table. Multi-table views are also called join views. Join views are used for reading only, not updating.
EMPLOYEE (Table)
1006 1019 301 312101 Stein John 861015 631015 39450001008 1019 301 312102 Kanieski Carol 870201 680517 39250001005 0801 403 431100 Ryan Loretta 861015 650910 41200001004 1003 401 412101 Johnson Darlene 861015 560423 46300001007 1005 403 432101 Villegas Arnando 870102 470131 59700001003 0801 401 411100 Trader James 860731 570619 4785000
MANAGEREMPLOYEE EMP DEPT JOB LAST FIRST HIRE BIRTH SALARYNUMBER NUMBER NUMBER CODE NAME NAME DATE DATE AMOUNT
PK FK FK FK
MANAGERDEPT DEPARTMENT BUDGET EMPNUMBER NAME AMOUNT NUMBER
501 marketing sales 80050000 1017301 research and development 46560000 1019302 product planning 22600000 1016403 education 93200000 1005402 software support 30800000 1011401 customer support 98230000 1003201 technical operations 29380000 1025
PK FK
DEPARTMENT (Table)
LAST DEPARTMENT NAME NAME
Stein research & developmentKanieski research & developmentRyan educationJohnson customer supportVillegas educationTrader customer support
EmpDept (View)
"Joined Together"
SELECT – Example of DML
The SELECT statement is used to retrieve data from tables.
Who was hired on October 15, 1986?
1006 1019 301 312101 Stein John 861015 631015 39450001008 1019 301 312102 Kanieski Carol 870201 680517 39250001005 0801 403 431100 Ryan Loretta 861015 650910 41200001004 1003 401 412101 Johnson Darlene 861015 560423 46300001007 1005 403 432101 Villegas Arnando 870102 470131 59700001003 0801 401 411100 Trader James 860731 570619 4785000
EMPLOYEE (partial listing)
MANAGEREMPLOYEE EMP DEPT JOB LAST FIRST HIRE BIRTH SALARYNUMBER NUMBER NUMBER CODE NAME NAME DATE DATE AMOUNT
PK FK FK FK
SELECT Last_Name ,First_Name
FROM EmployeeWHERE Hire_Date = '1986-10-15';
Result
LASTNAMEStein RyanJohnson
FIRSTNAMEJohnLorettaDarlene
The JOIN Operation
A join operation is used when the SQL query requires information from multiple tables.
Who works in Research and Development?EMPLOYEE
1006 1019 301 312101 Stein John 861015 631015 39450001008 1019 301 312102 Kanieski Carol 870201 680517 39250001005 0801 403 431100 Ryan Loretta 861015 650910 41200001004 1003 401 412101 Johnson Darlene 861015 560423 46300001007 1005 403 432101 Villegas Arnando 870102 470131 59700001003 0801 401 411100 Trader James 860731 570619 4785000
MANAGEREMPLOYEE EMP DEPT JOB LAST FIRST HIRE BIRTH SALARYNUMBER NUMBER NUMBER CODE NAME NAME DATE DATE AMOUNT
PK FK FK FK
MANAGERDEPT DEPARTMENT BUDGET EMPNUMBER NAME AMOUNT NUMBER
501 marketing sales 80050000 1017301 research and development 46560000 1019302 product planning 22600000 1016403 education 93200000 1005402 software support 30800000 1011401 customer support 98230000 1003201 technical operations 29380000 1025
PK FK
DEPARTMENTResult
LASTNAMEStein Kanieski
FIRSTNAMEJohnCarol
Macros – Teradata SQL Extension
A MACRO is a predefined set of SQL statements which is logically stored in a database.
Macros may be created for frequently occurring queries of sets of operations.
Macros have many features and benefits:
• Simplify end-user access
• Control which operations may be performed by users
• May accept user-provided parameter values
• Are stored on the RDBMS, thus available to all clients
• Reduces query size, thus reduces LAN/channel traffic
• Are optimized at execution time
• May contain multiple SQL statements
To create a macro:
CREATE MACRO Customer_List AS (SELECT customer_name FROM Customer;);
To execute a macro:
EXEC Customer_List;
To replace a macro:
REPLACE MACRO Customer_List AS
(SELECT customer_name, customer_number FROM Customer;);
HELP Commands – Teradata SQL Extension
Databases and Users:
HELP DATABASE Customer_Service ;
HELP USER Dave_Jones ;
Tables, Views, and Macros:
HELP TABLE Employee ;
HELP VIEW Emp;
HELP MACRO Payroll_3;
HELP COLUMN Employee.*;
Employee.last_name;
Emp.* ;
Emp.last;
HELP INDEX Employee;
HELP STATISTICS Employee;
HELP CONSTRAINT Employee.over_21;
Example of HELP DATABASE
HELP DATABASE Customer_Service;
*** Help information returned. 15 rows.
*** Total elapsed time was 1 second.
Table/View/Macro name Kind CommentContact T ?Customer T ?Cust_Comp_Orders V ?Cust_Pend_Orders V ?Cust_Order_ix I ?Department T ?Employee T ?Employee_Phone T ?Job T ?Location T ?Location_Employee T ?Location_Phone T ?Orders T ?Set_Ansidate_on M ?Set_Integerdate_on M ?
Command:
SHOW Command – Teradata SQL Extension
SHOW commands display how an object was created.
Command Returns statement
SHOW TABLE tablename; CREATE TABLE statement
SHOW VIEW viewname; CREATE VIEW ...
SHOW MACRO macroname; CREATE MACRO ...
SHOW TRIGGER triggername; CREATE TRIGGER …
SHOW PROCEDURE procedurename; CREATE PROCEDURE …
SHOW TABLE Employee;
CREATE SET TABLE CUSTOMER_SERVICE.Employee, FALLBACK, NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT ( employee_number INTEGER, manager_employee_number INTEGER, department_number INTEGER, job_code INTEGER, : salary_amount DECIMAL(10,2) NOT NULL)UNIQUE PRIMARY INDEX ( employee_number );
EXPLAIN Facility – Teradata SQL Extension
The EXPLAIN modifier in front of any SQL statement generates an English translation of the Parser’s plan.
The request is fully parsed and optimized, but not actually executed.
EXPLAIN returns:
• Text showing how a statement will be processed (a plan)
• An estimate of how many rows will be involved
• A relative cost of the request (in units of time)
This information is useful for:
• predicting row counts
• predicting performance
• testing queries before production
• analyzing various approaches to a problem EXPLAIN
EXPLAIN SELECT last_name, department_number FROM Employee ;
Explanation (partial):
3) We do an all-AMPs RETRIEVE step from CUSTOMER_SERVICE.Employee by way of an all-rows scan with no residual conditions into Spool 1, which is built locally on the AMPs. The size of Spool 1 is estimated to be 24 rows. The estimated time for this step is 0.15 seconds.
Teradata Features Review
• Designed for decision-support and tactical queries
• Ideal for data warehouse applications
• Parallelism makes possible access to very large tables
• Performance increase is linear as components are added
• Uses standard SQL
• Runs as a “database server” to client applications
• Runs on multiple hardware platforms
• Open architecture – uses industry standard components
Win XPWin 2000
UNIXClient
MainframeClient
TeradataDATABASE
Review Questions
1. What language is used to access a Teradata table?
2. What are five Teradata database objects?
3. What are four major components of the Teradata architecture?
4. What are views?
5. What are macros?
Module 3: Review Question Answers
1. What language is used to access a Teradata table?
2. What are five Teradata database objects?
3. What are four major components of the Teradata architecture?
4. What are views?
5. What are macros?
SQL
Tables, views, macros, triggers, and stored procedures
PE, AMPs, Vdisks, and Message Passing Layer
- Subset of rows and columns or one or more tables- Virtual tables- Window into one or more tables
Predefined, stored set of SQL statements