b210 sizing
Post on 02-Apr-2018
232 Views
Preview:
TRANSCRIPT
-
7/27/2019 B210 Sizing
1/51
Module 10: Sizing
After completing this module, you will be able to:
Determine column sizing requirements based on chosen
data type.
Determine physical table row size.
Determine table sizing requirements via estimates and
empirical evidence.
Determine index sizing requirements via estimates andempirical evidence.
-
7/27/2019 B210 Sizing
2/51
Row Format
Only if Variable Length
Columns are declared.
R
o
w
Le
n
g
t
h
Row ID
Row
HashUniq.
Value
S
pa
r
e
P
r
e
s
e
nc
e
B
i
t
s
VAR
CHAR
Offsets
Fixed
Length
Columns
Uncom-
pressed
Columns
VAR
CHAR
Columns
R
e
f.
A
rr
a
y
P
t
r.
2 4 4 1 2 2 2 2 0-n Bytes 2
1 - n Bytes Data for Compressible
Columns that are neither
Compressed nor NULL.
Part.
#
Only for a V2R5 Table
with a PPI.
SELECT * returns columns in the order declared in the CREATE TABLE.
-
7/27/2019 B210 Sizing
3/51
-
7/27/2019 B210 Sizing
4/51
NULL and COMPRESS
The default for all columns is NULLable and NOT compressible.
NULL column values are NOT automatically compressed.
COMPRESS compresses NULL column values to zero space.
COMPRESS compresses NULL and occurrences of to
zero space.
CREATE TABLE Employee
( emp_num INTEGER NOT NULL,dept_num INTEGER COMPRESS,
country CHAR(20) COMPRESS 'Australia',:
COMPRESS all fixed length columns where at least 10% of the rows
participate. COMPRESS creates smaller rows, therefore more rows/block and fewer
blocks.
You cannot use the COMPRESS option with:
Primary Index
Referenced and referencing data columns (PK-FK)
VARCHAR, VARBYTE, and VARGRAPHIC data types CHAR, BYTE, and GRAPHIC data types > 255 characters
How many bits are allocated?
01
2
-
7/27/2019 B210 Sizing
5/51
NOT NULL Clause
In relational systems, NULL means missing or unknown.
The default for all columns is NULLable.
Specify NOT NULL on the CREATE TABLE as appropriate:
All PRIMARY KEY columns
All UNIQUE columns
All columns with NN (No NULLS) constraints
Any index may contain NULLs unless explicitly prohibited.
NOT NULL permits storage of zero and blank values.
-
7/27/2019 B210 Sizing
6/51
Multi-Value Compression (V2R5 Feature)
What is Multi-Value Compression?
A V2R5 enhancement that allows up to 255 distinct values (plus NULL)to be
compressed per fixed width column a.k.a., Value List Compression.
Enhances system cost and performance of high data volume applications.
Provides significant performance improvement for general ad-hocworkloads and full-table scan applications.
Reduces storage cost by storing more logical data per unit of physical
capacity. Performance is improved because there is less physical data to
retrieve during scan-oriented queries.
Best candidate for compression - fixed width column with a small number
of frequently occurring values in large table used frequently for FTSs.
Compression in V2R4 provided up to 25% capacity savings and scan
performance increases up to 35%.
V2R5 Compression is expected to yield even greater savings.
-
7/27/2019 B210 Sizing
7/51
Rules for Compression
Up to 255 values (plus NULL) can be compressed per column.
The maximum size of a compress value remains 255 bytes.
Only fixed-width columns can be compressed.
Primary index columns cannot be compressed.
ALTER TABLE does not support compression on existing columns.
You can not add a compress value to a column if doing so would cause
the table header row to exceed its maximum length (64 KB).
Number of Presence Bits needed for Multi-Value Compression:
Compressed Values # of Bits
1 12 - 3 2
4 - 7 3
8 - 15 4
16 - 31 5
32 - 63 6
64 - 127 7
128 - 255 8
Note:
If column is "nullable",
there will be 1 additional
presence bit.
-
7/27/2019 B210 Sizing
8/51
Implementing Multi-Value Compression
ALTER TABLE accepts a list of values
for field attribute 'COMPRESS'.
Add a Education column.
ALTER TABLE People
ADD Education CHAR(10)COMPRESS
('ELEMENTARY',
'MIDDLE',
'HIGH',
'COLLEGE',
'POST GRAD') ;
CREATE TABLE accepts a list of values
for field attribute 'COMPRESS'.Compress the top 15 most populated
countries. Compress the Sex field.
CREATE TABLE People
( Name VARCHAR(100),Address VARCHAR(100),
Country CHAR(100) COMPRESS
( 'Australia', 'Bangladesh',
'Brazil', 'China', 'England',
'France', 'Germany', 'India',
'Indonesia', 'Japan', 'Mexico',
'Nigeria', 'Pakistan',
'Russian Federation',
'United States of America'
),
Sex CHAR(1) COMPRESS ('F', 'M')
);
-
7/27/2019 B210 Sizing
9/51
-
7/27/2019 B210 Sizing
10/51
V2R5 Compression
Compression is not supported on the following data types:
INTERVAL TIME TIMESTAMP
VARCHAR VARBYTE VARGRAPHIC
Compression is supported on the following data types:
DATE (4) CHAR (N) (N
-
7/27/2019 B210 Sizing
11/51
ANSI and Teradata Data Types
Teradata ANSI (Core) equivalent
INTEGER INTEGER
SMALLINT SMALLINT
BYTEINT
DATE
TIME, TIME WITH TIME ZONE
TIMESTAMP, TIMESTAMP WITH TIME ZONE
FLOAT FLOAT
FLOAT REAL
FLOAT DOUBLE PRECISION
DECIMAL(n,m) DECIMAL(n,m)
DECIMAL(n,m) NUMERIC(n,m)
CHAR(n) CHAR(n)
VARCHAR(n), CHAR VARYING(n)
LONG VARCHAR
BYTE(n), VARBYTE(n)
GRAPHIC(n)VARGRAPHIC(n), LONG VARGRAPHIC
-
7/27/2019 B210 Sizing
12/51
Data Types
-128 to +127 Non-ANSI
S
I
G
N
BYTEINT
-32,768 to +32,767SMALLINT
S
I
G
N
-2,147,483,648 to +2,147,483,647INTEGER
S
I
G
N
-
7/27/2019 B210 Sizing
13/51
Data Types (Date and Time)
((YYYY - 1900)) * 10000 + (MM * 100) + DDDATE (4 Bytes)
TIME (6 Bytes) hh:mm:ss.ssssss
TIME WITH
TIME ZONE (8 Bytes) hh:mm:ss.ssssss +/- hh.mm
TIMESTAMP (10 Bytes) Date + Time
TIMESTAMP WITH
TIME ZONE (12 Bytes)
Date + Time + Zone
DATE, TIME, and TIMESTAMP are also SQL functions.
CURRENT_DATE, CURRENT_TIME, and CURRENT_TIMESTAMP represent
values.
-
7/27/2019 B210 Sizing
14/51
Data Types (Decimal, Numeric, Float, Real,
Double Precision)
Decimal and Numeric
DECIMAL [ ( n [ , m ] ) ]
n = 1 - 18
m = 0 - n
Default = (5, 0)
Stored in scaled binary
Number of Number of
Digits Bytes
1 to 2 1 byte
3 to 4 2 bytes
5 to 9 4 bytes
10 to 18 8 bytes
FLOAT, REAL, and DOUBLE PRECISION
Notes:
Range is 2 * 10 -307 to 2 * 10 +308
15 significant decimal digit precision.
Manipulated in IEEE floating point format.
Corresponds to, but is not identical to, IBM
normalized 64 bit floating point.
S
I
G
N
Exponent
11 Bits
Fraction/Mantissa
52 Bits
8 Bytes
-
7/27/2019 B210 Sizing
15/51
-
7/27/2019 B210 Sizing
16/51
Data Types (BYTE and VARBYTE)
BYTE
Stored in host format.
Never translated by the Teradata Database
Handled as if they were n-byte, unsigned binary
integers.
Suitable for digitized image information
VARBYTE ( n ) n = 1 - 64000
Variable length binary string
2 byte column offset identifies location in row
2
BYTE ( n ) n = 1 - 64000
Fixed length binary string
-
7/27/2019 B210 Sizing
17/51
Data Types (GRAPHIC, VARGRAPHIC,
LONG VARGRAPHIC)
2 bytes per character.
Used for double-byte Kanji and Hiragana, and
Chinese double-byte Hanzi values.
VARGRAPHIC
GRAPHIC ( n ) n = 1 - 32000Fixed length multi-byte character string;
n is the length in logical characters.
VARGRAPHIC ( n ) n = 1 - 32000
Variable length multi-byte character
string; n is the length in logicalcharacters.
2 byte column offset identifies location in row
2
2LONG VARGRAPHIC
Equivalent to VARGRAPHIC (32000)
2 byte column offset identifies location in row
-
7/27/2019 B210 Sizing
18/51
Variable Column Offsets
50 75 75 100 c1 data c3 data
c1 c2 c3 x 5 7 10 5 0
0
Offset Array Variable Length Data
Offset values tell the starting location of a variable column.
Determine the column length by subtracting its starting location
from the next columns starting location.
The definition of variable length columns requires one additional 2-byte offset that locates the end of the final variable column.
-
7/27/2019 B210 Sizing
19/51
Sizing Considerations
Compress only columns where at least 10% to 20% of the rows participate.
COMPRESS will create smaller rows, and smaller rows are generally more
efficient.
Compress columns whose NULL values are not subject to changes.
Compression saves space but costs computational overhead.
Adding a column that is not compressible expands all rows.
Adding a column that is compressible and there are no spare presence bits
expands all rows.
Dropping a column changes all row sizes where data is present.
Use VARCHAR whenever the space savings offset the overhead.
-
7/27/2019 B210 Sizing
20/51
-
7/27/2019 B210 Sizing
21/51
Example: Sizing a Row
SUPV JOB LAST FIRST HIRE BIRTH SALARY
EMP # EMP # DEPT # CODE NAME NAME DATE DATE AMOUNT
PK,SA FK FK FK NN NN NN NN NN
INT INT INT SMALL Char Char DATE DATE DECINT Fix 20 Var 30 (10,2)
EMPLOYEE
Using this logical row layout, the next page will size a typical row of the Employee table.
-
7/27/2019 B210 Sizing
22/51
Example: Completing the Row Size Calculation Form
Data Type # of Columns Size TOTAL
BYTEINT * 1 =SMALLINT 1 * 2 = 2
INTEGER 3 * 4 = 12
DATE 2 * 4 = 8
TIME(ANSI) * 6 =
TIME with ZONE * 8 =
TIMESTAMP * 10 =
TIMESTAMP/ ZONE * 12 =DECIMAL 1-2 * 1 =
3-4 * 2 =
5-9 * 4 =
10-18 1 * 8 = 8
FLOAT * 8 =
Fixed 1 SUM(n) = 20
Variable 1 SUM(a) = 14LOGICAL SIZE = 64
Overhead = 14
Partitioned Primary Index Overhead (2) = 0
Variable Column Offsets(_1_ * 2 ) + 2 ; zero if no variable columns = 4
___0__ Bits for Compressible Columns
___3__ Nullable Columns
___3__ / 8 (Quotient only) = 0PHYSICAL ROW SIZE = 82
Table Name EMPLOYEE
Column Name Type Max Average
First Name CV 30 14
SUM(a) = 14
Variable Column Data Detail
SUM(a) = SUM of the AVERAGE number of bytes
expected for the variable column.
SUM(n) = SUM of the CHAR and GRAPHIC column
bytes.
** For V2, round up to an even number of bytes.
-
7/27/2019 B210 Sizing
23/51
Row Size Calculation Form
Data Type # of Columns Size TOTAL
BYTEINT * 1 =SMALLINT * 2 =
INTEGER * 4 =
DATE * 4 =
TIME(ANSI) * 6 =
TIME with ZONE * 8 =
TIMESTAMP * 10 =
TIMESTAMP/ ZONE * 12 =DECIMAL 1-2 * 1 =
3-4 * 2 =
5-9 * 4 =
10-18 * 8 =
FLOAT * 8 =
Fixed SUM(n) =
Variable SUM(a) =
LOGICAL SIZE =
Overhead = 14
Partitioned Primary Index Overhead (2) =
Variable Column Offsets(__ * 2 ) + 2 ; zero if no variable columns =
_____ Bits for Compressible Columns
_____ Nullable Columns
_____ / 8 (Quotient only) =PHYSICAL ROW SIZE =
Table Name ____________________
Column Name Type Max Average
SUM(a) =
Variable Column Data Detail
SUM(a) = SUM of the AVERAGE number of bytes
expected for the variable column.
SUM(n) = SUM of the CHAR and GRAPHIC column
bytes.
** For V2, round up to an even number of bytes.
-
7/27/2019 B210 Sizing
24/51
Row Size Exercise
CALL
CALL
NUMBER
PK, SA
TAKEN BY
EMPLOYEE
NUMBER
CUSTOMER
NUMBER
FK, NN FK
PLACED BY
CONTACT
NUMBER
FK
PLACED BY
EMPLOYEE
NUMBER
FK
ORIGINAL
CALL
NUMBER
FK
UPI NUSI
CALL
DATE
NN
NUSI
SYSTEM
CATEGORY
PART
NUMBER
CALL
TIME
NN
CALL
STATUS
CODE
FK, NN
CALL
TYPE
CODE
FK, NN
CALL
PRIORITY
CODE
FK, NN
NUSINUSI
FK FK
AREA
CODE PHONE EXTENSION
100 100
DOMAIN
NAME
Call_Number
Employee_Number
Customer_Number
Contact_Number
Date
Time
Call_Status_Code
Call_Type_Code
Call_Priority_Code
Area_Code
Phone
Extension
System_Number
Part_Category
MAX
BYTES
4
4
4
4
4
6
2
2
2
2
4
4
4
4
DATA
TYPE
INT
INT
INT
INT
D
TIME
CF
SMALL INT
INT
INT
INT
INT
SMALL INT
SMALL INT
Example ofConstraint Number
This table will be partitioned via RANGE_N on Call_Date with Monthly intervals.
-
7/27/2019 B210 Sizing
25/51
Row Size Calculation Form
Data Type # of Columns Size TOTAL
BYTEINT * 1 =SMALLINT * 2 =
INTEGER * 4 =
DATE * 4 =
TIME(ANSI) * 6 =
TIME with ZONE * 8 =
TIMESTAMP * 10 =
TIMESTAMP/ ZONE * 12 =DECIMAL 1-2 * 1 =
3-4 * 2 =
5-9 * 4 =
10-18 * 8 =
FLOAT * 8 =
Fixed SUM(n) =
Variable SUM(a) =
LOGICAL SIZE =
Overhead = 14
Partitioned Primary Index Overhead (2) =
Variable Column Offsets(___ * 2 ) + 2 ; zero if no variable columns =
______ Bits for Compressible Columns
______ Nullable Columns
______ / 8 (Quotient only) =PHYSICAL ROW SIZE =
Table Name CALL
Column Name Type Max Average
SUM(a) =
Variable Column Data Detail
SUM(a) = SUM of the AVERAGE number of bytes
expected for the variable column.
SUM(n) = SUM of the CHAR and GRAPHIC column
bytes.
** For V2, round up to an even number of bytes.
3 6
10 40
1 4
1 6
1 2
0
58
2
9
9 +175 76
-
7/27/2019 B210 Sizing
26/51
-
7/27/2019 B210 Sizing
27/51
Table Headers
One row per AMP per Table.
Table headers are a separate subtable.
Minimum table header block size is 512
bytes (1 sector) per AMP.
Typically, a table header will be at least
1024 bytes. For example:
Tables with 4 or more columns
Tables with 2 columns and a NUSI
Compressed values are maintained
within the table header.
Multiple values that are compressedfor a column are maintained in an
array in the table header.
Maximum value of approximately 64K.
The base table header covers all of its
secondary index subtables.
STANDARD ROW HEADERLENGTH, ROW ID, PRESENCE/SPARE BYTES
FIELD 2 OFFSETFIELD 3 OFFSET
FIELD 4 OFFSET
FIELD 5 OFFSET
FIELD 6 OFFSET
FIELD 7 OFFSET
FIELD 8 OFFSET
FIELD 9 OFFSET
EXTRA OFFSET
DATABASE AND TABLENAMES
DATABASE ID
OTHER INTERNAL INFOCREATION DATE, PROTECTION, TYPE OFJOURNALING, JOURNAL ID, STRUCTVERSION, etc.
FIELD
1
INDEX DESCRIPTORS(36 BYTES * # INDEXES) PLUS 20 BYTESPER INDEX COLUMN
F2
ALWAYS NULL 3
FASTLOAD & RESTOREINFORMATION
USUALLY NULL
F4
BASE COLUMN INFOCOUNT OF COLUMNS, LOCATION OF FIRSTFIXED FIELD, NUMBER OF PRESENCE BITS,etc. 24 BYTES
COLUMN INFORMATION FOREACH COLUMN
20 BYTES PER COLUMN(+ COMPRESS VALUE)DATA TYPE, OFFSET WITHIN ROWNULLIBLE/NOT NULLIBLE, COMPRESS/NOCOMPRESS, PRESENCE BIT LOCATION, etc.
FIEL
D
5
RESTARTABLE SORTINFORMATION
USUALLY NULL
F
6
ALWAYS NULL 7
ALWAYS NULL 8
ALWAYS NULL 9
ROW LENGTH orREF. ARRAY POINTER
V2R4 Example of Table Header
-
7/27/2019 B210 Sizing
28/51
Sizing a Data Table
Block sizes vary within a table, so compute a range.
Typically, the maximum block size is 63.5 KB bytes, and a typical block size is 48 KB.
Formula:
(BlockSize - 38) / RowSize = RowsPerBlock (rounded down)
RowCount / RowsPerBlock = Blocks (rounded up)
NumAmps * 1024 = Header(Blocks * BlockSize) + Header = NO FALLBACK (BlockSize = Typical Block Size)
(Blocks * BlockSize) * 2 + Header = FALLBACK (BlockSize = Typical Block Size)
Parameters:
38 = Block Header + Block Trailer BlockSize = Typical block size in bytes
1024 = Typical table header size NumAmps = Number of AMPs in the systemRowCount = Number of table rows expected
RowSize = Physical row size
Note:
For large tables, table headers and block overhead (38 bytes) add a minimal
amount of size to the table. Therefore, multiply row size by number of rows
and double for Fallback.
-
7/27/2019 B210 Sizing
29/51
Table Sizing Exercise
Given this data, estimate the size of a table with Fallback and a typical block size of 48K.
BlockSize = 49,152 bytes (48K) NumAmps = 20
RowCount = 501,000,000
RowSize = 98 bytes (includes overhead)
Formula: (BlockSize - 38) / RowSize = RowsPerBlock (round down)
RowCount / RowsPerBlock = Blocks (round up)
NumAmps * 1024 = Header
(Blocks * BlockSize) + Header = No Fallback
(Blocks * BlockSize) * 2 + Header = Fallback
Calculation:
(49,152 - 38) / 98 = 501 rows per block501,000,000 / 501 = 1,000,000 blocks
20 * 1024 = 20,480 for table headers
(1,000,000 * 49,152) + 20,480 = 49,152,020,480 (No Fallback)
(1,000,000 * 49,152) * 2 + 20,480 = 98,304,020,480 (Fallback)
An easier way to estimate this table size:501,000,000 x 98 bytes x 2 (Fallback) = 98,196,000,000
-
7/27/2019 B210 Sizing
30/51
Estimating the Size of a USI Subtable
Spare &
Presence
Row
Length
Row ID of USI
Row
Hash
Uniq.
Value
Secondary
IndexValue
Base Table RowIdentifier Ref.
ArrayPointer
2 4 4 2 7 Variable 2 (opt.) 4 4 2Bytes
Row Offsets
& Misc. (>=7)
Row
Hash
Uniq.
ValuePart.
#
There is one Index row for each Base Table row.USI subtable row size = (Index value size + 29 or 31)
Where 31 = 4 (Row Header and Row Ref. Array pointer)
+ 8 (This row's Row ID)
+ 9 (Spare, presence, and offset bytes)
+ 2 (Optional Partition # for PPI tables)
+ 8 (Base table Row ID)
To estimate the amount of space needed for a USI subtable, you can use the
following formulas.
For tables with NPPI, USI Subtable Size = (Row count) * (index value size + 29)
For tables with PPI, USI Subtable Size = (Row count) * (index value size + 31)
Note: Double this figure for Fallback.
-
7/27/2019 B210 Sizing
31/51
Estimating the Size of a NUSI Subtable
Row Offsets
& Misc. (>=7)
Spare &
Presence
Row
Length
Row ID of NUSI
RowHash
Uniq.Value
Secondary
Index
Value
Ref.
Array
Pointer
2 4 4 2 7 Variable 8/10 8/10 2Bytes
Table Row ID List
P RH U
2 4 4
P RH U
2 4 4
There is at least one index row per AMP for each distinct index value that is in the base table
on that AMP.
To estimate the size of a NUSI subtable,
Size = (Row count) * 8 (or 10 for PPI tables)
+ ( (#distinct values) * (Index value size + 21) * MIN ( #AMPs, Rows per value ) )
MIN( ___ , ___ ) use the smaller of the two values.
Double this figure for Fallback.
More AMPs than typical rows/value:
(10 AMPS, 5 rows/value)
NOT Every AMP has every value.
NOT Every AMP has a subtable row for every value.
More strongly selective. Fewer rows returned from an equality search.
Example:
More typical rows/value than AMPS:
(50 rows/value, 10 AMPS)
Every AMP probably has every value.
Every AMP has a subtable row for every value.
More weakly selective. More rows returned from an equality search.
-
7/27/2019 B210 Sizing
32/51
Estimating the Size
of a Reference Index Subtable
Row OffsetsOverhead
Row
Length
Row ID of RI
Row
Hash
Uniq.
Value
Foreign
KeyValue
Ref.
ArrayPointer
2 4 4 2 6 1 0+ Variable 4 2Bytes
Valid
Flag Count
Optional Presence and
variable length indicators
There is one reference index row for each distinct foreign key value.
RI subtable row size = (Index value size + 25)
Where 25 = 4 (Row Header and Row Ref. Array pointer)
+ 8 (This row's Row ID)
+ 8 (Overhead and row offset bytes)
+ 1 (Validity flag)
+ 4 (Count)
To estimate the size of a Reference Index (RI) subtable, you can use the
following formula.
RI Subtable Size = (Distinct count) * (index size + 25)
Double this figure for Fallback.
-
7/27/2019 B210 Sizing
33/51
Index Sizing Exercise
A 1,000,000 row Fallback table, on a 20-AMP system, has an Integer USI and a
50 Row per Value CHAR(29) NUSI with 20,000 distinct values.
Assume the table is partitioned. Estimate the space for each secondary index.
Formulas:
USI Size = Row Count * (IndexValueSize + 31)
NUSI Size = (Row Count * 10)
+
(#distinct values) * (Index value size + 21) * MIN ( #AMPs , Rows per value )
1,000,000 * (4 + 31 + 1) * 2 = 72,000,000
(10,000,000 + (20,000 * (29 + 21) * 20)) * 2 = 60,000,000
Note: +1 - Rows are allocated on even offsets within the data block.
-
7/27/2019 B210 Sizing
34/51
Empirical Sizing
The best way to size a production table, including indexes is:
1. Load a known percentage of rows onto the system.
2. Query the DD through the view DBC.TableSize.
3. Create one index.
4. Query the DD through the view DBC.TableSize.
5. Repeat steps 3 and 4 as necessary.
6. Multiply the results to determine the production size.
Example:
Step 1 Load 1% of a table onto a system.
Step 2 SELECT SUM(CurrentPerm) FROM DBC.Tablesize
WHERE DatabaseName = DATABASE
AND TableName = 'Daily_Sales' ;
Sum(CurrentPerm)
671,744Step 3 CREATE INDEX (sales_date) ON Daily_Sales;
Step 4 SELECT SUM(CurrentPerm) FROM DBC.Tablesize
WHERE DatabaseName = DATABASE
AND TableName = 'Daily_Sales' ;
Sum(CurrentPerm)914,944
Therefore, index size is:
914,944
671,744243,200
Note: The same query
without the SUM
keyword returns
per/AMP figures which
reveal distribution
efficiency.
-
7/27/2019 B210 Sizing
35/51
Spool Space
Maximum spool space needs vary with table size, use (type of application),
and frequency of use. Large systems use more spool to duplicate tables on each AMP.
Cylinders not currently used for data may be used for spool.
The user's spool amount may be changed dynamically.
Avoid unnecessary copying or redistribution of entire tables to spool.
As user con currency and/or SQL complexi ty inc reases, add mo re SPOOL.
Running out of Spool Space
If a user exceeds their Spool space limit, they will receive the following errormessage.
2646 No more spool space in username
If the AMP runs out of Spool space (insufficient available cylinders for Spool),
the following message will be displayed.
2507 - Out of spool space on disk
-
7/27/2019 B210 Sizing
36/51
Release of Spool
Intermediate Spool
Intermediate Spool results are held until the (LastUse) Explain step.
Output Spool
Output Spool results are held until:
Last spool Response - BTEQ
CLOSE cursor - PreProcessor
ERQ, Terminate function - CLI
Session ends (Job Abort, timeout, logoff, etc.)
System is restarted
System Restart - each AMP rebuilds its Master Index from its Cylinder
Indexes.
The AMPs delete all spool files by moving them to the Free Cylinder List.
This costs only one I/O per spool cylinder, and saves maintaining the
Master Index on disk.
-
7/27/2019 B210 Sizing
37/51
System Sizing Exercise
Some general guidelines for estimating system size:
20% (or more) of total space for Spool5% of total space for PJs/development/staging
5% of total space for DBC & Transient Journal
20 - 50% of data size for indexes
If not using Fallback, multiply the amount of raw data by a factor of 2 or 3.
If using Fallback, multiply the amount of raw data by a factor of 4 or 5.
Example:
User raw data 400 GB
Estimate of Vdisk space needed 1600 - 2000 GB
Proof:
Estimate of Vdisk space 1600 - 2000 GB- Spool (20%) - 400 GB
- PJs/development/staging (5%) - 100 GB
- DBC & Transient Journal (5%) - 100 GB
1000 - 1400 GB
User raw data 400 GB
20 - 50% for indexes 80 - 200 GBFallback (for data and indexes) 960 - 1200 GB
A 4 node system, each
with 7 AMPs, and each
AMP with 72 GB of Vdiskspace would meet this
requirement.
4 x 7 x 72 GB = 2016 GB
of available space
-
7/27/2019 B210 Sizing
38/51
Sizing Summary
Accurate row counts and sizes are needed to get good space estimates.
Database sizing includes:
Tables + Fallback +
Secondary Indexes + Fallback +
Reference Indexes + Fallback +
Join Indexes + Fallback +
Hash Indexes + Fallback +
Permanent Journal (dual or single) +
Stored Procedure space +Spool space +
Temporary space
-
7/27/2019 B210 Sizing
39/51
Review Questions
1. Which of the following can be used with the COMPRESS option?
a. Referencing columns - Foreign Key
b. Referenced column - Primary Key as a USIc. Unique Primary Index
d. Non-unique Secondary Index
2. Which section of a row identifies the starting location of variable length data column data and is
present only if variable length columns are declared?
a. Uncompressed Columns
b. VARCHAR Columns
c. Presence Bits
d. Column Offsets
3. How can you override the default that a column with a NULL value will require row space?
a. Use the COMPRESS option on the column as part of the CREATE TABLE statement.
b. When creating the user, set the default so that columns will default to COMPRESS when creating
a table.
c. Use the NOT NULL option on the column as part of the CREATE TABLE statementd. Use the DEFAULT NULL option on the column as part of the CREATE TABLE statement.
4. What is the minimum space the table headers will take for a 6-column table on a 10 AMP system?
a. 10240 bytes
b. 4096 bytes
c. 5120 bytes
d. 1024 bytes
5. What DD view can you query to get sizing information about tables? _____________________
-
7/27/2019 B210 Sizing
40/51
Module 10: Review Question Answers
1. Which of the following can be used with the COMPRESS option?
a. Referencing columns - Foreign Key
b. Referenced column - Primary Key as a USIc. Unique Primary Index
d. Non-unique Secondary Index
2. Which section of a row identifies the starting location of variable length data column data and is
present only if variable length columns are declared?
a. Uncompressed Columns
b. VARCHAR Columns
c. Presence Bits
d. Column Offsets
3. How can you override the default that a column with a NULL value will require row space?
a. Use the COMPRESS option on the column as part of the CREATE TABLE statement.
b. When creating the user, set the default so that columns will default to COMPRESS when creating
a table.
c. Use the NOT NULL option on the column as part of the CREATE TABLE statementd. Use the DEFAULT NULL option on the column as part of the CREATE TABLE statement.
4. What is the minimum space the table headers will take for a 6-column table on a 10 AMP system?
a. 10240 bytes
b. 4096 bytes
c. 5120 bytes
d. 1024 bytes
5. What DD view can you query to get sizing information about tables? DBC.Tablesize
-
7/27/2019 B210 Sizing
41/51
Lab Exercises
Lab Exercise 10-1
PurposeIn this lab, you will compress multiple values for a column in order to reduce Perm space.
What you need
Populated AU.Accounts table and an empty table in your database
Tasks
1. Populate your Accounts table from the AU.Accounts table using the INSERT/SELECT statement:
INSERT INTO Accounts SELECT * FROM AU.Accounts;
Using the DBC.TableSize view, what is the amount of Perm space used. Accounts =___________
2. Create a new table, named "Accounts_MVC", based on the Accounts table except compress the
following city names:
Culver City, Hermosa Beach, Los Angeles, and Santa Monica
Populate your Accounts_MVC table from the AU.Accounts table using INSERT/SELECT.
Using the DBC.TableSize view, what is the amount of Perm space used. Accounts_MVC =___________
-
7/27/2019 B210 Sizing
42/51
Lab Exercises
Lab Exercise 10-2
PurposeIn this lab, you will use populate tables, determine tables sizes, and create secondary indexes.
What you need
Populated AU.Trans table and an empty table in your database
Tasks
1. Determine the size of your empty Trans table using the DBC.TableSize view (SELECT with and withoutthe SUM aggregate function).
Size of empty Trans = _______________
What size are the table headers on each AMP? _______________
2. Using SHOW TABLE, the Row Size Calculation form and the Sizing a Data Table Formula, estimate
the size of this table; assume 15,000 rows.
Estimated size of Trans = _______________
3. Populate your Trans table from the AU.Trans table using the following INSERT/SELECT statement:
INSERT INTO Trans SELECT * FROM AU.TRANS;
Use the SELECT COUNT(*) function to verify the number of rows. ___________
-
7/27/2019 B210 Sizing
43/51
Lab Exercises
Lab Exercise 10-2 (cont.)
Tasks
4. Using the DBC.TableSize view, determine the actual size of the Trans table by using the SUM aggregate
function.
Size of populated Trans = _______________
5. Create a USI on the Trans_Number column.
Estimate the size of the USI = _______________
Actual size of the USI = _______________ (use the empirical sizing technique)
6. Create a NUSI on the Trans_ID column.
Estimate the size of the NUSI = ______________ (Hint: use DISTINCT function)Actual size of the NUS I= ______________ (use the empirical sizing technique)
-
7/27/2019 B210 Sizing
44/51
Lab Exercises
Lab Exercise 10-3
PurposeIn this lab, you will determine tables sizes and establish referential integrity between two tables.
What you need
Populated PD tables and empty tables in your database
Tasks
1. Populate your Employee and Emp_Phone tables from the PD.Employee and PD.Emp_Phone tablesusing the following INSERT/SELECT statements.
INSERT INTO Employee SELECT * FROM PD.Employee;
INSERT INTO Emp_Phone SELECT * FROM PD.Emp_Phone;
2. Using the DBC.TableSize view, determine the actual size of the Emp_Phone table by using the SUM
aggregate function.
Size of populated Emp_Phone table = _______________
-
7/27/2019 B210 Sizing
45/51
Lab Exercises
Lab Exercise 10-3 (cont.)
Tasks
3. The Foreign key is Employee_Number in PD.Emp_Phone and the Primary Key is the
Employee_Number in PD.Employee.
Create a References constraint on Employee_Number using the following SQL statements.
ALTER TABLE Emp_Phone ADD CONSTRAINT fk1
FOREIGN KEY (Employee_Number)
REFERENCES Employee (Employee_Number);
(use the HELP CONSTRAINT Emp_Phone.fk1; to view constraint information.
4. Using the DBC.TableSize view, determine the actual size of the Emp_Phone table by using the SUM
aggregate function.
Estimate the size of the Reference Index = _______________
Size of populated Emp_Phone with references index = _______________
Size of references index = _______________
5. Drop the Foreign Key constraint by executing the following SQL command.
ALTER TABLE Emp_Phone DROP CONSTRAINT fk1;
-
7/27/2019 B210 Sizing
46/51
Lab Solutions for Lab 10-1
Lab Exercise 10-1
1. Populate your Accounts table from the AU.Accounts table using the INSERT/SELECT statement:
INSERT INTO Accounts SELECT * FROM AU.Accounts;
Using the DBC.TableSize view, what is the amount of Perm space used. Accounts = 1,804,288
2. Create a new table, named "Accounts_MVC", based on the Accounts table except compress the
following city names: Culver City, Hermosa Beach, Los Angeles, and Santa Monica
CREATE SET TABLE Accounts_MVC, FALLBACK,
NO BEFORE JOURNAL, NO AFTER JOURNAL
(ACCOUNT_NUMBER INTEGER NOT NULL,
NUMBER INTEGER,
STREET CHAR(25),
CITY CHAR(20) COMPRESS ('Hermosa Beach', 'Culver City', 'Los Angeles','Santa Monica'),
STATE CHAR(2),ZIP_CODE INTEGER,
BALANCE_FORWARD DECIMAL(10,2),
BALANCE_CURRENT DECIMAL(10,2))
PRIMARY INDEX ( ACCOUNT_NUMBER );
Populate your Accounts_MVC table from the AU.Accounts table using INSERT/SELECT.
Using the DBC.TableSize view, what is the amount of Perm space used. Accounts_MVC = 1,404,828
-
7/27/2019 B210 Sizing
47/51
Lab Solutions for Lab 10-2
Lab Exercise 10-2
1. Determine the size of your empty Trans table using the DBC.Tablesize view (SELECT with and without the SUMaggregate function).
SELECT SUM(CurrentPerm) FROM DBC.Tablesize
WHERE DatabaseName = DATABASE AND TableName = 'Trans' ;
Sum(CurrentPerm)
8192 Size of empty Trans = 8192 (Captured from an 8 AMP system)
What size are the table headers on each AMP? 1024
2. Using SHOW TABLE, the Row Size Calculation form and the Sizing a Data Table Formula, estimate the size of this
table; assume 15,000 rows.
Each row is 24 bytes long plus 14 bytes for overhead = 38 bytes
38 x 15,000 = 570,000 x 2 (Fallback) = 1,140,000 bytes approx.
Estimated size of Trans = 1,140,000
3. Populate your Trans table from the AU.Trans table using the following INSERT/SELECT statement:
INSERT INTO Trans SELECT * FROM AU.TRANS;
Use the SELECT COUNT(*) function to verify the number of rows. 15,000
SELECT COUNT(*) FROM Trans; Count(*)15000
-
7/27/2019 B210 Sizing
48/51
Lab Solutions for Lab 10-2 (cont.)
Lab Exercise 10-2 (cont.)4. Using the DBC.Tablesize view, determine the actual size of the Trans table by using the SUM aggregate function.
Size of populated Trans = 1,153,024
SELECT SUM(CurrentPerm) FROM DBC.Tablesize
WHERE DatabaseName = DATABASE
AND TableName = 'Trans' ;
Sum(CurrentPerm)
1153024 (Estimated size was 1,140,000)
5. Create a USI on the Trans_Number column.
CREATE UNIQUE INDEX (Trans_Number) on Trans;
Estimate the size of the USI = 960,000
(4 + 29) x 15,000 = 480,000 x 2 (Fallback) = 990,000 bytes approx.
Actual size of the USI = 1,026,048 (use the empirical sizing technique)
SELECT SUM(CurrentPerm) FROM DBC.Tablesize
WHERE DatabaseName = DATABASE
AND TableName = 'Trans' ;
Sum(CurrentPerm)2179072 2,179,072 - 1,153,024 = 1,026,048
-
7/27/2019 B210 Sizing
49/51
Lab Solutions for Lab 10-2 (cont.)
Lab Exercise 10-2 (cont.)6. Create a NUSI on the Trans_ID column.
CREATE INDEX (Trans_ID) on Trans;
Estimate the size of the NUSI = _______________
SELECT COUNT(DISTINCT(Trans_ID)) FROM Trans;
Count(Distinct(TRANS_ID))975
(15,000 x 8) + ( 975 x (4 + 21) x 8 ) = 315,000 bytes approx.
315,000 x 2 (Fallback) = 630,400 bytes approx.
Actual size of the NUSI = 523,264 (use the empirical sizing technique)
SELECT SUM(CurrentPerm) FROM DBC.Tablesize
WHERE DatabaseName = DATABASE
AND TableName = 'Trans' ;
Sum(CurrentPerm)
2702336 2,702,336 - 2,179,072 = 523,264
-
7/27/2019 B210 Sizing
50/51
Lab Solutions for Lab 10-3
Lab Exercise 10-31. Populate your Employee and Emp_Phone tables from the PD.Employee and PD.Emp_Phone tables using the following
INSERT/SELECT statement:
INSERT INTO Employee SELECT * FROM PD.Employee;
INSERT INTO Emp_Phone SELECT * FROM PD.Emp_Phone;
2. Using the DBC.Tablesize view, determine the actual size of the Emp_Phone table by using the SUM aggregate function.
Size of populated Emp_Phone = 124,928
SELECT SUM (CurrentPerm) FROM DBC.Tablesize
WHERE DatabaseName = DATABASE
AND TableName = 'Emp_Phone' ; Sum(CurrentPerm)
124,928
3. The Foreign key is Employee_Number in the Emp_Phone table and the Primary Key is the Employee_Number in the
Employee table.
Create a References constraint on Employee_Number using the following SQL statements.
ALTER TABLE Emp_Phone ADD CONSTRAINT fk1 FOREIGN KEY (Employee_Number)
REFERENCES Employee (Employee_Number);
(use the HELP CONSTRAINT Emp_Phone.fk1; to view constraint information.
HELP CONSTRAINT Emp_Phone.fk1;
Name Type State Index ID Foreign Key Columns ...FK1 REFERENCE VALID 0 EMPLOYEE_NUMBER ...
-
7/27/2019 B210 Sizing
51/51
Lab Solutions for Lab 10-3 (cont.)
Lab Exercise 10-3 (cont.)
4. Using the DBC.Tablesize view, determine the actual size of the Emp_Phone table by using the SUM aggregate function.
SELECT COUNT(DISTINCT(Employee_Number)) AS "Count"
FROM Emp_Phone;
Count
1000
(4 + 25) x 1,000 = 29,000 x 2 (Fallback) = 58,000 bytes approx.
Estimate the size of the Reference Index = 58,000
Size of populated Emp_Phone with references index = 190,464
SELECT SUM (CurrentPerm) FROM DBC.Tablesize
WHERE DatabaseName = DATABASE
AND TableName = 'Emp_Phone' ;
Sum(CurrentPerm)
190464 190,464 - 124,928 = 65,536
Size of references index = 65,536
5. Drop the Foreign Key constraint by executing the following SQL command.
top related