surrogate key generator implementation in

Upload: anilkumar-reddy

Post on 15-Oct-2015

21 views

Category:

Documents


0 download

DESCRIPTION

SURROGATE KEY GENEARATION IN DATASTAGE

TRANSCRIPT

  • 5/25/2018 Surrogate Key Generator Implementation In

    1/10

    Surrogate Key Generator Implementation in Datastage 8.1, 8.5 and 9.1

    The Surrogate Key Generator stage is a processing stage that generates surrogate keycolumns and maintains the key source.

    A surrogate key is a unique primary key that is not derived from the data that itrepresents, therefore changes to the data will not change the primary key. In a starschema database, surrogate keys are used to join a fact table to a dimension table.

    Surrogate key generator stage uses:

    1. Create or delete the key source before other jobs run2. Update a state file with a range of key values3. Generate surrogate key columns and pass them to the next stage in the job4. View the contents of the state file

    Generated keys are 64 bit integers and the key source can be stat file or databasesequence.

    Surrogate keys are used to join a dimension table to a fact table in a star schemadatabase.

    When the SCD stage performs a dimension lookup :A)If a matching record is found, it retrieves the value of the existing surrogate key.B) If a match is not found, the stage obtains a new surrogate key value by using thederivation of the Surrogate Key column on the Dim Update tab.

    If you want the SCD stage to generate new surrogate keys by using a

    key source that you created with a Surrogate Key Generator stage asdescribed in Surrogate Key Generator. If you want to use your own method to handle surrogate keys, youshould derive the Surrogate Key column from a source column.You can replace the dimension information in the source data stream with thesurrogate key value by mapping the Surrogate Key column to the output link.

    Creating the key Source :

    Drag the surrogate key stage from palette to parallel job canvas with no input andoutput links.

  • 5/25/2018 Surrogate Key Generator Implementation In

    2/10

    Double click on the surrogate key stage and click on properties tab.

    Properties:

    Key Source Action = create

    Source Type : FlatFile or Database sequence(in this case we are using FlatFile)

    When you run the job it will create an empty file.

    http://4.bp.blogspot.com/-cEJGs7ImY5s/UmZnDocjQHI/AAAAAAAAAPc/qTZDnyvm0Ro/s1600/2.jpghttp://2.bp.blogspot.com/-Zf2DQsMXOUc/UmZnDRU68-I/AAAAAAAAAPg/OTjznDOkqMs/s1600/1.jpghttp://4.bp.blogspot.com/-cEJGs7ImY5s/UmZnDocjQHI/AAAAAAAAAPc/qTZDnyvm0Ro/s1600/2.jpghttp://2.bp.blogspot.com/-Zf2DQsMXOUc/UmZnDRU68-I/AAAAAAAAAPg/OTjznDOkqMs/s1600/1.jpg
  • 5/25/2018 Surrogate Key Generator Implementation In

    3/10

    If you want to the check the content change the View Stat File = YES and check thejob log for details.

    skey_genstage,0: State file /tmp/skeycutomerdim.stat is empty.if you try to create the same file again job will abort with the following error.

    skey_genstage,0: Unable to create state file /tmp/skeycutomerdim.stat: Fileexists.Deleting the key source:

    Updating the stat File:

    To update the stat file add surrogate key stage to the job with single input link fromother stage.

    We use this process to update the stat file if it is corrupted or deleted.

    1 1. Open the surrogate key stage editor and go to theproperties tab.

    http://1.bp.blogspot.com/-IOlcWzccEwY/UmZnD7T-IgI/AAAAAAAAAPo/pHy4SDBVqA0/s1600/3.jpg
  • 5/25/2018 Surrogate Key Generator Implementation In

    4/10

    http://1.bp.blogspot.com/-MpUy79_VAKo/UmZnEQ_wQaI/AAAAAAAAAPs/7yeLO4_lm8A/s1600/4.jpg
  • 5/25/2018 Surrogate Key Generator Implementation In

    5/10

    If the stat file exists we can update otherwise we can create and update it.

    We are using SkeyValue parameter to update the stat file using transformer stage.

    http://2.bp.blogspot.com/-5TR5Nny-9Pw/UmZnFa1EA3I/AAAAAAAAAP4/Y4iVk7CSDb0/s1600/5.jpg
  • 5/25/2018 Surrogate Key Generator Implementation In

    6/10

    Generating Surrogate Keys:

    http://1.bp.blogspot.com/-RsEHR8bD2Zs/UmZnFYkVZ7I/AAAAAAAAAP8/vXevYbBf2jE/s1600/7.jpghttp://2.bp.blogspot.com/-pbe9o1tUmrY/UmZnF0-wM3I/AAAAAAAAAQk/chJTC6NXBBQ/s1600/6.jpghttp://1.bp.blogspot.com/-RsEHR8bD2Zs/UmZnFYkVZ7I/AAAAAAAAAP8/vXevYbBf2jE/s1600/7.jpghttp://2.bp.blogspot.com/-pbe9o1tUmrY/UmZnF0-wM3I/AAAAAAAAAQk/chJTC6NXBBQ/s1600/6.jpg
  • 5/25/2018 Surrogate Key Generator Implementation In

    7/10

    Now we have created stat file and will generate keys using the stat key file.

    Click on the surrogate keys stage and go to properties add add type a name for thesurrogate key column in the Generated Output Column Name property.

    Go to ouput and define the mapping like below.

    http://4.bp.blogspot.com/-xh-mcOfKviw/UmZnGYV61BI/AAAAAAAAAQM/4c2by2870Y4/s1600/8.jpg
  • 5/25/2018 Surrogate Key Generator Implementation In

    8/10

    Rowgen we are using 10 rows and hence when we run the job we see 10 skey values inthe output.I have updated the stat file with 100 and below is the output.

    http://2.bp.blogspot.com/-8-ju69Rr9SQ/UmZnG9TvK8I/AAAAAAAAAQY/jIWJI_CogVE/s1600/9a.jpghttp://2.bp.blogspot.com/-JgJub5OzrDo/UmZnGsweUII/AAAAAAAAAQU/XqoDzktQwMw/s1600/9.jpghttp://2.bp.blogspot.com/-8-ju69Rr9SQ/UmZnG9TvK8I/AAAAAAAAAQY/jIWJI_CogVE/s1600/9a.jpghttp://2.bp.blogspot.com/-JgJub5OzrDo/UmZnGsweUII/AAAAAAAAAQU/XqoDzktQwMw/s1600/9.jpg
  • 5/25/2018 Surrogate Key Generator Implementation In

    9/10

    If you want to generate the key value from begining you can use following property inthe surrogate key stage.

    http://3.bp.blogspot.com/-6LzfiQYUE3g/UmZnHpHWhqI/AAAAAAAAAQs/eGydMyVL9MI/s1600/9b.jpg
  • 5/25/2018 Surrogate Key Generator Implementation In

    10/10

    A. If the key source is a flat file, specify how keys are generated:

    1. To generate keys in sequence from the highest value that was last used, setthe Generate Key from Last Highest Value property to Yes. Any gaps in the keyrange are ignored.2. To specify a value to initialize the key source, add the File Initial Value property to

    the Options group, and specify the start value for key generation.3. To control the block size for key ranges, add the File Block Size property to the

    Options group, set this property to User specified, and specify a value for theblock size.B. If there is no input link, add the Number of Records property to the Options group,and specify how many records to generate.

    http://4.bp.blogspot.com/-XrdpVzjjUhQ/UmZnHs1XyZI/AAAAAAAAAQo/IFHKam9QGUw/s1600/9c.jpg