Selasa, 28 Agustus 2012

IBM Information Server 8.X (DataStage): Parallel Transformer Stage Variables And Properties

DataStage: What is Transformer Stage?

DataStage provides several stages to load the data into the data warehouse or data marts. The stages classified into General, Database, Developement and Debug, File, Processing, Real time etc. and the transformer stage is a processing stage.

Local Variables:

You can declare and use your own variables within a Transformer stage. Such variables are accessible only from the Transformer stage in which they declared. Since these variables are local to the transformer, the name local variables.

The uses:

  • The values assigned by expressions.
  • They are used in expressions which define an output column derivation.
  • Expressions evaluating a variable can include other variables or the variable being evaluated itself.

Advantageous of stage variables:

1. Same part of an expression used in multiple column derivations.

For example, IF (DSLINK1.col1 [1, 3] = "001") THEN { statements evaluated here} used in multiple output column derivations, the expression evaluates for each column that uses it. Use the stage variable by moving this substring calculation and the substring evaluates just once for every input row.

2. Where an expression includes calculated constant values.

A column definition might include a function call that returns a constant value, such as: Str(" ",20).This returns a string of 20 spaces. In this case, the function would be evaluated every time the column derivation is evaluated. It is more efficient to calculate the constant value just once for the whole Transformer.

Use the stage variable to use this function. This case, the function would still be evaluated once for every input row.

Assign the initial value from the Stage Properties to the stage variable and leave the derivation empty. This way, the function evaluated only once for whole transformer processing.

3. Where an expression requires a type conversion and used as a constant, or used in multiple places

System variables

WebSphere DataStage provides a set of variables containing useful system information that you can access from an output derivation or constraint.

Name Description:

@FALSE The value replaced with 0.

@TRUE The value replaced with 1.

@INROWNUM Input row counter.

@OUTROWNUM Output row counter (per link).

@NUMPARTITIONS The total number of partitions for the stage.

@PARTITIONNUM The partition number for the particular instance.

Transformer stage: Input page

Partitioning tab:

The Partitioning tab allows you to specify details about how the incoming data partitioned or collected when input to the Transformer stage. It also allows you to specify that the data should be sorted on input.

By default the Transformer stage will attempt to preserve partitioning of incoming data, or use its own partitioning method according to what the previous stage in the job dictates.

In The transformer stage, the stage tab allows to set various properties in the General like Execution mode, Preserve partitioning, Advanced tab, Triggers tab etc.

For more info on learning IBM Information server 8.x or DataStage 8.x, I invite you to visit the where you will learn more on Data Warehousing concepts, designing and developing DataStage Server Jobs, Parallel Jobs, containers and sequence jobs in detail and sign up for the Special Offer: Grab the FREE 7 Day Mini Email Course on how to "Master the DataStage"

The article is submitted by Yogi Talakanti, creator of Data Warehousing expert Solutions.

2 komentar:

  1. I enjoy what you’ve got right here, adore what you’re stating and the way you say it.This is a very interesting article.
    Sony laptops

  2. It is good to see the good explanataion on transformer stage with real time scenarios examples. Keep updating latest techniques on this . Thanks

    Transformer Stage