Course Overview:

This course enables the project administrators and ETL developers to acquire the skills necessary to develop parallel jobs in Data Stage. Students will learn to create parallel jobs that access sequential and relational data and combine and transform the data using functions and other job components.

Course Content:


Datastage Overview

♦ Datastage History

♦ DataStage Architecture

♦ Datastage Topology

♦ DataStage Components

♦ Server Components

♦ Client Components

♦ DataStage Workflow

Types of DataStage Job

♦ Parallel Jobs

♦ Server Jobs

♦ Job Sequences

Datastage Administrator

♦ Creating Projects

♦ Enabling RCP

♦ Overview of Other Mandatory properties

Server Jobs

♦ Creating simple server job

♦ Monitoring the jobs at runtime

♦ Transformer and lookups

♦ Stage variables

Parallel Jobs

♦ Parallel job overview

♦ Implementing Pipelining and Paralleling techniques in jobs

♦ Compile and trigger the job in Director

♦ View the job log

♦ Sub routines

Partitioning techniques

♦ Describe parallel processing architecture Describe pipeline & partition parallelism

♦ List and describe partitioning and collecting algorithms

♦ Describe configuration files

♦ Explain OSH & Score

Sequential or unstructured data

♦ Sequential File stage

♦ Data Set stage

♦ File Set stage

♦ Complex Flat File stage

♦ Fixed length and Variable length files

♦ Create jobs that read from and write to sequential files

♦ Read from multiple files using file patterns

♦ Null handling in Sequential File Stage

♦ Handling Excel sheets, .csv files

Configuration files

♦ Overview of Configuration files

♦ Node concepts

♦ Conductors, Section leaders and Players

♦ Creating Configuration files

Grouping Data

♦ Combine data using the Lookup stage

♦ Combine data using merge stage

♦ Combine data using the Join stage

♦ Combine data using the Funnel stage

Sorting and Aggregating Data

♦ Sort data using in-stage sorts and Sort stage

♦ Combine data using Aggregator stage

♦ Remove Duplicates stage

Transforming Data

♦ Transforming the data with Constraints, Derviation

♦ Column derivations using user defined code and system functions

♦ Filter records based on business logic

♦ Control data flow based on data conditions

Repository Functions

♦ Perform a simple Find

♦ Perform an Advanced Find Perform an impact analysis

♦ Compare the differences between two Table Definitions and Jobs

Working with Relational Data

♦ Import Table Definitions for relational tables

♦ Create Data Connections

♦ Use Connector stages in a job

♦ Use SQL Builder to define SQL Select statements

♦ Use SQL Builder to define SQL Insert and Update statements

♦ Use the DB2 Enterprise stage

Metadata in Parallel Framework

♦ Explain schemas

♦ Create schemas

♦ Explain Runtime Column Propagation (RCP)

♦ Build a job that reads data from a sequential file using a schema

♦ Build a shared container

Flow or Controlling Jobs

♦ Use the DataStage Job Sequencer to build a job that controls a sequence of jobs

♦ Use Sequencer links and stages to control the sequence a set of jobs run in

♦ Use Sequencer triggers and stages to control the conditions under which jobs run

♦ Define user variables

♦ Enable restart

♦ Handle errors and exceptions

♦ Custom Job triggers in Sequence job

Advanced Features

♦ Run-Time column propagation(RCP)

♦ Multiple Instances

♦ Sub Routines

♦ Command Line

♦ Implementing Parameter in Jobs

♦ Creating Parameters set

♦ Standardization of DS Job

♦ Etiquettes in Job Designing.

We can assure a 100% job guarantee and Placement. Contact us for Free - Demo.

Quick Enroll