DataStage Training

DataStage Syllabus

 

Datastage Introduction

Datastage Overview

  • Datastage History
  • DataStage Architecture
  • Datastage Topology
  • DataStage Components
  • Server Components
  • Client Components
  • DataStage Workflow

Types of DataStage Job

  • Parallel Jobs
  • Server Jobs
  • Job Sequences

Datastage Administrator

  • Creating Projects
  • Enabling RCP
  • Overview of Other Mandatory properties

Server Jobs

  • Creating simple server job
  • Monitoring the jobs at runtime
  • Transformer and lookups
  • Stage variables

Parallel Jobs

  • Parallel job overview
  • Implementing Pipelining and Paralleling techniques in jobs
  • Compile and trigger the job in Director
  • View the job log
  • Sub routines

Partitioning techniques

  • Describe parallel processing architecture Describe pipeline & partition parallelism
  • List and describe partitioning and collecting algorithms
  • Describe configuration files
  • Explain OSH & Score

Sequential or unstructured data

  • Sequential File stage
  • Data Set stage
  • File Set stage
  • Complex Flat File stage
  • Fixed length and Variable length files
  • Create jobs that read from and write to sequential files
  • Read from multiple files using file patterns
  • Null handling in Sequential File Stage
  • Handling Excel sheets, .csv files

Configuration files

  • Overview of Configuration files
  • Node concepts
  • Conductors, Section leaders and Players
  • Creating Configuration files

Grouping Data

  • Combine data using the Lookup stage
  • Combine data using merge stage
  • Combine data using the Join stage
  • Combine data using the Funnel stage

Sorting and Aggregating Data

  • Sort data using in-stage sorts and Sort stage
  • Combine data using Aggregator stage
  • Remove Duplicates stage

Transforming Data

  • Transforming the data with Constraints, Derviation
  • Column derivations using user defined code and system functions
  • Filter records based on business logic
  • Control data flow based on data conditions

Repository Functions

  • Perform a simple Find
  • Perform an Advanced Find Perform an impact analysis
  • Compare the differences between two Table Definitions and Jobs

Working with Relational Data

  • Import Table Definitions for relational tables
  • Create Data Connections
  • Use Connector stages in a job
  • Use SQL Builder to define SQL Select statements
  • Use SQL Builder to define SQL Insert and Update statements
  • Use the DB2 Enterprise stage

Metadata in Parallel Framework

  • Explain schemas
  • Create schemas
  • Explain Runtime Column Propagation (RCP)
  • Build a job that reads data from a sequential file using a schema
  • Build a shared container

Flow or Controlling Jobs

  • Use the DataStage Job Sequencer to build a job that controls a sequence of jobs
  • Use Sequencer links and stages to control the sequence a set of jobs run in
  • Use Sequencer triggers and stages to control the conditions under which jobs run
  • Define user variables
  • Enable restart
  • Handle errors and exceptions
  • Custom Job triggers in Sequence job

Advanced Features

  • Run-Time column propagation(RCP)
  • Multiple Instances
  • Sub Routines
  • Command Line
  • Implementing Parameter in Jobs
  • Creating Parameters set
  • Standardization of DS Job
  • Etiquettes in Job Designing
Quick Enroll