This course enables the project administrators and ETL developers to acquire the skills necessary to develop parallel jobs in Data Stage. Students will learn to create parallel jobs that access sequential and relational data and combine and transform the data using functions and other job components.
Introduction
Datastage Overview
♦ Datastage History
♦ DataStage Architecture
♦ Datastage Topology
♦ DataStage Components
♦ Server Components
♦ Client Components
♦ DataStage Workflow
Types of DataStage Job
♦ Parallel Jobs
♦ Server Jobs
♦ Job Sequences
Datastage Administrator
♦ Creating Projects
♦ Enabling RCP
♦ Overview of Other Mandatory properties
Server Jobs
♦ Creating simple server job
♦ Monitoring the jobs at runtime
♦ Transformer and lookups
♦ Stage variables
Parallel Jobs
♦ Parallel job overview
♦ Implementing Pipelining and Paralleling techniques in jobs
♦ Compile and trigger the job in Director
♦ View the job log
♦ Sub routines
Partitioning techniques
♦ Describe parallel processing architecture Describe pipeline & partition parallelism
♦ List and describe partitioning and collecting algorithms
♦ Describe configuration files
♦ Explain OSH & Score
Sequential or unstructured data
♦ Sequential File stage
♦ Data Set stage
♦ File Set stage
♦ Complex Flat File stage
♦ Fixed length and Variable length files
♦ Create jobs that read from and write to sequential files
♦ Read from multiple files using file patterns
♦ Null handling in Sequential File Stage
♦ Handling Excel sheets, .csv files
Configuration files
♦ Overview of Configuration files
♦ Node concepts
♦ Conductors, Section leaders and Players
♦ Creating Configuration files
Grouping Data
♦ Combine data using the Lookup stage
♦ Combine data using merge stage
♦ Combine data using the Join stage
♦ Combine data using the Funnel stage
Sorting and Aggregating Data
♦ Sort data using in-stage sorts and Sort stage
♦ Combine data using Aggregator stage
♦ Remove Duplicates stage
Transforming Data
♦ Transforming the data with Constraints, Derviation
♦ Column derivations using user defined code and system functions
♦ Filter records based on business logic
♦ Control data flow based on data conditions
Repository Functions
♦ Perform a simple Find
♦ Perform an Advanced Find Perform an impact analysis
♦ Compare the differences between two Table Definitions and Jobs
Working with Relational Data
♦ Import Table Definitions for relational tables
♦ Create Data Connections
♦ Use Connector stages in a job
♦ Use SQL Builder to define SQL Select statements
♦ Use SQL Builder to define SQL Insert and Update statements
♦ Use the DB2 Enterprise stage
Metadata in Parallel Framework
♦ Explain schemas
♦ Create schemas
♦ Explain Runtime Column Propagation (RCP)
♦ Build a job that reads data from a sequential file using a schema
♦ Build a shared container
Flow or Controlling Jobs
♦ Use the DataStage Job Sequencer to build a job that controls a sequence of jobs
♦ Use Sequencer links and stages to control the sequence a set of jobs run in
♦ Use Sequencer triggers and stages to control the conditions under which jobs run
♦ Define user variables
♦ Enable restart
♦ Handle errors and exceptions
♦ Custom Job triggers in Sequence job
Advanced Features
♦ Run-Time column propagation(RCP)
♦ Multiple Instances
♦ Sub Routines
♦ Command Line
♦ Implementing Parameter in Jobs
♦ Creating Parameters set
♦ Standardization of DS Job
♦ Etiquettes in Job Designing.
We can assure a 100% job guarantee and Placement. Contact us for Free - Demo.
Copyright © 2017 - Developed by Infihive Consulting Services LLC changes