Training / IBM DataStage on IBM Cloud Pak for Data
IBM DataStage-aaS Anywhere Developers Training
This comprehensive course empowers you to master data integration and transformation pipelines. You’ll efficiently manage DataStage Flows, handle file and data processing stages, and seamlessly connect to cloud and database connectors. Learn to automate pipelines with Watson Pipelines, ensuring optimized performance and streamlined data workflows. By the end of this course, you’ll be equipped with the skills to handle parallel frameworks, multi-instance jobs, and advanced data transformations.
Audience:
These courses are appropriate for learners new to DataStage, and experienced DataStage learners new to DataStage aas Anywhere.
Objectives: This is a 4-day instructor-led course, conducted over 5 hours per day.
Overview
- Introduction to Cloud Pak for Data
- Introduction to DataStage
- Parallel Framework
Working with DataStage
- Projects
- Stages, Links and RCP
- Invocation Id’s
- Table Definitions
- DataStage Flow
- Compile and Run
- Adaptable Flows
- Subflows
- Multi-instance Jobs
- Import and Export flows
File Stages
- Sequential File Stage
- Data Set Stage
Data Processing Stages
- Sorting Data
- Sort Stage
- Remove Duplicates Stage
- Aggregator Stage
- Combining Data
- Lookup Stage
- Join Stage
- Merge Stage
Data Connector Stages
- DB2 Connector Stage
- HTTP Connector Stage
Cloud Connector Stages (Choice of any two Cloud Connectors)
- Amazon S3 Connector
- Amazon RedShift Connector
- Microsoft Azure Connector for File Storage
- Microsoft Azure SQL Server Connector Stage
- Google Big Query Connector
- Snowflake Connector
Data Transformation Stages
- Transformer Stage
- Hierarchical Data Stage
Watson Pipeline
- Introduction
- Pipelines Nodes
- Trigger Conditions
- Handling Errors
- Run Pipelines