Number of questions: 64
Time allowed in minutes: 90
Required passing score: 64%
Languages: English, Japanese
Describe how to properly configure DataStage.
Identify tasks required to create and configure a project to be used for jobs.
Given a configuration file, identify its components and its overall intended purpose.
Demonstrate proper use of node pools.
Demonstrate knowledge of framework schema.
Identify the method of importing, sharing, and managing metadata.
Demonstrate knowledge of runtime column propagation (RCP).
Persistent Storage (15%)
Explain the process of importing/exporting data to/from framework.
Demonstrate proper use of a Sequential File stage.
Demonstrate proper usage of Complex Flat File stage.
Demonstrate proper usage of FileSets and DataSets.
Demonstrate use of FTP stage for remote data.
Demonstrate use of restructure stages.
Identify importing/exporting of XML data.
Knowledge of balanced optimization for Hadoop and integration of oozie workflows.
Demonstrate proper use of File Connector stage.
Demonstrate use of DataStage to handle various types of data including unstructured, hierarchical, Cloud, and Hadoop.
Parallel Architecture (9%)
Demonstrate proper use of data partitioning and collecting.
Demonstrate knowledge of parallel execution.
Demonstrate proper selection of database stages and database specific stage properties.
Identify source database options.
Demonstrate knowledge of target database options.
Demonstrate knowledge of the different SQL input/creation options and when to use each.
Data Transformation (12%)
Demonstrate knowledge of default type conversions, output mappings, and associated warnings.
Demonstrate proper selections of Transformer stage vs. other stages.
Describe Transformer stage capabilities.
Demonstrate the use of Transformer stage variables.
Identify process to add functionality not provided by existing DataStage stages.
Demonstrate proper use of SCD stage.
Demonstrate job design knowledge of using runtime column propagation (RCP).
Demonstrate knowledge of Transformer stage input and output loop processing.
Job Components (8%)
Demonstrate knowledge of Join, Lookup and Merge stages.
Demonstrate knowledge of Sort stage.
Demonstrate understanding of Aggregator stage.
Describe proper usage of change capture/change apply.
Demonstrate knowledge of real-time components.
Job Design (14%)
Demonstrate knowledge of shared containers.
Describe how to minimize Sorts and repartitions.
Demonstrate knowledge of creating restart points and methodologies.
Explain the process necessary to run multiple copies of the source.
Knowledge of creating DataStage jobs that can be used as a service.
Knowledge of balanced optimization.
Describe the purpose and uses of parameter sets and how they compare with other approaches for parameterizing jobs.
Demonstrate the ability to create and use Data Rules using the Data Rules stage to measure the quality of data.
Demonstrate various methods of using DataStage to handle encrypted data.
Monitor and Troubleshoot (9%)
Demonstrate knowledge of parallel job score.
Identify and define environment variables that control DataStage with regard to added functionality and reporting.
Given a process list, identify conductor, section leader, and player process.
Identify areas that may improve performance.
Demonstrate knowledge of runtime metadata analysis and performance monitoring.
Ability to monitor DataStage jobs using the Job Log and Operations Console.
Job Management and Deployment (8%)
Demonstrate knowledge of DataStage Designer Repository utilities such as advanced find, impact analysis, and job compare.
Articulate the change control process.
Knowledge of Source Code Control Integration.
Demonstrate the ability to define packages, import, and export using the ISTool utility.
Demonstrate the ability to perform admin tasks with tools such as Directory Admin.
Job Control and Runtime Management (8%)
Demonstrate knowledge of message handlers.
Demonstrate the ability to use the dsjob command line utility.
Demonstrate ability to use job sequencers.
Create and manage encrypted passwords and credentials files.
IBM Certified Solution Developer – InfoSphere DataStage v11.3
Job Role Description / Target Audience
This certification demonstrates that the successful candidate has important knowledge and skills necessary to professionally design and develop an efficient and scalable DataStage solution to a complex enterprise level business problem; configure a scalable parallel environment including clustered and distributed configurations; collect, report on, and resolve issues identified through key application performance indicators; be proficient in extending the capabilities of the parallel framework using the provided APIs (build-ops and wrappers).
Recommended Prerequisite Skills
Able to design and develop a scalable complex solution using an optimum number of stages.
Able to select the optimal data partitioning methodology.
Able to configure a distributed or non-symmetric environment.
Should be proficient with BuildOps and wrappers.
Should understand how to tune a parallel application to determine where bottlenecks exist and how to eliminate them.
Should understand basic configuration issues for all relational databases and be highly proficient in at least one.
Able to improve job design by implementing new product features in DataStage v11.3.
Able to monitor DataStage jobs via the Job Log and the Operations Console.
Assume you have before and after data sets and want to identify and process all of the changes
between the two data sets. Assuming data is properly partitioned and sorted, which of the
following should be used?
B. Change Apply
C. Change Capture
D. Change Capture and Change Apply
In the Masking Policy Editor in the Data Masking stage of your job, you have specified for a
column containing credit card numbers the Random Replacement masking policy. For the Copy
option you have specified “(1,2)(3,4)”.
What changes will be made to a credit card number, such as 6327664369, to mask it?
A. Digits 1 through 2 and digits 3 through 4will be randomly changed. The rest of the digits will remain the same.
B. Digits 1 through 2 and digits 3 through 4 will remain the same. The rest of the digits will be randomly changed.
C. The 2 digits starting at digit 1 and the 4 digits starting at digit 3 will remain the same. The rest of the digitswill be randomly changed.
D. The 2 digits starting at digit 1 and the 4 digits starting at digit 3 will be randomly changed. The rest of the digits will remain the same.
What is the primary advantage of creating data rules within Information Analyzer as opposed to
creating them within the Data Rules stage?
A. Data rules cannot be created within the Data Rules stage. They must first be created in
Information Analyzer before they can be used in the Data Rules stage.
B. Rules created within Information Analyzer can be tested and debugged on non-operational data
in a testing environment before they are put into production.
C. Rulescreated in the DataStage Data Rules stage have to be compiled into an executable form
before they can be used.
D. The variables in rules created in DataStage Data Rules stage must first be bound to data
columns or literals before they can be run.
Which stages will require a schema file when runtime column propagation (RCP) is enabled?
A. Data Set
B. Column Import
C. Internal Source
D. External Target
E. Make Subrecord
Which of the following actions are available when editing a message handler?
A. Abort Job
B. Demote to warning
C. Promote to warning
D. Promote to informational