Talend Fundamentals Training

Talend Studio for Data Integration training is aimed at data scientists and data-warehouse builders who need the skills to design ETL processes to populate their data stores. Talend Studio dramatically improves the efficiency of data integration Job design through an easy-to-use graphical development environment. With integrated connectors to source and target systems, it enables rapid deployment and reduces maintenance costs. It supports all types of data integration, migration, and synchronization.
This course helps you use Talend Studio for Data Integration as quickly as possible. It focuses on the basic capabilities of Studio and how you can use it to build reliable, maintainable data integration tasks that solve practical problems, including extracting data from common database and file formats, transforming it, and integrating it into targets.

 

Target Audience: Anyone who wants to use Talend Studio to perform data integration tasks: software developers and development managers.

Prerequisites: Basic knowledge of computing, including familiarity with Java or another programming language, SQL, and general database concepts.

 

Description Days Price (exc vat)

Talend Fundamentals Training*

3 ZAR  USD
R 20,000 $ 2,000
  • Lunch, refreshments and training material included.
  • Class start at 9:00am for 9:30am
  • South Africa training locations: Johannesburg, Cape Town, Durban
  • Global training locations: USA, Candana, UK, Dubai, Europe

Course Overview

After completing this course, you will be able to:
    • Create a project
    • Create and run a Job that reads, converts, and writes data
    • Merge data from several sources within a Job
    • Save a schema for repeated use
    • Create and use metadata and context variables within Jobs
    • Connect to, read from, and write to a database from a Job
    • Access a web service from a Job
    • Work with master Jobs and subJobs
    • Build, export, and test-run Jobs outside Studio
    • Implement basic error-handling techniques
    • Use best practices for Job and component naming, hints, and documentation

Getting started
    • Starting Talend Studio
    • Creating a first Job
    • Running a Job
Working with files
    • Reading an input file
    • Transforming data
    • Running a Job
    • Combining columns
    • Duplicating a Job
Joining data sources
    • Creating metadata
    • Joining data sources
    • Capturing rejects
    • Correcting a lookup
Filtering data
    • Filtering output data
    • Using multiple filters
Using context variables
    • Understanding and using context variables
    • Using repository context variables
Error handling
    • Detecting and handling basic errors
    • Raising a warning
Generic schemas
    • Setting up sales data files
    • Creating customer metadata
    • Creating product metadata
Working with databases
    • Creating database metadata
    • Creating a customer table
    • Creating a product table
    • Setting up a sales table
    • Joining data
    • Finalizing a Job
Creating master Jobs
    • Controlling Job execution using a master Job
Working with web services
    • Accessing a web service
Running a stand-alone Job
    • Building a Job
    • Modifying a Job
Documenting a Job
    • Using best practices while documenting a Job
 Dimensional Concepts
    Entity Relationship modelling (ER)
    Dimensional Modelling (DM)
    Relationships between DM and ER
    Why dimensional modelling?

The Dimensional Model
    The Dimensional Model
        Facts
        Attributes
        Dimensions
    Primary, foreign and surrogate keys
    Keys
        Primary
        Foreign
        Surrogate
    Granularity of facts

Inside Dimension Tables
    Drilling down
    High quality verbose attributes
    Degenerate dimensions
    Time dimension
        Time dimension hierarchy
        Time dimension granularity
    Location dimension
    Party-role dimensions
    Large dimensions
    Mini dimensions
    Slowly changing dimensions
    Multi-valued facts
    Snowflake Schemas
Inside Fact Tables
    Type of fact
        Additive
        Semi-additive
        Non additive
    “Fact-less” fact tables
    Fact table families
        Value chains
        Heterogeneous product schema
        Aggregates
An Architectural Approach

    Data marts
        Conformed dimensions
        Conformed fact definitions
        Data mart granularity
    The data mart matrix

Building Dimensional Models

    Building the data mart matrix
    Four steps to define a data mart
        Step 1: Choose the data mart
        Step 2: Declare the grain
        Step 3: Choose the dimensions
        Step 4: Choose the facts
    Design principles
Data Quality
    Data quality improvement
    Data quality assurance