Inflow’s pentaho etl kettle online tutorial

Did you know that more than 60% of a Business Intelligence project is about data warehouse and the ETL (data integration)? the customers wants it, don’t you want to learn how to deliver?

The course is an outcome of 5 years developing successful projects.
it is not a regular average (havy) course with boring technical details about functionality of the pentaho kettle.(Well, it also does that but not just that).
It includes hands on and my personal tips on how to handle the customer requests, also when to consider getting into tasks that will be hard to accomplish.

Of course that is if the customer understands the task (meaning: the money he is going to spend) and the complexity of it and you agree that this is the right solution. then the course would help you deliver.

Tip #1 : there is always more than one way to solve data integration with pentaho, in fact you don’t need it to be the best way, it just need to work with synchronization with the customer needs. Also I put in a lot of case studies, examples, real life scenarios and my personal opinion on that matter. (You don’t have to agree).
I tried to add some humor and sarcasm about the way customer ask for data integration requirements and how we as developers solve staff.

“What I need is very simple, do you see that man? do you see the moon? Just put the man on the moon”

yeh right… by the way the budget is 1.99$. which pentaho kettle provider you like to be

How the course was built: I separated the course to four sections.
I built it in such way that you can jump to a specific step you want to learn (5if you already an expert on pentaho kettle)
and see the example or if you’re a newbie (sorry to call you that – but you are / everybody once was)
then you can go step-by-step while gaining knowledge and expertise.


concepts of data integration This section deals with questions like:

  • what is data integration
  • when do we use data integration
  • data warehouse structure consideration for business intelligence. (At the end the idea is to extract the data not just to load it)
  • what other data integration tools are there and why we decided to work with pentaho kettle

in case you’re already a developer you can skip the concept and go to the next section.

Software to install and that I recommend using for better data integration flow. I go over some 10 installations that you’ll need, including the database example.

hands on real scenario with pentaho kettle walk through MySQL database has several examples of real life scenarios, I have chosen two of them in order to take an origin (square) database and develop a data integration solution with all the steps needed, then load it to the target(round). In this section (most of the course) I take you step-by-step from easy and understandable features to more complex scenarios. The beauty of it is that you can go over all the flow from the beginning to end or use it as a dictionary and look at it as an example for specific pentaho kettle step.

    Some of the subjects:

  • Understanding pentaho kettle environment
  • Working with variables
  • Repository
  • Connection
  • Jobs
  • Transformations
  • input steps (tables , files)
  • handle files
  • slow changing dimension
  • handling text / string
  • sorting / merging / lookup
  • mapping
  • calculations
  • output steps (table output , insert update, update, dimension lookup…)
  • Scripting steps
  • handling datatypes

and many more… I also added five case studies of using steps outside our scenario because I thought it would help you understand/improve your abilities.

going to production There are steps you need to develop in order to make your data integration stable and reliable some of the steps are:

  • Validating data
  • error handling
  • logging
  • performance
  • scheduling

click here for more information about pentaho kettle at wikipedia