pentaho kettle etl tutorial materials
software and books you should consider having
books
book #1: “pentaho kettle solution”
created by: Matt Custers – who also developed kettle before it was bought by pentaho, Roland Bouman, Jos von Dongen.
I use the book as a reference , when i develop its like a guide to specific things , i bought it as PDF and i just
search before i go and try to find more info online.
From my point of view this is the “Bible” of pentaho kettle.
I think that my course will give you most of the material you need, but to have the complete picture i
recommend buying the book too(and i did). you will need it for the day by day development.
it covers some complex materials:
Unstructured data handling , web services ,xml , Clustering, big data and more
I want to compliment Matt casters for developing pentaho kettle, this is unbelievable for one person to achieve on his own .
Book #2 : pentaho solutions
written by: Roland Bouman, Jos von Dongen
it covers the full pentaho solution and has some parts of kettle as part of the suite.
Book #3 : “pentaho 3.2 data integration”
by Maria carina Roldan it’s a beginning guide to pentaho kettle
book #4: The Data Warehouse Toolkit
author Ralph kimball – The “bible” of dimension modeling , explaining the 34 guidelines to develop an optimize data warehouse
in our case the “target” meaning load of the data.
a must book if you intend to be a top natch ETL developer
softwares
materials for the course
tool | description | link |
---|---|---|
mysql | The Database we will use for source and DWH DB | click here |
jre | Java Runtime Environment | click here |
navicat | Best Database manager for Mysql and more | click here |
notepad++ | Text Editor | click here |
sakila database | sample rent movie DB | click here |
pentaho PDI | pentaho data integration – a.k.a kettle | click here |
expresso | regular expression development tool | click here |
power architect | Data Modeling & Profiling Tool | click here |
date file | The date dimension |