Jobs are used to coordinate ETL activities
such as:
- Defining the flow and dependencies that control the linear order for the transformations to run.
- Preparing for execution by checking conditions such as, "Is my source file available?" or "Does a table exist?"
- Performing bulk load database operations.
- Assisting file management, such as posting or retrieving files using FTP, copying files, and deleting files.
- Sending success or failure notifications through email.
For this part of the tutorial, imagine that an external system is responsible for placing your sales_data.csv input in its source location every Saturday night at 9 p.m. You want to create a job that will verify that the file has arrived and then run the transformation to load the records into the database. In a subsequent exercise, you will schedule the job to run every Sunday morning at 9 a.m.
The following steps assume that you have built a Getting Started transformation as described in Step 1: Extract and load data of the tutorial.