Introducing On-Demand Pipeline Execution in AWS Data Pipeline

Marc Beitchman is a Software Development Engineer in the AWS Database Services team

Now it is possible to trigger activation of pipelines in AWS Data Pipeline using the new on-demand schedule type. You can access this functionality through the existing AWS Data Pipeline activation API. On-demand schedules make it easy to integrate pipelines in AWS Data Pipeline with other AWS services and with on-premise orchestration engines.

For example, you can build AWS Lambda functions to activate an AWS Data Pipeline execution in response to AWS CloudWatch cron expression events or AWS S3 event notifications. You can also invoke the AWS Data Pipeline activation API directly from the AWS CLI and SDK.

To get started, create a new pipeline and use the default object to specify a property of 'scheduleType":"ondemand”. Setting this parameter enables on-demand activation of the pipeline.

Note: Activating a running on-demand pipeline cancels the current run of the pipeline and starts a new run of the pipeline. Check the state of the current running pipeline if you do not want activation to cancel a running on-demand pipeline.

Below is a simple example of a default object configured for on-demand activation.

{ 
 "id": "Default",
 "scheduleType": "ondemand"      
}

The screen shot below shows an on-demand pipeline with two Hadoop activities. The pipeline has been run three times.

Check out our samples in the AWS Data Pipeline samples Github repository. These samples show you how to create an AWS Lambda function that triggers an on-demand pipeline activation in response to CreateObject (new file) events in S3 and how to trigger an on-demand pipeline activation in response to AWS CloudWatch cron expression events.

If you have questions or suggestions, please leave a comment below.

----------------------------

Related:

How Coursera Manages Large-Scale ETL using AWS Data Pipeline and Dataduct

 

Looking to learn more about Big Data or Streaming Data? Check out our Big Data and Streaming data educational pages.

 

Comments