AWS Big Data Blog

Implement a Real-time, Sliding-Window Application Using Amazon Kinesis and Apache Storm

Rahul Bhartia is an AWS Solutions Architect

Streams of data are becoming ubiquitous today – clickstreams, log streams, event streams, and more. The need for real-time processing of high-volume data streams is pushing the limits of traditional data processing infrastructures. Building a clickstream monitoring system, for example, where data is in the form of a continuous clickstream rather than discrete data sets, requires the use of continuous processing rather than ad-hoc, one-time queries.

Developers can use Apache Storm and Amazon Kinesis to quickly and cost-effectively build an application that continuously processes very high volumes of streaming data. To help developers integrate Apache Storm with Amazon Kinesis, earlier this year we launched the Amazon Kinesis Storm Spout. Last week we released an update to the Spout to support Ack/Fail semantics. With this update, the Spout now re-emits failed messages up to the configured retry limit, making it easier to build reliable data processing applications. The updated Amazon Kinesis Storm Spout is available on Github.

Check out the white paper to learn how the entire stack works all the way from ingestion to visualization, and look at our github repository  to view further instructions on how to build and deploy it yourself.

If you have questions or suggestions, please leave a comment below.

Do more with Amazon Kinesis!

Processing Amazon Kinesis Stream Data Using Amazon KCL for Node.js

Hosting Amazon Kinesis Applications on AWS Elastic Beanstalk

Snakes in the Stream! Feeding and Eating Amazon Kinesis Streams with Python