Internet-facing web applications are frequently scanned and probed by various sources, sometimes for good and other times to identify weaknesses. It takes some sleuthing to determine the probable intent of such exploit attempts, especially if you do not have tools in place that identify them. One way you can identify and block unwanted traffic is to use AWS WAF, a web application firewall that helps protect web applications from exploit attempts that can compromise security or place unnecessary load on your application.
Typically, exploit attempts are automated. Their intent is to collect information about your web application, such as the software version and exposed URLs. Think of these attempts as “reconnaissance missions” that gather data about where your web application might be vulnerable. To find out what is vulnerable, these exploit attempts send out a series of requests to see if they get any responses. Along the way, these attempts usually generate several error codes (HTTP 4xx error codes) as they try to determine what is exposed. Even normal requests can generate these error codes, but if you see a high number of errors coming from a single IP address, this is a good indication that somebody (or something) might not have good intentions for your web application. If you are delivering your web application with Amazon CloudFront, you can see these error codes in CloudFront access logs. Based on these error codes, you can configure AWS Lambda to update AWS WAF and block requests that generate too many error codes.
In this blog post, I show you how to create a Lambda function that automatically parses CloudFront access logs as they are delivered to Amazon S3, counts the number of bad requests from unique sources (IP addresses), and updates AWS WAF to block further requests from those IP addresses. I also provide a CloudFormation template that creates the web access control list (ACL), rule sets, Lambda function, and logging S3 bucket so that you can try this yourself.
This solution expands on a recent blog post by my colleague Heitor Vital who provided a comprehensive how-to guide for implementing rate-based blacklisting. I use the same concept in this post, but I block IP addresses based on the number of HTTP 4xx error codes instead of total requests. This solution assumes you have a CloudFront distribution and already are familiar with CloudFormation.
The following architecture diagram shows the flow of this solution, which works in this way:
- CloudFront delivers access log files for your distribution up to several times an hour to the S3 bucket you have configured. This bucket must reside in the same AWS region as the Lambda function and where you create the CloudFormation stack for this example.
- As new log files are delivered to the S3 bucket, the custom Lambda function is triggered. The Lambda function parses the log files and looks for requests that resulted in error codes 400, 403, 404, and 405. The function counts the number of bad requests temporarily storing results in current_outstanding_requesters.json in the configured S3 bucket.
- Bad requests for each IP address above the threshold that you define are blocked by updating an auto block IP match condition in AWS WAF.
- The Lambda function also updates CloudWatch with custom metrics for counters on number of requests and number of IP addresses blocked (you could set CloudWatch alarms on this as well).
Lambda function overview
Here's how the Lambda function is composed:
The relevant Python modules (shown in the following screenshot) are imported for use.
The next section (shown in the following screenshot) contains some configurable items such as error codes (line 29) for which to search, and the OUTPUT_FILE_NAME for storage of the function’s output in S3. The LINE_FORMAT parameters determine the format of the log; the settings are for CloudFront, but the function may be modified for other log formats.
The following list outlines the main functions and their purpose:
- def get_outstanding_requesters – Handles the parsing of the log file.
- def merge_current_blocked_requesters – Handles the expiration of blocks.
- def write_output – Writes the current IP block status to current_outstanding_requesters.json.
- def waf_get_ip_set – Gets the current AWS WAF service IPSet.
- def get_ip_set_already_blocked – Determines if the IPSet is already blocked.
- def update_waf_ip_set – Performs the update to the AWS WAF IPSet.
- def lambda_handler – Reads the values configured in CloudFormation such as S3 bucket, and updates CloudWatch metrics.
Deploying the Auto Block Solution—Using the AWS Management Console
Let’s get started! In the CloudFormation console:
- Ensure you select a region where Lambda is available. See the Region Table for current availability.
- Click Create Stack, and then specify the template URL: https://s3.amazonaws.com/awswaf.us-east-1/block-bad-behaving-ips/block-bad-behaving-ips_template.json. Click Next.
Type a name for your stack as well as the following parameters (also shown in the following screenshot):
- S3 Location – The stack will create a new S3 bucket where CloudFront log files are to be stored.
- Request Threshold – The number of bad requests in a one-minute period from a single source IP address to trigger a block condition. The default value is 50.
WAF Block Period – The duration (in seconds) for which the bad IP address will be blocked. The default value is 4 hours (14400 seconds).
- Optional: Complete the creation of the stack by entering tag details. Click Next.
- Acknowledge that you are aware of the changes and associated costs with this stack creation by selecting the check box in the Capabilities section. Click Create.
Testing the Auto Block Solution
After the stack has been created, you can test the Lambda function by copying this test file into the S3 bucket that was created. You can also test the function by copying your own CloudFront logs from your current logging bucket. In the Lambda console, click the Monitoring tab and then click through to view the logs in CloudWatch Logs. You will also notice that the S3 bucket now contains a file, current_outstanding_requesters.json, which details the IP addresses that are currently blocked. This is how the Lambda function stores state between invocations.
If you are satisfied the auto block solution is working correctly, you may want to configure a CloudFront distribution to store log files in the new bucket by using the CloudFront console. As the logs are processed, you can verify the Lambda function is running correctly in the Lambda console. Additionally you can use the AWS WAF console to check on current IP blocks. In the AWS WAF console, you will see the web ACL named Malicious Requesters, and an Auto Block Rule linked to an IP match condition called Auto Block Set. You will also see a Manual Block Rule to which you can add IP addresses you want to block manually.
Finally, to enforce the blocking, configure AWS WAF in CloudFront by associating the web ACL with your CloudFront distribution. In the CloudFormation console select Distribution Settings on the distribution you wish to enable for AWS WAF, and modify the AWS WAF Web ACL setting, as shown in the following screenshot.
Remember that 4xx errors also can be returned in response to normal requests. Determine which threshold is right for you, and ensure you do not have any missing pages or images somewhere on your site that are generating 404 errors. If you are not sure which threshold is right for you, test the rules first by changing the rule action to Count instead of Block, and viewing web request samples. When you are confident in your rules, you can change the rule action to Block.
This blog post has shown you how to configure a solution that automatically blocks IP addresses based on their error count. If you have ideas or questions, submit them in the “Comments” section below or on the AWS WAF forum.