How to beat application DDoS attacks with CrowdSec & Cloudflare
Distributed Denial-of-service (DDoS) attacks have been targeting all types of businesses over the past few years. They have been used by hackers for quite some time and are some of the most common attacks but remain extremely efficient and harmful.
The concept is simple: hackers hammer a given target from many different locations to take it down (and usually ask for money afterward as a condition to stop the attack).
There are several types of DDoS:
- L3 DDoS (referring to OSI Model Layer 3). The L3 is the network layer, and L3 DDoS typically target network equipment and connection to flood them up to the point where there are simply too many packets to handle. It’s a tremendously efficient attack, but most major hosting providers and hyperscalers are well protected against them nowadays.
- L7 DDoS (for Layer 7) directly targets applications. The goal here isn’t to shut down the network but rather the service itself, by flooding it with applicative requests (usually Web ones), leading to a resource shortage (CPU, RAM, or both).
E-commerce sites are one of the usual victims: an e-commerce site down is a site that isn’t making money. There are many ways and tools to perform this kind of attack and many layers of defense, but today we will focus on application (layer 7) distributed denial of service, L7 DDoS in short.
How do application-layer DDoS work?
Thousands of machines, behind thousands of distinct public IP addresses, send HTTP requests to a specific target. One of the key points to make the attack difficult to mitigate is how it is being distributed.
L7 DDoS attacks are complex to handle: one or two HTTP requests per source make them nearly impossible to mitigate with simple IP level remediation: banning the IP from your logs is pointless, as they are never re-used. One solution is to integrate a geolocalization aspect in counter-measures: attacks come mostly from country X and Y or Autonomous System (AS) X and Y. However, banning these given countries will mean false positives and collateral damage: legitimate users from countries where the attack is coming from would be denied access to a website, which might not be acceptable (ie. E-commerce).
But fear not, using captchas as a countermeasure rather than purely dropping the attackers would allow legitimate humans to go through and bad bots to stay outside at a reasonable cost. Let’s see how that works out with CrowdSec.
We started a t2.medium (2 Xeon cores @2.4Ghz / 4Gb RAM) with Apache2 MySQL running a WordPress with WooCommerce.
We used a third-party “booter” (read: DoS/DDoS service) to generate the attack. After launching a 20-minute L7 DDoS from this platform, we saw between one and two thousand distinct IPs being involved, and inevitably, after a minute or so, the site crashed.
To provide a bit more context, here are some metrics about one of the attacks:
CPU usage remains very high during the whole test.
Traffic decreases because the website is not accessible
During the attack, we saw:
- 70K HTTP requests
- 1,150 unique IP addresses (fairly well distributed across the IPV4 address space)
Our take on the problem
As it is very often the case with “state-of-the-art” L7 DDoS, we assume that we are only going to see one or two HTTP requests per involved attacking IP and that we need to be able to make decisions at country or AS level. And this is where the Cloudflare bouncer comes into play.
Why use a Cloudflare bouncer? Cloudflare, through its API, allows to set rules targeting not only IPs and ranges but also AS and countries, which is convenient for our use case, and the available remediations include Captchas and JSchallenges. Here is what we would like to achieve (national flags are arbitrarily pictured and do not reflect any reality):
During the L7 DDoS, countries from where the attack is mostly coming will be subject to captchas (China, Indonesia, and India in our example), while other countries (here France and Spain) will stay unaffected. However, legitimate users from countries from where attacks are coming will still go through, just by filling a captcha.
To achieve this, the CrowdSec agent will read Apache2 logs, detect the ongoing DDoS sources and emit decisions that the Cloudflare bouncer will consume to instruct Cloudflare (grey block) to block the attackers. Next, we will create a CrowdSec scenario to detect excessive traffic from a given country, autonomous system, or any given IP/range. This scenario will then apply Captcha rules to source country/AS/range/IP via the Cloudflare API. Note that to work correctly, Apache2 must be configured to deal with X-forwarded-for HTTP header to display actual attackers’ IPs and not Cloudflare ones.
Trial by fire
We installed CrowdSec (straight from the doc) along with the Cloudflare bouncer. No specific tuning was done, and CrowdSec is running on the targeted machine. Firing the same test leads to very different results.
First of all, from a CloudFlare stand point, we can see our rules are hitting spot-on.
Regarding resources consumed by the machine, the results are very different. Within 5 minutes after the attack started, resources are back to normal, and around 2 of those 5 minutes are the delay for Cloudflare to apply the rules.
Both CPU and network go back to normal within 5 minutes.
Here is a short video showing how the attack and remediation are looking.
The scenario itself is rather straightforward and will simultaneously count the number of IPs coming from the same range, country, or AS. If this number reaches a threshold, we will emit a decision of type captcha on the source country, range, or AS.
type: leaky #debug: true name: crowdsecurity/http-ddos-by-ASN description: "Detect and prevent applicative DDoS by leveraging CF-like bouncers" filter: "evt.Meta.ASNNumber != '0' && evt.Meta.service == 'http' && evt.Meta.http_status == '200' && evt.Parsed.static_ressource == 'false'" groupby: "evt.Meta.ASNNumber" distinct: "evt.Meta.source_ip" capacity: 20 leakspeed: "10s" blackhole: 5m labels: service: http type: scan remediation: true scope: type: AS expression: evt.Meta.ASNNumber --- type: leaky #debug: true name: crowdsecurity/http-ddos-by-cn description: "Detect and prevent applicative DDoS by leveraging CF-like bouncers" filter: "evt.Meta.IsoCode != '' && evt.Meta.service == 'http' && evt.Meta.http_status == '200' && evt.Parsed.static_ressource == 'false'" groupby: "evt.Meta.IsoCode" distinct: "evt.Meta.source_ip" capacity: 50 leakspeed: "30s" blackhole: 5m labels: service: http type: scan remediation: true scope: type: Country expression: evt.Meta.IsoCode
In case you are not familiar with the syntax of a scenario, let’s take a look at the crowdsecurity/http-ddos-by-cn scenario.
Here is the scenario filter on incoming HTTP requests that are not targeting static resources.
filter: "evt.Meta.IsoCode != '' && evt.Meta.service == 'http' && evt.Meta.http_status == '200' && evt.Parsed.static_ressource == 'false'"
The bucket will group events by source country and will only count distinct IPs (as the goal is to count the number of unique IPs coming from a country at a given time).
groupby: "evt.Meta.IsoCode" distinct: "evt.Meta.source_ip"
We will authorize 50 distinct IPs every 30 seconds.
capacity: 50 leakspeed: "30s" blackhole: 5m
If the scenario is triggered, it will launch a remediation against the source country.
labels: remediation: true scope: type: Country expression: evt.Meta.IsoCode
By default, CrowdSec will only process decisions that are targeting specific IPs. However, in our case, we need to be able to apply decisions targeting Autonomous systems or even countries. In order to do so, we need to edit
/etc/crowdsec/profiles.yaml to add the appropriate blocks:
name: default_ip_remediation #debug: true filters: - Alert.Remediation == true && Alert.GetScope() == "Ip" decisions: - type: ban duration: 4h on_success: break --- name: default_country_remediation #debug: true filters: - Alert.Remediation == true && Alert.GetScope() == "Country" decisions: - type: ban duration: 4h on_success: break --- name: default_AS_remediation #debug: true filters: - Alert.Remediation == true && Alert.GetScope() == "As" decisions: - type: ban duration: 4h on_success: break
Leveraging the Cloudflare bouncer
Once we have detected the attack sources, we can leverage Cloudflare features to stop the attack selectively: the bouncer will create and update firewall rules in Cloudflare via its API to protect against the ongoing attacks. We don’t want to ban legitimate users from the affected countries, so we can set up rules that will only force users to pass a Captcha. Humans will go through, and bots won’t.
A rule generated by the bouncer, exposing various countries, IPs and AS to captchas, effectively blocking bots.
While we are already pleased with these early results, it’s only the beginning. We are planning to enhance these capabilities as follows:
- The threshold defined in the scenarios will vary from one infrastructure to another: this is something that will be auto-configured by observing the average traffic by country in normal conditions to set appropriate thresholds.
- The booter service we used is somehow disappointing and unstable: the amount of IPs involved in between attacks varies but hardly passes the 1,200 marks. Have a good L7 DDoS booter service to recommend?
- Current scenarios are great to stop raw L7 DDoS, but we will improve them to address more business cases such as credit card stuffing.
Now you know how easy it is to mitigate an applicative-layer DDoS attack using CrowdSec and Cloudflare. So why don’t you take a leap?