Preemptively blocking malicious IPs is not just good for your security posture, it’s also good for your wallet.
In this article, I’ll explain how you can track remediation metrics using your CrowdSec Security Engine and how you can estimate the actual cost savings enabled by the CrowdSec Blocklists.
Remediation Component metrics
With the release of CrowdSec 1.6.3 we added support for Remediation Component metrics. Supported Remediation Components will collect metrics on how many blocks they serve and break down these metrics by the source of the block. Currently, both the Nginx and Firewall Remediation Components support this feature and we’re working on adding it to the rest of our growing collection.
To enable the Remediation Component metrics, simply upgrade it to the newest version.
For instance, if you’re using the iptables Remediation Component on Debian, you can use the following command:
sudo apt install --upgrade-only crowdsec-firewall-bouncer-iptables
After your Remediation Component has been upgraded, the metrics can be checked using the cscli or on the CrowdSec Console.
cscli metrics show bouncers
Depending on the type of Remediation Component used, you will see different metrics displayed. The iptables Remediation Component used above will display packets and bytes blocked, while the Nginx Remediation Component will display requests blocked.
This happens because the two Remediation Components block at a different layer in the OSI 7-layer network model.
While these metrics are interesting by themselves, they are hard to translate into cost savings that a business would care about.
So, let me walk you through the process we use to turn numbers from the world of packets and requests into numbers from the world of euros and dollars!
What is the actual value of a blocked attack?
For this section we start with the Nginx Remediation Component because the request metric it tracks is easier to reason about. So, imagine we have a Remediation Component metric output that tells us we blocked 42 requests in the past hour.
Now, naively, we could say the value of the Remediation Component, in this case, is that it saved us 42 log lines in storage and 42 times our average webpage size in egress traffic. However, with this naive view, we would be committing a statistical fallacy that is called survivorship bias in literature.
The name of this fallacy comes from investigations into airplane armor during the Second World War. During the war, engineers in the US were given the task of improving the armor of the bombers they sent over Germany. As armor is heavy, they needed to prioritize where to put extra plates. Plotting the locations of all the damage seen by the repairmen in the US onto a plane, you get this famous picture.
A naive approach would now be to improve the armor in places with a lot of red dots, however, the best place to put more armor is the spots without any red dots. The reason for this is that the planes that the repairmen see are the ones that got damaged and still managed to come back. If we assume the German air defense hit planes in a more or less random fashion, then the white spots on the plane are places that prevent the plane from returning if they do get hit.
In our case, a similar bias exists in the metrics collected by the Remediation Component.
Say an attacker wants to try 6 different attacks before moving to the next server. If they are blocked on attempt number one, they move on instantly. So, if we block them, the reported Remediation Component metric will be one request, but the actual saving would be 6 requests.
So, to get the actual traffic savings we had to estimate the number of tries an attacker will usually do before moving on to the next victim. For this, we hosted a honeypot instance running juicy software for attackers to try (WordPress) and checked through the logs.
Over the three months of running this experiment, we detected about 23000 distinct attacks. After tossing out outlier attackers — like the one bot who hit the same endpoint about 5000 times — we computed the average number of requests per attack which is around 10 requests.
This leads us to our first data point.
Estimate #1: The average attacker will try to get in about 10 times before moving on to the next target.
Based on our estimate we can now say that one blocked request would produce 10 requests to our application if we were not blocking it. As a webserver like Nginx or Apache logs every request that reaches it, we can quickly deduce our next estimate.
Estimate #2: The average attacker will produce about 10 lines of log data before moving on to the next target.
A natural next question to ask here is how much space these log lines will take up. The reason for this is that storing logs is both a regulatory requirement in most places and something that both hardware vendors and cloud providers will charge money for.
Logging a simple request in the common log format used by web servers such as Apache or Nginx takes up about 170 bytes of storage on disk, which would result in 1.7kB of storage saved per blocked request. However, most of the time, these logs will be compressed.
This is a cost saving measure that is very efficient for web server logs because there is a lot of duplication in the data. The compression ratio of HTTP logs can vary quite a bit, but usually it is quite high so we assume a 90% compression for these logs. This leads us to our next estimate.
Estimate #3: The average attacker will produce about 0.17 kB of logs to be stored.
Aside from the cost of log storage, cloud providers usually also charge on egress from your webserver. This means that any responses sent to the malicious client also incur a cost on the business side.
To give an estimate for this, we rely on data collected by the HTTP archive. The HTTP archive provides a record for all web performance information, such as the size of a webpage and the technologies utilized. We use its page weight report to calculate the size of responses generated by an average webserver.
For the total weight of a website, the archive calculates an average of about 2.6MB. This means that when someone opens a browser tab and navigates to a website such as
crowdsec.net
they can expect about 2.6MB in data from the webserver. However, for our use case, this would be a significant overestimation. When a browser opens a website, it resolves and downloads all the embedded content as well, such as images, fonts, and CSS.
When an attacker, usually an automated bot, opens a website, it usually just downloads the HTML without resolving the embedded content. This means that the stat we care about is the size of the average HTML document. According to the HTTP archive, the average HTML document is 33.8kB. Taking into account our factor of 10 attacks per block, this yields our next estimate.
Estimate #4: The average attacker will produce about 340kB of egress traffic.
Now we have all the metrics needed to turn the metrics exposed by the Remediation Component into metrics our business cares about — or do we?
An Experiment in Security Cost Efficiency
Security must be smart and cost-effective. To explore this balance, we conducted an experiment using CrowdSec.
See the resultsAddendum: From packets to requests
In the previous sections we have assumed that our remediation happens on the application level. This means that each block represents an actual, fully formed HTTP request. However, in the example metrics we provided in the beginning our Remediation Component only shows a statistic for the number of packets that are dropped. This is not the same metric.
This difference comes from the fact that most firewalls don’t block at the application level, instead a firewall usually works by simply declining TCP connections from being established. This means that when an attacker Alice tries to connect to the server Bob and Bob’s firewall has banned Alice, Alice will not get a response to her request to open a TCP connection.
However, as the internet is fundamentally unreliable and packets could just be lost in transit, the TCP client of Alice will usually retry automatically. For linux machines, the number of retries is specified by the tcp_syn_retries
value of the kernel and has a default of six. Therefore, when we try to move from packets to requests, we assume that each attack will be worth about seven packets.
Applying our formulas to the metrics above now gets us the following statistics.
Within the last seven days our CrowdSec Security Engine has prevented around 13,400 attack attempts, saving us around 2MB of compressed server logs and a good 4.5Gb of egress traffic.
Not too bad for our little WordPress honeypot, is it?
Hope you found this breakdown interesting and don’t forget — we are safer together!