Apr 11, 2024

Network Effect x AI: Transforming CTI into Tactical Threat Intelligence

In the vast landscape of cybersecurity, the significance of IP addresses cannot be overstated. Unlike domain names and hashes, IP addresses play a pivotal role in facilitating remote attacks, serving as the launching pad for cybercriminal operations worldwide. However, they also present a unique opportunity for defenders.

In this article, we delve into the realm of IP-based cyber defense, exploring how innovative approaches are reshaping the battleground against malicious actors and transforming CTI into crowd-powered Tactical Threat Intelligence (TTI).

The opportunity

Contrary to domain names and hashes, IP addresses aren’t available in infinite numbers. They were, are, and will be used to carry most remote attacks for as long as we can foresee.

AI-based attacks, most phishing attacks (apart from SMS-based ones), exploitations, brute force, and scouting techniques rely on IP addresses. They are the rockets carrying the payload from the cybercriminal homebase toward their targets. Interestingly enough, they also have a higher associated cost for rental and maintenance than domain names, making them a durable indicator of compromise (IoC).

If an IP supporting intrusive operations is burnt, relocating those tools or reorganizing those ops, impacts a non-zero cost on cybercriminals. The more IPs are known, tagged, and burnt, the higher the cost becomes for the attackers.

The problem

As IoC, IP addresses also come with cons. Namely, it’s hard to know who owns and uses them, if they are under the custody of an end user, a corporation, an MSP, an MSSP, and what level of diligence is applied to them. It’s equally complex to know if they are related to a 4G, DSL, Fiber, Hosting farm, a VPN or residential proxy system, etc.

The solution

The CrowdSec Network of Security Engines now spans 3000+ autonomous systems in 180+ countries, tens of thousands of users, and hundreds of thousands of workloads. It pumps north of 10M reports daily, offering us a unique Network Effect to leverage. To date, the exclusivity — the % of IP addresses known to CrowdSec and no other CTI vendor — is above 30% in all contexts and can reach >60% even for very sophisticated actors.

CrowdSec edits a Security Engine, under MIT license, to identify dangerous behaviors in any type of logs and take remediations against IP addresses carrying those attacks. Fundamentally, the Security Engine operates as an expert system, confronting facts (logs) and rules (behavioral scenarios).

Acting as a global interferometer, all Security Engines share the IPs, their behaviors, and the timestamp of their last attacks with CrowdSec. With that data in hand, the CrowdSec consensus engine generates a real-time map of IPs used by cybercriminals and shares them with users to generate proactive, 0 false positive blocklists.

The CrowdSec Data

Explore CrowdSec’s fail-proof approach to tactical intelligence and learn how CrowdSec guarantees unmatched data curation.

Learn more

Network Effect x AI, transforming CTI into TTI

Currently, the software relies on static rules, defined to detect abuse in the most common services exposed on your machine (SSH, web firewall, etc.). Once detected and remediated, the alert is then shared on the Crowdsec servers containing the malevolent IPs and the nature of the attack detected. To reward this contribution, users are provided with the Community IP Blocklist, containing the IPs that are the most reported among our Network. This blocklist is constantly evolving and gets updates in near real-time. Any data curated enough to be injected in an edge filtering component is a method of utilizing Tactical Threat Intelligence.

At each of these steps, we are heavily developing AI to improve the overall system performance. Ultimately, CrowdSec will transition from an Expert System to an ML/DL-driven decision-making system.

First, we introduced statistical Bayesian modeling in the Crowdsec Security Engine to let the users configure scenarios based on rules and the probability of events happening. For users, this opened the door to craft a vast amount of customizable scenarios with various levels of sensitivity and trust and apply graded responses.

Currently, we work on assisting the detection step with AI to cover a larger amount of threats and adapt to ever-changing attack patterns. This allows Crowdsec to protect users from 0-day vulnerabilities without the need to wait for a detection rule to be crafted. To achieve this, we train an anomaly detection model that parses and analyzes each part of a web request using natural language processing. The model is trained on a constantly evolving dataset of attacks in a way that minimizes the number of false positives. Eventually, it will be embedded into the Crowdsec engine – so that the detection remains fast, efficient, and privacy-preserving for the user. In the future, we envision the users fine-tuning the models based on the traffic they report.

AI is also a first-class contributor to Crowdsec’s TTI. We use machine learning to sort, analyze, and classify the 12 million signals of attack received every day, to bring more context to the attacks detected by the crowd. We developed a system able to classify an IP address as a VPN or a Proxy and redistribute this information through a specific blocklist our users can subscribe to. We also aim at detecting wider attack patterns and botnets, by analyzing threat actors’ interactions on our datalake.

We use Graph Network Analysis to isolate criminal IP targeting a specific vertical or organization from common actors targeting a wider part of the Internet. This information is crucial for our users as it helps them to save time, by ignoring alerts linked emanating from the Background Noise of the Internet, while being more proactive to specific threats. We also experiment with embeddings and high-dimensional search engines to let the users build custom blocklists and let them find IP addresses that are reported and share common patterns.

Combining graph embeddings into larger AI models will allow us to detect more complex attack patterns on a larger timescale. It would help to better lock on isolated IP addresses with low signal-to-noise ratio, like the one used sequentially to achieve a global task. For example, IP1 can be used to scan/scout, IP2, coming from another range to bruteforce credentials, and IP3 (from yet another range) to exploit vulnerabilities. Even though those three addresses might be under the custody of the same cybercriminal group, those events may appear unrelated for an individual machine. However, at the scale of the CrowdSec Network, those patterns will allow us to attribute those three IPs to the same adversarial entity, allowing users to preemptively block IP2 and IP3 as soon as IP1 knocks at their digital door.

The end game

With these models, we intend to detect what we call Global Offensive AIs (GO.AI), which are the combination of multiple Large Language models, trained to achieve all the tasks from target scouting, to breaching and evasion. Research has proved that these AIs can hack websites without human feedback. GO.AI is the aggregation of several sub-agents trained on specific datasets like whitepapers, CVE databases, CTF logs, server logs, and WAF rules, acting in common to achieve a final goal. Those AIs can scan the exposed surface and launch a global coordinated attack on the internet-facing part of a corporation in seconds.

It would be really difficult to identify one actor acting for a GO.AI in a different phase. However, their techniques, behaviors, and timings correlated with the CrowdSec Network, analysis should be dismissed by defensive AI models, which were trained to detect cohorts of similar behaviors and attack patterns on a large number of samples provided by our network.

To sum up

In the ever-evolving landscape of cybersecurity threats, we cannot emphasize enough the importance of proactive defense mechanisms. Initiatives like the CrowdSec Network Effect are transforming the way we combat cyber threats. By harnessing the power of AI, we’re not only detecting and neutralizing attacks but also paving the way for a more resilient future.

We hope you found this exploration into IP-based defense strategies insightful and, of course, stay tuned and join us on our journey to explore different methods to defend against the evolving threat of Global Offensive AIs (GO.AI). Together, we can bolster our defenses and safeguard the digital realm against adversarial incursions.