FAQ

What is CrowdSec all about?

We’re a company founded by privacy-aware infosec professionals. We want to make the world a better (=safer) place. Read more about our background here. Our manifesto explains the basics. We believe in and trust in the community. We want it to work together for the benefit of everyone (except the cyber criminals, obviously).

Who can benefit from a privileged access to the CrowdSec IP reputation database?

Certain experts can get free, read-only, access to our database, under a number of conditions defined by us. If interested you should contact us and be able to prove you belong in one of the following categories:

Professional security researcher
Student in data science or security
Others interested in doing research on our data. Contact us to discuss

Before accessing the data, a waiver and an NDA will have to be signed. Professional categories that aren’t listed can still contact us to ask for access, authorizations will be granted if the purpose of the study is both ethical as well as legitimate.

Are you planning to open source the consensus engine?

Yes. Our objective is to provide a tool that is as open-source as it can be. Open-sourcing our consensus will take a bit of time, but it is definitely on the flight plan. We will share more news in due time.

What is CrowdSec’s relationship with Fail2ban?

We have a deep admiration for fail2ban’s work and are in contact with some of its contributors. Cyril started it as a Python exercise for himself, then many others made it the default security component we all know. (We do not name them here, to avoid forgetting anyone, but also to preserve the ones not willing to get exposure, but you all know who you are) Nevertheless, Fail2ban was created 16 years ago, based on Python. CrowdSec capitalizes on its philosophy, but the company behind it provides more work power and a long-term sustainable model, allowing high-profile developers and security experts to dedicate themselves 100% to this software. Also, years apart, we make different choices and adopt newer models. Like the decoupled approach, a faster language (Golang), an inference engine, Yaml & Grok, IPV6, API first approach, multi-layer awareness, a hub to find your configurations, IP reputation, multi-OS compatibility, etc. Whatever future awaits CrowdSec, the team extends its greetings to this formidable piece of software & team, that has written part of security history on Unix hosts.

How can I unban myself?

Anyone can unban themselves. If the IP behind attacks was cleared from the security breach that probably led to nefarious actions, we have no reason to keep it in our database forever. It could also happen that the Consensus unrightfully evaluated an IP as dangerous. But we cannot ignore the fact that hackers themselves may want to unban themselves. That is why the first removal will be made within 24h. The second query to unban the same IP though will take more time. And the third one even more time. This is made to prevent hackers from unbanning themselves too easily. The Captcha required will also chill out attempts to clear many IPs in an automated way. Any IP that wasn’t spotted for at least 72 hours will be automatically be cleared without the need for admins to unban themselves. You can access the unban page right here to remove your IP from CrowdSec’s blocklist.

How can I know that CrowdSec won’t block NAT IPs?

We highly recommend users to always take the “softest” remedy, here is why. Any IP can be used to give access to a large number of users. Think of a large corporate network allowing 35 000 users to surf the net through 4 proxies for example. If you ban one IP, you could block some of the 34 999 legitimate users to just stop one hacker and that would be overkill. Also, some IPs are used by CGNAT (Carrier Grid Nat) or in variable IP pools. Users behind those IPs are not always the same and blocking the IP is not a real option neither an efficient remedy. To avoid these problems, CrowdSec uses several mechanisms. One of them is to only keep an IP for 72h in our database. If this IP hasn’t shown any sign of further aggressivity with this timeframe, we consider it has been cleaned or that it was a variable IP, and it’s removed from the blocklist. You can choose, as a user, to use smarter ways than just purely drop the connexion with your firewall. If you protect your home network from scans, it’s not going to harm anyone if you ban this IP in your firewall, but if you run an e-commerce website, for example, you may want to be more careful. Depending on which technology you use, you could send a Captcha, reduce user rights, send a mobile factor authentication, slow down the connexion, etc. Bouncers come in all shapes to cover a lot of various use case, choose them wisely, and please, use the least aggressive remedy that will keep your assets safe.

How can I make sure to avoid poisoning and false positives when using CrowdSec?

Every network member (watchers sharing their signals) gets a trust rank (TR). By consistently sending back valuable and exact information, the TR gets better over time. A daemon reporting for months, with 100% accuracy, valuable information will eventually reach the maximum TR. Feeding the system with wrong information would result in a severe and immediate loss of TR. This mechanism is made to avoid poisoning. All TR can partake in the consensus, but only the highest TR rank can publish to the database without needing validation from our own honeypot network. It nevertheless has to pass the test of the Canary list, meaning the IP reported shouldn’t be one of the canary. Canaries are in fact whitelisted IP, known to be trustworthy, like the Google bot, Microsoft updates, etc. If a scenario is too sensitive or twitchy, it might shoot a canary. This mechanism is made to avoid false positives. A ML algorithm will (soon) be trained on our honeypot network logs to further rule out false positives and also highlight low noise attacks, like IP working in a coordinated fashion, but where some of them aren’t directly violating rules. (like doing a basic port check before another one compromises a machine). All those mechanisms (and more to come) contribute to what we call the Consensus chamber (Consensus in short), where the decision is taken to either ban the IP responsible for an alert or not.

Is CrowdSec GDPR compliant?

We constantly enforce three principles:

Collect the minimum possible data, only one in this case: the aggressive IP. (Time & scenario are not private data per se)
We only keep them only for the necessary period of time
Anyone will be able to remove its IP from our database (but it can be reintroduced automatically if the IP was not cleaned, and the cooldown between request is getting exponentially longer)

Lawyers are also contracted to review our policies and validate that our processes are GDPR compliant. We will, very soon, release more details and legal work around those points, as well as update our website’s footer & cookies to be fully compliant with international regulations. More information about our compliance to GDPR can be found in this dedicated section.

Can I have more information about data ownership?

The “Watcher tier” consists of people using the software, sending us their signals and, in return, benefiting, for free, from the global, curated, IP reputation database. We send them, at very regular intervals, a list of IP addresses considered dangerous for them, that they can safely ban or regulate in any way they see fit. The IP blocklists and global database belong to us, but a full, unlimited right to use is granted to the user, as long as they share the IP they block through CrowdSec instance(s). The data sent by each instance of CrowdSec is only made of a timestamp, the scenario triggered, and the aggressive IP. Those data, collected worldwide on our servers, are then curated to avoid false positives and poisoning and then redistributed to each user sharing signals, based on its technological footprint (self-declared). Hence if you are running a Magento for instance, you only get IP nefarious to Magento. This avoids overloading the machine with a very large ban list but also lowers the chances of a false positive. Those curated data are CrowdSec property and a usage right is given to users receiving an IP list. It can be even used outside of the context of CrowdSec. If you use CrowdSec and share the IP blocked with us, nothing prevents you from using the ban list you receive on your SIEM or other security tools.

How is the data treated and where is it stored?

Data is treated by our online servers. CrowdSec’s team is made of former Pentesters, DevOps, SecOps, SecDevOps, some having a decade of experience in secure hosting. This doesn’t mean we aren’t error-prone, but at the very least, we have decent field knowledge and standards. Our servers are secured and maintained. Obviously, being breached would deeply damage our reputation and trust relation with our community, hence this point is not taken lightly. Even though we take all measures we think are adapted to protect our servers, would those collection servers be compromised, nothing vital transits through them. The data they gather is not sensible, confidential, or private material. We would potentially miss some signals, but that’s pretty much it. The consensus servers, the one casting whether an IP is dangerous or not, is not publicly exposed and is also severely secured to avoid any security breach. The storage of those data isn’t exposed either and only accessed through an Internal API. If the data were to be wiped out by accident or intentionally, the network would anyway quickly regenerate a consensus within a few hours. Anyway, those servers distributing the consensus (ie IP blocklist) aren’t either containing any sensitive information.

Is CrowdSec IPV6 compatible?

The software supports IPV6. Its API & bouncers as well. The IP reputation system also applies to IPV6 addresses space.

How is the collected information being used (server-side treatment and online dependency)?

Server-side treatments involve the following:

Collecting information (IP / Timestamp / Scenario) sent by the network members accepting to share them
Distributing curated IP block list (tailor-made for each, according to their choices in the back office (coming soon))

The reputation system (feeding your local daemon with IPs to block), can be deactivated and/or replaced by another source of reputation in the configuration, making the software 100% able to function in a standalone manner if you want absolutely no dependency on any online service. With the local API (LAPI, as of v1.0) Security Engines can be deployed & configured 100% offline if you want to.

What is the CrowdSec taxonomy?

Here are the keywords we use about CrowdSec:

Watcher: A user that shares the IPs he blocks using the behavior engine
Security Engine: The piece of software you can download from Github or directly in packages and execute on your Internet exposed machines
Alert: A clear behavior extracted from a log source by a scenario, upon which a bouncer can be activated
Parser: They normalize and enrich (geography, 3rd party whitelist, etc.) logs, signals & data
Data source: Allow log, signals, or data acquisition (logfile, rsyslogd, cloud trail, MQTT, etc.)
Scenario: a Yaml file describing a behavior to identify Aggressions
Bouncer: a component enforcing the decision which is setup in the Local or Online interface, it can be drop, captcha, mfa, privilege drop, rate or speed limiting, etc. It can work on any level IP/session/business logic. You could, for example, use a TPC wrapper in a IoT IDE to handle incoming connexion, or send a Captcha from your Magento if an IP is marked as dangerous.
Collection: A group of scenarii, parsers and datasource focused on a precise vertical (eCommerce), technical context (Magento) or generic template (LAMP)
Cscli: The command line tool to interact with the daemon, dashboard and database
Consensus: The group of algorithms and data source contributing to the evaluation of an Alert. This is a server-side treatment (explained further in this FAQ), made to avoid false positives & poisoning.

Can I have more information about the CrowdSec software license?

CrowdSec is licensed under MIT open source license, you can find a copy of the text here: “Copyright 2020, CrowdSec SAS (http://crowdsec.net), Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.” That’s all there is to it. Like in the case of Debian, you can do anything you want with it, for free, period. You just need to embed this license when you redistribute the product. Note that this licensing policy is applicable only to the CrowdSec software as distributed through GitHub or CrowdSec’s website or inherited from an initial installation from those sources. Notably excluded are the data received through the software, which are subject to the CrowdSec EULA https://booking.crowdsec.net/crowdsec-eula, which notably prohibits to distribute or market the data (IP addresses, scenarios,…) in any way whatsoever, whether free of charge or for a fee.