AI For Bot Detection
October 17, 2023

How to Block Bad Bots

One of our customers recently said that they never block bad bots, as everytime they block them, they just come back anyway, often with a large, more sustained and even more malicious attack. The attitude was simply, “whatever we do, they just come back anyway, so what’s the point? At the next high-tide, our defensive sand castle will simply be washed away. 

If your only tool is a hammer, your clients become the nail

If you are relying on existing WAF, IP reputation or old school fingerprint or CAPTCHA detection only techniques, then sadly, this is probably going to be the reality for many. Blocking can really backfire. Blocking IPs, a range of IPs, and entire country or two, seems to work for a while until the bots try a new IP range or start using botnets or residential Proxies to avoid detection.  If you increase account authentication, for example, by adding in MFA for all users, you will invariably suffer from customer churn, and much greater costs of support and on-boarding. Forcing all users to take a CAPTCHA, for example, will also inevitably lead to greater customer churn and lower sales.

Having been initially blocked, the attack may be a more powerful attack than the prior attacks. The bot attacker has a job to do, and they are effectively earning money from exploiting your services. You’ve just made life harder for them, so they are probably going to re-tool and try again.

Blocking malicious bots is absolutely essential. The key is understanding which bots to block, and to understand and breakdown your entire threat by risk cohort, so you can treat each visitor cohort with the appropriate set of tailor responses according to their actual threat. The key is not blocking bots, it's knowing which bots you have to block, and improving the overall user experience of all your legitimate visitors without punishing them.

Detecting and Managing Bad Bots

At VerifiedVisitors we have a totally different approach. The clue is in the name. We first support your actual verified visitors, who are likely to be your most loyal and valuable customers. Customers usually come back to a service, and have readily identifiable patterns of behaviour. Our AI platform learns from the actual traffic on your website, and starts to build cohorts of visitors according to a whole range of threat detectors, including the actual behaviour. As we learn more about the actual verified visitors we can be gradually more trusting of this data. They have repeat visits, normal behaviour, pass our sophisticated fingerprinting and platform tests, and they have valid mouse trails. This means they start to gain our trust, even though they are still actively monitored. 

In the VerifiedVisitor portal risk surface area map, you can clearly see the VerifiedVisitor traffic portion in green. Identifying the core human visitor traffic means that we can allow our Verified users to carry on enjoying the benefits of the services, without punishing them, or treating them as if they are bots or potentially malicious. 

Risk Surface Area Cohort Monitoring - alllows you to see at a glance how to protect your services from bot attacks

Recognizing Good Bot Traffic

Next we also want to remove all the good bot traffic - the good bots such as Googlebot everyone wants indexing their website. Why? Bad bots masquerade as these good bots, and once whitelisted by the WAF can cause absolute havoc once they gain access behind the firewall. Yes, you can manually manage this in robots.txt, look up each good bot’s digital provenance, IP range, reverse its DNS, and monitor this over time, and regularly check your logs for new user agent crawlers. If you are checking weblogs and doing this manually today it’s a massive waste of time. VerifiedVisitors has over 40 categories of bots, and a recommendation engine, that you simply apply. This will build your good bot access list instantly. We even have a tool that syncs up your existing robots.txt file to ensure your stated bot policy is actually enforced, not just sign-posted in robots.txt. VerifiedVisitors uses our AI engine, to verify each good bot, ensures that it’s behaviour is consistent e.g with the real googlebot, and that it has the correct digital signature and digital provenance. Now both the good bots and the verified visitors are all managed. You can now concentrate on the bad bots.

Fake Bots

Fake bots are simply fake good bots - impersonators masquerading as a well-known legitimate services hoping they will get whitelisted. These fake bots are extremely sneaky, and it’s just so easy to add them to your whitelist if you are not constantly vigilant, or don’t use automated tools such as Verified Visitors actively in place monitoring and managing your good bot access list.

Likely Automated Category

Likely automated is the category we use for suspicious activity from visitors that looks likely to be bot driven, but it’s not absolutely definitive. For example, they may have partially failed some platform, canvas or fingerprint tests, but have passed others, and are originating in data centers. The cohort of traffic often needs further analysis, and this is where VerifiedVisitors has a range of techniques we can use to tease out this cohort into human or automated. For example, we make extensive use of:

  • Behavioural Testing: Our AI platform learns from the actual traffic patterns on your API or websites. If the likely automated traffic is displaying bot type behavior it’s much more likely that it is, in fact, a bot. 
  • Additional AI based Testing: If we are strongly suspicious that the traffic is a bot, then slowing it down and performing additional analysis is often a good technique. Afterall, no-one cares about bot latency at this point. This allows us to collect additional data and telemetry, such as more mouse trail data, or other more detailed canvas tests, so we can tease it out as definitely bot or not using additional data points.
  • Challenging the Bot: we may serve the bot a CAPTCHA to monitor if it passes or fails. Please note if the bot passes it doesn’t mean it's classified as human! It simply  means it's been further verified to establish it’s real origin. Bots often pass CAPTCHA but  leave their unique signatures - for example using CAPTCHA farms often results in a significant delay, as the CAPTCHA is passed off to a service for human completion, often overseas.
  • Honey Traps: Feeding bots fake content which the human won’t find, but the bot will pick up from crawling the page.
  • Blocking: this could be an option - particularly if the likely automated traffic is displaying behavioral signs of an attack, for example the bot may be targeting a payment gateway, API or other more sensitive area of the service..

Automated Bots

Automated bots have failed core tests and have proven themselves to be bots. Although they may not be actively targeting customer login paths, they represent a real threat. Again blocking is a good option here, as we know this traffic is definitely bot. VerifiedVisitors has dynamic rules - so once you put in a rule blocking automated bot traffic from a particular path, domain or API service, its blocked - no matter how the bot tries to rotate user agents, IP or disguise its origins.

Account Take-Over Bots

In terms of our threat levels, these account take-over bots are amongst the highest threat to your business. They are actively targeting account login paths, and are automated. This attack type is one that can’t be tolerated, You need to actively block this traffic at the network edge, before they have a chance to do any damage.Bots logging into to actual accounts represents a data breach, even if it was a lucky brute force fluke, it doesn’t matter, it’s still a data breach. 

The Importance of Bot Management

In today's landscape, where bots have become increasingly sophisticated, investing in a robust bot management solution is crucial. Such a solution should have been built from the ground up using  AI and machine learning to detect bot activities in real-time and mitigate malicious bot activities automatically.

A comprehensive bot management strategy enables you to:

  • Manage cohorts of visitors according to risk.
  • Identify verified visitors.
  • Use a recommendation engine to automate a comprehensive good bot access list.
  • Have a wide range of bot mitigations tailored for each bot threat type..

Photo by Jaime Spaniol on Unsplash

Check more blogs

Get updates on the content