Bot Detection Accuracy Matrix

How does VerifiedVisitors validate zero trust at the network edge? How do we ensure we are constantly tracking progress to zero trust?

VerifiedVisitors uses Machine Learning to verify incoming traffic authenticity. In Machine Learning the validation metric most commonly used to verify model results over time is a confusion matrix.

Confused? Let’s make sense of this in the simple diagram below showing how the matrix works:

Bot Confusion Matrix

Let’s take a flip of a coin. We predict heads or tails. Our prediction is either right or wrong. So there are only four possibilities for each of our predictions vs. what actually happened.

We call heads. The result is heads. It’s a true positive. We call heads, the result is tails. It’s a false positive.

Obviously, we then have the exact inverse.

We call tails, the result is tails. It’s a true negative. We call tails, it’s heads. It’s a false negative.

So, now we’ve understood the confusion matrix, how does this apply to bots?

Bot Detection Rates

When we predict the correct results for humans and bots as showing in the green arrow, we’re in a great place. Instead of forcing all users into a rigorous authentication process, legitimate users pass through seamlessly, we get better traction, users are happier and all the bots are stopped.

Let’s look at what happens when things go wrong, following the false results red arrow as shown above.

The false positive is when we predict it’s a bot, but it actually proves to be human. In this case, we are certainly creating an issue for our human visitors. We are accusing a potential customer of not being human, and may even ask them to complete a secondary verification process, or even a CAPTCHA.

One problem with using CAPTCHA as a feedback loop, is of course that bots do solve captcha. Human CAPTCHA farms are now pretty common, and have even been incorporated into a “bots as a service” offerings.

However, a very low false positive rate isn’t something you want to boast about for another major reason. It only tells you one half of the story, and in this case it really is a shaggy dog story. Why?

Looking again at our confusion matrix, it’s the false negatives that cause the real damage. In this case our system says the visitor is human. We trust that visitor, but it’s really a bot.

The vast majority of bot detection vendors do not have a robust confusion matrix model, as they aren’t using Machine Learning at the heart of their detection model. They aren't looking at the whole picture.

When we develop our models we want to prioritise detection of false negatives over false positives. Why? The reason is simple, if we challenge a small additional percentage of humans we don’t breach our zero trust. When we label a bot as a human and create the false negative, we need to avoid that at all costs. The false negatives create the real problem, as we’ve breached our zero trust principles, and allowed a bot access to our protected space.

Let’s take a real world example so this is clear. We all remember the dreaded PCR tests. What we want to avoid is telling a human the test proves they don’t have COVID when they do. Armed with the false knowledge, of course they can then go onto spread disease in the community.

The false positive - when they have failed the PCR test but don’t have COVID is at least safe for the community, as hopefully responsible people will want to isolate and recover at home.

How are the false positive and false negative rates related? No model is perfect, and in fact if we get perfect result, it’s usually the sign of a model overfitting error.

However, it’s usually the case if your prioritise accuracy in one area of the matrix, for example the false positives, it will then impact on accuracy in another area. This isn’t a linear process, it varies completely between model to model and the datasets, and there are plenty of examples when this generalisation isn't true. The fact remains that concentrating on lowering the false positive rate means making it easier for humans to pass, but that in turn makes it easier for the bots. The bots have to emulate the human behaviour.

What's clear is the more humans we challenge, the more labels we get right, the more our model improves.

When we label the bot as a human, the bot doesn’t tell us.

It carries on being a bot, and it hides in the data set, incorrectly labelled as human. At VerifiedVisitors we prioritise the false negatives, using our Red Team and threat intelligence to continually try and break our models.

So how can our customers trust us? Do we have the best false positive rate? Do we have the best false negative rate?

Our response is quite simple. Please don’t trust us.

We have a zero trust model for a reason.

Our “playback” features allows our customers to see exactly which visitors were blocked and why, so they can validate and independently check at their SOC, or use a SIEM or other analysis tools they may have.

Headline figures such as 99.99% effective rates, or 0.01% false positives rates, really don't mean anything. Just because we are 99.99% effective for all bot detection across all customers is meaningless. That 0.01% could be the bot that’s currently hitting your website.

Adopting a zero trust model, means we offer our customers a systematic way of measuring and validating bot traffic. We show both the false positive and the false negative ratios in our reports that can be incorporated into their own KPIs and reporting.


Photo by Markus Spiske on Unsplash

Check more blogs

Get updates on the content