
Block Bad Bots Using .htaccess

It is astonishing to think that 2012 was the year that traffic generated by automated bots and spiders on the internet outgrew human traffic. Since then, bots and spiders have only increased in their virility and use.
Whilst many bots are 'good' bots that do things like look through your site to calculate your position in search engine indexes or help you find the cheapest deal on your car insurance, many others are far less benign and more hostile: probing your website for security flaws or systems they can exploit.
If you have ever had a WordPress website hacked (and really ‑ who hasn't), it would inevitably have been a bot, rather than an actual human mortal, doing the hacking.
If you spot bots in your server history that are behaving oddly; maybe trying to access different variations of admin or wp URLs on your site in the hope of finding a log‑in, or just simply clogging your website down with irrelevant traffic, there are steps you can take to banish them.
The important thing to bear in mind here is that these solutions rely on the User‑Agent string. A really nefarious bot developer would likely change this often so you may lose the battle with an insistent android, but for the most part, these steps will help.
robots.txt
The first step is inside robots.txt. This is a file in the root of your domain which politely tells bots whether you would really rather they gave your website a skip. It's a little like those 'no soliciting' stickers your grandparents have on their front door, and about as useful.
Nevertheless, here is an example of how to block a bot called 'CuteStat' in your robots.txt file:
User-agent: CuteStatDisallow: /This simply says "If you are a bot called CuteStat, you are not allowed anywhere beneath the root of this domain". This is actually a genuine example ‑ CuteStat are incredibly annoying, but at least they do pay attention to this disallow...
You can see my robots.txt file here if you are interested. As I mentioned though, these are really only useful to 'good' robots who actually pay any attention to the robots.txt standard. Like those religious doorstep visitors who you simply cannot stop interrupting your tea time, bad robots, and bad robot developers will simply ignore it.
.htaccess
Your second option is a little more technical, and will only work if you're on an Apache server with access to .htaccess. Here, you can query a visitor's User‑Agent string and determine whether or not to allow them access to the site:
SetEnvIfNoCase User-Agent .*ahrefsbot.* bad_botSetEnvIfNoCase User-Agent .*dotbot.* bad_bot<Limit GET POST HEAD> Order Allow, Deny Allow from all Deny from env=bad_bot</Limit>Here, we are setting a variable called bad_bot based on whether the User‑Agent contains specific strings, and then allows everybody to access the site, unless that variable is true.
I've left a couple of bot examples in the code block above, but you could create a list as long as necessary simply by replicating the first line and changing the bot name, for example:
SetEnvIfNoCase User-Agent .*Go-http-client* bad_botOne quick word of warning: the more directives you have in your .htaccess file, the more CPU and memory your server needs to use to serve your website. Consequently, this can ‑ in theory ‑ slow your website down and reduce the TTFB.
That said, the difference at this level is imperceivable, just don't put hundreds and hundreds in there!
Related Articles

What Does Front‑End Development Mean? 
Reducing Image Brightness with CSS. Reducing Image Brightness with CSS

Modified Binary Search: Solving 'Search in Rotated Sorted Array'. Modified Binary Search: Solving 'Search in Rotated Sorted Array'

Caching Strategies in React. Caching Strategies in React

Leveraging .then() in Modern JavaScript. Leveraging
.then()in Modern JavaScript
Using the filter() Method in JavaScript. Using the
filter()Method in JavaScript
Dynamic Routes in Next.js. Dynamic Routes in Next.js

Ethical Web Development ‑ Part I. Ethical Web Development ‑ Part I

Resolving mini‑css‑extract‑plugin Warnings in Gatsby. Resolving
mini‑css‑extract‑pluginWarnings in Gatsby
Access CSS Variables from a Database via db‑connect. Access CSS Variables from a Database via
db‑connectWeb Development and the Environment. Web Development and the Environment

Optimising Next.js Performance with Incremental Static Regeneration (ISR). Optimising Next.js Performance with Incremental Static Regeneration (ISR)