
Block Bad Bots Using .htaccess
Block bad bots with .htaccess and robots.txt, including why unwanted crawlers waste bandwidth, where rules help, and what to treat with caution too.
Articles
.htaccess is a configuration file used by Apache‑based web servers. It gives us as site administrators control over various settings without having to alter the server configuration directly.
Below you will find a subset of articles from my blog specifically about htaccess. Although this is a topic I've been working with for many years, it's fair to say that I've not written about it often. I've only managed to publish two articles about it, which you can see and read below.

.htaccessBlock bad bots with .htaccess and robots.txt, including why unwanted crawlers waste bandwidth, where rules help, and what to treat with caution too.

urllist.txt from sitemap.xmlUsing PHP it is quick and easy to automatically generate your urllist.txt sitemap from your sitemap.xml file (for example, using gatsby‑plugin‑sitemap).