Robots.txt parser / generator
-
Updated
Sep 18, 2018 - TypeScript
Robots.txt parser / generator
Front-end workflow to start a new project with Eleventy and Webpack.
Generates a robots.txt
The repository contains Google-based robots.txt parser and matcher as a C++ library (compliant to C++17).
🚫🤖 Override /robots.txt to disallow all web crawlers, regardless settings stored in the database. Compatible with Liferay 7.0, 7.1, 7.2, 7.3 and 7.4.
This is a python crawler that disregards robots.txt rules and downloads disallowed resources
A tool for debugging robots.txt
🌐 Displays the contents of robots.txt and sitemap.xml files of a website google extension
🤖 Robots.txt generator done right.
A lightweight and simple robots.txt parser in node
A simple python program which find out any website robots.txt file.
A lightweight crawler frontier implementation in TypeScript using Redis.
This is a collection of robots.txt templates
'noindex' is a movement for drawing soft boundaries on internet for search engines and generative AI crawlers.
A small, tested, no-frills parser of robots.txt files in Swift.
Sitemaps and Robots.txt for websites around the world.
Fully native robots.txt parsing component without any dependencies.
A simple to use multi-threaded web-crawler written in C with libcURL and Lexbor.
Add a description, image, and links to the robots-txt topic page so that developers can more easily learn about it.
To associate your repository with the robots-txt topic, visit your repo's landing page and select "manage topics."