advertools - online marketing productivity and analysis tools
-
Updated
Sep 30, 2024 - Python
advertools - online marketing productivity and analysis tools
A simple and flexible web crawler that follows the robots.txt policies and crawl delays.
A set of reusable Java components that implement functionality common to any web crawler
Ultimate Website Sitemap Parser
The robots.txt exclusion protocol implementation for Go language
Simple robots.txt template. Keep unwanted robots out (disallow). White lists (allow) legitimate user-agents. Useful for all websites.
A simple but powerful web crawler library for .NET
Determine if a page may be crawled from robots.txt, robots meta tags and robot headers
Tame the robots crawling and indexing your Nuxt site.
A pure-Python robots.txt parser with support for modern conventions.
Gatsby plugin that automatically creates robots.txt for your site
Parser for robots.txt for node.js
.Net Core Plugin Manager, extend web applications using plugin technology enabling true SOLID and DRY principles when developing applications
NodeJS robots.txt parser with support for wildcard (*) matching.
Open-Source Python Based SEO Web Crawler
Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation to outsmart website bots and prevent blocking.
An Astro project template for decent projects: auth, i18next, Bootstrap, sitemap, webworker, robots.txt, preact, react, endpoints, endpoint clients, OAuth, various Astro features and data loading preconfigured
Java sitemap generator. This library generates a web sitemap, can ping Google, generate RSS feed, robots.txt and more with friendly, easy to use Java 8 functional style of programming
Add a description, image, and links to the robots-txt topic page so that developers can more easily learn about it.
To associate your repository with the robots-txt topic, visit your repo's landing page and select "manage topics."