Skip to content

Web Crawler 3000 v1.0 - The Ultimate Link Gobbler Edition 🕸️

Latest
Compare
Choose a tag to compare
@anima-regem anima-regem released this 07 Nov 15:10
· 2 commits to main since this release

Web Crawler 3000 v1.0 - The Ultimate Link Gobbler Edition brings you a Go-powered web-crawling experience that’s faster, smarter, and more ravenous for links than ever!

🔥 Key Highlights:

  • BFS-Driven Crawling: This release introduces true queue-based Breadth-First Search (BFS), making sure no link gets left behind (or visited twice).
  • Absolute URL Resolution: Our crawler has finally figured out how to handle those pesky relative URLs! Now it converts all links to absolute URLs, effortlessly exploring every corner of a site without wandering off-track.
  • Intelligent Link Filtering: No more crawling into “mailto:” or “tel:” links. This bot is web-only, and it’s serious about it! If it’s not HTTP or HTTPS, Web Crawler 3000 will pass.

🛠️ Improved Robustness:

  • Cycle Detection: The crawler uses a links map to keep track of where it’s been, so it won’t waste time re-visiting links or get stuck in loops.
  • Streamlined Link Queue: All links are stored in a queue, ensuring a smooth, organized crawl from start to finish.

💡 Perfect For:

  • Exploring Website Structures: Peek into the skeleton of any website, gather its links, and see the hidden web.
  • Testing Crawling Logic: Want to build your own web scraper? This project is a solid foundation with easy-to-read Go code.
  • Just For Fun: Crawl, conquer, and marvel as Web Crawler 3000 does all the hard work while you watch the links flow.

Enjoy the power of web crawling in one easy-to-use, BFS-optimized Go package!