- feat(sync): subscribe to page updates to perform async handling of data
- feat(js): add init of script parsing
- "feat(worker): add tls support"
- chore(request): add custom domain redirect policy
- chore(glob): fix glob crawl establish
- chore(crawl): fix crawl asset detection and trailing start
- feat(fs): add temp storage resource handling (#112)
- feat(url-glob): URL globbing (#113) thanks to @roniemartinez)
- chore(request): fix resource success handling
- feat(proxies): add proxy support
- feat(decentralization): add workload split
- perf(crawl): add join handle task management
- perf(links): add fast pre serialized url anchor link extracting and reduced memory usage
- perf(links): fix case sensitivity handling
- perf(crawl): reduce memory usage on link gathering
- chore(crawl): remove
Website.reset
method and improve crawl handling resource usage (reset
not needed now ) - chore(crawl): add heap usage of links visited
- perf(crawl): massive scans capability to utilize more cpu
- feat(timeout): add optional
configuration.request_timeout
duration - build(tokio): remove unused
net
feature - chore(docs): add missing scrape section
- perf(req): enable brotli
- chore(tls): add ALPN tls defaults
- chore(statics): add initial static media ignore
- chore(robots): add shared client handling across parsers
- feat(crawl): add subdomain and tld crawling
- perf(links): filter dup links after async batch
- chore(delay): fix crawl delay thread groups
- perf(page): slim channel page sending required props
- feat(regex): add optional regex black listing
- chore(bin): fix bin executable #17
- feat(cli): add cli separation binary #17
- feat(robots): add robots crawl delay respect and ua assign #24
- feat(async): add async page body gathering
- perf(latency): add connection re-use across request #25