A simple Node.js web scraper using website-scraper to download an entire website.
Make sure you have Node.js installed on your machine.
-
Clone the repository:
git clone https://github.com/Bahrul-Rozak/url-to-code.git
-
Navigate to the project directory:
cd your-repo-name
-
Install dependencies:
npm install
-
Open
index.js
in your preferred code editor. -
Set the
websiteUrl
variable to the URL of the website you want to scrape.const websiteUrl = 'https://example.com';
-
Customize other options if needed (e.g.,
maxDepth
,directory
, etc.). -
Run the scraper:
node index.mjs
-
Check the
./result
directory for the downloaded website.
urls
: An array of URLs to scrape.urlFilter
: A function to filter URLs. The example filters URLs that start with the specifiedwebsiteUrl
.recursive
: Iftrue
, the scraper will follow links recursively.maxDepth
: Maximum recursion depth.prettifyUrls
: Iftrue
, URLs will be prettified.filenameGenerator
: File naming strategy, set to'bySiteStructure'
in the example.directory
: Output directory for the downloaded website.
- website-scraper for providing an easy-to-use web scraping library.
Happy downloading! 🕸️