InternSync

A Playwright based web scraper to scrape internships from Internshala, written in Python. Data is stored in a CSV file.

Disclaimer

This is for educational purposes only. I am not responsible for any misuse of this code.

Features

Store data in a CSV file
Store data in a Google Sheet (optional)
Keeps the Google Sheet data synced using GitHub Actions (optional)

Prerequisites

Make sure you have the following dependencies installed:

Playwright library for Python
Python 3.x

you can install them using the following commands too:

pip install playwright && playwright install chromium

Usage

Clone the repository
Install the dependencies using pip install -r requirements.txt
Run the script using python main.py

Optional steps (for Google Sheets mode only):

Create a new Google Sheet
Create a new project in Google Cloud Platform
Follow this guide for setting up the Google Sheets API
Download the JSON file and add all the credentials to the .env file (refer to .env.example)
Get the Google Sheet ID from the URL e.g https://docs.google.com/spreadsheets/d/GOOGLE_SHEET_ID/edit
Add to GOOGLE_SHEET_ID to the .env file

Optional steps (for syncing Google Sheets using GitHub Actions):

GitHub Actions are already setup in the repository
Download GitHub CLI or add secrets manually to the repository from .env file
With GitHub CLI run gh secret set -R <your-username/your-repo> -f .env

Options

--headful: Run the script in non-headless mode (show the browser)
```
python main.py --headful
```
--gs: Run the script in Google Sheets mode (store data in Google Sheets)
```
python main.py --gs
```

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
data.csv		data.csv
main.py		main.py
requirements.txt		requirements.txt
sheet.py		sheet.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InternSync

Disclaimer

Features

Prerequisites

Usage

Options

About

Releases

Packages

Languages

rohit1kumar/internsync

Folders and files

Latest commit

History

Repository files navigation

InternSync

Disclaimer

Features

Prerequisites

Usage

Options

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages