Soodud is a webapp that scrapes data from online stores with Python and then uses C++ hierarchical cluster analysis to form comparable products between stores that are stored in PostgreSQL. The resulting data is served by a Django REST API and then processed by a TailwindCSS & React frontend.
CI/CD is implemented through Github Actions and Docker Compose. Nginx & fail2ban are used to compress/cache/serve static files, provide rate limiting, and detect malicious bots. All commits are ran through flake8 and other pre-commit filters.
- Clone the project.
- Create a valid
.env
file based on.env.example
. - In order to contribute, first install the required git commit hooks with
cd django && pipenv run pre-commit install
.
- Install dependencies using
cd client && npm install --dev
andcd django && pipenv install
. - Build the C++ project and move
clustering/out/clustering.(so|pyd)
into thedjango/data/stores/
directory. - Start the webpack dev server using
cd client && npm run server
- Start the Python virtual environment with
cd django && pipenv shell
. - Start the Django dev server with
tools/start_server.sh
. - To scrape new product data and form updated product clusters, run
tools/run_service.sh launch
andtools/run_service.sh match
respectively.
- If this is your initial configuration, temporarily disable HTTPS in
nginx/nginx.conf
by commenting out theinclude
. - Run Docker Compose with
tools/compose.sh
. - Create a new cronjob with
tools/cron.txt
as a reference. This will ensure that the product database is updated once a day.