MUT

Market Understanding Tool

About

This project is intended to make a pipeline of data analysis about opportunities for data science career announced at Indeed. However, this pipeline can classify job opportunities of whenever sector, beyond data science.

This pipeline generates a .html file with:

Clusters 2D Graph

Clusters Keywords Ranking

TF-IDF Ranking

Check the "Brazillian Data Science Jobs Market: A Deep Analysis" on the web!

Project Details

Folders

Folder	Description
db/	Folder where your Scrapy database will be saved
output/	Folder where your graphs and results will be saved

Files

ARGS	USAGE
[db-title]	It is your Scrapy database title (e. g., datascience_db)
[urls-file]	It is your Indeed URL filename (take a look at sample.urls)
[toxicwords-file]	It is the filename of list of words for not use in the analysis (take a look at sample.toxicwords)
[num-clusters]	Number of clusters to identify, in a range (e. g., 2-8) or single (e. g., 8)

Requirements

Paraphrasing The Beatles: " All you need is docker 🐳 "

Install

1. Clone this repo 🍕

git clone https://github.com/HelioNeves/mut.git
cd /mut

2. Basic building 🔧

docker build . -t mut

Running this awesome docker image

1. Load ubuntu layer 🌈

docker run -ti --name MUT-env mut /bin/bash

2. Once inside ubuntu, run pipeline python scripts 🐍

Scrapy

python3 scraper.py [db-title] [urls-file]

Analytics app

python3 app.py [db-title] [toxicwords-file] [num-clusters]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MUT

Market Understanding Tool

About

Project Details

Folders

Files

Requirements

Install

1. Clone this repo 🍕

2. Basic building 🔧

Running this awesome docker image

1. Load ubuntu layer 🌈

2. Once inside ubuntu, run pipeline python scripts 🐍

Scrapy

Analytics app

Files

README.md

Latest commit

History

README.md

File metadata and controls

MUT

Market Understanding Tool

About

Project Details

Folders

Files

Requirements

Install

1. Clone this repo 🍕

2. Basic building 🔧

Running this awesome docker image

1. Load ubuntu layer 🌈

2. Once inside ubuntu, run pipeline python scripts 🐍

Scrapy

Analytics app