This is the repository of my internship project at the DDL lab in Lyon.
The created model mesure gender bias in Indeed french job offers.
For more information about the research methodology and for questions regarding collaboration, please contact: bennadjifella@yahoo.fr & marc.allassonniere-tang@mnhn.fr
we scraped our data from Indeed Canada . This data helped us calculate gender inequality in job offers
- In order to scrap data from Indeed Canada and Indeed France, please refer to Scraping_ca.R and Scraping_fr.R under Scrap folder
- To analyse data, please refer to Second_model.R under Models. We have another model less performing under First_model.R
- Data_raw contain our raw data: Indeed_search_CA.csv and manually annotated data under qualitative result.csv
- We evaluated our model using new data under Indeed_director.csv we scraped using New_scrap.R
For this project, we used the following:
main tools :
main libraries:
- Tidyverse
- Dyplr
- Stringr
- Xml2
- Rvest
- Ggplot2
- party
To read the report for this project, please click here.
For the presentation : https://view.genially.com/6294c70801cdc900183e3a78/dossier-reporting-internship-report
The classification rule-based model of this project will be replaced in the futur by a ML model.