Skip to content

GlobalFishingWatch/paper-forced-labor-responsible-ml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

86 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

paper-forced-labor-responsible-ml

This repository is a companion to the manuscript “Towards a responsible machine learning approach to identify forced labor at sea”, from Rocío Joo et al.

Reproducibility

For reproducibility purposes, we created a script with all the reproducible analyses of the paper. Due to confidentiality agreements, the negative cases used for validation cannot be shared, so we modified the code to run it without them. Since there are differences in how Mac and Linux/Windows handle random seeds when using the ranger package. For that reason, we have copied our results in both Mac and Linux for user comparison in the script. The data (with anonymized vessels) to run the script is here.

Original codes for data processing and analysis

  • These cannot be run without access to Global Fishing Watch tables in Big Query.
  • First step: Run queries to match tables of vessel information and compute movement patterns.
  • Second step: Process the data to be in the right format for the model.
  • Third, fourth and fifth steps: run sensitivity analyses for the number of bags, the hyperparameter values of the random forests, and the number of initial random seeds.
  • Sixth: With those optimal values, run the model, do predictions, compute performance and fairness
  • Seventh: Run an additional analysis of the ports used by those predicted as positives.