This project is a student pair-matching system aimed at matching local students with incoming international students based on their common interests, courses, and faculties. This can be used by higher institutions for an effective buddy system where international students are paired with a local student of the same interest.
the best way to use this project is though a docker container as it will ensure that all dependencies are installed and the script is run in a controlled environment. To use the docker container, follow the steps below:
- Create a directory on your local machine where you want to store the project files. You can do this by running the command below:
mkdir esn-buddy-matcher
cd esn-buddy-matcher
- Create a docker-compose.yml file in the directory you created above. You can do this by running the command below:
touch docker-compose.yml
- Open the docker-compose.yml file in a text editor and add the following content:
services:
buddy-matcher:
image: daiigr/buddy-matcher:latest
container_name: buddy-matcher
volumes:
- ./config:/config
- ./input:/input
- ./output:/output
- run the following command to start the docker container:
docker compose up
- the first time you run the container, the necessary directories will be created in the
config
,input
, andoutput
directories. You can then add the necessary files to theinput
andconfig
directories once the files are added, you can start the docker container again by running the command below:
docker compose up
- Clone the project to your local system. You can clone using the command below:
git clone <Repository-URL>
cd path-to-folder
- Run the bash shell script
run.sh
on your terminal. This script will install necessary dependencies and execute the main Python script. Use the command:
./run.sh
-
Ensure you have
python3
installed on your system. If not, you can download and install python from here. -
Clone the project to your local system. You can clone using the command below:
git clone <Repository-URL>
cd path-to-folder
- Set up the virtual environment. If not typically installed, you can install by running the command:
python3 -m venv .venv
- Enter the following command to activate the environment:
- On MacOS/Linux:
source .venv/bin/activate
- Install the required dependencies:
pip install -r requirements.txt
- Once the above steps are done, run the bash shell script
run.sh
on your terminal. This script will install necessary dependencies and execute the main Python script. Use the command:
./run.sh
- if you prefer to run the script manually, you can run the following command:
python3 src/main.py
-
The script will process the data from incoming and local students from the
input
folder. It will use hobbies from theconfig/hobbies.csv
and faculty distances fromconfig/faculty_distances.xlsx
. -
All results will be output to the
output
folder. If outliers are detected, two separate reports will be generated: a buddy pair report ignoring outliers, and one including outliers. If there are no outliers, only one report is generated.
-
The
input/local_students.csv
andinput/incoming_students.csv
files contain the information about the local and incoming students, respectively. The columns in these files are self-explanatory and contain relevant information needed for the match-making process. -
The hobbies are read from
config/hobbies.csv
and it's a simple list of hobbies. -
The
config/faculty_distances.xlsx
file contains distances between faculties at the school setting. This information helps delivering a more refined match-making results. -
The
config/local_students_column_renames.csv
andconfig/incoming_students_column_renames.csv
help map the input column names to a standard form, facilitating data processing. practically speaking, they map one set of header names to another so that they match for processing
-
Output files are saved in the
output
directory. -
If outliers are detected, two separate reports will be generated. One is a 'matching_report_no_outliers.csv', which contains a buddy pair report ignoring outliers, the other 'matching_report_with_outliers.csv' will include outliers. Each row in this file represents a buddy pair, with relevant matching information included.
-
If there are no outliers, only a single report 'matching_report.csv' is generated.
Please note, each time the script is run, a new output file is created with the timestamp in the filename to avoid overwriting previous results. Please make sure to review the latest file for the most recent results.
That's it! You've now successfully setup and run the ESN Buddy Matcher.