Installation

pip install -m requirements.txt
Spacy requires the following:
- python -m spacy download en_core_web_sm
Download the main dataset here. Replace it in the data directory.
Finally, you can run jupyter lab in order to view the code.

Data exploration

NOTE: make sure to run all scripts (i.e., python files) from their specific path in the terminal.

First, you need to run the retrieve_OCM_labels.py script. It will create two json files that are needed for data exploration.
The Data Exploration & Processing notebook is ready to be executed. This may take some time, since we validate the the rows' language is English. You don't need to run it, since I ran all cells and saved the en_data.csv in the data directory.
- In case you want to run it, please make sure to download the fpsc3 dataset and place it in the data directory.
The Chosen Categories and Cultures Distribution notebooks explore the categories that would potentiall be chosen in my thesis, and the cultures within the eHRAF database given these categories.
All images will be saved within the exploration directory.

Training

NOTE: both models are executed and the results are the same as listed in my paper. So, no need to run them and wait for training, however, feel free to do so.

The Model 112 (Training102) notebook contains all needed code to train the models. The dataset specified is the data/en_data.csv.
- The results will be saved as pickle files.
In the Model 113 (Analysis, hidden cues, text) notebook, I analyze the results of the model by reading the results saved during training.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
Archived		Archived
data		data
exploration		exploration
output		output
utils		utils
.gitignore		.gitignore
Model 112 (Training102).ipynb		Model 112 (Training102).ipynb
Model 113 (Analysis, hidden cues, text).ipynb		Model 113 (Analysis, hidden cues, text).ipynb
Readme.md		Readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Data exploration

Training

About

Releases

Packages

Languages

hasan-sh/masters-thesis

Folders and files

Latest commit

History

Repository files navigation

Installation

Data exploration

Training

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages