Skip to content

Code to download a dataset and filter it according to its metadata

Notifications You must be signed in to change notification settings

EllieBennett91/ORACC-download

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

ORACC-download

This code allows you to download data from the Open Richly Annotated Cuneiform Corpus (ORACC).

This project is code that I have remixed from Niek Veldhuis' Computational Assyriology (Compass) project. I have not written the original code, but I did put two part of his code together and add filters in order to produce a dataset from ORACC I required for my project.

In the same folder you download this file, you need a folder called "output". This folder will be where your saved data will be saved to.

The result of the code is a table where every line represents a text, and every word is represented as a lemma. The lemmas are represented as lemma[guideword]POS. Data for this is taken from ORACC, where you can find more information regarding these terms. The texts included are filtered according to their metadata, which is also provided by ORACC.

The code is full of comments, but if you have any issues please contact me eleanor.bennett@helsinki.fi.

Licensing

You are free to use, reuse, and remix this code, but please credit myself (Ellie Bennett) and Niek Veldhuis.

(CC BY 4.0)

About

Code to download a dataset and filter it according to its metadata

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published