-
Notifications
You must be signed in to change notification settings - Fork 7
Finding Data (including legacy versions)
#Basic Overview
This page assumes you know the legacy Perseus document name and/or standard common name for a particular primary source file and are trying to find it's URN and/or location in the CVS repo, or the other way around. For secondary sources the process is much the same, but please also see the wiki page dedicated to secondary sources.
Some resources that are most frequently helpful in finding files and document names include:
-
The P4 to P5 Text Migration Spreadsheet
While the P4 to P5 Text Migration Spreadsheet is mostly deprecated with respect to tracking files (for more on current ways of tracking files, see Tracking Data), it can still be a good place to look if you're trying to find a particular text. -
The Perseus Catalog
The atom links for particular editions or works are frequently the most helpful. You can find them under "Alternate Representations" in the upper righthand corner after you've clicked on a particular author, work, or edition. You can also find them directly by URN. Here is an example. There is also an ATOM Response API that can be programmatically searched; see here for more information. -
The CVS repo
You'll need to have a Tufts UTLN to gain access to the CVS repos with legacy data, or ask someone with access to find the file for you. If you choose to do the latter, find out the CVS file name from The P4 to P5 Text Migration Spreadsheet, and enter an issue in the appropriate Git Repo. -
The Perseus canonical repos
_The Perseus canonical repos are kept in the PerseusDL Github account, organized by language of the original edition. For more on the canonical repos, see the next section.
#Canonical repos
PerseusDL/canonical was the first public GitHub repository home for the TEI XML texts of the Perseus Digital Library.
As our strategy for working with the texts through GitHub evolved and as the repository grew, we decided to move the texts to individual repositories subdivided by the CTS namespace to which the texts have been assigned. In general, namespace corresponds to language of original transmission for the work.
Greek works are now in http://github.com/PerseusDL/canonical-greekLit
Latin works are now in http://github.com/PerseusDL/canonical-latinLit
Anglo-Saxon works are now in http://github.com/PerseusDL/canonical-angLit
Italian works are now in http://github.com/PerseusDL/canonical-itaLit
Norse works are now in http://github.com/PerseusDL/canonical-norseLit
Farsi works are now in http://github.com/PerseusDL/canonical-farsiLit
For now you can find secondary sources and reference works in http://github.com/PerseusDL/canonical-pdlrefwk but this is subject to change. For more on this, see Secondary Sources.
If you are unsure of where to find a work you are interested in, please use the Perseus Catalog. The Catalog interface prominently displays the CTS URN for each edition or translation of a work. The filestructure in our GitHub repositories for texts currently adhere to the following structure:
canonical-NAMESPACE/data/TEXTGROUP/WORK/TEXTGROUP-WORK-VERSION.xml
More information on the CTS identifier structure of the Perseus texts can be found in the Catalog documentation.
Note that all GitHub file locations are subject to change, and that URLS to the GitHub files should NOT be used as Permanent Stable Identifiers for the Perseus texts. Information on where and how to find stable identifiers for the Perseus texts is provided at
http://sites.tufts.edu/perseusupdates/beta-features/perseus-stable-uris/ and http://sites.tufts.edu/perseuscatalog/documentation/user-guide/catalogdata-uris/
Got questions that aren't answered on any of these pages or their links? See Questions and Decisions, Asking and Reaching.