Skip to content

Latest commit

 

History

History
108 lines (80 loc) · 12.7 KB

README.md

File metadata and controls

108 lines (80 loc) · 12.7 KB

Tyndale House STEPBible Data Repository

Data created for www.STEPBible.org by Tyndale House Cambridge with CC BY-NC 4.0
(the code for STEPBible.org is on a separate licence)

This licence allows...

This public licence allows you to:

  • Include any part of STEPBible-Data in free software or free publications without requesting permission  
    (Though we'd love to hear from you about your project when you make it available.)
  • Request permission if your project is not free. A reasonable request is unlikely to be refused.
  • Download the data and reformat it for your application, without changing the data itself.
    Any changes to the formatting, order and application can be made without needing to record it
  • Make changes to the data and record the differences
    You can make corrections or report possible errors to be checked at TyndaleStepATgmail.com Any changes made to data should be recorded and made available to subsequent users.
  • Refer others to this repository as the source of the data.
    Updates or corrections are easier to implement when the data is distributed from a single source. You are welcome to make a mirror, so long as it is kept up-to-date and has a link back here.

And you should:

Tyndale House is...

Tyndale House is an international Biblical Research Institute. About 50 scholars at a time work there, often from more than a dozen countries. Most work on private research, and some work on special projects for Tyndale House, such as this one.

The data in this repository is created and curated collaboratively by Tyndale scholars, directed by David Instone-Brewer with the oversight of Peter Williams, and by their successors.

The repository aims to provide reliable and freely usable data for studying the Bible without any denominational or doctrinal bias. Much of the data is derivative from other publically licenced sources, and has been compared with other non-public sources so that differences can be checked by Tyndale scholars. Corrections and proposed updates are welcomed - please send them to TyndaleStepATgmail.com for checking.

Datasets available

The following datasets are already posted

  • Bible modules for OSIS Sword software Bibles in the same format as Crosswire modules which can be used in any Sword-compatible software.

  • TTESV - Tyndale Translation tags for ESV
    Tags for Greek & Hebrew Extended Strongs (compatible with original Strongs) for the translated text of the ESV.

  • TOTHT - Tyndale OT Hebrew Tagged text
    The Leningrad codex based on Westminster via OpenScriptures, with full morphological and semantic tags for all words, prefixes and suffixes. Semantic tags use the extended Strongs linked to BDB by OS, is backwardly compatible with simple Strongs tags and includes all affixes (as defined in TBESH). Morphological tags are from ETCBC converted to the format of OS (similar to Westminster) with different morphology for Ketiv/Qere when needed.

  • TANTT - Tyndale Amalgamated NT Tagged texts
    Greek text created from the SBLGNT+apparatus, following the decisions made by NA28, listing the major editions that also use that form (SBL, Treg, TR, Byz, WH, NA28). Variants are being added from major editions plus the 1st 4 centuries of MSS (from Bunning). All words are tagged lexically (extended Strong linked to LSJ) and morphologically (Robinson based on Tauber plus a few missing details) plus context-sensitive meanings for words with more than one meaning. For copyright reasons, any words, variants or punctuation that occur only in NA27 and/or in NA28 are omitted, so that this data cannot be used to reconstruct those texts.

  • TBESH - Tyndale Brief lexicon of Extended Strongs for Hebrew
    Abridged BDB linked to extended Strongs (compatible with OpenScriptures and backwardly compatible with original Strongs)

  • TBESG - Tyndale Brief lexicon of Extended Strongs for Greek
    Brief definitions for all Greek Bible words (NT, LXX, Apoc, & variants) using corrected Abbott-Smith when available, completed with other similar definitions. Backwardly compatible with original Strongs.

  • TIPNR - Tyndale Individualised Proper Names with all References
    Every name in the Bible, linked to all Hebrew & Greek forms of that name and separated into individual people & places. Each form of the names for each individual includes exhaustive refs for where that individual is named with data of their spouses, siblings and offspring or the places' geolocation (based on OpenBible).

  • TVTMS - Tyndale Versification Traditions with Methodology for Standardisation: Eng+Heb+Lat+Grk+Others
    All the versification differences in the OT traditional texts in Hebrew, Latin and Greek, and NT early versification, compared with English standard (defined by NRSV which is virtually identical to KJV). Bible translations have an almost infinite variety of versifications because they may follow (for example) Latin in several sections, Hebrew in a few and English most of the time. The Methodology provides simple rules for every section, such as "if this chapter has 29 verses, it is using Greek versification". Using this, a whole Bible can be reversified according to English or traditional Hebrew or Greek or Latin versification, or compared with Bibles using that versification.

  • TEHMC - Tyndale Expansion of Hebrew Morphology Codes
    Hebrew morphology codes with expanded explanations in terms of parsing, meaning and example. The codes are based on OpenScripture which is similar to the Westminster code system used in BibleWorks and other commercial software. They include extra codes which occur in STEPBible data which distinguishes sequential perfectives, gentilics, gender/location for personal pronouns, and non-Jussive/Cohortative as well as Jussive/Cohortative & possibly-Jussive/Cohortative forms.

  • TEGMC - Tyndale Expansion of Greek Morphology Codes
    Greek morphology codes with expanded explanations in terms of parsing, meaning and example. The codes are based on Robinson, developed for the Majority text and used in most open-source texts. They include extra codes which occur in STEPBible data which distinguishes persons in possessive and reflexive pronouns, 2nd forms of verbs, and distinctions between deponant forms and ambiguous passive/middle.

Datasets coming

The followins datasets are still being finished and/or being checked. If you see data that you have need of which isn't yet available, please contact us and perhaps you can become part of the checking process.

  • TOTGT - Tyndale OT Greek Tagged text
    LXX text with later Ecclesiastical variants. The base text is Rhalfs with variants from the Apostolic Bible (based on Sixtine, Aldine and Complutensian texts). Both have been tagged to LSJ (compatible with extended Strongs) and most of morphology has been tagged (based on CCAT) but variant tagging need completing.

  • TFBDB - Tyndale Formatted full BDB lexicon
    Full BDB formatted for easy reading (all bibliographic data hidden as hover-text) linked to extended Strongs (compatible with OpenScriptures and backwardly compatible with original Strongs)

  • TFLSJ - Tyndale Formatted full LSJ lexicon
    Full LSJ entries for all Bible words (NT, LXX, Apoc & variants), formatted for easy reading (all bibliographic data hidden as hover-text) linked to extended Strongs (backwardly compatible with original Strongs).

  • TOTMM - Tyndale OT Manuscripts and Meanings
    Translation, Hebrew form and witnesses for each variant that affects the meaning of the text, as determined by Barthélemy's UBS committee. Also, alternate meanings found in standard translations. Shown as alternate renderings of a base text (ESV 2011).

  • TNTMM - Tyndale NT Manuscripts and Meanings
    Translation, Greek form and witnesses up to 400 AD for each variant that affects the meaning of the text, as determined by the UBS apparatus. Also, alternate meanings found in standard translations. Shown as alternate renderings of a base text (ESV 2011).

Data format

Data is in plain unicode text (UTF-8) with fields separated by tabs, so that they can be loaded into any text editor or spreadsheet.

  • To open in spreadsheets, (e.g. Excel): In Github, click on the file, then "Download" then Save (Ctr+S) to your drive. In Excel "Browse" for it using "All Files" (not "All Excel Files") and open it. When asked, select "Delimited", "Tab", "General".

  • By default, datasets are one-line records, so a Record ends with a NewLine, and each line has identical fields.

  • Some datasets have multi-line records. Records are separated by a line starting with "$". The first line is a Header with fields that apply to each subsequent subRecord line. SubRecord lines all start with a tab.
    For example, in the ProperNames dataset, the first line is a header with information about the type (individual, place, title etc) and other data. These details apply to each of the subsequent subRecords which contain fields for the specific tag, Hebrew/Greek, translation, and the list of references. So the Header effectively contains fields which belong to each of its subRecords and would be identical for each of them if they were included on each line.

  • Hebrew glyphs are separated and normalised in the order:
    consonant; sin/shin dot; dagesh; vowel; metheg/raphe; accents

    • Glyphs NOT used for Hebrew include:
      װ ױ ײ ﭏ ײַ שׁ שׂ שּׁ שּׂ אַ אָ אּ בּ גּ דּ הּ וּ זּ טּ יּ ךּ כּ לּ מּ נּ סּ ףּ פּ צּ קּ רּ שּ תּ וֹ בֿ כֿ פֿ ﬠ ﬡ ﬢ ﬣ ﬤ ﬥ ﬦ ﬧ ﬨ
  • Greek glyphs are normalised to include only:
    ; · . , ᾽ ά ά ὰ ᾷ ᾷ ἀ Ἀ Ἀ ἁ Ἁ ἄ ἄ Ἄ Ἄ ἅ ἂ ἂ ἅ ἃ ἃ ᾶ ᾳ ἆ ἆ έ έ ὲ ἐ Ἐ Ἐ ἑ Ἑ ἔ Ἔ ἒ ἕ ἕ Ἒ Ἕ Ἕ ἓ ἓ ή ή ὴ ῇ ῇ ἠ Ἠ Ἠ ἡ Ἡ ἤ ἤ Ἤ Ἤ ἢ ἢ ἥ ἥ Ἢ Ἢ ἣ ἣ ᾖ ᾖ ᾗ ᾗ ᾗ ῆ ῃ ῄ ῄ ἦ ἦ Ἦ Ἦ ἧ ἧ ᾐ ᾐ ᾑ ᾔ ᾔ ί ί ὶ ϊ ΐ ΐ ΐ ῒ ῒ ἰ Ἰ Ἰ ἱ Ἱ ἴ ἴ Ἴ Ἴ ἵ ἵ Ἵ Ἵ ἳ ἳ ῖ ἶ ἶ ἷ ἷ ό ό ὸ ὀ Ὀ Ὀ ὁ Ὁ ὄ ὄ Ὄ Ὄ ὅ ὅ ὂ ὂ Ὅ ὃ ὃ Ὃ Ὃ Ὃ ῥ Ῥ ̔Ρ ύ ύ Ύ ὺ ϋ ΰ ΰ ΰ ῢ ῢ ὐ ὑ Ὑ ὔ ὔ ὒ ὒ ὕ ὕ ὓ ὓ ῦ ὖ ὖ ὗ ὗ ώ ώ ὼ ῷ ῷ ὠ Ὠ ὡ Ὡ ὤ ὤ Ὤ ὢ ὢ ὥ ὥ Ὥ Ὥ ᾦ ᾧ ᾧ Ὧ ᾯ ᾯ ῶ ῳ ῴ ῴ ὦ ὦ Ὦ ὧ ὧ ὧ ᾠ ᾠ ς

  • Glyphs NOT used for Greek include:
    ; ' ᾿ ` ῾ ’ ‘ ‛ ′ ΄ ʹ̛̀́̓̒̓̔̕ ʹ ʻ ʼ ʽ ʾ ʿ ˈ ˊ ˋ ' ` ´ o ά ὰ ᾷ ἀ Ἀ ἁ Ἁ ἄ Ἄ ἅ ἂ ἃ ᾶ ᾳ ἆ έ ὲ ἐ Ἐ ἑ Ἑ ἔ Ἔ ἕ Ἕ ἓ ή ὴ ῇ ἠ Ἠ ἡ Ἡ ἤ Ἤ ἢ ἥ Ἢ ἣ ᾗ ῆ ῃ ῄ ἦ Ἦ ᾖ ἧ ᾐ ᾑ ᾔ i ί ὶ ϊ ΐ ῒ ἰ Ἰ ἱ Ἱ ἴ Ἴ ἵ Ἵ ἳ ῖ ἶ ἷ ό ὸ ὀ Ὀ ὁ Ὁ ὄ Ὄ ὅ ὂ Ὅ ὃ Ὃ ῥ Ῥ ύ ὺ ϋ ΰ ῢ ὐ ὑ Ὑ ὔ ὕ ὒ ὓ ῦ ὖ ὗ ώ ὼ ῷ ὠ ὡ Ὡ ὤ Ὤ ὢ ὥ Ὥ ᾦ ᾧ ᾯ ῶ ῳ ῴ ὦ Ὦ ὧ Ὧ ᾠ ϛ

  • Bible reference abbreviations are based on UBS with slightly different formatting:
    References are e.g. Gen.1.10-12; 1Ki.2.4,5; Phm.2; Job.1.3--2.4;
    OT: Gen, Exo, Lev, Num, Deu, Jos, Jdg, Rut, 1Sa, 2Sa, 1Ki, 2Ki, 1Ch, 2Ch, Ezr, Neh, Est, Job, Psa, Pro, Ecc, Sng, Isa, Jer, Lam, Ezk, Dan, Hos, Jol, Amo, Oba, Jon, Mic, Nam, Hab, Zep, Hag, Zec, Mal,
    Apoc: Tob, Jdt, EsG, Wis, Sir, Bar, LJe, S3Y, Sus, Bel, 1Ma, 2Ma, 3Ma, 4Ma, 1Es, 2Es, Man, Ps2, Oda, PsS, Alternate MSS: JsA, JdB, TbS, SsT, DnT, BlT,
    NT: Mat, Mrk, Luk, Jhn, Act, Rom, 1Co, 2Co, Gal, Eph, Php, Col, 1Th, 2Th, 1Ti, 2Ti, Tit, Phm, Heb, Jas, 1Pe, 2Pe, 1Jn, 2Jn, 3Jn, Jud, Rev
    (OT+NT are all based on the first 3 characters, except: Jdg, Sng, Ezk, Jol, Nam, Mrk, Jhn, Php, Phm, Jas, 1Jn, 2Jn, 3Jn)