-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
cleaned a bit repo; committed changes corresponding to HIPE2022 data …
…release v1.0
- Loading branch information
Matteo Romanello
committed
Mar 3, 2022
1 parent
eff4486
commit 188219b
Showing
40 changed files
with
1,793 additions
and
1,315 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,30 +1,30 @@ | ||
2022-01-13 15:59:16,835 - root - INFO - Start conversion of 5 files. | ||
2022-01-13 15:59:16,848 - root - INFO - Converting data/preparation/minireference/en/retokenized/cu31924087948174_0035.xmi into data/preparation/minireference/en/tsv/cu31924087948174_0035.tsv | ||
2022-01-13 15:59:17,075 - root - INFO - Hyphenation – Removed character - from spec-tator => spectator | ||
2022-01-13 15:59:17,076 - root - INFO - Hyphenation – Removed character - from her-self => herself | ||
2022-01-13 15:59:17,081 - root - INFO - Converting data/preparation/minireference/en/retokenized/cu31924087948174_0063.xmi into data/preparation/minireference/en/tsv/cu31924087948174_0063.tsv | ||
2022-01-13 15:59:17,274 - root - INFO - Hyphenation – Removed character - from occur-ring => occurring | ||
2022-01-13 15:59:17,275 - root - INFO - Hyphenation – Removed character - from κελαι-νώπαν => κελαινώπαν | ||
2022-01-13 15:59:17,276 - root - INFO - Hyphenation – Removed character - from mari-ners, => mariners, | ||
2022-01-13 15:59:17,279 - root - INFO - Hyphenation – Removed character - from dark-ened => darkened | ||
2022-01-13 15:59:17,284 - root - INFO - Converting data/preparation/minireference/en/retokenized/sophoclesplaysa05campgoog_0014.xmi into data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0014.tsv | ||
2022-01-13 15:59:17,414 - root - ERROR - Transcript for entity Aristotle (Δ δίς 1. 15 § 13) is present in data/preparation/minireference/en/retokenized/sophoclesplaysa05campgoog_0014.xmi, yet entity is not marked as noisy. Levenshtein distance is computed nevertheless. | ||
2022-01-13 15:59:17,416 - root - INFO - Hyphenation – Removed character - from con-nected => connected | ||
2022-01-13 15:59:17,419 - root - INFO - Hyphenation – Removed character - from inter-polation => interpolation | ||
2022-01-13 15:59:17,422 - root - INFO - Converting data/preparation/minireference/en/retokenized/sophoclesplaysa05campgoog_0146.xmi into data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0146.tsv | ||
2022-01-13 15:59:17,589 - root - ERROR - Transcript for entity 257. 1075 is present in data/preparation/minireference/en/retokenized/sophoclesplaysa05campgoog_0146.xmi, yet entity is not marked as noisy. Levenshtein distance is computed nevertheless. | ||
2022-01-13 15:59:17,590 - root - ERROR - Transcript for entity Philostratus ( Viz. Apoll. Δ. 22 § 5) is present in data/preparation/minireference/en/retokenized/sophoclesplaysa05campgoog_0146.xmi, yet entity is not marked as noisy. Levenshtein distance is computed nevertheless. | ||
2022-01-13 15:59:17,592 - root - INFO - Hyphenation – Removed character - from xpé-vov) => xpévov) | ||
2022-01-13 15:59:17,593 - root - INFO - Hyphenation – Removed character - from how-ever, => however, | ||
2022-01-13 15:59:17,597 - root - INFO - Hyphenation – Removed character - from per-son, => person, | ||
2022-01-13 15:59:17,604 - root - INFO - Hyphenation – Removed character - from διοί-yew => διοίyew | ||
2022-01-13 15:59:17,609 - root - INFO - Converting data/preparation/minireference/en/retokenized/sophoclesplaysa05campgoog_0288.xmi into data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0288.tsv | ||
2022-01-13 15:59:17,758 - root - INFO - Hyphenation – Removed character - from re-corded => recorded | ||
2022-01-13 15:59:17,760 - root - INFO - Hyphenation – Removed character - from ξυν-ηρετεῖν => ξυνηρετεῖν | ||
2022-01-13 15:59:17,762 - root - INFO - Hyphenation – Removed character - from ξυνηρε-Toes: => ξυνηρεToes: | ||
2022-01-13 15:59:17,762 - root - INFO - Hyphenation – Removed character - from συῤνγ-ἥσεις => συῤνγἥσεις | ||
2022-01-13 15:59:17,763 - root - INFO - Hyphenation – Removed character - from ξυναρτί-oes.) => ξυναρτίoes.) | ||
2022-01-13 15:59:17,764 - root - INFO - Hyphenation – Removed character - from ξύμ-πλουν => ξύμπλουν | ||
2022-01-13 15:59:17,765 - root - INFO - Hyphenation – Removed character - from ellipti-cal => elliptical | ||
2022-01-13 15:59:17,770 - root - INFO - Hyphenation – Removed character - from *vio-lence’ => *violence’ | ||
2022-01-13 15:59:17,774 - root - INFO - Conversion completed. | ||
2022-02-14 10:54:32,861 - root - INFO - Start conversion of 5 files. | ||
2022-02-14 10:54:32,875 - root - INFO - Converting data/preparation/minireference/en/retokenized/cu31924087948174_0035.xmi into data/preparation/minireference/en/tsv/cu31924087948174_0035.tsv | ||
2022-02-14 10:54:33,114 - root - INFO - Hyphenation – Removed character - from spec-tator => spectator | ||
2022-02-14 10:54:33,116 - root - INFO - Hyphenation – Removed character - from her-self => herself | ||
2022-02-14 10:54:33,121 - root - INFO - Converting data/preparation/minireference/en/retokenized/cu31924087948174_0063.xmi into data/preparation/minireference/en/tsv/cu31924087948174_0063.tsv | ||
2022-02-14 10:54:33,330 - root - INFO - Hyphenation – Removed character - from occur-ring => occurring | ||
2022-02-14 10:54:33,331 - root - INFO - Hyphenation – Removed character - from κελαι-νώπαν => κελαινώπαν | ||
2022-02-14 10:54:33,332 - root - INFO - Hyphenation – Removed character - from mari-ners, => mariners, | ||
2022-02-14 10:54:33,335 - root - INFO - Hyphenation – Removed character - from dark-ened => darkened | ||
2022-02-14 10:54:33,339 - root - INFO - Converting data/preparation/minireference/en/retokenized/sophoclesplaysa05campgoog_0014.xmi into data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0014.tsv | ||
2022-02-14 10:54:33,478 - root - ERROR - Transcript for entity Aristotle (Δ δίς 1. 15 § 13) is present in data/preparation/minireference/en/retokenized/sophoclesplaysa05campgoog_0014.xmi, yet entity is not marked as noisy. Levenshtein distance is computed nevertheless. | ||
2022-02-14 10:54:33,480 - root - INFO - Hyphenation – Removed character - from con-nected => connected | ||
2022-02-14 10:54:33,484 - root - INFO - Hyphenation – Removed character - from inter-polation => interpolation | ||
2022-02-14 10:54:33,487 - root - INFO - Converting data/preparation/minireference/en/retokenized/sophoclesplaysa05campgoog_0146.xmi into data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0146.tsv | ||
2022-02-14 10:54:33,663 - root - ERROR - Transcript for entity 257. 1075 is present in data/preparation/minireference/en/retokenized/sophoclesplaysa05campgoog_0146.xmi, yet entity is not marked as noisy. Levenshtein distance is computed nevertheless. | ||
2022-02-14 10:54:33,664 - root - ERROR - Transcript for entity Philostratus ( Viz. Apoll. Δ. 22 § 5) is present in data/preparation/minireference/en/retokenized/sophoclesplaysa05campgoog_0146.xmi, yet entity is not marked as noisy. Levenshtein distance is computed nevertheless. | ||
2022-02-14 10:54:33,666 - root - INFO - Hyphenation – Removed character - from xpé-vov) => xpévov) | ||
2022-02-14 10:54:33,666 - root - INFO - Hyphenation – Removed character - from how-ever, => however, | ||
2022-02-14 10:54:33,671 - root - INFO - Hyphenation – Removed character - from per-son, => person, | ||
2022-02-14 10:54:33,679 - root - INFO - Hyphenation – Removed character - from διοί-yew => διοίyew | ||
2022-02-14 10:54:33,683 - root - INFO - Converting data/preparation/minireference/en/retokenized/sophoclesplaysa05campgoog_0288.xmi into data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0288.tsv | ||
2022-02-14 10:54:33,840 - root - INFO - Hyphenation – Removed character - from re-corded => recorded | ||
2022-02-14 10:54:33,843 - root - INFO - Hyphenation – Removed character - from ξυν-ηρετεῖν => ξυνηρετεῖν | ||
2022-02-14 10:54:33,845 - root - INFO - Hyphenation – Removed character - from ξυνηρε-Toes: => ξυνηρεToes: | ||
2022-02-14 10:54:33,845 - root - INFO - Hyphenation – Removed character - from συῤνγ-ἥσεις => συῤνγἥσεις | ||
2022-02-14 10:54:33,846 - root - INFO - Hyphenation – Removed character - from ξυναρτί-oes.) => ξυναρτίoes.) | ||
2022-02-14 10:54:33,847 - root - INFO - Hyphenation – Removed character - from ξύμ-πλουν => ξύμπλουν | ||
2022-02-14 10:54:33,848 - root - INFO - Hyphenation – Removed character - from ellipti-cal => elliptical | ||
2022-02-14 10:54:33,853 - root - INFO - Hyphenation – Removed character - from *vio-lence’ => *violence’ | ||
2022-02-14 10:54:33,857 - root - INFO - Conversion completed. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,32 +1,33 @@ | ||
2022-01-13 15:59:21,731 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/cu31924087948174_0035.tsv | ||
2022-01-13 15:59:21,732 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/cu31924087948174_0063.tsv | ||
2022-01-13 15:59:21,732 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0014.tsv | ||
2022-01-13 15:59:21,732 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0146.tsv | ||
2022-01-13 15:59:21,733 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0288.tsv | ||
2022-01-13 15:59:21,734 - __main__ - INFO - Written sample to data/release/v0.1/HIPE-2022-ajmc-v0.1-sample-en.tsv | ||
2022-01-13 15:59:21,736 - __main__ - INFO - data/release/v0.1/HIPE-2022-ajmc-v0.1-sample-en.tsv contains all 5 expected documents | ||
2022-01-13 15:59:21,737 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/cu31924087948174_0035-biblio.tsv | ||
2022-01-13 15:59:21,737 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/cu31924087948174_0063-biblio.tsv | ||
2022-01-13 15:59:21,738 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0014-biblio.tsv | ||
2022-01-13 15:59:21,738 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0146-biblio.tsv | ||
2022-01-13 15:59:21,738 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0288-biblio.tsv | ||
2022-01-13 15:59:21,739 - __main__ - INFO - Written sample to data/release/v0.1/HIPE-2022-ajmc_biblio-v0.1-sample-en.tsv | ||
2022-01-13 15:59:21,741 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/Wecklein1894_0007.tsv | ||
2022-01-13 15:59:21,741 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/Wecklein1894_0016.tsv | ||
2022-01-13 15:59:21,742 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/Wecklein1894_0080.tsv | ||
2022-01-13 15:59:21,742 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/Wecklein1894_0087.tsv | ||
2022-01-13 15:59:21,742 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/sophokle1v3soph_0017.tsv | ||
2022-01-13 15:59:21,743 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/sophokle1v3soph_0049.tsv | ||
2022-01-13 15:59:21,743 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/sophokle1v3soph_0085.tsv | ||
2022-01-13 15:59:21,744 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/sophokle1v3soph_0125.tsv | ||
2022-01-13 15:59:21,744 - __main__ - INFO - Written sample to data/release/v0.1/HIPE-2022-ajmc-v0.1-sample-de.tsv | ||
2022-01-13 15:59:21,746 - __main__ - INFO - data/release/v0.1/HIPE-2022-ajmc-v0.1-sample-de.tsv contains all 8 expected documents | ||
2022-01-13 15:59:21,746 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/Wecklein1894_0007-biblio.tsv | ||
2022-01-13 15:59:21,746 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/Wecklein1894_0016-biblio.tsv | ||
2022-01-13 15:59:21,747 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/Wecklein1894_0080-biblio.tsv | ||
2022-01-13 15:59:21,747 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/Wecklein1894_0087-biblio.tsv | ||
2022-01-13 15:59:21,747 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/sophokle1v3soph_0017-biblio.tsv | ||
2022-01-13 15:59:21,748 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/sophokle1v3soph_0049-biblio.tsv | ||
2022-01-13 15:59:21,748 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/sophokle1v3soph_0085-biblio.tsv | ||
2022-01-13 15:59:21,748 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/sophokle1v3soph_0125-biblio.tsv | ||
2022-01-13 15:59:21,749 - __main__ - INFO - Written sample to data/release/v0.1/HIPE-2022-ajmc_biblio-v0.1-sample-de.tsv | ||
2022-02-14 10:54:35,262 - __main__ - INFO - Created folder data/release/v1.0 as it did not exist | ||
2022-02-14 10:54:35,264 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/cu31924087948174_0035.tsv | ||
2022-02-14 10:54:35,265 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/cu31924087948174_0063.tsv | ||
2022-02-14 10:54:35,265 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0014.tsv | ||
2022-02-14 10:54:35,266 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0146.tsv | ||
2022-02-14 10:54:35,266 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0288.tsv | ||
2022-02-14 10:54:35,268 - __main__ - INFO - Written sample to data/release/v1.0/HIPE-2022-v1.0-ajmc-sample-en.tsv | ||
2022-02-14 10:54:35,269 - __main__ - INFO - data/release/v1.0/HIPE-2022-v1.0-ajmc-sample-en.tsv contains all 5 expected documents | ||
2022-02-14 10:54:35,269 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/cu31924087948174_0035-biblio.tsv | ||
2022-02-14 10:54:35,270 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/cu31924087948174_0063-biblio.tsv | ||
2022-02-14 10:54:35,270 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0014-biblio.tsv | ||
2022-02-14 10:54:35,271 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0146-biblio.tsv | ||
2022-02-14 10:54:35,271 - __main__ - INFO - Read input from file data/preparation/minireference/en/tsv/sophoclesplaysa05campgoog_0288-biblio.tsv | ||
2022-02-14 10:54:35,272 - __main__ - INFO - Written sample to data/release/v1.0/HIPE-2022-v1.0-ajmc_biblio-sample-en.tsv | ||
2022-02-14 10:54:35,273 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/Wecklein1894_0007.tsv | ||
2022-02-14 10:54:35,274 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/Wecklein1894_0016.tsv | ||
2022-02-14 10:54:35,274 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/Wecklein1894_0080.tsv | ||
2022-02-14 10:54:35,274 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/Wecklein1894_0087.tsv | ||
2022-02-14 10:54:35,275 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/sophokle1v3soph_0017.tsv | ||
2022-02-14 10:54:35,276 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/sophokle1v3soph_0049.tsv | ||
2022-02-14 10:54:35,276 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/sophokle1v3soph_0085.tsv | ||
2022-02-14 10:54:35,276 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/sophokle1v3soph_0125.tsv | ||
2022-02-14 10:54:35,277 - __main__ - INFO - Written sample to data/release/v1.0/HIPE-2022-v1.0-ajmc-sample-de.tsv | ||
2022-02-14 10:54:35,279 - __main__ - INFO - data/release/v1.0/HIPE-2022-v1.0-ajmc-sample-de.tsv contains all 8 expected documents | ||
2022-02-14 10:54:35,280 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/Wecklein1894_0007-biblio.tsv | ||
2022-02-14 10:54:35,280 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/Wecklein1894_0016-biblio.tsv | ||
2022-02-14 10:54:35,281 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/Wecklein1894_0080-biblio.tsv | ||
2022-02-14 10:54:35,281 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/Wecklein1894_0087-biblio.tsv | ||
2022-02-14 10:54:35,282 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/sophokle1v3soph_0017-biblio.tsv | ||
2022-02-14 10:54:35,282 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/sophokle1v3soph_0049-biblio.tsv | ||
2022-02-14 10:54:35,283 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/sophokle1v3soph_0085-biblio.tsv | ||
2022-02-14 10:54:35,283 - __main__ - INFO - Read input from file data/preparation/minireference/de/tsv/sophokle1v3soph_0125-biblio.tsv | ||
2022-02-14 10:54:35,284 - __main__ - INFO - Written sample to data/release/v1.0/HIPE-2022-v1.0-ajmc_biblio-sample-de.tsv |
Oops, something went wrong.