From 5f8138a038797fe02e787a6c86b078662b7c8385 Mon Sep 17 00:00:00 2001 From: Simon Gray Date: Tue, 30 Apr 2024 09:11:05 +0200 Subject: [PATCH] updated releases.md for release 2024-04-30 --- pages/releases-da.md | 11 +++++++++-- pages/releases-en.md | 7 +++++++ src/main/dk/cst/dannet/db/bootstrap.clj | 4 ++-- 3 files changed, 18 insertions(+), 4 deletions(-) diff --git a/pages/releases-da.md b/pages/releases-da.md index 97eff15..87f2839 100644 --- a/pages/releases-da.md +++ b/pages/releases-da.md @@ -1,5 +1,12 @@ # Versioner -De nye DanNet-versioner bruger udgivelsesdatoen som versionsnummer, formatteret som `YYYY-MM-DD`. +De nye DanNet-versioner bruger udgivelsesdatoen som versionsnummer, formateret som `YYYY-MM-DD`. + +## **2024-04-30**: Forbedret CSV-eksport + andre små rettelser +* CSV-eksporten er blevet forbedret ved... + 1. ... at fjerne tilstedeværelsen af interne ID'er i `synsets.csv` (der henviser til ontologityper) og i stedet erstatte dem med den konkrete sammensætning af ontologityper. + 2. .. at inkludere leksikale opslag i `words.csv`, som tidligere fejlagtigt blev udeladt. +* Nogle leksikale opslag, der tidligere var af den generiske type `ontolex:LexicalEntry`, har nu mere specifikke typer, f.eks. `ontolex:Word`, `ontolex:MultiWordExpression` eller `ontolex:Affix`. +* Nogle af ordklasserne for adjektiver tilføjet i udgivelsen `2023-05-11` manglede en ordklasse-relation og/eller forvekslede to separate relationstyper; dette er nu blevet rettet. ## **2023-11-28**: Korte etiketter * `dns:shortLabel`-varianter af de eksisterende synset-labels (udledt fra bl.a. ordfrekvenser fra [DDO](https://ordnet.dk/ddo)) er blevet tilføjet til DanNet-datasættet. @@ -16,7 +23,7 @@ De nye DanNet-versioner bruger udgivelsesdatoen som versionsnummer, formatteret * Derudover er kønsdata fra de gamle versioner af DanNet også nu inkluderet. Det kan findes via den nye `dns:gender`-relation. * For bedre at kunne facilitere navigation af grafen på DanNet-hjemmesiden er en ny relation, `dns:linkedConcept`, blevet tilføjet til DanNet-skemaet. Denne relation er den omvendte relation af `wn:ili` og kan udledes i den store graf der kan udforskes på wordnet.dk/dannet. -## **2023-06-01**: ~5000 links til Open English WordNet +## **2023-06-01**: ~5000 links til Open English WordNet * Skemaoversættelserne er blevet opdateret. * Omtrent 5000 links er blevet tilføjet, som linker DanNet med [Open English WordNet](https://github.com/globalwordnet/english-wordnet) eller indirekte via [CILI](https://github.com/globalwordnet/cili). * OEWN-datasættet har fået et medfølgende datasæt der indeholder genererede etiketter for synsets, betydninger og ord. diff --git a/pages/releases-en.md b/pages/releases-en.md index daef4bf..c671732 100644 --- a/pages/releases-en.md +++ b/pages/releases-en.md @@ -1,6 +1,13 @@ # Releases The newer DanNet releases use the release date as the version number, formatted as `YYYY-MM-DD`. +## **2024-04-30**: Improved CSV export + other small fixes +* The CSV export has been improved by... + 1. ... removing the presence of internal IDs in `synsets.csv` (referring to ontological types) replacing them instead with the concrete mix of ontological types. + 2. ... including lexical entries in `words.csv` which were previously erroneously excluded. +* Some Lexical entries which were formerly of the generic `ontolex:LexicalEntry` type now have more specific types, e.g. `ontolex:Word`, `ontolex:MultiWordExpression`, or `ontolex:Affix`. +* Some of the parts-of-speech for the adjectives added in release `2023-05-11` were missing a PoS relation and/or mixed up two separate relation types; this has now been fixed. + ## **2023-11-28**: Short labels * `dns:shortLabel` variants of synset labels (derived from, amongst other things, word frequencies from [DDO](https://ordnet.dk/ddo)) have been added to the DanNet dataset. * `dns:source` is now used once again to link to the original dictionary entry sources such as DDO. The usage of `dc:source` was both problematic wrt. its definition in the schema, as well the annoying fact that `dc` in some cases results in confusion when used as an RDF prefix as it may be hardcoded to a specific IRI. diff --git a/src/main/dk/cst/dannet/db/bootstrap.clj b/src/main/dk/cst/dannet/db/bootstrap.clj index 2a57b41..b2934a6 100644 --- a/src/main/dk/cst/dannet/db/bootstrap.clj +++ b/src/main/dk/cst/dannet/db/bootstrap.clj @@ -92,7 +92,7 @@ "2023-11-28") (def current-release - (str "2023-11-28" "-SNAPSHOT")) + (str "2024-04-30" #_"-SNAPSHOT")) (defn assert-expected-dannet-release! "Assert that the DanNet `model` is the expected release to boostrap from." @@ -342,7 +342,7 @@ This function survives between releases, but the functions it calls are all considered temporary and should be deleted when the release comes." [dataset] - (let [expected-release "2023-11-28-SNAPSHOT"] + (let [expected-release "2024-04-30"] (assert (= current-release expected-release)) ; another check (println "Applying release changes for" expected-release "...") (assign-specific-lexical-entry! dataset)