All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- fix
PopulationComparisonProcessor
: fix measurement of Absolute Coveredness to not increase in case of a duplicated resource in one dataset without a corresponding resource in any other dataset
- changed
PropertyComparisonProcessor
: use deduplicated values for determining deviations and omissions to avoid duplicated reporting
3.1.1 - 2024-11-13
- fix
wdMismatchFinder
report: suppress empty strings as external value
3.1.0 - 2024-11-12
- fix
wdMismatchFinder
report: adjust to changed mismatches import file format
- extend
wdMismatchFinder
report: add reporting of value deviations for Wikidata qualifiers
3.0.1 - 2024-11-05
- fix
PropertyComparisonProcessor
: fix measurement of Absolute Coveredness
3.0.0 - 2024-10-04
- add measure Absolute Coveredness measured by
PopulationComparisonProcessor
andPropertyComparisonProcessor
- add measure Relative Coveredness measured by
PopulationComparisonProcessor
andPropertyComparisonProcessor
- add measure Duplicate Count measured by
PopulationComparisonProcessor
andPropertyComparisonProcessor
- add value filter condition to measurement reports by
PropertyComparisonProcessor
- Breaking Change: use av:ResourceDuplicate instead of av:Issue to report duplicated resources
- upgrade to ABECTO Vocabulary version 1.3.1
- code refactoring
2.2.2 - 2024-06-08
- fix reports
2.2.1 - 2024-06-07
- increase minimum Java version to Java 17
- upgrade Apache Jena to v5.0.0
- upgrade Guava to v33.2.1-jre
2.2.0 - 2024-03-04
Note: This release causes a minor version increment, as the previous release wrongly caused a patch version increment only.
- fix
measurementsMarkdown
report: working together with parameter--reportOn
2.1.4 - 2024-03-01
- add report
deviationsMarkdown
- fix a bug that caused a
ClassCastException
in the correspondence groups stream returned byProcessor#getCorrespondenceGroups()
if a literal is present in a used predefined metadata graph
2.1.3 - 2024-02-06
- fix
PopulationComparisonProcessor
andPropertyComparisonProcessor
: avoid unnecessary rounding - fix
PopulationComparisonProcessor
: align error rate parameter toPropertyComparisonProcessor
2.1.2 - 2023-12-13
- fix docker build
2.1.1 - 2023-12-09
- add benchmarks for
PopulationComparisonProcessor
andPropertyComparisonProcessor
- split project into subprojects
abecto-core
andabecto-benchmark
2.1.0 - 2023-04-17
- do not use
av:relevantResource
anymore
2.0.1 - 2023-04-15
- fix
measurementsMarkdown
report: restoredqv:computedOn
erroneously replaced byav:associatedDataset
2.0.0 - 2023-04-15
- Breaking Change: merge
LiteralValueComparisonProcessor
andResourceValueComparisonProcessor
intoPropertyComparisonProcessor
to enable comparison of variables permitting literal and non-literal values - change
PropertyComparisonProcessor
: changed absolute coverage measure and relative coverage measure to fully base on deduplicated values - change
PropertyComparisonProcessor
: changed count measure, deduplicated count measure, absolute coverage measure and relative coverage measure to ignoring excluded values
- fix
PropertyComparisonProcessor
: fix handling for NaN values - fix
PropertyComparisonProcessor
: fix deduplicated count measure to not substract duplicated values from count twice - Breaking Change: use
av:associatedDataset
instead ofdqv:computedOn
forav:MetaDataGraph
(exports for existing results will not work anymore) - hotfix
SparqlSourceProcessor
: filter statements containing IRIs with Newline (U+000A) character to work around DBpedia extraction framework Issue 748 and JENA-2351
- Breaking Change: removed reporting of unexpected type issues by
PropertyComparisonProcessor
- Breaking Change: removed deprecated
CompletenessProcessor
(replaced byPopulationComparisonProcessor
)
1.2.0 - 2023-03-09
- fix
LiteralValueComparisonProcessor
andResourceValueComparisonProcessor
: add handling for datasets not covering a compared variable - fix
PopulationComparisonProcessor
: add handling for datasets not containing any resource for a compared aspect - fix
PopulationComparisonProcessor
: report count value instead of deduplicated count value for count measure
- extend
LiteralValueComparisonProcessor
andResourceValueComparisonProcessor
: report deduplicated count measure - extend
PopulationComparisonProcessor
: report deduplicated count measure
1.1.0 - 2023-02-02
- extend
LiteralValueComparisonProcessor
: add calculation of measurement count, absolute coverage, relative coverage and estimated completeness per variable - extend
ResourceValueComparisonProcessor
: add calculation of measurement count, absolute coverage, relative coverage and estimated completeness per variable - extend
measurementsMarkdown
export: add support for measurements with affected variable - extend
wdMismatchFinder
export: enable reporting of missing values
- fix
LiteralValueComparisonProcessor
: possibly missing value deviations or value omissions in case of duplicated resources in other dataset - fix
ResourceValueComparisonProcessor
: possibly missing value deviations or value omissions in case of duplicated resources in other dataset - fix
LiteralValueComparisonProcessor
: possibly additional value deviations or value omissions in case of duplicated resources in the dataset - fix
ResourceValueComparisonProcessor
: possibly additional value deviations or value omissions in case of duplicated resources in the dataset - fix
wdMismatchFinder
export: adjust to format changes (T288511, T313468)
- renamed
CompletenessProcessor
intoPopulationComparisonProcessor
: deprecated dummyCompletenessProcessor
class remains to avoid a breaking change - change
CompletenessProcessor
/PopulationComparisonProcessor
: avoid relativeCoverage measurement in case of no values in other dataset - change
CompletenessProcessor
/PopulationComparisonProcessor
: increase precision of measurement results from 2 to 16 digits
1.0.1 - 2022-08-18
- fix report
wdMismatchFinder
: remove<
and>
around external_url values
1.0.0 - 2022-08-17
- extend
@Parameter
annotation: add parameterconverter
expecting an implementation of the JacksonConverter
interface and make use of them during processor initialization, to enable early execution failures due to invalid parameter values
- improve
FileSourceProcessor:
improve parser error logging - Breaking Change: renamed
FBRuleReasoningProcessor
intoForwardRuleReasoningProcessor
- Breaking Change:
SparqlConstructProcessor
andSparqlSourceProcessor
expect datatypexsd:string
instead ofav:SparqlQuery
for parameterquery
, to ease configuration writing - Breaking Change:
av:VariablePath
s of aspects expect datatypexsd:string
instead ofav:SparqlPropertyPath
for the propertyav:propertyPath
- Breaking Change: removed support for the RDF datatypes
av:SparqlPropertyPath
andav:SparqlQuery
- remove custom xsd:dateTimeStamp mapping
0.10.0 - 2022-07-19
- extend report
deviations
: add columnsnippetToAnnotateValueComparedToAsWrong
to ease wrong value annotation for future plan executions
- fix
Step
: consider associated dataset of predefined metadata graphs - fix report engine: avoid character escaping in of literals
0.9.1 - 2022-07-15
- fix
wdMismatchFinder
report: remove datatypes of wikidata_value and external_value
0.9.0 - 2022-07-15
- add result export template
measurementsMarkdown
- fix
wdMismatchFinder
report: exclude not yet supported mismatches of (alternative) labels or missing values and qualifier values
0.8.0 - 2022-07-01
- extend
LiteralValueComparisonProcessor
: add parameterallowLangTagSkip
to enable comparison of values from sources using and not using language tags
0.7.2 - 2022-06-30
- fix all reports
0.7.1 - 2022-06-30
- fix
UsePresentMappingProcessor
: fix logging - fix
CompletenessProcessor
: add handling of missing aspect coverage by a datasets - fix
LiteralValueComparisonProcessor
: add handling of missing aspect coverage by a datasets - fix
ResourceValueComparisonProcessor
: add handling of missing aspect coverage by a datasets - fix
JaroWinklerMappingProcessor
: add handling of missing aspect coverage by a datasets - fix
JaroWinklerMappingProcessor
: update similarity library fixing a bug that might cause a lower similarity between resources with several values of the compared variable - fix all reports: disable special character escaping
- hotfix for https://issues.apache.org/jira/browse/JENA-2335
0.7.0 - 2022-06-24
- extend
FileSourceProcessor
: permit multiplepath
parameter values
- rename
RdfFileSourceProcessor
intoFileSourceProcessor
- improve
UsePresentMappingProcessor
: improve logging on missing aspect patterns
0.6.0 - 2022-06-23
- add CLI parameters
--failOnDeviation
,--failOnValueOmission
,--failOnResourceOmission
,--failOnWrongValue
and--failOnIssue
to enable exit code 1 in case of deviations/value omissions/resource omissions/wrong values/other issues - add CLI parameter
--reportOn
to enable limited scope of reports and exit codes to a single dataset - add result export template
mappingReview
- extend
UrlSourceProcessor
: permit multipleurl
parameter values - extend result export template
resourceOmission
: sort results, add optional rdfs:label
- fix result export template
deviations
: remove duplicated columns - fix
UrlSourceProcessor
: use correct URL for request in heuristic language detection mode; enable followRedirects in brute force language detection mode - fix all reports: include report templates into JAR
- TRIG output avoids base and empty prefix to ease result reading
0.5.0 - 2022-05-12
- add
FBRuleReasoningProcessor
: Generates statements using custom rules - extend
SparqlSourceProcessor
: add retries on failures configurable with parameterschunkSizeDecreaseFactor
andmaxRetries
- extend
SparqlSourceProcessor
: add parameterfollowInverseUnlimited
- extend
SparqlSourceProcessor
: add parameterignoreInverse
- extend
SparqlSourceProcessor
: addrdf:first
andrdf:rest
to defaultfollowUnlimited
values - extend
LiteralValueComparisonProcessor
: add parameterlanguageFilterPatterns
- extend
LiteralValueComparisonProcessor
: add parameterallowTimeSkip
to enable date part ofxsd:date
andxsd:dateTime
- extend
MappingProcessor
: add persisting of transitive correspondences - extend
Parameters
: enable use of different Collection subtypes for Processor Parameters - extend
Aspect
: enable use of one pattern for multiple datasets - extend logging: log
Step
processing start and completion - extend logging: log CLI execution phases
- extend built in documentation (
--help
) - add result export engine
- add result export template
deviations
- add result export template
resourceOmissions
- add result export template
wdMismatchFinder
(see Wikidata Mismatch Finder file format) - add extraction of property path between key variable and other variables to make them available for exports
- add reuse of sources prefix definitions for the output file
- add CLI parameter
--loadOnly
to enable reuse of execution output for report generation
- fix
EquivalentValueMappingProcessor
: fix message format - fix
EquivalentValueMappingProcessor
: skip resource with unbound variables - fix
SparqlSourceProcessor
: enable arbitrary query lengths by updating Apache Jena fixing JENA-2257 - fix
SparqlSourceProcessor
: fix expected datatypes of some parameters - fix
SparqlSourceProcessor
: increased compatibility to SPARQL endpoint implementations - improve performance of several mapping and comparison processors
- changed logging format
- changed
UsePresentMappingProcessor
: replaced parameterassignmentPaths
expecting SPARQL Property Paths with parametervariable
expecting an aspect variable name
0.4.0 - 2022-01-12
- add
EquivalentValueMappingProcessor
: Provides correspondences based on equivalent values.
- rename vocabulary resource av:SparqlSelectQuery into av:SparqlQuery
- fix
UrlSourceProcessor
: parameter value can now be set - fix
UrlSourceProcessor
: uses accept headers to explicitly request RDF in case of Content Negotiation - fix output av:relevantResource statements in case of blank node aspects
0.3.0 - 2022-01-10
- transform ABECTO from a webservice into a command line tool with RDF dataset files as input and output
- merge
CategoryCountProcessor
intoCompletenessProcessor
- rename
RelationalMappingProcessor
intoFunctionalMappingProcessor
- rename
LiteralDeviationProcessor
intoLiteralValueComparisonProcessor
- rename
ResourceDeviationProcessor
intoResourceValueComparisonProcessor
- rename categories into aspects
- rename mappings into correspondences
- add
SparqlSourceProcessor
: Extracts RDF from a SPARQL endpoint.
- remove
ManualMappingProcessor
: Predefined correspondences can be stated in the configuration directly. - remove
ManualCategoryProcessor
: Aspects can be stated in the configuration directly. - remove
TransitiveMappingProcessor
: Transitive correspondences are now added automatically - remove Jupyter Notebook support to control ABECTO
- remove
OpenlletReasoningProcessor
: Enable publishing as packed binary version
0.2.1 - 2020-12-03
- fix HTML output in Jupyter Notebooks: resolve misplaced
</div>
0.2.0 - 2020-12-03
- add
UrlSourceProcessor
: Loads an RDF document from a URL. - add
ExecutionRestController#getMetadata
: return metadata of loaded sources used in this execution - add
UsePresentMappingProcessor
: Provides mappings for resources connected in the ontologies with given property paths. - add
TransitiveMappingProcessor
: Provides transitive closure of existing mappings. - add
CompletenessProcessor
: Provides absolute and relative coverage statistics, omission detection, and duplicate detection of resources by category and ontologies. - extend
SparqlConstructProcessor
: enable recursive generation of new triples with SPARQL Construct Query and add parametermaxIterations
with default value1
- extend Measurement Report for Jupyter Notebooks: alphabetical order of measurements, alphabetical order of dimensions, replace ontology UUIDs with ontology names in dimension columns
- add Omission Report for Jupyter Notebooks
- extend
JaroWinklerMappingProcessor
: add parameterdefaultLangTag
used as fallback locale for LowerCase conversion during case-insensitive mapping - add
/version
API call returning the version of ABECTO - add Mapping Report in Jupyter Notebooks: replacing heavy-weighted Mapping Review
- fix
JaroWinklerMappingProcessor
: ignore other categories, enable case-insensitive mapping - fix
Category
:getPatternVariables()
does not anymore return helperVar
for BlankNodePropertyLists and BlankNodePropertyListPaths introduced by Apache Jena, which cause Exceptions inCategoryCountProcessor
- fix Measurement and Omission: use
abecto:ontology
instead ofabecto:knowledgeBase
- fix Measurement Report in Jupyter Notebooks: no dimensions column header concatenation of multiple measurement types
- fix
AbstractRefinementProcessor
: disable RDFS reasoning on input ontologies - fix
LiteralDeviationProcessor
: correct handling of float and double, enable multiple values of same property - fix
Deviation Report
in Jupyter Notebooks: solve omission of deviations - fix HTML output in Jupyter Notebooks: add line-breaks to enable
git diff
for result
- remove
Mapping Review
in Jupyter Notebooks: replaced by simple Mapping Report
0.1.1 - 2020-05-29
- fix
LiteralDeviationProcessor
: support numerical value Infinite, correct mixed numeric type comparison, address precision issues of mixed number type comparison - fix
Deviation Report
in Jupyter Notebooks: added missing IRI of resources with 2 or more deviations - fix
RelationalMappingProcessor
: skip candidates with missing value for a variable, fix missing mappings in case of at least 3 incomplete ontologies - fix
ExecutionRestController#getData
: include transformation nodes data - fix
ManualCategoryProcessor
: allow empty parameters
0.1.0 - 2020-05-05
- add
RdfFileSourceProcessor
: Loads an RDF document from the local file system. - add
JaroWinklerMappingProcessor
: Provides mappings based on the Jaro-Winkler Similarity of string property values using our implementation from Efficient Bounded Jaro-Winkler Similarity Based Search. - add
ManualMappingProcessor
: Enables users to manually adjust the mappings by providing or suppressing mappings. - add
RelationalMappingProcessor
: Provides mappings based on the mappings of referenced resources. - add
OpenlletReasoningProcessor
: Infers the logical consequences of the input RDF models utilizing the Openllet Reasoner to generate additional triples. - add
SparqlConstructProcessor
: Applies a given SPARQL Construct Query to the input RDF models to generate additional triples. - add
CategoryCountProcessor
: Measures the number of resources and property values per category. - add
LiteralDeviationProcessor
: Detects deviations between the property values of mapped resources as defined in the categories. - add
ManualCategoryProcessor
: Enables users to manually define resource categories and their properties. - add
ResourceDeviationProcessor
: Detects deviations between the resource references of mapped resources as defined in the categories.