ckanext-fairdatapoint

CKAN harvester for FAIR Data Point. Contains a harvester for FAIR data points. In the future, the FAIR data point API might be supported by this extension too.

Stages

The harvester runs in three stages. Each of these stages can be modified.

Gather stage. The gather stage uses the FairDataPointRecordProvider which implements the IRecordProvider interface to create a list of identifiers of the objects which should be included in the harvest. In case of a FAIR data point, this list includes catalogs and datasets. In the future, collections could be added;
Fetch stage. The fetch stage downloads the actual source data. In this phase, additional data from other sources may be included to better suit the DCAT profile as expected by CKAN;
Import stage. The import stage does the actual import. How the RDF from the FAIR data point is mapped to CKAN packages and resources is determined by so-called application profiles. In case of a FAIR data point which uses custom fields, a profile must be created. A profile can be defined as a Python class in the ckanext.fairdatapoint.profiles.py file. The new profile must be registered in the [ckan.rdf.profiles] section of setup.py. What profile is being used for a particular is determined by the harvester configuration.

{ "profiles": "fairdatapoint_dcat_ap" }

To run the harvester from the command line:

ckan --config=<full path to CKAN ini-file> harvester run-test <id of harvester>

To rebuiod the index in case it is not automatically update after clearing all packages from a harvester:

ckan --config=<full path to CKAN ini-file> search-index rebuild

For more information got to GDI harvester information

Requirements

Compatibility with core CKAN versions:

CKAN version	Compatible?
2.10	tested

Installation

TODO: Add any additional install steps to the list below. For example installing any non-Python dependencies or adding any required config settings.

To install gdi-userportal-ckanext-fairdatapoint:

Activate your CKAN virtual environment, for example:

. /usr/lib/ckan/default/bin/activate
Clone the source and install it on the virtualenv

git clone https://github.com/GenomicDataInfrastructure/gdi-userportal-ckanext-fairdatapoint.git cd gdi-userportal-ckanext-fairdatapoint pip install -e . pip install -r requirements.txt
Add fairdatapoint to the ckan.plugins setting in your CKAN config file (by default the config file is located at /etc/ckan/default/ckan.ini).
Restart CKAN. For example if you've deployed CKAN with Apache on Ubuntu:

sudo service apache2 reload

Config settings

Catalog harvesting

There is a setting ckanext.fairdatapoint.harvest_catalogs. Default is false. If set to true, CKAN will harvest catalogs as datasets.

The setting can be overriden in the harvester profile, by setting "harvest_catalogs": "true" or "harvest_catalogs": "false" in the harvester configuration JSON.

Label resolving

The harvester supports the resolving of labels for fields defined as a (resolvable) URI. Examples of this include Wikidata entities. There is a setting ckanext.fairdatapoint.resolve_labels. Default is true, but you can disable it globally by explicitly setting it to false.

The setting can be overriden in the harvester profile, by setting "resolve_labels": "true" or "resolve_labels": "false" in the harvester configuration JSON.

Developer installation

To install ckanext-fairdatapoint for development, activate your CKAN virtualenv and do:

git clone https://github.com/GenomicDataInfrastructure/gdi-userportal-ckanext-fairdatapoint.git
cd gdi-userportal-ckanext-fairdatapoint
python setup.py develop
pip install -r dev-requirements.txt

Fairdatapoint plugin depends on ckanext-scheming, ckanext-harvester and ckanext-dcat. Make sure these are installed, otherwise run:

pip install -e 'git+https://github.com/ckan/ckanext-scheming.git@release-3.0.0#egg=ckanext-scheming[requirements]'
pip install -e 'git+https://github.com/ckan/ckanext-harvest.git@v1.6.0#egg=ckanext-harvest[requirements]'
pip install -e 'git+https://github.com/ckan/ckanext-dcat.git@v2.1.0#egg=ckanext-dcat'
pip install -r https://raw.githubusercontent.com/ckan/ckanext-dcat/v2.1.0/requirements.txt

Tests

To run the tests go to GDI harvester test information

Releasing a new version of ckanext-fairdatapoint

If ckanext-fairdatapoint should be available on PyPI you can follow these steps to publish a new version:

Update the version number in the setup.py file. See PEP 440 for how to choose version numbers.
Make sure you have the latest version of necessary packages:

pip install --upgrade setuptools wheel twine
Create a source and binary distributions of the new version:
```
python setup.py sdist bdist_wheel && twine check dist/*
```
Fix any errors you get.
Upload the source distribution to PyPI:
```
twine upload dist/*
```
Commit any outstanding changes:
```
git commit -a
git push
```
Tag the new release of the project on GitHub with the version number from the setup.py file. For example if the version number in setup.py is 0.0.1 then do:
```
git tag 0.0.1
git push --tags
```

License

This work is licensed under multiple licenses. Because keeping this section up-to-date is challenging, here is a brief summary as of January 2024:

All original source code is licensed under AGPL.
All documentation is licensed under CC-BY-4.0.
Some configuration and data files are licensed under CC-BY-4.0.
For more accurate information, check the individual files.

Name		Name	Last commit message	Last commit date
Latest commit History 198 Commits
.github		.github
LICENSES		LICENSES
ckanext		ckanext
.coveragerc		.coveragerc
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
MANIFEST.in		MANIFEST.in
README.md		README.md
dev-requirements.txt		dev-requirements.txt
renovate.json		renovate.json
renovate.json.license		renovate.json.license
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
sonar-project.properties		sonar-project.properties
test.ini		test.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ckanext-fairdatapoint

Stages

Requirements

Installation

Config settings

Catalog harvesting

Label resolving

Developer installation

Tests

Releasing a new version of ckanext-fairdatapoint

License

About

Releases 9

Packages

Languages

GenomicDataInfrastructure/gdi-userportal-ckanext-fairdatapoint

Folders and files

Latest commit

History

Repository files navigation

ckanext-fairdatapoint

Stages

Requirements

Installation

Config settings

Catalog harvesting

Label resolving

Developer installation

Tests

Releasing a new version of ckanext-fairdatapoint

License

About

Resources

Code of conduct

Stars

Watchers

Forks

Releases 9

Packages 0

Languages

Packages