From ff1ebecd4738113c954ccda99d3088581b406c14 Mon Sep 17 00:00:00 2001
From: ci-bot
- SciCat with docker compose.
+ Get set up with an instance of SciCat to explore the metadata catalog. SciCatlive provides a flexible
+and easy way to learn about SciCat and its features for people who are looking to integrate SciCat into their environment. For a user guide please see
+
+ original documentation
+
+ .
+
+ This project requires docker and docker compose. The docker version must be later than 2.29.0 to support this project.
- Features and services can be enabled or configured by setting
+ SciCat has extra features as part of its core as well as integrating with external services.
+
+ SciCat features that extend the backend are:
+* Jobs - this mechanism posts to a message broker, which can then trigger down stream processes. To use this a RabbitMQ server enabled.
+* Elasticsearch - creates an elasticsearch service to provide full text search in the backend.
+
+ Services that can be integrated with SciCat are:
+* LDAP - authentication and authorization from an LDAP server
+* OIDC - authentication and authorization using an OIDC provider
+* SearchAPI- for better free text search in the metadata based on the PANOSC
+
+ search-api
+
+ * JupyterHub - Adds an instance of JupyterHub which demonstrates ingestion and extraction of metadata using
+
+ pyscicat
+
+ .
+
+ To enable extra services configure them by:
+1. setting
docker compose env variables
- , using
+ 2. using
docker compose profiles
- , modifying the
+ 3. modifying the
service-specific config
- and adding
+ 4. adding
entrypoints
+
+ Here below we show the dependencies, including the ones of the
+
+ extra services
+
+ (if
+
+ We flag with
+
+ The user can selectively decide the containers to spin up and the dependencies will be resolved accordingly. The available services are in the
+
+ services
+
+ folder and are called consistently.
+
+ For example, one could decide to only run the
+
+ (or a list of services, for example, with the proxy
+
+ This will run, from the
+
+ previous section
+
+ , (1) and (2) but skip the rest.
+
+ Will run, from the
+
+ previous section
+
+ , (1), (2) and (4) but skip (5).
+
+ And
+
+ Will run, from the
+
+ previous section
+
+ , (1) and (2), skip (3) and (4), and add the
+
+ Make sure to check the
+
+ backend compatibility
+
+ when choosing services and setting
+
First stable version
@@ -1270,6 +1278,17 @@
.
+ https://${service}.localhost/${prefix}
+
+ . For example, the backend API can be explored through a Swagger UI at
+
+ http://backend.localhost/explorer
+
+ . For more information on the paths used by these routes see the original documentation for these services.
+
Extra services and features
@@ -1278,22 +1297,179 @@
+ Dependencies
+
+ ¶
+
+
+
+ B
+
+ depends on
+
+ A
+
+ , then we visualize it as
+
+ A --> B
+
+ ):
+
+ graph TD
+ subgraph services
+ subgraph backend
+ backends[v3*/v4*]
+ end
+ mongodb --> backend
+ backend --> frontend
+ backend --> searchapi
+ backend --> jupyter
+ end
+
+ proxy -.- services
+ *
+
+ the services which have extra internal dependencies, which are not shared.
+
+ Select the services
+
+ ¶
+
+
+
+ backend
+
+ by running (be aware that this will not run the
+
+ proxy
+
+ , so the service will not be available at
+
+ backend.localhost
+
+ ):
+
+ docker compose up -d backend
+
+ docker compose up -d backend proxy
+
+ )
+
+ Accordingly (click to expand)...
+
+
+ docker compose up -d frontend
+
+ docker compose --profile search up -d searchapi
+
+ searchapi
+
+ service.
+
+ docker compose env vars and profiles
+
.
@@ -1313,6 +1489,20 @@
file.
+ For example, to use the Jobs functionality of SciCat change
+
+ JOBS_ENABLED
+
+ to true before running your
+
+ docker compose
+
+ command or simply export it in the shell. For all env configuration options see
+
+ here
+
+
+ For example
+
+ docker compose --profile analysis
+
+ sets up a jupyter hub with some notebooks for ingesting data into SciCat, as well as the related services (backend, mongodb, proxy). For more information on profiles available in SciCat live see the following
+
+ table
+
+ .
+
- Here below we show the dependencies, including the ones of the
-
- extra services
-
- (if
-
- B
-
- depends on
-
- A
-
- , then we visualize it as
-
- A --> B
-
- ):
-
graph TD
- subgraph services
- subgraph backend
- backends[v3*/v4*]
- end
- mongodb --> backend
- backend --> frontend
- backend --> searchapi
- backend --> jupyter
- end
-
- proxy -.- services
-
- We flag with
-
- *
-
- the services which have extra internal dependencies, which are not shared.
-
- The user can selectively decide the containers to spin up and the dependencies will be resolved accordingly. The available services are in the - - services - - folder and are called consistently. -
-
- For example, one could decide to only run the
-
- backend
-
- by running (be aware that this will not run the
-
- proxy
-
- , so the service will not be available at
-
- backend.localhost
-
- ):
-
docker compose up -d backend
-
-
- (or a list of services, for example, with the proxy
-
- docker compose up -d backend proxy
-
- )
-
- This will run, from the - - previous section - - , (1) and (2) but skip the rest. -
-docker compose up -d frontend
-
- - Will run, from the - - previous section - - , (1), (2) and (4) but skip (5). -
-- And -
-docker compose --profile search up -d searchapi
-
-
- Will run, from the
-
- previous section
-
- , (1) and (2), skip (3) and (4), and add the
-
- searchapi
-
- service.
-
- Make sure to check the
-
- backend compatibility
-
- when choosing services and setting
-
- docker compose env vars and profiles
-
- .
-
SciCat with docker compose.
"},{"location":"#first-stable-version","title":"First stable version","text":" Release v3.0
is the first stable and reviewed version of SciCatLive.
Running this project on Windows is not officialy supported, you should use Windows Subsystem for Linux (WSL).
However, if you want to run it on Windows you have to be careful about: - This project makes use of symbolic links, Windows and git for Windows have to be configured to handle them . - End of lines, specifically in shell scripts. If you have the git config parameter auto.crlf
set to true
, git will replace LF by CRLF causing shell scripts and maybe other things to fail. - This project uses the variable ${PWD}
to ease path resolution in bind mounts. In PowerShell/Command Prompt, the PWD
environment variable doesn't exist so you would need to set in manually before running any docker compose
command.
git clone https://github.com/SciCatProject/scicatlive.git\n
docker compose up -d\n
By running docker compose up -d
these steps take place:
http://${service}.localhost
. The frontend is available at simply http://localhost
. Features and services can be enabled or configured by setting docker compose env variables , using docker compose profiles , modifying the service-specific config and adding entrypoints .
"},{"location":"#docker-compose-env-variables","title":"Docker compose env variables","text":"They are used to modify existing services where whenever enabling the feature requires changes in multiple services. They also have the advantage, compared to docker profiles, of not needing to define a new profile when a new combination of features becomes available. To set an env variable for docker compose, either assign it in the shell or change the .env file. To later unset it, either unset it from the shell or assign it an empty value, either in the shell or in the .env file.
"},{"location":"#docker-compose-profiles","title":"Docker compose profiles","text":" They are used when adding new services or grouping services together (and do not require changes in multiple services). To enable any, run docker compose --profile <PROFILE> up -d
, or export the COMPOSE_PROFILES
env variable as described here . If needed, the user can specify more than one profile in the CLI by using the flag as --profile <PROFILE1> --profile <PROFILE2>
.
COMPOSE_PROFILES
analysis
: jupyter search
: searchapi '*'
: jupyter,searchapi ''
* BE_VERSION
v3
: backend/v3 v4
: backend/v4 v4
as set Sets the BE version to use in (2) of default setup to v3 mongodb,frontend env JOBS_ENABLED
true
: rabbitmq,archivemock,jobs feature ''
v3 Creates a RabbitMQ message broker which the BE posts to and the archivemock listens to. It emulates the data long-term archive/retrieve workflow env ELASTIC_ENABLED
true
: elastic,elastic feature ''
v4 Creates an elastic search service and sets the BE to use it for full-text searches env LDAP_ENABLED
true
: ldap auth ''
* Creates an LDAP service and sets the BE to use it as authentication backend env OIDC_ENABLED
true
: oidc auth ''
* Creates an OIDC identity provider and sets the BE to use it as authentication backend env DEV
true
: backend,frontend,searchapi,archivemock in DEV mode ''
* The SciCat services' environment is prepared to ease the development in a standardized environment env <SERVICE>_HTTPS_URL
<URL>
: HTTPS termination ''
* Requests the TLS certificate for the URL to LetsEncrypt through the proxy After optionally setting any configuration option, one can still select the services to run as described here .
"},{"location":"#dev-configuration","title":"DEV configuration","text":"(click to expand) To provide a consistent environment where developers can work, the DEV=true
option creates the SciCat services (see DEV from here for the list), but instead of running them, it just creates the base environment that each service requires. For example, for the backend
, instead of running the web server, it creates a NODE environment with git
where one can develop and run the unit tests. This is useful as often differences in environments create collaboration problems. It should also provide an example of the configuration for running tests. Please refer to the services' README for additional information, or to the Dockerfile CMD
of the components' GitHub repo if not specified otherwise. The DEV=true
affects the SciCat services only.
Please be patient when using DEV as each container runs unit tests as part of the init, which might take a little to finish. This is done to test the compatibility of upstream/latest with the docker compose
(see warning). To see if any special precaution is required to run the tests, refer to the entrypoints/tests.sh
mounted by the volumes. To disable test execution, just comment the entrypoints/tests.sh
mount on the respective service.
It is very convenient if using VSCode , as, after the docker services are running, one can attach to it and start developing using all VSCode features, including version control and debugging.
To prevent git unpushed changes from being lost when a container is restarted, the work folder of each service, when in DEV mode, is mounted to a docker volume, with naming convention ${COMPOSE_PROJECT_NAME}_<service>_dev
. Make sure, before removing docker volumes to push the relevant changes.
As the DEV containers pull from upstream/latest, there is no guarantee of their functioning outside of releases. If they fail to start, try, as a first option, to build the image from a tag (e.g. build context ) using the TAG and then git checkout to that tag (e.g. set GITHUB_REPO including the branch using the same syntax and value as the build context).
e.g., for the frontend:
build:\n- context: https://github.com/SciCatProject/frontend.git\n+ context: https://github.com/SciCatProject/frontend.git#v4.4.1\n environment:\n- GITHUB_REPO: https://github.com/SciCatProject/frontend.git\n+ GITHUB_REPO: https://github.com/SciCatProject/frontend.git#v4.4.1\n
If you did not remove the volume, specified a new branch, and had any uncommited changes, they will be stashed to checkout to the selected branch. You can later reapply them by git stash apply
.
You can enable TLS termination of desired services by setting the <SERVICE>_HTTPS_URL
, by setting the full URL, including https://
. The specified HTTPS URL will get a letsencrypt
generated certificate through the proxy setting. For more details see the proxy instructions . After setting some URLs, the required changes in dependent services are automatically resolved, as explained for example here . Whenever possible, we use either the docker internal network or the localhost subdomains.
Please make sure to set all required <SERVICE>_HTTPS_URL
whenever enabling one, as mixing public URLs and localhost
ones might be tricky. See, for example, what is described in the frontend documentation and the backend documentation .
It can be changed whenever needing to configure a service independently from the others.
Every service folder (inside the services parent directory) contains its configuration and some instructions, at least for the non-third-party containers.
For example, to configure the frontend , the user can change any file in the frontend config folder, for which instructions are available in the README file.
After any configuration change, docker compose up -d
must be rerun, to allow loading the changes.
Sometimes, it is useful to run init scripts (entrypoints) before the service starts. For example, for the frontend
composability, it is useful to specify its configuration through multiple JSON files, with different scopes, which are then merged by a init script . For this reason, one can define service-specific entrypoints
(e.g. frontend ones ) which can be run inside the container, before the service starts (i.e. before the docker compose command
is executed). Whenever these entrypoints are shared between services, it is recommended to place them in an entrypoints
folder below the outermost service (e.g. this one ).
To ease the iterative execution of multiple init scripts, one can leverage the loop_entrypoints utility, which loops alphabetically over /docker-entrypoinst/*.sh
and executes each. This is in use in some services (e.g. in the frontend ), so one can add additional init steps by mounting them, one by one, as volumes inside the container in the /docker-entrypoints
folder and naming them depending on the desired order (eventually rename the existing ones as well).
/docker-entrypoints/*.sh
, naming them sequentially, depending on the desired execution order entrypoint
field in the service command
See for example here .
"},{"location":"#dependencies","title":"Dependencies","text":" Here below we show the dependencies, including the ones of the extra services (if B
depends on A
, then we visualize it as A --> B
):
graph TD\n subgraph services\n subgraph backend\n backends[v3*/v4*]\n end\n mongodb --> backend\n backend --> frontend\n backend --> searchapi\n backend --> jupyter\n end\n\n proxy -.- services
We flag with *
the services which have extra internal dependencies, which are not shared.
The user can selectively decide the containers to spin up and the dependencies will be resolved accordingly. The available services are in the services folder and are called consistently.
For example, one could decide to only run the backend
by running (be aware that this will not run the proxy
, so the service will not be available at backend.localhost
):
docker compose up -d backend\n
(or a list of services, for example, with the proxy docker compose up -d backend proxy
)
This will run, from the previous section , (1) and (2) but skip the rest.
Accordingly (click to expand)...docker compose up -d frontend\n
Will run, from the previous section , (1), (2) and (4) but skip (5).
And
docker compose --profile search up -d searchapi\n
Will run, from the previous section , (1) and (2), skip (3) and (4), and add the searchapi
service.
Make sure to check the backend compatibility when choosing services and setting docker compose env vars and profiles
.
Please note that services should, in general, be defined by their responsibility, rather than by their underlying technology, and should be named so.
"},{"location":"#basic","title":"Basic","text":"To add a new service (see the jupyter service for a minimal example):
compose.yaml
file README.md
file in the service * if the service to add is not shared globally, but specific to one particular service or another implementation of the same component, add it to the services
folder relative to the affected service, and in (6) add it to its inclusion list. See an example of a service relative services folder here and a relative inclusion list here .
Since some images are not built with multi-arch, in particular the SciCat ones, make sure to specify the platform of the service in the compose, when needed, to avoid possible issues when running docker compose up
on different platforms, for example on MAC with arm64 architecture. See for example the searchapi compose .
To add a new service, with advanced configuration (see the backend for an extensive example):
eventually, if the service supports ENVs , leverage the include override feature from docker compose. For this:
compose.base.yaml
file, e.g. here , which should contain the base
configuration, i.e. the one where all ENVs are unset, i.e. the features are disabled ELASTIC_ENABLED
) compose.<ENV>.yaml
file, e.g. backend v4 compose.elastic.yaml , with the additional/override config, specific to the enabled feature .compose.<ENV>.yaml
, e.g. here . This is used whenever the ENV
is unset, as described in the next step compose.yaml
to merge the compose*.yaml
files together, making sure to default to .compose.<ENV>.yaml
whenever the ENV
is not set. See an example here backend
service, add the selective include in the parent compose.yaml, e.g. here eventually, add entrypoints for init logics, as described here , e.g. like here , including any ENVs specific logic. Remember to set the environment variable in the compose.yaml file. See, for example, the frontend entrypoint and compose file .
To use SciCat, please refer to the original documentation .
"},{"location":"services/backend/","title":"Backend service","text":""},{"location":"services/backend/#backend-service","title":"Backend service","text":"The SciCat backend HTTP service.
"},{"location":"services/backend/#enable-additional-features","title":"Enable additional features","text":" The BE_VERSION
value controls which version of the backend should be started, either v3 or v4 (default).
Setting the BACKEND_HTTPS_URL and OIDC_ENABLED env variables requires changing the OIDC configuration, either in the v3 compose.oidc.yaml and providers.oidc.json , or the v4 env file .
"},{"location":"services/backend/#dependencies","title":"Dependencies","text":" Here below we show the internal dependencies of the service, which are not already covered here (if B
depends on A
, then we visualize it as A --> B
). The same subdomain to service convention applies.
When setting BACKEND_HTTPS_URL
and OIDC_ENABLED
, you might need to also set KEYCLOAK_HTTPS_URL
to correctly resolve the login flow redirects. A more detailed explanation for v3 can be found here, and it is similar for v4.
graph TD\n ldap --> backend\n keycloak --> backend
"},{"location":"services/backend/services/keycloak/","title":"Keycloak (OIDC Identity provider)","text":""},{"location":"services/backend/services/keycloak/#keycloak-oidc-identity-provider","title":"Keycloak (OIDC Identity provider)","text":"OIDC is an authentication protocol that verifies user identities when they sign in to access digital resources. SciCat can use an OIDC service as third-party authentication provider.
"},{"location":"services/backend/services/keycloak/#configuration-options","title":"Configuration options","text":"The Keycloak configuration is set by the .env file and the realm created is in facility-realm.json file .
For an extensive list of available options see here .
Realm creation is only done once, when the container is created.
"},{"location":"services/backend/services/keycloak/#default-configuration","title":"Default configuration","text":" The default configuration .env file creates the admin
user with the admin
password. Administration web UI is available at http://keycloak.localhost
Also a realm called facility
is created with the following user and group:
The users' groups are passed to SciCat backend via the OIDC ID Token, in the claim named accessGroups
(an array of strings). The name of the claim can be configured either in login-callbacks.js for v3 or with environment variables for v4.
LDAP (Lightweight Directory Access Protocol) is a protocol used to access and manage directory information such as user credentials. SciCat can use LDAP as third-party authentication provider.
"},{"location":"services/backend/services/ldap/#configuration-options","title":"Configuration options","text":"The OpenLDAP configuration is set by the .env file .
For an extensive list of available options see here .
You can add other users by editing the ldif file . User creation is only done once, when the container is created.
"},{"location":"services/backend/services/ldap/#default-configuration","title":"Default configuration","text":" The default configuration .env file creates the dc=facility
domain with the following user:
The SciCat backend v3 is the SciCat metadata catalogue RESTful API layer, built on top of the Loopback framework.
"},{"location":"services/backend/services/v3/#configuration-options","title":"Configuration options","text":"The v3 backend configuration is set through files. What follows is a list of available options. Some configurations are very verbose and could be simplified, but v3 has reached end of life in favour of v4, thus there is no active development.
"},{"location":"services/backend/services/v3/#datasourcesjson","title":"datasources.json","text":"It allows setting the connection to the underlying mongo database. It consists of two blocks: It consists of two blocks, the transient one which should not be changed and the mongo one for which we list the options that can be configured.
"},{"location":"services/backend/services/v3/#mongo","title":"mongo","text":"TL;DR in most cases, is enough to set the desired url with the full connection string to your mongo instance.
Name Description Value host mongodb host \"mongodb\" port mongodb host port \"27017\" url mongodb full URL. If set, all the other options, apart from useNewUrlParser
and allowExtendedOperators
are discarded in favour of this one \"\" database mongodb database \"dacat\" password mongodb user password \"\" user mongodb user password \"\""},{"location":"services/backend/services/v3/#providersjson","title":"providers.json","text":" It allows setting the authentication providers. The local
block sets the local accounts.
Any file called providers*.json will be merged together by the merge_json.sh . This is done to allow better scoping of providers options.
"},{"location":"services/backend/services/v3/#local","title":"local","text":"The only option available is to either enable or disable the local authentication. Remove the block if you want to disable that.
"},{"location":"services/backend/services/v3/#configlocaljs","title":"config.local.js","text":"It allows setting backend-specific configurations. Here are the commonly changed options.
"},{"location":"services/backend/services/v3/#exports","title":"exports","text":"Name Description Value pidPrefix prefix of the internal IDs \"PID.SAMPLE.PREFIX\" doiPrefix prefix of the published DOIs \"DOI.SAMPLE.PREFIX\" policyPublicationShiftInYears number of years before the data should be made open access. This is only an annotation in the metadata and no action is triggered after expiration 3 policyRetentionShiftInYears number of years by which the data should be kept. This is only an annotation in the metadata and no action is triggered after expiration 10 site name of the facility runnin SciCat \"SAMPLE-SITE\" queue message broker flavour for the JOBs \"rabbitmq\" logbook.enabled option to enable scichat \"false\""},{"location":"services/backend/services/v3/#functional-accounts","title":"Functional Accounts","text":"There are a few functional accounts available for handling data:
Username Password Usage admin 2jf70TPNZsS Admin ingestor aman Ingest datasets archiveManager aman Manage archiving of datasets proposalIngestor aman Ingest proposals"},{"location":"services/backend/services/v3/#default-configuration","title":"Default configuration","text":"In the default configuration folder config , the backend is set to use the mongo container .
"},{"location":"services/backend/services/v3/#enable-additional-features","title":"Enable additional features","text":" Additionally, by setting the env variable JOBS_ENABLED
, the archive mock and rabbitmq services are started and the backend is configured to connect to them.
If LDAP_ENABLED
is toggled, you can use LDAP to log in with a LDAP user .
If OIDC_ENABLED
is toggled, you can use OIDC to log in with a OIDC user .
With DEV=true
, since the v3 tests are supposed to run with an empty DB, the set DB is dacat_test which is empty. If willing to use the seeded one later during development, just set dacat
as database values in the file /home/node/app/server/datasources.json
on the container.
Here below we show the internal dependencies of the service, which are not already covered here and here (if B
depends on A
, then we visualize as A --> B
). The same subdomain to service convention applies.
graph TD\n rabbitmq --> archivemock\n rabbitmq --> backend\n backend --> archivemock
"},{"location":"services/backend/services/v3/services/archivemock/","title":"Archive Mock","text":""},{"location":"services/backend/services/v3/services/archivemock/#archive-mock","title":"Archive Mock","text":"The Archive Mock simulates the interactions of an archival mock with SciCat.
"},{"location":"services/backend/services/v3/services/archivemock/#service-requirements","title":"Service Requirements","text":"The container uses environment variables for configuration.
"},{"location":"services/backend/services/v3/services/archivemock/#default-configuraiton","title":"Default configuraiton","text":" By default, it is configured to connect to the backend v3 container with the admin
account, and to the RabbitMQ container with the guest
account. It will then handle all archival and retrieval jobs posted to RabbitMQ, and update the corresponding Datasets accordingly in Scicat.
The SciCat backend v4 is a rewrite of the original backend, built on top of the NestJS framework.
"},{"location":"services/backend/services/v4/#configuration-options","title":"Configuration options","text":"The backend-next service is mainly configured via environment variables. For an extensive list of available options see here .
"},{"location":"services/backend/services/v4/#functional-accounts","title":"Functional Accounts","text":"There are a few functional accounts available for handling data:
Username Password Usage admin 2jf70TPNZsS Admin ingestor aman Ingest datasets archiveManager aman Manage archiving of datasets proposalIngestor aman Ingest proposals"},{"location":"services/backend/services/v4/#default-configuration","title":"Default configuration","text":"In the default configuration folder config , the backend is set to use the mongo container .
"},{"location":"services/backend/services/v4/#enable-additional-features","title":"Enable additional features","text":" Additionally, by setting the env variable ELASTIC_ENABLED
, the elastic search service is started and the backend is configured to connect to them.
If LDAP_ENABLED
is toggled, you can use LDAP to log in with a LDAP user .
If OIDC_ENABLED
is toggled, you can use OIDC to log in with a OIDC user .
With DEV=true
, since the container might have limited memory, it is recommended to run unit tests with the option --runInBand
, as here , which makes the tests run sequentially, avoiding to fill the RAM which makes them freeze.
Here below we show the internal dependencies of the service, which are not already covered here and here (if B
depends on A
, then we visualize as A --> B
). The same subdomain to service convention applies.
graph TD\n elasticsearch --> backend
"},{"location":"services/frontend/","title":"Frontend","text":""},{"location":"services/frontend/#frontend","title":"Frontend","text":"The SciCat frontend is the SciCat metadata catalogue web UI, built on top of the Angular framework.
"},{"location":"services/frontend/#configuration-options","title":"Configuration options","text":" The frontend configuration is set by the config files . Files inside the config folder, with a .json
extension are merged respecting the alphabetical order of the files in the container , with config.v3.json applied depending on the BE_VERSION .
Please note that merging the config files is a functionality provided by SciCat Live
and is not supported natively by the frontend
.
For an extensive list of available options see here in the SciCat frontend section.
"},{"location":"services/frontend/#default-configuration","title":"Default configuration","text":" In the default configuration config , the frontend is set to call the backend service
available at backend.localhost
(either v4 , by default, or v3 if specified otherwise by setting BE_VERSION
).
For an explanation of how setting BE_VERSION
changes the environment creation see here .
Since there was a small breaking change from v3
to v4
, when connecting to the backend
, the BE_VERSION
value controls if config.v3.json file , which is applied when BE_VERSION=v3
, should be included in the configs merge process.
With DEV=true
, please use npm start -- --host 0.0.0.0
. This is to allow traffic from any IP to the frontend
component and it is necessary since the component runs in the docker network.
Setting the BACKEND_HTTPS_URL env variable requires changing the backend
URL used by the frontend
. This is managed here .
When setting FRONTENT_HTTPS_URL
it is likely you also want to set the BACKEND_HTTPS_URL
, to allow the communication between the two wherever the browser is accessed.
This Jupyter Notebook instance is preconfigured with an example notebook that shows the usage of Pyscicat .
"},{"location":"services/jupyter/#pyscicat-notebook","title":"Pyscicat Notebook","text":"This notebook demonstrates all the major actions Pyscicat is capable of: * logging into SciCat backend * dataset creation * datablock creation * attachment upload
"},{"location":"services/jupyter/#env-file","title":".env file","text":"It contains the environment variables for connecting to the backend service deployed by this project
"},{"location":"services/jupyter/#thumbnail-image","title":"Thumbnail image","text":"An example image that is used for the attachment upload demonstration
"},{"location":"services/jupyter/#default-configuration","title":"Default configuration","text":"This service is only dependant on the backend service, since it demonstrates communication with the latter through Pyscicat.
The notebooks are mounted to the container from the config/notebooks directory. The changes to these notebooks should not be contributed back to this repository, unless this is intentional. In the case you want to upstream changes to these notebooks, be sure to clear all the results from them.
The main readme covers all dependencies of this package.
"},{"location":"services/mongodb/","title":"Mongodb","text":""},{"location":"services/mongodb/#mongodb","title":"Mongodb","text":" The mongodb
container is responsible of creating a mongodb container with initial metadata.
All files collection created with relative data are in the seed folder and the init script here
To add more collections during the creation of the database: 1. add the corresponding file(s) here , keeping the convention: filename := collectionname.json
. 2. Restart the docker container.
These files are ingested into the database using mongo funcionalities and bypassing the backend, i.e. they are not to be taken as examples to use the backend API.
"},{"location":"services/mongodb/#default-configuration","title":"Default configuration","text":" In the default configuration init.sh , the seeding creates data in the mongodb database used by the backend
service (either v4 , by default, or v3 if specified otherwise by setting BE_VERSION
).
For an explanation of how setting BE_VERSION
changes the environment creation see here .
BE_VERSION
","text":" Since v3 and v4 connect to two different DBs, the BE_VERSION environment variable controls which DB should be seeded ( dacat
for v3 and dacat-next
for v4 ).
The proxy acts as a reverse proxy to the SciCat Live containers.
"},{"location":"services/proxy/#configuration","title":"Configuration","text":""},{"location":"services/proxy/#env-file","title":".env file","text":"It sets proxy options which are rarely changed, for example, the default configuration with the docker network.
"},{"location":"services/proxy/#tlsenv-file","title":".tls.env file","text":"It can be customized to set the TLS options. This has an effect only if the service URLs exposed by traefik are reachable from the public web.
You need to set the letsencrypt options here .
"},{"location":"services/proxy/#enable-tls","title":"Enable TLS","text":" To enable TLS on specific services, you can set the <SERVICE>_HTTPS_URL
env var to the desired URL, including the https://
prefix, making sure that the URLs are reachable by letsencrypt
. See here for an example. This will request the certificate from letsencrypt
.
The SciCat seachAPI is the SciCat metadata catalogue standardised API for communication between SciCat and the PaN portal, built on top of the Loobpack framework.
"},{"location":"services/searchapi/#configuration-options","title":"Configuration options","text":"The searchapi configuration is set by the .env file . For an extensive list of available options see here .
"},{"location":"services/searchapi/#default-configuration","title":"Default configuration","text":" In the default configuration .env file , the searchapi is set to call the backend service
available at backend.localhost
(either v4 , by default, or v3 if specified otherwise by setting BE_VERSION
).
For an explanation of how setting BE_VERSION
changes the environment creation see here .
Get set up with an instance of SciCat to explore the metadata catalog. SciCatlive provides a flexible and easy way to learn about SciCat and its features for people who are looking to integrate SciCat into their environment. For a user guide please see original documentation .
This project requires docker and docker compose. The docker version must be later than 2.29.0 to support this project.
"},{"location":"#first-stable-version","title":"First stable version","text":" Release v3.0
is the first stable and reviewed version of SciCatLive.
Running this project on Windows is not officialy supported, you should use Windows Subsystem for Linux (WSL).
However, if you want to run it on Windows you have to be careful about: - This project makes use of symbolic links, Windows and git for Windows have to be configured to handle them . - End of lines, specifically in shell scripts. If you have the git config parameter auto.crlf
set to true
, git will replace LF by CRLF causing shell scripts and maybe other things to fail. - This project uses the variable ${PWD}
to ease path resolution in bind mounts. In PowerShell/Command Prompt, the PWD
environment variable doesn't exist so you would need to set in manually before running any docker compose
command.
git clone https://github.com/SciCatProject/scicatlive.git\n
docker compose up -d\n
By running docker compose up -d
these steps take place:
http://${service}.localhost
. The frontend is available at simply http://localhost
. https://${service}.localhost/${prefix}
. For example, the backend API can be explored through a Swagger UI at http://backend.localhost/explorer
. For more information on the paths used by these routes see the original documentation for these services. SciCat has extra features as part of its core as well as integrating with external services.
SciCat features that extend the backend are: * Jobs - this mechanism posts to a message broker, which can then trigger down stream processes. To use this a RabbitMQ server enabled. * Elasticsearch - creates an elasticsearch service to provide full text search in the backend.
Services that can be integrated with SciCat are: * LDAP - authentication and authorization from an LDAP server * OIDC - authentication and authorization using an OIDC provider * SearchAPI- for better free text search in the metadata based on the PANOSC search-api * JupyterHub - Adds an instance of JupyterHub which demonstrates ingestion and extraction of metadata using pyscicat .
To enable extra services configure them by: 1. setting docker compose env variables 2. using docker compose profiles 3. modifying the service-specific config 4. adding entrypoints
"},{"location":"#dependencies","title":"Dependencies","text":" Here below we show the dependencies, including the ones of the extra services (if B
depends on A
, then we visualize it as A --> B
):
graph TD\n subgraph services\n subgraph backend\n backends[v3*/v4*]\n end\n mongodb --> backend\n backend --> frontend\n backend --> searchapi\n backend --> jupyter\n end\n\n proxy -.- services
We flag with *
the services which have extra internal dependencies, which are not shared.
The user can selectively decide the containers to spin up and the dependencies will be resolved accordingly. The available services are in the services folder and are called consistently.
For example, one could decide to only run the backend
by running (be aware that this will not run the proxy
, so the service will not be available at backend.localhost
):
docker compose up -d backend\n
(or a list of services, for example, with the proxy docker compose up -d backend proxy
)
This will run, from the previous section , (1) and (2) but skip the rest.
Accordingly (click to expand)...docker compose up -d frontend\n
Will run, from the previous section , (1), (2) and (4) but skip (5).
And
docker compose --profile search up -d searchapi\n
Will run, from the previous section , (1) and (2), skip (3) and (4), and add the searchapi
service.
Make sure to check the backend compatibility when choosing services and setting docker compose env vars and profiles
.
They are used to modify existing services where whenever enabling the feature requires changes in multiple services. They also have the advantage, compared to docker profiles, of not needing to define a new profile when a new combination of features becomes available. To set an env variable for docker compose, either assign it in the shell or change the .env file. To later unset it, either unset it from the shell or assign it an empty value, either in the shell or in the .env file.
For example, to use the Jobs functionality of SciCat change JOBS_ENABLED
to true before running your docker compose
command or simply export it in the shell. For all env configuration options see here
They are used when adding new services or grouping services together (and do not require changes in multiple services). To enable any, run docker compose --profile <PROFILE> up -d
, or export the COMPOSE_PROFILES
env variable as described here . If needed, the user can specify more than one profile in the CLI by using the flag as --profile <PROFILE1> --profile <PROFILE2>
.
For example docker compose --profile analysis
sets up a jupyter hub with some notebooks for ingesting data into SciCat, as well as the related services (backend, mongodb, proxy). For more information on profiles available in SciCat live see the following table .
COMPOSE_PROFILES
analysis
: jupyter search
: searchapi '*'
: jupyter,searchapi ''
* BE_VERSION
v3
: backend/v3 v4
: backend/v4 v4
as set Sets the BE version to use in (2) of default setup to v3 mongodb,frontend env JOBS_ENABLED
true
: rabbitmq,archivemock,jobs feature ''
v3 Creates a RabbitMQ message broker which the BE posts to and the archivemock listens to. It emulates the data long-term archive/retrieve workflow env ELASTIC_ENABLED
true
: elastic,elastic feature ''
v4 Creates an elastic search service and sets the BE to use it for full-text searches env LDAP_ENABLED
true
: ldap auth ''
* Creates an LDAP service and sets the BE to use it as authentication backend env OIDC_ENABLED
true
: oidc auth ''
* Creates an OIDC identity provider and sets the BE to use it as authentication backend env DEV
true
: backend,frontend,searchapi,archivemock in DEV mode ''
* The SciCat services' environment is prepared to ease the development in a standardized environment env <SERVICE>_HTTPS_URL
<URL>
: HTTPS termination ''
* Requests the TLS certificate for the URL to LetsEncrypt through the proxy After optionally setting any configuration option, one can still select the services to run as described here .
"},{"location":"#dev-configuration","title":"DEV configuration","text":"(click to expand) To provide a consistent environment where developers can work, the DEV=true
option creates the SciCat services (see DEV from here for the list), but instead of running them, it just creates the base environment that each service requires. For example, for the backend
, instead of running the web server, it creates a NODE environment with git
where one can develop and run the unit tests. This is useful as often differences in environments create collaboration problems. It should also provide an example of the configuration for running tests. Please refer to the services' README for additional information, or to the Dockerfile CMD
of the components' GitHub repo if not specified otherwise. The DEV=true
affects the SciCat services only.
Please be patient when using DEV as each container runs unit tests as part of the init, which might take a little to finish. This is done to test the compatibility of upstream/latest with the docker compose
(see warning). To see if any special precaution is required to run the tests, refer to the entrypoints/tests.sh
mounted by the volumes. To disable test execution, just comment the entrypoints/tests.sh
mount on the respective service.
It is very convenient if using VSCode , as, after the docker services are running, one can attach to it and start developing using all VSCode features, including version control and debugging.
To prevent git unpushed changes from being lost when a container is restarted, the work folder of each service, when in DEV mode, is mounted to a docker volume, with naming convention ${COMPOSE_PROJECT_NAME}_<service>_dev
. Make sure, before removing docker volumes to push the relevant changes.
As the DEV containers pull from upstream/latest, there is no guarantee of their functioning outside of releases. If they fail to start, try, as a first option, to build the image from a tag (e.g. build context ) using the TAG and then git checkout to that tag (e.g. set GITHUB_REPO including the branch using the same syntax and value as the build context).
e.g., for the frontend:
build:\n- context: https://github.com/SciCatProject/frontend.git\n+ context: https://github.com/SciCatProject/frontend.git#v4.4.1\n environment:\n- GITHUB_REPO: https://github.com/SciCatProject/frontend.git\n+ GITHUB_REPO: https://github.com/SciCatProject/frontend.git#v4.4.1\n
If you did not remove the volume, specified a new branch, and had any uncommited changes, they will be stashed to checkout to the selected branch. You can later reapply them by git stash apply
.
You can enable TLS termination of desired services by setting the <SERVICE>_HTTPS_URL
, by setting the full URL, including https://
. The specified HTTPS URL will get a letsencrypt
generated certificate through the proxy setting. For more details see the proxy instructions . After setting some URLs, the required changes in dependent services are automatically resolved, as explained for example here . Whenever possible, we use either the docker internal network or the localhost subdomains.
Please make sure to set all required <SERVICE>_HTTPS_URL
whenever enabling one, as mixing public URLs and localhost
ones might be tricky. See, for example, what is described in the frontend documentation and the backend documentation .
It can be changed whenever needing to configure a service independently from the others.
Every service folder (inside the services parent directory) contains its configuration and some instructions, at least for the non-third-party containers.
For example, to configure the frontend , the user can change any file in the frontend config folder, for which instructions are available in the README file.
After any configuration change, docker compose up -d
must be rerun, to allow loading the changes.
Sometimes, it is useful to run init scripts (entrypoints) before the service starts. For example, for the frontend
composability, it is useful to specify its configuration through multiple JSON files, with different scopes, which are then merged by a init script . For this reason, one can define service-specific entrypoints
(e.g. frontend ones ) which can be run inside the container, before the service starts (i.e. before the docker compose command
is executed). Whenever these entrypoints are shared between services, it is recommended to place them in an entrypoints
folder below the outermost service (e.g. this one ).
To ease the iterative execution of multiple init scripts, one can leverage the loop_entrypoints utility, which loops alphabetically over /docker-entrypoinst/*.sh
and executes each. This is in use in some services (e.g. in the frontend ), so one can add additional init steps by mounting them, one by one, as volumes inside the container in the /docker-entrypoints
folder and naming them depending on the desired order (eventually rename the existing ones as well).
/docker-entrypoints/*.sh
, naming them sequentially, depending on the desired execution order entrypoint
field in the service command
See for example here .
"},{"location":"#add-a-new-service","title":"Add a new service","text":"Please note that services should, in general, be defined by their responsibility, rather than by their underlying technology, and should be named so.
"},{"location":"#basic","title":"Basic","text":"To add a new service (see the jupyter service for a minimal example):
compose.yaml
file README.md
file in the service * if the service to add is not shared globally, but specific to one particular service or another implementation of the same component, add it to the services
folder relative to the affected service, and in (6) add it to its inclusion list. See an example of a service relative services folder here and a relative inclusion list here .
Since some images are not built with multi-arch, in particular the SciCat ones, make sure to specify the platform of the service in the compose, when needed, to avoid possible issues when running docker compose up
on different platforms, for example on MAC with arm64 architecture. See for example the searchapi compose .
To add a new service, with advanced configuration (see the backend for an extensive example):
eventually, if the service supports ENVs , leverage the include override feature from docker compose. For this:
compose.base.yaml
file, e.g. here , which should contain the base
configuration, i.e. the one where all ENVs are unset, i.e. the features are disabled ELASTIC_ENABLED
) compose.<ENV>.yaml
file, e.g. backend v4 compose.elastic.yaml , with the additional/override config, specific to the enabled feature .compose.<ENV>.yaml
, e.g. here . This is used whenever the ENV
is unset, as described in the next step compose.yaml
to merge the compose*.yaml
files together, making sure to default to .compose.<ENV>.yaml
whenever the ENV
is not set. See an example here backend
service, add the selective include in the parent compose.yaml, e.g. here eventually, add entrypoints for init logics, as described here , e.g. like here , including any ENVs specific logic. Remember to set the environment variable in the compose.yaml file. See, for example, the frontend entrypoint and compose file .
To use SciCat, please refer to the original documentation .
"},{"location":"services/backend/","title":"Backend service","text":""},{"location":"services/backend/#backend-service","title":"Backend service","text":"The SciCat backend HTTP service.
"},{"location":"services/backend/#enable-additional-features","title":"Enable additional features","text":" The BE_VERSION
value controls which version of the backend should be started, either v3 or v4 (default).
Setting the BACKEND_HTTPS_URL and OIDC_ENABLED env variables requires changing the OIDC configuration, either in the v3 compose.oidc.yaml and providers.oidc.json , or the v4 env file .
"},{"location":"services/backend/#dependencies","title":"Dependencies","text":" Here below we show the internal dependencies of the service, which are not already covered here (if B
depends on A
, then we visualize it as A --> B
). The same subdomain to service convention applies.
When setting BACKEND_HTTPS_URL
and OIDC_ENABLED
, you might need to also set KEYCLOAK_HTTPS_URL
to correctly resolve the login flow redirects. A more detailed explanation for v3 can be found here, and it is similar for v4.
graph TD\n ldap --> backend\n keycloak --> backend
"},{"location":"services/backend/services/keycloak/","title":"Keycloak (OIDC Identity provider)","text":""},{"location":"services/backend/services/keycloak/#keycloak-oidc-identity-provider","title":"Keycloak (OIDC Identity provider)","text":"OIDC is an authentication protocol that verifies user identities when they sign in to access digital resources. SciCat can use an OIDC service as third-party authentication provider.
"},{"location":"services/backend/services/keycloak/#configuration-options","title":"Configuration options","text":"The Keycloak configuration is set by the .env file and the realm created is in facility-realm.json file .
For an extensive list of available options see here .
Realm creation is only done once, when the container is created.
"},{"location":"services/backend/services/keycloak/#default-configuration","title":"Default configuration","text":" The default configuration .env file creates the admin
user with the admin
password. Administration web UI is available at http://keycloak.localhost
Also a realm called facility
is created with the following user and group:
The users' groups are passed to SciCat backend via the OIDC ID Token, in the claim named accessGroups
(an array of strings). The name of the claim can be configured either in login-callbacks.js for v3 or with environment variables for v4.
LDAP (Lightweight Directory Access Protocol) is a protocol used to access and manage directory information such as user credentials. SciCat can use LDAP as third-party authentication provider.
"},{"location":"services/backend/services/ldap/#configuration-options","title":"Configuration options","text":"The OpenLDAP configuration is set by the .env file .
For an extensive list of available options see here .
You can add other users by editing the ldif file . User creation is only done once, when the container is created.
"},{"location":"services/backend/services/ldap/#default-configuration","title":"Default configuration","text":" The default configuration .env file creates the dc=facility
domain with the following user:
The SciCat backend v3 is the SciCat metadata catalogue RESTful API layer, built on top of the Loopback framework.
"},{"location":"services/backend/services/v3/#configuration-options","title":"Configuration options","text":"The v3 backend configuration is set through files. What follows is a list of available options. Some configurations are very verbose and could be simplified, but v3 has reached end of life in favour of v4, thus there is no active development.
"},{"location":"services/backend/services/v3/#datasourcesjson","title":"datasources.json","text":"It allows setting the connection to the underlying mongo database. It consists of two blocks: It consists of two blocks, the transient one which should not be changed and the mongo one for which we list the options that can be configured.
"},{"location":"services/backend/services/v3/#mongo","title":"mongo","text":"TL;DR in most cases, is enough to set the desired url with the full connection string to your mongo instance.
Name Description Value host mongodb host \"mongodb\" port mongodb host port \"27017\" url mongodb full URL. If set, all the other options, apart from useNewUrlParser
and allowExtendedOperators
are discarded in favour of this one \"\" database mongodb database \"dacat\" password mongodb user password \"\" user mongodb user password \"\""},{"location":"services/backend/services/v3/#providersjson","title":"providers.json","text":" It allows setting the authentication providers. The local
block sets the local accounts.
Any file called providers*.json will be merged together by the merge_json.sh . This is done to allow better scoping of providers options.
"},{"location":"services/backend/services/v3/#local","title":"local","text":"The only option available is to either enable or disable the local authentication. Remove the block if you want to disable that.
"},{"location":"services/backend/services/v3/#configlocaljs","title":"config.local.js","text":"It allows setting backend-specific configurations. Here are the commonly changed options.
"},{"location":"services/backend/services/v3/#exports","title":"exports","text":"Name Description Value pidPrefix prefix of the internal IDs \"PID.SAMPLE.PREFIX\" doiPrefix prefix of the published DOIs \"DOI.SAMPLE.PREFIX\" policyPublicationShiftInYears number of years before the data should be made open access. This is only an annotation in the metadata and no action is triggered after expiration 3 policyRetentionShiftInYears number of years by which the data should be kept. This is only an annotation in the metadata and no action is triggered after expiration 10 site name of the facility runnin SciCat \"SAMPLE-SITE\" queue message broker flavour for the JOBs \"rabbitmq\" logbook.enabled option to enable scichat \"false\""},{"location":"services/backend/services/v3/#functional-accounts","title":"Functional Accounts","text":"There are a few functional accounts available for handling data:
Username Password Usage admin 2jf70TPNZsS Admin ingestor aman Ingest datasets archiveManager aman Manage archiving of datasets proposalIngestor aman Ingest proposals"},{"location":"services/backend/services/v3/#default-configuration","title":"Default configuration","text":"In the default configuration folder config , the backend is set to use the mongo container .
"},{"location":"services/backend/services/v3/#enable-additional-features","title":"Enable additional features","text":" Additionally, by setting the env variable JOBS_ENABLED
, the archive mock and rabbitmq services are started and the backend is configured to connect to them.
If LDAP_ENABLED
is toggled, you can use LDAP to log in with a LDAP user .
If OIDC_ENABLED
is toggled, you can use OIDC to log in with a OIDC user .
With DEV=true
, since the v3 tests are supposed to run with an empty DB, the set DB is dacat_test which is empty. If willing to use the seeded one later during development, just set dacat
as database values in the file /home/node/app/server/datasources.json
on the container.
Here below we show the internal dependencies of the service, which are not already covered here and here (if B
depends on A
, then we visualize as A --> B
). The same subdomain to service convention applies.
graph TD\n rabbitmq --> archivemock\n rabbitmq --> backend\n backend --> archivemock
"},{"location":"services/backend/services/v3/services/archivemock/","title":"Archive Mock","text":""},{"location":"services/backend/services/v3/services/archivemock/#archive-mock","title":"Archive Mock","text":"The Archive Mock simulates the interactions of an archival mock with SciCat.
"},{"location":"services/backend/services/v3/services/archivemock/#service-requirements","title":"Service Requirements","text":"The container uses environment variables for configuration.
"},{"location":"services/backend/services/v3/services/archivemock/#default-configuraiton","title":"Default configuraiton","text":" By default, it is configured to connect to the backend v3 container with the admin
account, and to the RabbitMQ container with the guest
account. It will then handle all archival and retrieval jobs posted to RabbitMQ, and update the corresponding Datasets accordingly in Scicat.
The SciCat backend v4 is a rewrite of the original backend, built on top of the NestJS framework.
"},{"location":"services/backend/services/v4/#configuration-options","title":"Configuration options","text":"The backend-next service is mainly configured via environment variables. For an extensive list of available options see here .
"},{"location":"services/backend/services/v4/#functional-accounts","title":"Functional Accounts","text":"There are a few functional accounts available for handling data:
Username Password Usage admin 2jf70TPNZsS Admin ingestor aman Ingest datasets archiveManager aman Manage archiving of datasets proposalIngestor aman Ingest proposals"},{"location":"services/backend/services/v4/#default-configuration","title":"Default configuration","text":"In the default configuration folder config , the backend is set to use the mongo container .
"},{"location":"services/backend/services/v4/#enable-additional-features","title":"Enable additional features","text":" Additionally, by setting the env variable ELASTIC_ENABLED
, the elastic search service is started and the backend is configured to connect to them.
If LDAP_ENABLED
is toggled, you can use LDAP to log in with a LDAP user .
If OIDC_ENABLED
is toggled, you can use OIDC to log in with a OIDC user .
With DEV=true
, since the container might have limited memory, it is recommended to run unit tests with the option --runInBand
, as here , which makes the tests run sequentially, avoiding to fill the RAM which makes them freeze.
Here below we show the internal dependencies of the service, which are not already covered here and here (if B
depends on A
, then we visualize as A --> B
). The same subdomain to service convention applies.
graph TD\n elasticsearch --> backend
"},{"location":"services/frontend/","title":"Frontend","text":""},{"location":"services/frontend/#frontend","title":"Frontend","text":"The SciCat frontend is the SciCat metadata catalogue web UI, built on top of the Angular framework.
"},{"location":"services/frontend/#configuration-options","title":"Configuration options","text":" The frontend configuration is set by the config files . Files inside the config folder, with a .json
extension are merged respecting the alphabetical order of the files in the container , with config.v3.json applied depending on the BE_VERSION .
Please note that merging the config files is a functionality provided by SciCat Live
and is not supported natively by the frontend
.
For an extensive list of available options see here in the SciCat frontend section.
"},{"location":"services/frontend/#default-configuration","title":"Default configuration","text":" In the default configuration config , the frontend is set to call the backend service
available at backend.localhost
(either v4 , by default, or v3 if specified otherwise by setting BE_VERSION
).
For an explanation of how setting BE_VERSION
changes the environment creation see here .
Since there was a small breaking change from v3
to v4
, when connecting to the backend
, the BE_VERSION
value controls if config.v3.json file , which is applied when BE_VERSION=v3
, should be included in the configs merge process.
With DEV=true
, please use npm start -- --host 0.0.0.0
. This is to allow traffic from any IP to the frontend
component and it is necessary since the component runs in the docker network.
Setting the BACKEND_HTTPS_URL env variable requires changing the backend
URL used by the frontend
. This is managed here .
When setting FRONTENT_HTTPS_URL
it is likely you also want to set the BACKEND_HTTPS_URL
, to allow the communication between the two wherever the browser is accessed.
This Jupyter Notebook instance is preconfigured with an example notebook that shows the usage of Pyscicat .
"},{"location":"services/jupyter/#pyscicat-notebook","title":"Pyscicat Notebook","text":"This notebook demonstrates all the major actions Pyscicat is capable of: * logging into SciCat backend * dataset creation * datablock creation * attachment upload
"},{"location":"services/jupyter/#env-file","title":".env file","text":"It contains the environment variables for connecting to the backend service deployed by this project
"},{"location":"services/jupyter/#thumbnail-image","title":"Thumbnail image","text":"An example image that is used for the attachment upload demonstration
"},{"location":"services/jupyter/#default-configuration","title":"Default configuration","text":"This service is only dependant on the backend service, since it demonstrates communication with the latter through Pyscicat.
The notebooks are mounted to the container from the config/notebooks directory. The changes to these notebooks should not be contributed back to this repository, unless this is intentional. In the case you want to upstream changes to these notebooks, be sure to clear all the results from them.
The main readme covers all dependencies of this package.
"},{"location":"services/mongodb/","title":"Mongodb","text":""},{"location":"services/mongodb/#mongodb","title":"Mongodb","text":" The mongodb
container is responsible of creating a mongodb container with initial metadata.
All files collection created with relative data are in the seed folder and the init script here
To add more collections during the creation of the database: 1. add the corresponding file(s) here , keeping the convention: filename := collectionname.json
. 2. Restart the docker container.
These files are ingested into the database using mongo funcionalities and bypassing the backend, i.e. they are not to be taken as examples to use the backend API.
"},{"location":"services/mongodb/#default-configuration","title":"Default configuration","text":" In the default configuration init.sh , the seeding creates data in the mongodb database used by the backend
service (either v4 , by default, or v3 if specified otherwise by setting BE_VERSION
).
For an explanation of how setting BE_VERSION
changes the environment creation see here .
BE_VERSION
","text":" Since v3 and v4 connect to two different DBs, the BE_VERSION environment variable controls which DB should be seeded ( dacat
for v3 and dacat-next
for v4 ).
The proxy acts as a reverse proxy to the SciCat Live containers.
"},{"location":"services/proxy/#configuration","title":"Configuration","text":""},{"location":"services/proxy/#env-file","title":".env file","text":"It sets proxy options which are rarely changed, for example, the default configuration with the docker network.
"},{"location":"services/proxy/#tlsenv-file","title":".tls.env file","text":"It can be customized to set the TLS options. This has an effect only if the service URLs exposed by traefik are reachable from the public web.
You need to set the letsencrypt options here .
"},{"location":"services/proxy/#enable-tls","title":"Enable TLS","text":" To enable TLS on specific services, you can set the <SERVICE>_HTTPS_URL
env var to the desired URL, including the https://
prefix, making sure that the URLs are reachable by letsencrypt
. See here for an example. This will request the certificate from letsencrypt
.
The SciCat seachAPI is the SciCat metadata catalogue standardised API for communication between SciCat and the PaN portal, built on top of the Loobpack framework.
"},{"location":"services/searchapi/#configuration-options","title":"Configuration options","text":"The searchapi configuration is set by the .env file . For an extensive list of available options see here .
"},{"location":"services/searchapi/#default-configuration","title":"Default configuration","text":" In the default configuration .env file , the searchapi is set to call the backend service
available at backend.localhost
(either v4 , by default, or v3 if specified otherwise by setting BE_VERSION
).
For an explanation of how setting BE_VERSION
changes the environment creation see here .