A web application to extract and curate research software metadata following the codemeta software metadata standard.
SMECS facilitates the extraction of research software metadata from repositories on GitHub/GitLab. It offers a user-friendly graphical user interface for visualizing the retrieved metadata. This empowers researchers to create good metadata for their research software without reentering data which is already available elsewhere. Ultimately, SMECS delivers the curated metadata in JSON format, enhancing usability and accessibility.
Authors: Stephan Ferenz, Aida Jafarbigloo
The figure below illustrates the sequential processes and data flows within SMECS. First, users input data, triggering the tool to extract metadata associated with specific URLs. This metadata is then visualized, allowing users to review and interact with it. Users can curate, modify, and finalize the metadata according to their needs. Once satisfied, they can download the curated metadata in JSON format, providing an interoperable output for further use.
- Metadata Extraction Stage
- Metadata Extraction
- SMECS extracts metadata from GitHub and GitLab repositories. For details on the specific metadata that SMECS can extract, please refer to Metadata Terms in SMECS
- API Interactions: Use GitHub and GitLab APIs to fetch relevant metadata.
- Data Parsing: Analyze the retrieved metadata and translate it into CodeMeta metadata for further processing.
- Cross-Walk and Metadata Mapping
- Standardization: Align metadata fields from GitHub and GitLab to a common dictionary.
- Field Matching: Map equivalent fields between GitHub and GitLab. For example, mapping GitHub "topics" to GitLab "keywords".
- Visualization and Curation Stage
- Visualization: Extracted metadata is displayed in a structured form.
- User Interface: Interactive and simple UI for exploring the extracted and curated metadata.
- Metadata Curation: Refine the extracted metadata based on user preferences.
- Missing Metadata Identification: Identify and highlight fields where metadata is absent.
- User Input for Missing Metadata: Enable users to add missing metadata directly via the user interface.
- Real-Time Metadata Curation: Enable the possibility of representing the JSON format of the metadata based on the CodeMeta standard in real time, allowing one-direction changes from form format to JSON to show real-time metadata curation.
- Export Stage
- Export Formats: Save extracted and curated metadata in JSON format.
- Cloning the repository
- Copy URL of the project from Clone with HTTPS.
- Change the current working directory to the desired location.
- Run
git clone <URL>
in command prompt. (GitBash can be used as well)
- Creating virtual environment
Make sure Python is installed.
- Ensure you can run Python from command prompt.
- On Windows: Run
py --version
. - On Unix/MacOS: Run
python3 --version
.
- On Windows: Run
- Create the virtual environment by running this code in the command prompt.
- On Windows: Run
py -m venv <name-of-virtual-environment>
. - On Unix/MacOS: Run
python3 -m venv <name-of-virtual-environment>
.
- On Windows: Run
for more details visit Creation of virtual environments- Activate virtual environment.
- On Windows: Run
env\Scripts\activate
. - On Unix/MacOS: Run
source env/bin/activate
.
- On Windows: Run
env is the selected name for the virtual environment. Note that activating the virtual environment change the shell's prompt and show what virtual environment is being used.
- Managing Packages with pip
- Ensure you can run pip from command prompt.
- On Windows: Run
py -m pip --version
. - On Unix/MacOS: Run
python3 -m pip --version
.
- On Windows: Run
- Install a list of requirements specified in a Requirements.txt.
- On Windows: Run
py -m pip install -r requirements.txt
. - On Unix/MacOS: Run
python3 -m pip install -r requirements.txt
.
- On Windows: Run
for more details visit Installing Packages
Running the project
- Open and run the project in an editor (e.g. VS code).
- Run the project.
- On Windows: Run
py manage.py runserver
. - On Unix/MacOS: Run
python3 manage.py runserver
.
- On Windows: Run
- To see the output on the browser follow the link shown in the terminal. (e.g. http://127.0.0.1:8000/)
To get started with SMECS using Docker, follow the steps below:
- Prerequisites
- Make sure Docker is installed on your local machine.
- Cloning the Repository
git clone https://github.com/NFDI4Energy/SMECS.git
- Navigate to the Project Directory
cd smecs
- Building the Docker Images
docker-compose build
- Starting the Services
docker-compose up
- Accessing the Application
- Navigate to
http://localhost:8000
in your web browser.
- Navigate to
- Stopping the Services
docker-compose down
Setting Up GitLab/GitHub Personal Token
To enhance the functionality of this program and ensure secure interactions with the GitLab/GitHub API, users are required to provide their personal access token. Follow these steps to integrate your token:
- Generate a GitLab Token:
- Visit Create a personal access token for more information on how to generate a new token.
- Generate a GitHub Token:
- Visit Managing your personal access tokens for more information on how to generate a new token.
Tip for developers
If the page does not refresh correctly, clear the browser cache. You can force Chrome to pull in new data and ignore the saved ("cached") data by using the keyboard shortcut
Cmd+Shift+R
on Mac, and Ctrl+F5
or Ctrl+Shift+R
on Windows.We believe in the power of collaboration and welcome contributions from the community to enhance the SMECS workflow. Whether you have found a bug, have a feature idea, or want to share feedback, your contribution matters. Feel free to submit a pull request, open up an issue, or reach out with any questions or concerns.
To see upcoming features, please refer to our open issues.
The code is licensed under the GNU Affero General Public License v3.0 or later (AGPL-3.0-or-later).
See LICENSE.txt for further information.
We would like to thank meta_tool for providing the foundational framework upon which this project is built.